A common phenomenon that increasingly stimulates the interest of investors, companies, and entrepreneurs involved in crowd funding activities particularly on the Kickstarter website is identifying metrics that make such campaigns markedly successful. This study seeks to gauge the importance of key predictive variables or features based on statistical analysis, identify model-based machine learning methods based on performance assessment that predict success of a campaigns, and compare the selected different machine learning algorithms. To achieve our research objectives and maximize insight into the dataset used, feature engineering was performed. Then, machine learning models, inclusive of Logistic Regression (LR), Support Vector Machines (SVMs) in the form of Linear Discriminant Analysis (LDA), Quadratic Discriminant Analysis (QDA), and random forest analysis (bagging and boosting), were performed and compared via cross validation approaches in terms of their resulting test error rates, F1 score, Accuracy, Precision, and Recall rates. Of the machine learning models employed for predictive analysis, the test error rates and the other classification metric scores obtained across the three cross-validation approaches identified bagging and gradient boosting (the SVMs) as more robust methods for predicting success of Kickstarter projects. The major research objectives in this paper have been achieved by accessing the performance of key statistical learning methods that guides the choice of learning methods or models and giving us a measure of the quality of the ultimately chosen model. However, Bayesian semi-parametric approaches are of future research consideration. These methods facilitate the usage of an infinite number of parameters to capture information regarding the underlying distributions of even more complex data.
P. Belleflamme, T. Lambert, and A. Schwienbacher, Crowdfunding: Tapping the right crowd, J. Bus. Venturing, vol. 29, no. 5, pp. 585–609, 2014.
E. M. Gerber and J. Hui, Crowdfunding: Motivations and deterrents for participation, ACM Trans. Comput.-Human Interact., vol. 20, no. 6, p. 34, 2013.
E. Mollick, The dynamics of crowdfunding: An exploratory study, J. Bus. Ventur., vol. 29, no. 1, pp. 1–16, 2014.
M. J. Zhou, B. Z. Lu, W. P. Fan, and G. A. Wang, Project description and crowdfunding success: An exploratory study, Inf. Syst. Front., vol. 20, no. 2, pp. 259–274, 2018.
N. X. Wang, Q. X. Li, H. G. Liang, T. F. Ye, and S. L. Ge, Understanding the importance of interaction between creators and backers in crowdfunding success, Electron. Commer. Res. Appl., vol. 27, pp. 106–117, 2018.
K. Choy and D. Schlagwein, Crowdsourcing for a better world: On the relation between it affordances and donor motivations in charitable crowdfunding, Inf. Technol. People, vol. 29, no. 1, pp. 221–247, 2016.
H. Yu, S. H. Jiang, and K. C. Land, Multicollinearity in hierarchical linear models, Soc. Sci. Res., vol. 53, pp. 118–136, 2015.
S. L. Kukreja, J. Löfberg, and M. J. Brenner, A least absolute shrinkage and selection operator (LASSO) for nonlinear system identification, IFAC Proc. Vol., vol. 39, no. 1, pp. 814–819, 2006.
B. H. Menze, B. M. Kelm, R. Masuch, U. Himmelreich, P. Bachert, W. Petrich, and F. A. Hamprecht, A comparison of random forest and its Gini importance with standard chemometric methods for the feature selection and classification of spectral data, BMC Bioinformatics, vol. 10, no. 1, p. 213, 2009.
J. Franklin, The elements of statistical learning: Data mining, inference and prediction, Math. Intell., vol. 27, no. 2, pp. 83–85, 2005.
The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).