Comparison of machine learning methods in bad loan predictions: analysis of micro finance data
Keywords:
credit risk management, Machine Learning Algorithms, FinanceAbstract
Credit risk management is crucial for lending institutions to safeguard themselves against financial losses and maintain financial stability. Machine learning methods have been useful in analyzing borrower data and identifying bad loans that would be almost impossible for humans to detect. Client data from Microfinance Institution (MFI) in Uzbekistan have been used to build machine learning methods to predict loan delinquency. Their performances were evaluated based on five metrics: accuracy, sensitivity, specificity, negative (npv) and positive predictive value (ppv). The findings suggest that none of the machine learning used in the study methods have absolute advantage over the rest in all five-performance metrics. However, Extreme Gradient Boosting (XGBoost) produced the highest average performance compared to other methods.
References
Angelini, E., Di Tollo, G., & Roli, A. (2008). A neural network approach for credit risk evaluation. The Quarterly Review of Economics and Finance, 48(4), 733-755. https://doi.org/10.1016/j.qref.2007.04.001
Bao, W., Lianju, N., & Yue, K. (2019). Integration of unsupervised and supervised machine learning algorithms for credit risk assessment. Expert Systems With Applications, 128, 301-315. https://doi.org/10.1016/j.eswa.2019.02.033
Belgiu, M., & Drăguţ, L. (2016). Random forest in remote sensing: A review of applications and future directions. ISPRS Journal of Photogrammetry and Remote Sensing, 114, 24–31. https://doi.org/10.1016/j.isprsjprs.2016.01.011
Ben-Haim, Y., & Tom-Tov, E. (2010). A streaming parallel decision tree algorithm. Journal of Machine Learning Research, 11, 849-872.
Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785-794. https://doi.org/10.1145/2939672.2939785
Liu, J., Zhang, S., & Fan, H. (2022). A two-stage hybrid credit risk prediction model based on XGBoost and graph-based deep neural network. Expert Systems With Applications, 195, 116624. https://doi.org/10.1016/j.eswa.2022.116624
Malekipirbazari, M., & Aksakalli, V. (2015). Risk assessment in social lending via random forests. Expert Systems With Applications, 42(10), 4621–4631. https://doi.org/10.1016/j.eswa.2015.02.001
Pisner, D. A., & Schnyer, D. M. (2020). Support vector machine. In Elsevier eBooks (pp. 101–121). https://doi.org/10.1016/b978-0-12-815739-8.00006-7
Sohn, S. Y., Kim, D. H., & Yoon, J. H. (2016). Technology credit scoring model with fuzzy logistic regression. Applied Soft Computing, 43, 150–158. https://doi.org/10.1016/j.asoc.2016.02.025
Zhang, S., Li, X., Zong, M., Zhu, X., & Cheng, D. (2017). Learning k for kNN Classification. ACM Transactions on Intelligent Systems and Technology, 8(3), 1–19. https://doi.org/10.1145/2990508