MATHEMATICAL MODELS DEVELOPMENT FOR CREDIT INSTITUTIONS CLASSIFICATION USING DECISION TREES AND THEIR ENSEMBLES
https://doi.org/10.26583/vestnik.2024.350
EDN: LGWWEH
Abstract
The article examines the problem of using decision trees and their ensembles (decision forests) in the problem of classifying credit institutions as objects of economic security. Although decision trees and their ensembles have been successfully used in the banking sector, decision trees and their ensembles have not previously been used for automated identification of unreliable credit institutions. The classification of credit institutions was carried out on the basis of bank reporting form No. 101. As a result of the analysis, key performance indicators of credit institutions were identified, namely «Profit», «Accounts with the Bank of Russia», «Securities». Taking into account these indicators, the classification accuracy of 85 % was obtained for the CART model. For the Random Forest, Adaboost, and Xgboost models, all 23 financial statement measures (Form 101) were used, and the accuracy achieved was 83, 80, and 80 %, respectively. an urgent scientific and practical problem has been solved - mathematical models have been developed that make it possible to identify high-risk credit organizations and predict the risks of revocation of their licenses. During their application, a list of potentially unreliable credit institutions was identified, to which it is recommended that government authorities pay close attention.
About the Authors
E. P. AkishinaRussian Federation
V. V. Ivanov
Russian Federation
A. V. Kryanev
Russian Federation
A. S. Prikazchikova
Russian Federation
References
1. Han J., Kamber M., Pei J. Data mining. Concepts and techniques 3th ed. Elsevier Inc., 2012. 740 р.
2. Akishina E.P., Ivanov V.V., Kryanev A.V., Prikazchikova A.S. Mnogomernyj analiz dannyh v zadachah prognozirovaniya popadaniya kreditnyh organizacij v zonu riska [Multidimensional data analysis in problems of predicting whether credit institutions will fall into the risk zone]. Vestnik NIYAU MIFI, 2024. Vol. 13. No. 1. Pp. 22–29. DOI: 10.26583/vestnik.2024.302
3. Gmurman, V.E. Rukovodstvo k resheniyu zadach po teorii veroyatnostej i matematicheskoj statistike. [Guide to solving problems in probability theory and mathematical statistics]. Moscow, Vyssh. Shkola Publ., 2003. 400 p.
4. Breiman L. Random Forests Machine Learning, 2001. Vol. 45. No. 1. Pp. 5–32.
5. Hastie T., Tibshirani R., Friedman J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2th ed. Springer-Verlag, 2009. 746 p. ISBN: 978-0-387-84857-0.
6. Mishra A. Fraud Detection: A Study of AdaBoost Classifier and K-Means Clustering. SSRN. 2021. No. 2 (16). Pp. 1–9. DOI: 10.21203/rs.3.rs-247874/v1.
7. Vassallo D. Vella V., Ellul J. Application of Gradient Boosting Algorithms for Anti-money Laundering in Crypto currencies. SN Computer Science, 2021. No. 2 (3). Pp. 142–157. DOI: 10.1007/s42979-021-00558-z.
8. Chen T., Guestrin C. XGBoost: A Scalable Tree Boosting System. Computer Science Machine Learning, 2016 (v1). Available at: https://arxiv.org/abs/1603.02754 (accessed: 08.02.2024).
Review
For citations:
Akishina E.P., Ivanov V.V., Kryanev A.V., Prikazchikova A.S. MATHEMATICAL MODELS DEVELOPMENT FOR CREDIT INSTITUTIONS CLASSIFICATION USING DECISION TREES AND THEIR ENSEMBLES. Vestnik natsional'nogo issledovatel'skogo yadernogo universiteta "MIFI". 2024;13(4):242-250. (In Russ.) https://doi.org/10.26583/vestnik.2024.350. EDN: LGWWEH