MAʼLUMOTLAR TOʻPLAMINI OʻQITISH, BAHOLASH VA TEST TOʻPLAMLARIGA AJRATISH USULLARI
Keywords:
maʼlumotlar toʻplamini oʻqitish, training set, o‘quv to‘plami, validation set, baholash to‘plami, test set, test to‘plami, Machine Learning, ML, samaradorlik, accuracy, aniqlik, precision, eslab qolish, recall, F1-ball, F1 scoreAbstract
Mashinali oʻqitish algoritmlari maʼlumotlardagi shablon(qolip)larni oʻrganadi va ulardan yangi maʼlumotlar boʻyicha bashorat qilish uchun foydalanadi. Ishlab chiqilgan sun`iy intellekt modellari ish faoliyatini ularning oʻqitish jarayonida koʻrmagan maʼlumotlarga qarab baholash muhim. Bu ishni maʼlumotlarni taqsimlash yoki ajratish usuli orqali hal qilish mumkin. Ushbu maqolada mashinali oʻqitishda maʼlumotlar toʻplamini oʻqitish, baholash va test toʻplamlariga ajratish hamda modelni samaradorlik (accuracy), aniqlik (precision), eslab qolish(recall) va F1-ball (F1 score) orqali baholash usullari keltiriladi.
References
Dobbin, K. K., & Simon, R. M. (2011). Optimally splitting cases for training and testing high dimensional classifiers. BMC medical genomics, 4, 1-8.
Bai, Y., Chen, M., Zhou, P., Zhao, T., Lee, J., Kakade, S., ... & Xiong, C. (2021, July). How important is the train-validation split in meta-learning?. In International Conference on Machine Learning (pp. 543-553). PMLR.
Singh, A., Thakur, N., & Sharma, A. (2016, March). A review of supervised machine learning algorithms. In 2016 3rd international conference on computing for sustainable global development (INDIACom) (pp. 1310-1315). Ieee.
Osisanwo, F. Y., Akinsola, J. E. T., Awodele, O., Hinmikaiye, J. O., Olakanmi, O., & Akinjobi, J. (2017). Supervised machine learning algorithms: classification and comparison. International Journal of Computer Trends and Technology (IJCTT), 48(3), 128-138.
Raschka, S. (2018). Model evaluation, model selection, and algorithm selection in machine learning. arXiv preprint arXiv:1811.12808.
Rácz, A., Bajusz, D., & Héberger, K. (2015). Consistency of QSAR models: Correct split of training and test sets, ranking of models and performance parameters. SAR and QSAR in Environmental Research, 26(7-9), 683-700.
Batista, G. E., Prati, R. C., & Monard, M. C. (2004). A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD explorations newsletter, 6(1), 20-29.
Polyzotis, N., Zinkevich, M., Roy, S., Breck, E., & Whang, S. (2019). Data validation for machine learning. Proceedings of machine learning and systems, 1, 334-347.
Yacouby, R., & Axman, D. (2020, November). Probabilistic extension of precision, recall, and f1 score for more thorough evaluation of classification models. In Proceedings of the first workshop on evaluation and comparison of NLP systems (pp. 79-91).
Naidu, G., Zuva, T., & Sibanda, E. M. (2023, April). A review of evaluation metrics in machine learning algorithms. In Computer Science On-line Conference (pp. 15-25). Cham: Springer International Publishing.
Zhang, X., & Liu, C. A. (2023). Model averaging prediction by K-fold cross-validation. Journal of Econometrics, 235(1), 280-301.
Tan, J., Yang, J., Wu, S., Chen, G., & Zhao, J. (2021). A critical look at the current train/test split in machine learning. arXiv preprint arXiv:2106.04525.
Meng, X. (2013, May). Scalable simple random sampling and stratified sampling. In International conference on machine learning (pp. 531-539). PMLR.
Yadav, S., & Shukla, S. (2016, February). Analysis of k-fold cross-validation over hold-out validation on colossal datasets for quality classification. In 2016 IEEE 6th International conference on advanced computing (IACC) (pp. 78-83). IEEE.
Kim, J. H. (2009). Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap. Computational statistics & data analysis, 53(11), 3735-3745.
Wong, T. T. (2015). Performance evaluation of classification algorithms by k-fold and leave-one-out cross validation. Pattern recognition, 48(9), 2839-2846.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Oʻtkir Hamdamov, Botir Elov, Ruhillo Alayev
This work is licensed under a Creative Commons Attribution 4.0 International License.