MAʼLUMOTLAR TOʻPLAMINI OʻQITISH, BAHOLASH VA TEST TOʻPLAMLARIGA AJRATISH USULLARI

O'ʻtkir Hamdamov; Botir Elov; Ruhillo Alayev

Authors

Oʻtkir Hamdamov Menejment va zamonaviy texnologiyalar universiteti
Botir Elov Alisher Navoiy nomidagi Toshkent davlat oʻzbek tili va adabiyoti universiteti
Ruhillo Alayev Mirzo Ulugʻbek nomidagi O‘zbekiston Milliy universiteti

Keywords:

maʼlumotlar toʻplamini oʻqitish, training set, o‘quv to‘plami, validation set, baholash to‘plami, test set, test to‘plami, Machine Learning, ML, samaradorlik, accuracy, aniqlik, precision, eslab qolish, recall, F1-ball, F1 score

Abstract

Mashinali oʻqitish algoritmlari maʼlumotlardagi shablon(qolip)larni oʻrganadi va ulardan yangi maʼlumotlar boʻyicha bashorat qilish uchun foydalanadi. Ishlab chiqilgan sun`iy intellekt modellari ish faoliyatini ularning oʻqitish jarayonida koʻrmagan maʼlumotlarga qarab baholash muhim. Bu ishni maʼlumotlarni taqsimlash yoki ajratish usuli orqali hal qilish mumkin. Ushbu maqolada mashinali oʻqitishda maʼlumotlar toʻplamini oʻqitish, baholash va test toʻplamlariga ajratish hamda modelni samaradorlik (accuracy), aniqlik (precision), eslab qolish(recall) va F1-ball (F1 score) orqali baholash usullari keltiriladi.

References

Dobbin, K. K., & Simon, R. M. (2011). Optimally splitting cases for training and testing high dimensional classifiers. BMC medical genomics, 4, 1-8.

Bai, Y., Chen, M., Zhou, P., Zhao, T., Lee, J., Kakade, S., ... & Xiong, C. (2021, July). How important is the train-validation split in meta-learning?. In International Conference on Machine Learning (pp. 543-553). PMLR.

Singh, A., Thakur, N., & Sharma, A. (2016, March). A review of supervised machine learning algorithms. In 2016 3rd international conference on computing for sustainable global development (INDIACom) (pp. 1310-1315). Ieee.

Osisanwo, F. Y., Akinsola, J. E. T., Awodele, O., Hinmikaiye, J. O., Olakanmi, O., & Akinjobi, J. (2017). Supervised machine learning algorithms: classification and comparison. International Journal of Computer Trends and Technology (IJCTT), 48(3), 128-138.

Raschka, S. (2018). Model evaluation, model selection, and algorithm selection in machine learning. arXiv preprint arXiv:1811.12808.

Rácz, A., Bajusz, D., & Héberger, K. (2015). Consistency of QSAR models: Correct split of training and test sets, ranking of models and performance parameters. SAR and QSAR in Environmental Research, 26(7-9), 683-700.

Batista, G. E., Prati, R. C., & Monard, M. C. (2004). A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD explorations newsletter, 6(1), 20-29.

Polyzotis, N., Zinkevich, M., Roy, S., Breck, E., & Whang, S. (2019). Data validation for machine learning. Proceedings of machine learning and systems, 1, 334-347.

Yacouby, R., & Axman, D. (2020, November). Probabilistic extension of precision, recall, and f1 score for more thorough evaluation of classification models. In Proceedings of the first workshop on evaluation and comparison of NLP systems (pp. 79-91).

Naidu, G., Zuva, T., & Sibanda, E. M. (2023, April). A review of evaluation metrics in machine learning algorithms. In Computer Science On-line Conference (pp. 15-25). Cham: Springer International Publishing.

Zhang, X., & Liu, C. A. (2023). Model averaging prediction by K-fold cross-validation. Journal of Econometrics, 235(1), 280-301.

Tan, J., Yang, J., Wu, S., Chen, G., & Zhao, J. (2021). A critical look at the current train/test split in machine learning. arXiv preprint arXiv:2106.04525.

Meng, X. (2013, May). Scalable simple random sampling and stratified sampling. In International conference on machine learning (pp. 531-539). PMLR.

Yadav, S., & Shukla, S. (2016, February). Analysis of k-fold cross-validation over hold-out validation on colossal datasets for quality classification. In 2016 IEEE 6th International conference on advanced computing (IACC) (pp. 78-83). IEEE.

Kim, J. H. (2009). Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap. Computational statistics & data analysis, 53(11), 3735-3745.

Wong, T. T. (2015). Performance evaluation of classification algorithms by k-fold and leave-one-out cross validation. Pattern recognition, 48(9), 2839-2846.

MAʼLUMOTLAR TOʻPLAMINI OʻQITISH, BAHOLASH VA TEST TOʻPLAMLARIGA AJRATISH USULLARI

Authors

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Make a Submission

Indeksing

plagiat

Cajecs

Current Issue

Language

Information