POS TEGLARNI BELGILASH UCHUN MAKSIMAL ENTROPIYA ASOSIDAGI BILSTM-CRF MODELINI QO‘LLASH
Keywords:
Tabiiy tilni qayta ishlash, NLP, LSTM, BiLSTM, CRF, POS, NER, imlo xususiyatlar, mashinali o‘qitish, sun’iy intellekt, neyron tarmoqlarAbstract
Ushbu maqolada imlo xatolarni tuzatish va o‘zbek tili matnlari uchun so‘z turkumni aniqlash vazifasi uchun turli xil Long Short-Term Memory (LSTM) tarmoqlariga asoslangan modellarni taklif qilinadi. Ushbu modellar quyidagilarni o‘z ichiga oladi: oddiy LSTM tarmoqlari, BiLSTM tarmoqlari, Shartli Tasodifiy Maydon (Conditional Random Field - CRF) qatlami bilan birgalikdagi LSTM (LSTM-CRF) modeli va CRF qatlami bilan birgalikdagi BiLSTM (BiLSTM-CRF) modeli. Ushbu tadqiqot o‘zbek tili uchun birinchi bo‘lib BiLSTM-CRF modelini tabiiy tilni qayta ishlash (NLP) bo‘yicha standart imlo tuzatish va POS teglarni belgilashni ma’lumot to‘plamlariga tatbiq etadi. BiLSTM-CRF modelining afzalligi shundaki, u BiLSTM orqali oldingi va keyingi xususiyatlarni samarali o‘rganishi mumkin. Bundan tashqari, CRF qatlami yordamida butun gap darajasidagi teg ma’lumotlarini ham hisobga oladi. Ushbu model so‘z turkumini aniqlash (POS – Part of speech), segmentatsiya (chunking) va yangi so‘zlar (NER - Named Entity Recognition) bo‘yicha eng ilg‘or yoki unga yaqin aniqlikka erisha oladi. Shuningdek, u mustahkam ishlaydi va so‘z ifodalariga nisbatan kamroq bog‘liq bo‘ladi. Shuningdek testlash natijalari eng yaxshi ko‘rsatgichi 93,6 % ni tashkil etadi.
References
1. Ochilov M.M., Narzullayev O.O., Xolmatov O.A. “Mashinali O‘qitish Algoritmlari Asosida O‘zbek Tili Matnlaridagi Imlo Xatolarini Aniqlash Va Tuzatish” 2025 No 8(1) Raqamli Tеxnologiyalarning Nazariy va Amaliy Masalalari Xalqaro Jurnali ISSN 2181-3086
2. Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735-1780. https://doi.org/10.1162/neco.1997.9.8.1735
3. Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., & Dyer, C. (2016). Neural architectures for named entity recognition. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 260–270).
4. Huang, Z., Xu, W., & Yu, K. (2015). Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991. https://arxiv.org/abs/1508.01991
5. Levenshtein, V. I. (1966). Binary codes capable of correcting deletions, insertions, and reversals. Soviet physics doklady, 10(8), 707–710.
6. Ma, X., & Hovy, E. (2016). End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 1064–1074).
7. Jurafsky, D., & Martin, J. H. (2021). Speech and Language Processing (3rd ed. draft). Stanford University. Retrieved from https://web.stanford.edu/~jurafsky/slp3/
8. Graves, A., & Schmidhuber, J. (2005). Framewise phoneme classification with bidirectional LSTM networks. In Proceedings of the IEEE International Joint Conference on Neural Networks, 4, 2047–2052.
9. 102. Ibragimova, S., Boburkhon, T., & Abdullayeva, M. (2023). Solving the problems of normalization of non-standard words in the text of the Uzbek language. Acta of Turin Polytechnic University in Tashkent, 13(3), 38–42.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Turayev Boburxon Shuhrat o‘g‘li

This work is licensed under a Creative Commons Attribution 4.0 International License.