O‘ZBEK TILIDAGI MATNLARIDAGI PEREFRAZ BIRLIKLARNI ANIQLASH
Keywords:
perefraz gaplar, semantic o‘xshashliklar, Jaccard algoritmi, Glove algoritmi, Mashinali o‘qitishAbstract
O‘zbek tilidagi matnlari tarkibidagi perefraz juftliklarni aniqlash masalasining yechish ko‘plab NLP masalalarini yechishda qulayliklar yaratib beradi. Bu plagiat tizimlaridan farqli ravish bitta matn tarkibidagi duvlikatlarni aniqlaydi. Perefraz gaplarni aniqlashda ko‘plab ML algoritmlaridan foydalaniladi. Ushbu maqolada gaplarning o‘xshashligini aniqlashda Jaccard algoritmi, so‘zlarning semantic o‘xshashlik to‘plamini tuzishda Glove algoritmidan foydalanish ketma-ketliklari keltirilgan. Perefrazni aniqlash axborot tizimining foydalanuvchilari hamda unda foydalanishda kechadigan jarayonlar haqida ma’lumotlar keltirilgan.
References
Pronoza E., Yagunova E., Kochetkova N. Sentence Paraphrase Graphs: Classification Based on Predictive Models or Annotators’ Decisions?. In: Sidorov G., Herrera-Alcántara O. (eds) Advances in Computational Intelligence. MICAI 2016. Lecture Notes in Computer Science, vol 10061. Springer, Cham (2017).
Pronoza, E., Yagunova, E. Comparison of sentence similarity measures for Russian paraphrase identification. In Artificial Intelligence and Natural Language and Information Extraction, Social Media and Web Search FRUCT Conference (AINL-ISMW FRUCT), pp. 74–82, IEEE (2015a).
Pronoza, E., Yagunova, E. Low-Level Features for Paraphrase Identification. In: Sidorov, G., Galicia-Haro, Sofía N. (eds.) MICAI 2015. LNCS, vol. 9413, pp. 59–71. Springer, Cham (2015b).
Pronoza, E., Yagunova, E., Pronoza, A. Construction of a Russian Paraphrase Corpus: Unsupervised Paraphrase Extraction. Proceedings of the 9th Russian Summer School in Information Retrieval, August 24–28, 2015, Saint-Petersburg, Russia, (RuSSIR 2015, Young Scientist Conference), Springer CCIS (2015).
Androutsopoulos, I., Prodromos Malakasiotis, P. A survey of paraphrasing and textual entailment methods. Journal of Artificial Intelligence Research, v. 38: 135–187 (2010).
Fernando, S., Stevenson, M. A semantic similarity approach to paraphrase detection. In Proceedings of the 11th Annual Research Colloquium of the UK Special Interest Group for Computational Linguistics, pp. 45–52, (2008).
Pham, N., Bernardi, R., Zhang, Y. Z., Baroni, M. Sentence paraphrase detection: When determiners and word order make the difference. In Proceedings of the Towards a Formal Distributional Semantics Workshop at IWCS 2013, pp. 21–29 (2013).
Rocktäschel, T., Grefenstette, E., Hermann, K. M., Kočiský, T., Blunsom, P. Reasoning about entailment with neural attention. arXiv preprint arXiv:1509.06664. (2015)
He, H., Gimpel, K., Lin, J. Multi-perspective sentence similarity modeling with convolutional neural networks. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1576–1586 (2015).
M. J. Kusner, Y. Sun, N. I. Kolkin, K. Q. Weinberger, From Word Embeddings to Document Distances, JMLR: W&CP 37 (2015) 957-966.
Y. Zhang, J. Baldridge, L. He, Zhang Y. PAWS: Paraphrase Adversaries from Word Scrambling, ArXiv, 2019. DOI: 10.48550/arXiv.1904.01130.
V.-A. Oliinyk, V. Vysotska, Y. Burov, K. Mykich, V. B. Fernandes, Propaganda Detection in Text Data Based on NLP and Machine Learning, CEUR workshop proceedings 2631 (2020) 132-144.
Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, V. Stoyanov, RoBERTa: A Robustly Optimized BERT Pretraining Approach, ArXiv, 2019. DOI: 10.48550/arXiv.1907.11692.
V. Vysotska, Y. Burov, V. Lytvyn, A. Demchuk, Defining Author’s Style for Plagiarism Detection in Academic Environment, in: Proceedings of the International Conference on Data Stream Mining and Processing, DSMP, 2018, pp. 128-133. DOI: 10.1109/DSMP.2018.8478574.
N. Shakhovska, I. Shvorob, The method for detecting plagiarism in a collection of documents, in: International Conference on Computer Sciences and Information Technologies, 2015, pp. 42-145.
O. Karnalim, G. Kurniawati, Programming Style on Source Code Plagiarism and Collusion Detection, International Journal of Computing 19(1) (2020) 27-38. 17. N. Khairova, A. Shapovalova, O. Mamyrbayev, N. Sharonova, K. Mukhsina, Using BERT model to Identify Sentences Paraphrase in the News Corpus, CEUR Workshop Proceedings Vol-3171, (2022) 38-48.
N. Grabar, T. Hamon, Exploitation of the morphology for automatic extraction of general paraphrases of medical terms, Revue Traitement Automatique des Langues 57(1) (2016) 85–109.
I. Eshkol-Taravella, N. Grabar, Paraphrastic reformulations in spoken corpora, Lecture Notes in Computer Science 8686 (2014) 425–437.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Axmedova Xusniya Xusanovna, Sultanov Djamshid Baxodirovich

This work is licensed under a Creative Commons Attribution 4.0 International License.