O‘ZBEK TILI KORPUSI MATNLARI ASOSIDA TIL MODELLARINI YARATISH
Keywords:
language models, NLP, language corpus, UzbNlp package, unigram, bigram, trigram, n-gram, word prediction.Abstract
Language models are used to solve NLP tasks such as speech recognition, machine translation, POS tagging, intelligent text analysis, new text generation, and word prediction. Language models are an important component of any NLP task. This article presents the stages of developing language models and the n-gram method of creating language models based on the texts of the Uzbek language corpus.
References
Jurafsky, D., & Martin, J. H. (2019). Chapter 3: N-Gram Language Models N-Gram Language Models. Speech and Language Processing.
Chen, M., Suresh, A. T., Mathews, R., Wong, A., Allauzen, C., Beaufays, F., & Riley, M. (2019). Federated learning of N-gram language models. CoNLL 2019 - 23rd Conference on Computational Natural Language Learning, Proceedings of the Conference. https://doi.org/10.18653/v1/k19-1012
Republic, C., & Mikolov, T. (2012). Statistical Language Models Based on Neural Networks. Wall Street Journal, April. https://doi.org/10.1016/j.csl.2015.07.001
P.~Brown, V.~Della Pietra, de Souza, P., J.~Lai, & R.~Mercer. (1992). Class-based n- gram models of natural language. Computational Linguistics, 18.
Boltayevich, E. B., Mirdjonovna, H. S., & Ilxomovna, A. X. (2023). Methods for Creating a Morphological Analyzer. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 13741 LNCS. https://doi.org/10.1007/978-3-031-27199-1_4
Bommasani, R., Liang, P., & Lee, T. (2023). Holistic Evaluation of Language Models. Annals of the New York Academy of Sciences, 1525(1).
https://doi.org/10.1111/nyas.15007
Konstantopoulos, S. (2010). Learning language identification models: A comparative analysis of the distinctive features of names and common words. Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010.
Wallace, E., Gardner, M., & Singh, S. (2020). Interpreting Predictions of NLP Models. https://doi.org/10.18653/v1/2020.emnlp-tutorials.3
Roh, J., Park, S., Kim, B. K., Oh, S. H., & Lee, S. Y. (2021). Unsupervised multi-sense language models for natural language processing tasks. Neural Networks, 142.