CHALLENGES IN CORPUS LINGUISTICS AND MACHINE TRANSLATION

Authors

  • Nazirova Elmira Author
  • Abdurakhmonova Nilufar Author
  • Usmonova Kamola Author

Keywords:

corpus linguistics, language patterns, machine translation, morphological complexity lexical gaps, cultural nuances, linguistic diversity, data collection.

Abstract

This article addresses systematic cases of progress problems on corpora. The process of corpus generation creates linguistic and statistical issues, which eventually govern the entire process. During corpus generation, it needs careful attention to different factors such as corpus size, methods of data collection, organization of textual materials and others. These issues are significant not only for widely spoken languages like English and Turkish but also hold considerable vital Importance for less-resourced languages used in developing countries. We decided to explore these issues in detail in this article.

References

C. Chen, K.Chan, P.Wong, E.Chee, L.Wang, Q.Wang. A corpus-based online pronunciation learning system: The Pedagogical applications of a spoken corpus for improving Hong Kong/Mainland university students’ English pronunciation. The Second Asia Pacific Corpus Linguistics Conference, 2014.

E.Tognini-Bonelli.Corpus linguistics at work. Amsterdam: J. Benjamins, 2001.

M. Weisser. Practical corpus linguistics: An introduction to corpus-based language analysis. John Wiley- Sons. 2016

Abdurakhmonova, N., Tuliyev, U., Ismailov, A., & Abduvahobo, G. (2022). Uzbek electronic corpus as a tool for linguistic analysis. In Компьютерная обработка тюркских языков. TURKLANG 2022 (pp. 231-240).

Abduraxmonova, N. Z. Q., & Urazaliyeva, M. Y. (2022). O ‘zbek tili elektron korpusida (http://uzbekcorpus. uz/) og ‘zaki matnlar korpusini yaratishning nazariy va amaliy masalalari. Academic research in educational sciences, 3(3), 644-650.

Mengliev, D., Barakhnin, V., & Abdurakhmonova, N. (2021). Development of intellectual web system for morph analyzing of uzbek words. Applied Sciences, 11(19), 9117.

Abdurakhmonova, N. (2019). Dependency parsing based on Uzbek Corpus. In of the International Conference on Language Technologies for All (LT4All).

Agostini, A., Usmanov, T., Khamdamov, U., Abdurakhmonova, N., & Mamasaidov, M. (2021, January). Uzwordnet: A lexical-semantic database for the uzbek language. In Proceedings of the 11th Global Wordnet conference (pp. 8-19).

N.Abdurakhmonova, U.Tuliyev, A.Gatiatullin. Linguistic functionality of Uzbek Electron Corpus: uzbekcorpus. uz.International Conference on Information Science and Communications Technologies (ICISCT), 2021. pp. 1-4.

T.McEnery, A.Hardie. Corpus Linguistics: Method, Theory, and Practice. Cambridge: Cambridge University Press, 2011.

W.Crawford, E.Csomay. Doing Corpus Linguistics. London: Routledge, 2015.

Downloads

Published

2024-06-24

How to Cite

CHALLENGES IN CORPUS LINGUISTICS AND MACHINE TRANSLATION. (2024). «CONTEMPORARY TECHNOLOGIES OF COMPUTATIONAL LINGUISTICS», 2(22.04), 130-133. https://myscience.uz/index.php/linguistics/article/view/33

Similar Articles

31-40 of 110

You may also start an advanced similarity search for this article.