COUNTVECTORIZER YORDAMIDA SO‘ZLAR STATISTIKASINI ANIQLASH
Keywords:
word statistics, text processing, frequency, text, tokenizationAbstract
This article provides an overview of CountVectorizer, an important tool in natural language processing and effective machine learning for text. Explains the CountVectorizer methodology and retrieves word frequencies and a document term matrix. CountVectorizer helps with word statistics and text analysis.
References
Achal J, Harshada J, Bavik J, Charmi Ch “Text Pre-Processing Techniques in Natural Language Processing: A Review” International Research Journal of Engineering and Technology (IRJET), 2022. –B. 878
Alayev R, Maxmudjonova G, “O‘zbek tilidagi matnli hujjatlarda izlashni amalga oshirishni takomillashtirish”, Toshkent: O‘zbekistan: til va madaniyat 2023. –B. 79,
Elov B, Hamroyeva Sh, Xusainova Z, Xudayberganov N (2023). “O‘zbek tili korpusi matnlarini qayta ishlashda CountVectorizer, TF-IDF hamda Co-occurrence matrix usullarining ahamiyati” Elektron lug’atlar yaratishning nazariy va amaliy asoslari mavzusidagi xalqaro ilmiy-amaliy anjuman materiallari ., Andijon-2023. – B. 81
Elov B., Hamroyeva Sh., Alaev R., Xusainova Z., Yodgorov U., “O‘zbek tili korpusi matnlarini qayta ishlash usullari” Raqamli Transformatsiya va Sun’iy Intellekt ilmiy jurnali, 2023. –B. 117-129.
Maxmudjonova G., “Nomuhim so‘zlar tushunchasi va uning ahamiyati”. Kompyuter lingvistikasi: muammolar, yechim, istiqbollar Xalqaro ilmiy-amaliy konferensiya materiallari, 2023. 204-211.
Xusainova, Z., Elov, B., Yodgorov, U., O‘zbek tili matnlari uchun tokenizayorni ishlab chiqish.., MUHAMMAD AL-XORAZMIY AVLODLARI ilmiy-amaliy va axborat- tahliliy jurnal, 2023. –B. 27