НЕЧЕТКАЯ ЛОГИКА НА БАЗЕ ЯЗЫКОВЫХ МОДЕЛЕЙ: ПОСТРОЕНИЕ ГИБРИДНОЙ СИСТЕМЫ КЛАССИФИКАЦИИ ДИАЛЕКТОВ И ОПРЕДЕЛЕНИЯ ДИАЛЕКТИЗМОВ
Keywords:
dialect classification, neural network, LM, Bert, fuzzy logic, Mamdani model.Abstract
This work aims to create an information system for classifying dialects and identifying dialectics using a neural network model and a logical inference mechanism based on the rules of fuzzy logic. This system has no explicit analogues for the Russian language, which emphasizes the relevance of this work.
To solve the problem, the Python programming language was used with libraries for natural language processing, neural network training, Google Bert model usage and base of fuzzy logic rules creation for the Mamdani model.
The research work involved an analysis of dialectology field and a search for methods using natural language processing and machine learning methods for dialect tasks. The author offered the architecture of the neural network model for the dialect classification and formed the basic concepts of the dialethism’s determination algorithm. The next step was a neural network model training on Bert outputs and dialethism’s determination experimental algorithm implementation.
References
A. Etman and A. A. L. Beex, Language and Dialect Identification: A survey, 2015 SAI Intelligent Systems Conference (IntelliSys), London, UK, 2015, pp. 220-231, https://doi.org/10.1109/IntelliSys.2015.7361147
Iancu, Ion. A Mamdani type fuzzy logic controller. Fuzzy logic-controls, concepts, theories and applications 15.2 (2012): 325-350.
Jauhiainen T, Lindén K, Jauhiainen H. Language model adaptation for language and dialect identification of text. Natural Language Engineering. 2019;25(5):561-583. https://doi.org/10.1017/S135132491900038X
McBratney, Alex B., and Adrian W. Moore. Application of fuzzy sets to climatic classification. Agricultural and forest meteorology 35.1-4 (1985): 165-185.
Mitchell, Melanie, and David C. Krakauer. The debate over understanding in AI’s large language models. Proceedings of the National Academy of Sciences 120.13 (2023): e2215907120.
Sun, C., Qiu, X., Xu, Y., Huang, X. (2019). How to Fine-Tune BERT for Text Classification? In: Sun, M., Huang, X., Ji, H., Liu, Z., Liu, Y. (eds) Chinese Computational Linguistics. CCL 2019. Lecture Notes in Computer Science (), vol 11856. Springer, Cham. https://doi.org/10.1007/978-3-030-32381-3_16
Zadeh LA. Outline of a new approach to the analysis of complex systems and decision processes. IEEE Trans Syst Man Cybernet 1973, 3:28–44.
Zimmermann, H‐J. Fuzzy set theory. Wiley interdisciplinary reviews: computational statistics 2.3 (2010): 317-332.
Д. Антюхов. Обучение модели естественного языка с BERT, блог компании SberDevices, 2020. Habr. https://habr.com/ru/company/sberdevices/blog/527576/
Захарова, Капитолина Федоровна, and Варвара Георгиевна Орлова. Диалектное членение русского языка. УРСС, 2004.