Preview

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)

Advanced search

Context resolution of homonymy based on a centroid-context model

https://doi.org/10.15514/ISPRAS-2022-34(5)-11

Abstract

The article describes a new method for contextual resolution of homonymy based on a centroid-context model (CCM). The proposed method of detecting cases of homonymy in the corpus of texts and its resolution using the CCM model is based on the theoretical concept of phraseological conceptual analysis of texts (FCAT) and unique machine grammar, which is based on a system of inflective classes of russian words. The rigid conformity between the form of presentation of words and their grammatical information laid down in the theoretical concept of inflective classes of words of the Russian language made it possible to create on this basis new classes - classes of words that have the same sets of grammatical features, conforming to their forms of representation in similar contextual environments. When developing this model, the authors proceeded from the following hypothesis: the same sequences of generalized characters of word classes (generalized syntagms) should correspond to the same syntactic structures of various fragments of texts. At the same time, it was assumed that such a hypothesis is true for any syntactic models and can be useful in solving both global and particular problems of text analysis. Using this method, a new solution to the problem of resolving homonymy based on the proposed CCM model was proposed.

About the Authors

Alexander Alexeevich KHOROSHILOV
Federal Research Center «Computer Science and Control» of the Russian Academy of Sciences, Moscow Aviation Institute, 27th Central Research Institute of the Ministry of Defence of the Russian Federation
Russian Federation

Doctor of Science, Professor of the Moscow Aviation Institute (National Research University), Lead Researcher of the Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, Senior Researcher of the 27 Central Research Institute of the Ministry of Defense of the Russian Federation



Yuri Viktorovich NIKITIN
Federal Research Center «Computer Science and Control» of the Russian Academy of Sciences, Scientific and Industrial Company «High Technologies and Strategic Systems»
Russian Federation

Researcher of the Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, Development Team Leader of the Scientific and Industrial Company "High Technologies and Strategic Systems"



Anna Vladimirovna KAN
Moscow Aviation Institute, National Research Center «Zhukovsky Institute»
Russian Federation

Candidate of Technical Sciences, Associate Professor of the Moscow Aviation Institute, Head of the Analytical Department of the National Research Center "Zhukovsky Institute"



Yana Dmitrievna KOZLOVSKAYA
Scientific and Industrial Company "High Technologies and Strategic Systems"
Russian Federation

Development team member



Ekaterina Andreevna EVDOKIMOVA
Moscow Aviation Institute
Russian Federation

Student



References

1. Baum L. E., Petrie T. Statistical inference for probabilistic functions of finite state Markov chains // The annals of mathematical statistics. — 1966. — T. 37. — №. 6. — S. 1554-1563.

2. Lafferty J. et al. Conditional random fields: Probabilistic models for segmenting and labeling sequence data // Proceedings of the eighteenth international conference on machine learning, ICML. — 2001. — T. 1. — S. 282-289.

3. Elman J. L. Finding structure in time // Cognitive science. — 1990. — T. 14. — №. 2. — S. 179-211.

4. Sch‥utze H. Introduction to Information Retrieval // Proceedings of the international communication of association for computing machinery conference. — 2008.

5. Sha F., Pereira F. Shallow parsing with conditional random fields // Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1. — Association for Computational Linguistics, 2003. — S. 134-141.

6. Horoshilov Al-dr A., Musabaev R.R., Kozlovskaja Ja.D., Nikitin Ju.V., Horoshilov A.A. Avtomaticheskoe vyjavlenie i klassifikacija informacionnyh sobytij v tekstah SMI// Nauchno-tehnicheskaja informacija. Serija 2: Informacionnye processy i sistemy. 2020. №7. S. 27-38. DOI: 10.36535/0548-0027-2020-07-4

7. Khoroshilov, Ad.A., Musabaev, R.R., Kozlovskaya, Y.D. et al. Automatic Detection and Classification of Information Events in Media Texts. Autom. Doc. Math. Linguist. 54, 202–214 (2020). https://doi.org/10.3103/S0005105520040032

8. Ablov I.V. [i dr.] Sredstva mashinnoi grammatiki russkogo yazyka (po G.G. Belonogovu) [Means of machine grammar of the Russian language (according to G.G. Belonogov)]. Nauchno-tekhnicheskaya informatsiya. Ser. 2, № 6, 2018.

9. Kalinin Yu.P., Khoroshilov Al-dr. A., Khoroshilov Al-ei. A. Sovremennye tekhnologii avtomatizirovannoi obrabotki tekstovoi informatsii [Modern technologies for automated processing of text information]. Sistemy vysokoi dostupnosti, № 2, Vol. 11, 2015.


Review

For citations:


KHOROSHILOV A.A., NIKITIN Yu.V., KAN A.V., KOZLOVSKAYA Ya.D., EVDOKIMOVA E.A. Context resolution of homonymy based on a centroid-context model. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2022;34(5):171-182. (In Russ.) https://doi.org/10.15514/ISPRAS-2022-34(5)-11



Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)