Automatic search for fragments containing biographical information in a natural language text
https://doi.org/10.15514/ISPRAS-2018-30(6)-12
Abstract
References
1. [1]. Terpugova A.V. Biographical text as an object of linguistic researchю. Author’s abstract of the PhD thesis. Institute of Linguistics RAS, Moscow, 2011, 26 p. (in Russian).
2. [2]. Manning C., Raghavan P., Schütze H. Introduction to Information Retrieval. Cambridge University Press, 2008. 506 p.
3. [3]. Adamovich I.M., Volkov O.I. The system of facts extraction from historical texts. Sistemy i sredstva informatiki [Systems and Means of Informatics], vol. 25, № 3, 2015, p. 235-250 (in Russian).
4. [4]. Cybulska, A., Vossen, P. Historical Event Extraction From Text. In Proc. of 5th ACL-HLT Workshop on Language Technology on Cultural Heritage, 2011, pp. 39–43.
5. [5]. Hienert D., Luciano F. Extraction of Historical Events from Wikipedia. Lecture Notes in Computer Science, vol. 7540, 2015, pp. 16–28.
6. [6]. Santos C., Xiang B., Zhou B. Classifying Relations by Ranking with Convolutional Neural Networks. In Proc. of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, 2015, pp. 626-634.
7. [7]. Meerkamp P., Zhou Z. Information Extraction with Character-level Neural Networks and Free Noisy Supervision. Cornell University Library [электронный ресурс]. 2016. URL: https://arxiv.org/abs/1612.04118 (дата обращения 21.09.2018).
8. [8]. Homma Y., Sadamitsu K., Nishida K., Higashinaka R., Asano H., Matsuo Y. A Hierarchical Neural Network for Information Extraction of Product Attribute and Condition Sentences. In Proc. of the Open Knowledge Base and Question Answering (OKBQA), 2016, pp. 21-29.
9. [9]. Arkhipenko K., Kozlov I., Trofimovich J., Skorniakov K., Gomzin A., Turdakov D. Comparison of Neural Architectures for Sentiment Analysis of Russian Tweets. In Proc. of the International Conference “Dialogue 2016”, 2016, pp. 50-58.
10. [10]. Andrianov I., Mayorov V., Turdakov D. Modern Approaches to Aspect-Based Sentiment Analysis. Trudy ISP RAN/Proc. ISP RAN, vol. 27, №. 5, 2015 г., p. 5-22 (in Russian). DOI: 10.15514/ISPRAS-2015-27(5)-1.
11. [11]. Parhomenko P.A., Grigorev A.A., Astrakhantsev N.A. A survey and an experimental comparison of methods for text clustering: application to scientific articles. Trudy ISP RAN/Proc. ISP RAN, vol. 29, №. 2, 2017 г., p. 161-200 (in Russian). DOI: 10.15514/ISPRAS-2017-29(2)-6.
12. [12]. Ravuri S., Stolcke A. A Comparative Study of Recurrent Neural Network Models for Lexical Domain Classification. In Proc. of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016, pp. 6075-6079
13. [13]. Yogatama D., Dyer C., Ling W., Blunsom P. Generative and discriminative text classification with recurrent neural networks. arXiv preprint arXiv:1703.01898, 2017.
14. [14]. Chen G., Ye D., Xing Z., Chen J., Cambria E. Ensemble application of convolutional and recurrent neural networks for multi-label text categorization. In Proc. of the International Joint Conference on Neural Networks (IJCNN), 2017, pp. 2377-2383.
15. [15]. Valgina N.S., Rosental D.E., Fomina M.I. Modern Russian Language. Moscow, Logos, 2002, 528 p. (in Russian).
16. [16]. Wikipedia. The free encyclopedia. URL: https://ru.wikipedia.org/, accessed 26.11.2018.
17. [17]. Glazkova A. V. Building a text corpus for automatic biographical facts extraction from Russian texts. Sovremennyye informatsionnyye tekhnologii i IT-obrazovaniye [Modern Information Technologies and IT-education], vol 14, №. 4, 2018 (in Russian).
18. [18]. The corpus of biographical texts, URL https://sites.google.com/site/utcorpus/, accessed 01.12.2018.
19. [19]. Morphological analyzer pymorphy2, URL: [19]. https://pymorphy2.readthedocs.io/en/latest/, accessed 01.12.2018.
20. [20]. Mikolov T., Sutskever I., Chen K., Corrado G. S., Dean J. Distributed representations of words and phrases and their compositionality. In Proc. of the 26th International Conference on Neural Information Processing Systems, vol. 2, 2013, pp. 3111-3119.
21. [21]. Hochreiter S., Schmidhuber J. Long Short-term Memory. Neural computation, vol. 9, № 8, 1997, pp. 1735-1780.
22. [22]. Bai T., Dou H. J., Zhao W. X., Yang D. Y., Wen J. R. An Experimental Study of Text Representation Methods for Cross-Site Purchase Preference Prediction Using the Social Text Data. Journal of Computer Science and Technology, vol. 32, №. 4, 2017, pp. 828-842.
23. [23]. Keras: The Python Deep Learning library. URL: https://keras.io/, accessed 17.11.2018.
24. [24]. URL: https://github.com/oldaandozerskaya/biographical_samples.git, accessed 27.12.2018.
25. [25]. [gazeta.ru]. URL: https://www.gazeta.ru/, accessed 09.12.2018.
Review
For citations:
Glazkova A.V. Automatic search for fragments containing biographical information in a natural language text. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2018;30(6):221-236. (In Russ.) https://doi.org/10.15514/ISPRAS-2018-30(6)-12