Проблема валидации современных систем исправления грамматических ошибок: случай ошибок на уровне символов

Владимир Миронович СТАРЧЕНКО; Алексей Миронович СТАРЧЕНКО

doi:10.15514/ISPRAS-2022-35(5)-14

Проблема валидации современных систем исправления грамматических ошибок: случай ошибок на уровне символов

Владимир Миронович СТАРЧЕНКО, Алексей Миронович СТАРЧЕНКО

https://doi.org/10.15514/ISPRAS-2022-35(5)-14

Полный текст:

PDF (Eng)

сгенерировать QR код

Аннотация

Исследование сосредотачивается на проблеме того, как современные системы исправления грамматических ошибок обрабатывают ошибки на уровне слова. Работа обсуждает, как подобные ошибки могут взаимодействовать с эффективностью модели, и оценивает, как модели с разными архитектурами справляется с ними. Делается вывод о том, что специализированные системы исправления грамматических ошибок сталкиваются с проблемами при исправлении ошибок, приводящих к созданию несуществующих слов, и что предобработка с помощью простой системой обработки подобных ошибок значительно улучшает общую эффективность модели. Для оценки этого работа модели тестируется для нескольких валидационных датасетах. Вдобавок к валидационному датасету соревнования CoNLL-2014 в работе предлагается синтетический датасет с повышенной плотностью ошибок на уровне слова. На основании сравнения эффективности модели на двух датасетах, работа делает вывод о том, что валидационные датасеты с высокой плотностью ошибок, представляющих проблему для моделей, — это полезный инструмент для сравнения моделей. Кроме того, работа указывает на случаи некорректной аннотации несуществующих слов в разметке экспертов и предлагает очищенную версию датасета. В отличие от специализированных систем исправления грамматических ошибок, модель LLaMA, используемся для задачи исправления грамматических ошибок хорошо справляется с ошибками на уровне слова. Мы предполагаем гипотезу, в соответствии с которой этот результат объясняется тем фактом, что эта модель не обучается на специальной аннотированной выборке, содержащей ошибки, а получает в качестве входа грамматически и орфографически корректные тексты.

Ключевые слова

автоматическое исправление грамматических ошибок, валидация, спеллчек, предобработка, синтетические датасеты.

Об авторах

Владимир Миронович СТАРЧЕНКО

Национальный исследовательский университет «Высшая школа экономики»
Россия

Доктор технических наук, профессор, заведующий отделом прикладной математики и информатики Института системного программирования с 2004 года. Сфера научных интересов: алгебраические структуры в полях Галуа, модулярная арифметика, нейрокомпьютерные технологии, цифровая обработка сигналов, криптографические методы защиты информации.

Алексей Миронович СТАРЧЕНКО

Национальный исследовательский университет «Высшая школа экономики»
Россия

Является специалистом кафедры Системного программирования Московского государственного университета имени М.В. Ломоносова. Его научные интересы включают распознавание образов, системы остаточных классов.

Список литературы

1. Qorib M. R., Ng H. T. Grammatical error correction: Are we there yet? In Proceedings of the 29th International Conference on Computational Linguistics. Gyeongju, Republic of Korea: International Committee on Computational Linguistics, 2022, pp. 2794–2800.

2. Leacock C., Chodorow M., Gamon M., Tetreault J. Automated Grammatical Error Detection for Language Learners. Morgan & Claypool Publishers, 2014. 154 p.

3. Wang Y., Wang Y., Dang K., Liu J., and Liu Z. A comprehensive survey of grammatical error correction. ACM Trans. Intell. Syst. Technol., 12(5), 2021, pp. 1–51. doi: 10.1145/3474840.

4. Bryant C., Yuan Z., Qorib M. R., Cao H., Ng H. T., Briscoe T. Grammatical Error Correction: A Survey of the State of the Art. Computational Linguistics, 49 (3), 2023, pp. 643–701. doi: 10.1162/coli_a_00478.

5. Susanto R. H., Phandi P., Ng H. T. System combination for grammatical error correction. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha, Qatar: Association for Computational Linguistics, 2014, pp. 951–962. [Online]. doi: 10.3115/v1/D14-1102.

6. Rozovskaya A., Roth D. Grammatical error correction: Machine translation and classifiers. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Berlin, Germany: Association for Computational Linguistics, 2016, pp. 2205–2215. doi: 10.18653/v1/P16-1208.

7. Chollampatt S., Wang W., Ng H. T. Cross-sentence grammatical error correction. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy: Association for Computational Linguistics, 2019, pp. 435–445. doi: 10.18653/v1/P19-1042.

8. Gotou T., Nagata R., Mita M., Hanawa K. Taking the correction difficulty into account in grammatical error correction evaluation. In Proceedings of the 28th International Conference on Computational Linguistics. Barcelona, Spain (Online): International Committee on Computational Linguistics, 2020, pp. 2085–2095. doi: 10.18653/v1/2020.coling-main.188.

9. Omelianchuk K., Atrasevych V., Chernodub A., Skurzhanskyi O. GECToR – grammatical error correction: Tag, not rewrite. In Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications. Seattle, WA, USA → Online: Association for Computational Linguistics, 2020, pp. 163–170. doi: 10.18653/v1/2020.bea-1.16.

10. Cargill T. The design of a spelling checker’s user interface. ACM SIGOA Newsletter, 1(3), 1980, pp. 3-4.

11. Bentley J. Programming pearls: A spelling checker. Communications of the ACM, 28(5), 1985, pp. 456–462.

12. Chollampatt S., Ng H. T. Connecting the dots: Towards human-level grammatical error correction. In Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications. Copenhagen, Denmark: Association for Computational Linguistics, 2017, pp. 327–333. doi: 10.18653/v1/W17-5037.

13. Ge T., Wei F., Zhou M. Fluency boost learning and inference for neural grammatical error correction. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Melbourne, Australia: Association for Computational Linguistics, 2018, pp. 1055–1065. doi: 10.18653/v1/P18-1097.

14. Sakaguchi K., Post M., Van Durme B. Grammatical error correction with neural reinforcement learning. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers). Taipei, Taiwan: Asian Federation of Natural Language Processing, 2017, pp. 366–372.

15. Katsumata S., Komachi M. Stronger Baselines for Grammatical Error Correction Using a Pretrained Encoder-Decoder Model. In Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, Suzhou, China: Association for Computational Linguistics, 2020, pp. 827–832.

16. Rothe S., Mallinson J., Malmi E., Krause S., Severyn A. A Simple Recipe for Multilingual Grammatical Error Correction. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), Online, Association for Computational Linguistics, 2021, pp. 702–707. doi: 10.18653/v1/2021.acl-short.89.

17. Touvron H., Lavril T., Izacard G., Martinet X., Lachaux M. A., Lacroix T., Rozière B., Goyal N., Hambro E., Azhar F., Rodriguez A., Joulin A., Grave E., Lample, G. (2023) Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (online). Available at: https://arxiv.org/abs/2302.13971v1, accessed 18.12.2023.

18. Wang, Y., Kordi, Y., Mishra, S., Liu, A., Smith, N. A., Khashabi, D., Hajishirzi, H. (2022) Self-instruct: Aligning language model with self-generated instructions. arXiv preprint arXiv:2212.10560 (online). Available at: https://arxiv.org/abs/2212.10560, accessed 18.12.2023.

19. Taori R., Gulrajani I., Zhang T., Dubois Y., Li X., Guestrin C., Liang P., Hashimoto T. B. Alpaca: A Strong, Replicable Instruction-Following Model. The Center for Research on Foundation Models of Stanford Institute for Human-Centered Artificial Intelligence. Available at: https://crfm.stanford.edu/2023/03/13/alpaca.html, accessed 18.12.2023.

20. Floridi L., Chiriatti M. Gpt-3: Its nature, scope, limits, and consequences. Minds and Machines, 30, 2020, pp. 1–14.

21. Coyne S., Sakaguchi K., Galvan-Sosa D., Zock M., Inui K. Analyzing the Performance of GPT-3.5 and GPT-4 in Grammatical Error Correction. arXiv e-prints, p. arXiv:2303.14342 (online). Available at: https://arxiv.org/abs/2303.14342, accessed 18.12.2023.

22. Östling R., Gillholm K., Kurfalı M., Mattson M., and Wirén M. (2023) Evaluation of really good grammatical error correction. arXiv e-prints, p. arXiv:2308.08982 (online). Available at: https://arxiv.org/abs/2308.08982v1, accessed 18.12.2023. doi: 10.18653/v1/2022.emnlp-main.162.

23. Zhang Yu., Zhang B., Li Zh., Bao Z., Li Ch., Zhang M. SynGEC: Syntax-Enhanced Grammatical Error Correction with a Tailored GEC-Oriented Parser. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, United Arab Emirates: Association for Computational Linguistics, 2022, pp. 2518–2531.

24. Zhou, H., Liu, Y., Li, Z., Zhang, M., Zhang, B., Li, C., Zhang J., Huang, F. (2023) Improving Seq2Seq Grammatical Error Correction via Decoding Interventions. arXiv preprint arXiv:2310.14534 (online). Available at: https://arxiv.org/abs/2310.14534, accessed 18.12.2023.

25. Ng H. T., Wu S. M., Briscoe T., Hadiwinoto C., Susanto R. H., Bryant C. The CoNLL-2014 shared task on grammatical error correction. In Proceedings of the Eighteenth Conference on Computational Natural Language Learning: Shared Task. Baltimore, Maryland: Association for Computational Linguistics, 2014, pp. 1–14. doi: 10.3115/v1/W14-1701.

26. Bryant C., Ng H. T. How far are we from fully automatic high quality grammatical error correction? In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Beijing, China: Association for Computational Linguistics, 2015, pp. 697–707. doi: 10.3115/v1/P15-1068.

27. Grundkiewicz R., Junczys-Dowmunt M., Gillian E. Human evaluation of grammatical error correction systems. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon, Portugal: Association for Computational Linguistics, 2015, pp. 461–470. doi: 10.18653/v1/D15-1052.

28. Napoles C., Sakaguchi K., Post M., Tetreault J. Ground truth for grammatical error correction metrics. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). Beijing, China: Association for Computational Linguistics, Jul. 2015, pp. 588–593. doi: 10.3115/v1/P15-2097.

29. Chollampatt S., Ng H. T. A reassessment of reference-based grammatical error correction metrics. In Proceedings of the 27th International Conference on Computational Linguistics. Santa Fe, New Mexico, USA: Association for Computational Linguistics, 2018, pp. 2730–2741.

30. Mizumoto T., Hayashibe Y., Komachi M., Nagata M., Matsumoto Y. The effect of learner corpus size in grammatical error correction of ESL writings. In Proceedings of COLING 2012: Posters, Kay M. and Boitet C., Eds. Mumbai, India: The COLING 2012 Organizing Committee, 2012, pp. 863–872.

31. Tajiri T., Komachi M., Matsumoto Y. Tense and aspect error correction for ESL learners using global context. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Li H., Lin C.-Y., Osborne M., Lee G. G., and Park J. C., Eds. Jeju Island, Korea: Association for Computational Linguistics, 2012, pp. 198–202.

32. Yannakoudakis H., Briscoe T., Medlock B. A new dataset and method for automatically grading ESOL texts. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Lin D., Matsumoto Y., Mihalcea R., Eds. Portland, Oregon, USA: Association for Computational Linguistics, 2011, pp. 180–189.

33. Coltheart M., Rastle K., Perry C., Langdon R., Ziegler J., Drc: a dual route cascaded model of visual word recognition and reading aloud. Psychological review, 108(1), 2001, pp. 204–256. doi: 10.1037/0033-295X.108.1.204.

34. Castles A., Rastle K., Nation K., Ending the reading wars: Reading acquisition from novice to expert. Psychological Science in the Public Interest, 19(1), pp. 5–51, 2018, pMID: 29890888. doi: 10.1177/1529100618772271

35. Lichtarge J., Alberti C., Kumar S., Shazeer N., Parmar N., Tong S., “Corpora generation for grammatical error correction,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Burstein J., Doran C., Solorio T., Eds. Minneapolis, Minnesota: Association for Computational Linguistics, Jun. 2019, pp. 3291–3301. [Online]. Available: https://aclanthology.org/N19-1333

36. Näther M. An in-depth comparison of 14 spelling correction tools on a common benchmark. In Proceedings of the Twelfth Language Resources and Evaluation Conference, Calzolari N., Béchet F., Blache P., Choukri K., Cieri C., Declerck T., Goggi S., Isahara H., Maegaard B., Mariani J., Mazo H., Moreno A., Odijk J., Piperidis S., Eds. Marseille, France: European Language Resources Association, 2020, pp. 1849–1857.

37. Napoles C., Sakaguchi K., Tetreault J., JFLEG: A fluency corpus and benchmark for grammatical error correction. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, Lapata M., Blunsom P., Koller A., Eds. Valencia, Spain: Association for Computational Linguistics, 2017, pp. 229–234.

38. Qiu Z., Qu Y. A two-stage model for chinese grammatical error correction, IEEE Access, 7, pp. 146 772–146 777, 2019.

39. Hinson C., Huang H.-H., Chen H.-H. Heterogeneous recycle generation for Chinese grammatical error correction. In Proceedings of the 28th International Conference on Computational Linguistics, Scott D., Bel N., Zong C., Eds. Barcelona, Spain (Online): International Committee on Computational Linguistics. 2020, pp. 2191–2201. doi: 10.18653/v1/2020.coling-main.199.

Рецензия

Для цитирования:

СТАРЧЕНКО В.М., СТАРЧЕНКО А.М. Проблема валидации современных систем исправления грамматических ошибок: случай ошибок на уровне символов. Труды Института системного программирования РАН. 2023;35(5):215-228. https://doi.org/10.15514/ISPRAS-2022-35(5)-14

For citation:

STARCHENKO V.M., STARCHENKO A.M. Here We Go Again: Modern GEC Models Need Help with Spelling. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2023;35(5):215-228. https://doi.org/10.15514/ISPRAS-2022-35(5)-14

Контент доступен под лицензией Creative Commons Attribution 4.0 License.

ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)

Логин
Пароль
	Запомнить меня
Регистрация нового пользователя Забыли Ваш пароль?

Войти

Труды Института системного программирования РАН

Проблема валидации современных систем исправления грамматических ошибок: случай ошибок на уровне символов

Полный текст:

Аннотация

Ключевые слова

Об авторах

Список литературы

Рецензия

Для цитирования:

For citation:

Использование куки-файлов