Preview

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)

Advanced search

Here We Go Again: Modern GEC Models Need Help with Spelling

https://doi.org/10.15514/ISPRAS-2022-35(5)-14

Abstract

The study focuses on how modern GEC systems handle character-level errors. We discuss the ways these errors effect the performance of models and test how models of different architectures handle them. We conclude that specialized GEC systems do struggle against correcting non-existent words, and that a simple spellchecker considerably improve overall performance of a model. To evaluate it, we assess the models over several datasets. In addition to CoNLL-2014 validation dataset, we contribute a synthetic dataset with higher density of character-level errors and conclude that, provided that models generally show very high scores, validation datasets with higher density of tricky errors are a useful tool to compare models. Lastly, we notice cases of incorrect treatment of non-existent words on experts' annotation and contribute a cleared version of this dataset. In contrast to specialized GEC systems, LLaMA model used for GEC task handles character-level errors well. We suggest that this better performance is explained by the fact that Alpaca is not extensively trained on annotated texts with errors, but gets as input grammatically and orthographically correct texts.

About the Authors

Vladimir Mironovitch STARCHENKO
HSE University
Russian Federation

Dr. Sci. (Tech.), Professor, Head of the Department of Applied Mathematics and Computer Science of the Institute for System Programming of the RAS since 2004. Research interests: algebraic structures in the Galois fields, modular arithmetic, neurocomputer technologies, digital signal processing, cryptographic methods for protecting information



Alexei Mironovitch STARCHENKO
HSE University
Russian Federation

Specialist of the Department of system programming of CMC of Lomonosov Moscow State University. His research interests include pattern recognition, residual class systems.



References

1. Qorib M. R., Ng H. T. Grammatical error correction: Are we there yet? In Proceedings of the 29th International Conference on Computational Linguistics. Gyeongju, Republic of Korea: International Committee on Computational Linguistics, 2022, pp. 2794–2800.

2. Leacock C., Chodorow M., Gamon M., Tetreault J. Automated Grammatical Error Detection for Language Learners. Morgan & Claypool Publishers, 2014. 154 p.

3. Wang Y., Wang Y., Dang K., Liu J., and Liu Z. A comprehensive survey of grammatical error correction. ACM Trans. Intell. Syst. Technol., 12(5), 2021, pp. 1–51. doi: 10.1145/3474840.

4. Bryant C., Yuan Z., Qorib M. R., Cao H., Ng H. T., Briscoe T. Grammatical Error Correction: A Survey of the State of the Art. Computational Linguistics, 49 (3), 2023, pp. 643–701. doi: 10.1162/coli_a_00478.

5. Susanto R. H., Phandi P., Ng H. T. System combination for grammatical error correction. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha, Qatar: Association for Computational Linguistics, 2014, pp. 951–962. [Online]. doi: 10.3115/v1/D14-1102.

6. Rozovskaya A., Roth D. Grammatical error correction: Machine translation and classifiers. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Berlin, Germany: Association for Computational Linguistics, 2016, pp. 2205–2215. doi: 10.18653/v1/P16-1208.

7. Chollampatt S., Wang W., Ng H. T. Cross-sentence grammatical error correction. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy: Association for Computational Linguistics, 2019, pp. 435–445. doi: 10.18653/v1/P19-1042.

8. Gotou T., Nagata R., Mita M., Hanawa K. Taking the correction difficulty into account in grammatical error correction evaluation. In Proceedings of the 28th International Conference on Computational Linguistics. Barcelona, Spain (Online): International Committee on Computational Linguistics, 2020, pp. 2085–2095. doi: 10.18653/v1/2020.coling-main.188.

9. Omelianchuk K., Atrasevych V., Chernodub A., Skurzhanskyi O. GECToR – grammatical error correction: Tag, not rewrite. In Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications. Seattle, WA, USA → Online: Association for Computational Linguistics, 2020, pp. 163–170. doi: 10.18653/v1/2020.bea-1.16.

10. Cargill T. The design of a spelling checker’s user interface. ACM SIGOA Newsletter, 1(3), 1980, pp. 3-4.

11. Bentley J. Programming pearls: A spelling checker. Communications of the ACM, 28(5), 1985, pp. 456–462.

12. Chollampatt S., Ng H. T. Connecting the dots: Towards human-level grammatical error correction. In Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications. Copenhagen, Denmark: Association for Computational Linguistics, 2017, pp. 327–333. doi: 10.18653/v1/W17-5037.

13. Ge T., Wei F., Zhou M. Fluency boost learning and inference for neural grammatical error correction. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Melbourne, Australia: Association for Computational Linguistics, 2018, pp. 1055–1065. doi: 10.18653/v1/P18-1097.

14. Sakaguchi K., Post M., Van Durme B. Grammatical error correction with neural reinforcement learning. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers). Taipei, Taiwan: Asian Federation of Natural Language Processing, 2017, pp. 366–372.

15. Katsumata S., Komachi M. Stronger Baselines for Grammatical Error Correction Using a Pretrained Encoder-Decoder Model. In Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, Suzhou, China: Association for Computational Linguistics, 2020, pp. 827–832.

16. Rothe S., Mallinson J., Malmi E., Krause S., Severyn A. A Simple Recipe for Multilingual Grammatical Error Correction. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), Online, Association for Computational Linguistics, 2021, pp. 702–707. doi: 10.18653/v1/2021.acl-short.89.

17. Touvron H., Lavril T., Izacard G., Martinet X., Lachaux M. A., Lacroix T., Rozière B., Goyal N., Hambro E., Azhar F., Rodriguez A., Joulin A., Grave E., Lample, G. (2023) Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (online). Available at: https://arxiv.org/abs/2302.13971v1, accessed 18.12.2023.

18. Wang, Y., Kordi, Y., Mishra, S., Liu, A., Smith, N. A., Khashabi, D., Hajishirzi, H. (2022) Self-instruct: Aligning language model with self-generated instructions. arXiv preprint arXiv:2212.10560 (online). Available at: https://arxiv.org/abs/2212.10560, accessed 18.12.2023.

19. Taori R., Gulrajani I., Zhang T., Dubois Y., Li X., Guestrin C., Liang P., Hashimoto T. B. Alpaca: A Strong, Replicable Instruction-Following Model. The Center for Research on Foundation Models of Stanford Institute for Human-Centered Artificial Intelligence. Available at: https://crfm.stanford.edu/2023/03/13/alpaca.html, accessed 18.12.2023.

20. Floridi L., Chiriatti M. Gpt-3: Its nature, scope, limits, and consequences. Minds and Machines, 30, 2020, pp. 1–14.

21. Coyne S., Sakaguchi K., Galvan-Sosa D., Zock M., Inui K. Analyzing the Performance of GPT-3.5 and GPT-4 in Grammatical Error Correction. arXiv e-prints, p. arXiv:2303.14342 (online). Available at: https://arxiv.org/abs/2303.14342, accessed 18.12.2023.

22. Östling R., Gillholm K., Kurfalı M., Mattson M., and Wirén M. (2023) Evaluation of really good grammatical error correction. arXiv e-prints, p. arXiv:2308.08982 (online). Available at: https://arxiv.org/abs/2308.08982v1, accessed 18.12.2023. doi: 10.18653/v1/2022.emnlp-main.162.

23. Zhang Yu., Zhang B., Li Zh., Bao Z., Li Ch., Zhang M. SynGEC: Syntax-Enhanced Grammatical Error Correction with a Tailored GEC-Oriented Parser. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, United Arab Emirates: Association for Computational Linguistics, 2022, pp. 2518–2531.

24. Zhou, H., Liu, Y., Li, Z., Zhang, M., Zhang, B., Li, C., Zhang J., Huang, F. (2023) Improving Seq2Seq Grammatical Error Correction via Decoding Interventions. arXiv preprint arXiv:2310.14534 (online). Available at: https://arxiv.org/abs/2310.14534, accessed 18.12.2023.

25. Ng H. T., Wu S. M., Briscoe T., Hadiwinoto C., Susanto R. H., Bryant C. The CoNLL-2014 shared task on grammatical error correction. In Proceedings of the Eighteenth Conference on Computational Natural Language Learning: Shared Task. Baltimore, Maryland: Association for Computational Linguistics, 2014, pp. 1–14. doi: 10.3115/v1/W14-1701.

26. Bryant C., Ng H. T. How far are we from fully automatic high quality grammatical error correction? In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Beijing, China: Association for Computational Linguistics, 2015, pp. 697–707. doi: 10.3115/v1/P15-1068.

27. Grundkiewicz R., Junczys-Dowmunt M., Gillian E. Human evaluation of grammatical error correction systems. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon, Portugal: Association for Computational Linguistics, 2015, pp. 461–470. doi: 10.18653/v1/D15-1052.

28. Napoles C., Sakaguchi K., Post M., Tetreault J. Ground truth for grammatical error correction metrics. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). Beijing, China: Association for Computational Linguistics, Jul. 2015, pp. 588–593. doi: 10.3115/v1/P15-2097.

29. Chollampatt S., Ng H. T. A reassessment of reference-based grammatical error correction metrics. In Proceedings of the 27th International Conference on Computational Linguistics. Santa Fe, New Mexico, USA: Association for Computational Linguistics, 2018, pp. 2730–2741.

30. Mizumoto T., Hayashibe Y., Komachi M., Nagata M., Matsumoto Y. The effect of learner corpus size in grammatical error correction of ESL writings. In Proceedings of COLING 2012: Posters, Kay M. and Boitet C., Eds. Mumbai, India: The COLING 2012 Organizing Committee, 2012, pp. 863–872.

31. Tajiri T., Komachi M., Matsumoto Y. Tense and aspect error correction for ESL learners using global context. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Li H., Lin C.-Y., Osborne M., Lee G. G., and Park J. C., Eds. Jeju Island, Korea: Association for Computational Linguistics, 2012, pp. 198–202.

32. Yannakoudakis H., Briscoe T., Medlock B. A new dataset and method for automatically grading ESOL texts. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Lin D., Matsumoto Y., Mihalcea R., Eds. Portland, Oregon, USA: Association for Computational Linguistics, 2011, pp. 180–189.

33. Coltheart M., Rastle K., Perry C., Langdon R., Ziegler J., Drc: a dual route cascaded model of visual word recognition and reading aloud. Psychological review, 108(1), 2001, pp. 204–256. doi: 10.1037/0033-295X.108.1.204.

34. Castles A., Rastle K., Nation K., Ending the reading wars: Reading acquisition from novice to expert. Psychological Science in the Public Interest, 19(1), pp. 5–51, 2018, pMID: 29890888. doi: 10.1177/1529100618772271

35. Lichtarge J., Alberti C., Kumar S., Shazeer N., Parmar N., Tong S., “Corpora generation for grammatical error correction,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Burstein J., Doran C., Solorio T., Eds. Minneapolis, Minnesota: Association for Computational Linguistics, Jun. 2019, pp. 3291–3301. [Online]. Available: https://aclanthology.org/N19-1333

36. Näther M. An in-depth comparison of 14 spelling correction tools on a common benchmark. In Proceedings of the Twelfth Language Resources and Evaluation Conference, Calzolari N., Béchet F., Blache P., Choukri K., Cieri C., Declerck T., Goggi S., Isahara H., Maegaard B., Mariani J., Mazo H., Moreno A., Odijk J., Piperidis S., Eds. Marseille, France: European Language Resources Association, 2020, pp. 1849–1857.

37. Napoles C., Sakaguchi K., Tetreault J., JFLEG: A fluency corpus and benchmark for grammatical error correction. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, Lapata M., Blunsom P., Koller A., Eds. Valencia, Spain: Association for Computational Linguistics, 2017, pp. 229–234.

38. Qiu Z., Qu Y. A two-stage model for chinese grammatical error correction, IEEE Access, 7, pp. 146 772–146 777, 2019.

39. Hinson C., Huang H.-H., Chen H.-H. Heterogeneous recycle generation for Chinese grammatical error correction. In Proceedings of the 28th International Conference on Computational Linguistics, Scott D., Bel N., Zong C., Eds. Barcelona, Spain (Online): International Committee on Computational Linguistics. 2020, pp. 2191–2201. doi: 10.18653/v1/2020.coling-main.199.


Review

For citations:


STARCHENKO V.M., STARCHENKO A.M. Here We Go Again: Modern GEC Models Need Help with Spelling. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2023;35(5):215-228. https://doi.org/10.15514/ISPRAS-2022-35(5)-14



Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)