Preview

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)

Advanced search

PereFlex: A Tool for Automated Evaluation of Error Recovery in Parsers

https://doi.org/10.15514/ISPRAS-2025-37(6)-40

Abstract

Error recovery is a critical component of parsing technology, particularly in applications such as IDEs and compilers, where a single syntax error should not prevent further analysis of the input. This paper presents PereFlex – a tool for extensive experimental evaluation of error recovery in JVM-based parsers. Our evaluation is based on real-world parsers for Java and users' erroneous programs. The results demonstrate that while some strategies are fast, they often fail to provide meaningful recovery, whereas advanced methods offer better recovery quality at the cost of increased computational overhead.

About the Authors

Olga Igorevna BACHISHCHE
ITMO University
Russian Federation

Postgraduate student at the Institute of Applied Computer Technologies, ITMO University. Research interests: static program analysis.



Yaroslav Stanislavovich VOROBIEV
HSE University
Russian Federation

Master at HSE University. Research interests: data analysis.



Grigoriy Romanovich RAYKIN
ITMO University
Russian Federation

Postgraduate student at the ITMO Institute of Computer Science. Research interests: static and dynamic software analysis, fuzzing, formal software specification.



Darya Vladimirovna VASINA
ITMO University
Russian Federation

A Lead Engineer at the St. Petersburg Cloud Software Development Tools Laboratory of Huawei. She graduated from the Faculty of Computer Technologies and Control at ITMO University, majoring in Computer Science and Engineering. She specializes in creating software development tools with the integration of artificial intelligence technologies.



Daniil Sergeyevich SHUSHAKOV
ITMO University
Russian Federation

Master at ITMO University. Specializes in the development of integrated development environments and static code analysis.



Semyon Vyacheslavovich GRIGORIEV
St. Petersburg State University
Russian Federation

Cand. Sci. (Phys.-Math.), an associate professor in the Department of Software Engineering at St. Petersburg State University. His research interests include static program analysis, parsing algorithms and tools, and high-performance graph analysis.



References

1. I. Karvelas, J. Dillane, and B. A. Becker, “Programmers’ views on IDE compilation mechanisms,” in Proceedings of the ACM Conference on Global Computing Education Vol 1, ser. CompEd 2023. New York, NY, USA: Association for Computing Machinery, 2023, p. 98–104, Available: https://doi.org/10.1145/3576882.3617915.

2. D. Pritchard, “Frequency distribution of error messages,” in Proceedings of the 6th Workshop on Evaluation and Usability of Programming Languages and Tools, ser. PLATEAU 2015. New York, NY, USA: Association for Computing Machinery, 2015, p. 1–8., Available: https://doi.org/10.1145/2846680.2846681.

3. X. Zhou, S. Cao, X. Sun, and D. Lo, “Large language model for vulnerability detection and repair: Literature review and the road ahead,” ACM Trans. Softw. Eng. Methodol., Dec. 2024, just Accepted, Available: https://doi.org/10.1145/3708522.

4. T. J. Pennello and F. DeRemer, “A forward move algorithm for lr error recovery,” in Proceedings of the 5th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages, ser. POPL ’78. New York, NY, USA: Association for Computing Machinery, 1978, p. 241–254., Available: https://doi.org/10.1145/512760.512786.

5. M. de Jonge, L. C. L. Kats, E. Visser, and E. S¨oderberg, “Natural and flexible error recovery for generated modular language environments,” ACM Trans. Program. Lang. Syst., vol. 34, no. 4, Dec. 2012., Available: https://doi.org/10.1145/2400676.2400678.

6. L. Diekmann and L. Tratt, “Don’t Panic! Better, Fewer, Syntax Errors for LR Parsers,” in 34th European Conference on Object-Oriented Programming (ECOOP 2020), ser. Leibniz International Proceedings in Informatics (LIPIcs), R. Hirschfeld and T. Pape, Eds., vol. 166. Dagstuhl, Germany: Schloss Dagstuhl – Leibniz-Zentrum f¨ur Informatik, 2020, pp. 6:1–6:32., Available: https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECOOP.2020.6.

7. M. de Jonge and E. Visser, “Automated evaluation of syntax error recovery,” in Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering, ser. ASE ’12. New York, NY, USA: Association for Computing Machinery, 2012, p. 322–325., Available: https://doi.org/10.1145/2351676.2351736.

8. E. A. Santos, J. C. Campbell, D. Patel, A. Hindle, and J. N. Amaral, “Syntax and sensibility: Using language models to detect and correct syntax errors,” in 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER), 2018, pp. 311–322.

9. N. C. C. Brown, M. K¨olling, D. McCall, and I. Utting, “Blackbox: a large scale repository of novice programmers’ activity,” in Proceedings of the 45th ACM Technical Symposium on Computer Science Education, ser. SIGCSE ’14. New York, NY, USA: Association for Computing Machinery, 2014, p. 223–228., Available: https://doi.org/10.1145/2538862.2538924.

10. PereFlex source code, Available at: https://github.com/dsult/parser-compare, accessed 12.05.2025.

11. Visual Studio Code code editor, Available at: https://code.visualstudio.com, accessed 12.05.2025.

12. Eclipse IDE for Java, Available at: https://eclipseide.org, accessed 12.05.2025.

13. P. Degano and C. Priami, “Comparison of syntactic error handling in lr parsers,” Softw. Pract. Exper., vol. 25, no. 6, p. 657–679, Jun. 1995., Available: https://doi.org/10.1002/spe.4380250606.

14. P. Medvedev, “Theoretical analysis of edit distance algorithms,” Commun. ACM, vol. 66, no. 12, p. 64–71, Nov. 2023., Available: https://doi.org/10.1145/3582490.

15. B. Berabi, A. Gronskiy, V. Raychev, G. Sivanrupan, V. Chibotaru, and M. Vechev, “Deepcode ai fix: Fixing security vulnerabilities with large language models,” 2024., Available: https://arxiv.org/abs/2402.13291.

16. B. A. Becker, C. Murray, T. Tao, C. Song, R. McCartney, and K. Sanders, “Fix the first, ignore the rest: Dealing with multiple compiler error messages,” in Proceedings of the 49th ACM Technical Symposium on Computer Science Education, ser. SIGCSE ’18. New York, NY, USA: Association for Computing Machinery, 2018, p. 634–639., Available: https://doi.org/10.1145/3159450.3159453.

17. A. Akinshin, Pro.NET Benchmarking: The Art of Performance Measurement. Apress Berkeley, CA, 2019. 687 p

18. JMH source code, Available at: https://github.com/openjdk/jmh, accessed 12.05.2025.

19. I. Utting, N. Brown, M. K¨olling, D. McCall, and P. Stevens, “Web-scale data gathering with bluej,” in Proceedings of the Ninth Annual International Conference on International Computing Education Research, ser. ICER ’12. New York, NY, USA: Association for Computing Machinery, 2012, p. 1–4., Available: https://doi.org/10.1145/2361276.2361278.

20. Javac compiler, Available at: https://docs.oracle.com/javase/8/docs/technotes/guides/javac, accessed 12.05.2025.

21. Javac diagnostics, Available at: https://openjdk.org/groups/compiler/doc/hhgtjavac/diagnostics.html, accessed 12.05.2025.

22. javac errors, Available at:

23. https://github.com/openjdk/jdk/blob/master/src/jdk.compiler/share/classes/com/sun/tools/javac/resources/compiler.properties, accessed 12.05.2025.

24. Tree-sitter parser generator, Available at: https://tree-sitter.github.io/tree-sitter, accessed 12.05.2025.

25. Java bundle for Tree-sitter, Available at: https://github.com/tree-sitter/tree-sitter-java, accessed 12.05.2025.

26. T. Parr and K. Fisher, “Ll(*): the foundation of the antlr parser generator,” in Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, ser. PLDI ’11. New York, NY, USA: Association for Computing Machinery, 2011, p. 425–436., Available: https://doi.org/10.1145/1993498.1993548.

27. ANTLR grammar based on Java 8 specification, Available at: https://github.com/antlr/grammars-v4/tree/master/java/java, accessed 12.05.2025.

28. Optimized ANTLR grammar for Java, Available at: https://github.com/antlr/grammars-v4/tree/master/java/java8, accessed 12.05.2025.

29. S. Queiroz de Medeiros, G. de Azevedo Alvez Junior, and F. Mascarenhas, “Automatic syntax error reporting and recovery in parsing expression grammars,” Sci. Comput. Program., vol. 187, no. C, Feb. 2020., Available: https://doi.org/10.1016/j.scico.2019.102373.

30. S. Q. de Medeiros and F. Mascarenhas, “Towards automatic error recovery in parsing expression grammars,” in Proceedings of the XXII Brazilian Symposium on Programming Languages, ser. SBLP ’18. New York, NY, USA: Association for Computing Machinery, 2018, p. 3–10., Available: https://doi.org/10.1145/3264637.3264638.

31. G. Sakkas, M. Endres, P. J. Guo, W. Weimer, and R. Jhala, “Seq2parse: neurosymbolic parse error repair,” Proc. ACM Program. Lang., vol. 6, no. OOPSLA2, Oct. 2022., Available: https://doi.org/10.1145/3563330.

32. R. Gupta, A. Kanade, and S. Shevade, “Deep reinforcement learning for syntactic error repair in student programs,” in Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, ser. AAAI’19/IAAI’19/EAAI’19. AAAI Press, 2019., Available: https://doi.org/10.1609/aaai.v33i01.3301930.


Review

For citations:


BACHISHCHE O.I., VOROBIEV Ya.S., RAYKIN G.R., VASINA D.V., SHUSHAKOV D.S., GRIGORIEV S.V. PereFlex: A Tool for Automated Evaluation of Error Recovery in Parsers. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2025;37(6):121-132. https://doi.org/10.15514/ISPRAS-2025-37(6)-40



Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)