Using Large Language Models for Table Header Recognition

Ilia Igorevich OKHOTIN; Nikita Olegovych DORODNYKH

doi:10.15514/ISPRAS-2025-37(6)-9

Using Large Language Models for Table Header Recognition

Ilia Igorevich OKHOTIN, Nikita Olegovych DORODNYKH

https://doi.org/10.15514/ISPRAS-2025-37(6)-9

Full Text:

PDF (Rus)

Generate QR code

Abstract

Automatic table header recognition remains a challenging task due to the diversity of table layouts, including multi-level headers, merged cells, and non-standard formatting. This paper is the first to propose a methodology to evaluate the performance of large language models on this task using prompt engineering. The study covers eight different models and six prompt strategies with zero-shot and few-shot settings, on a dataset of 237 tables. The results demonstrate that model size critically affects the accuracy: large models (405 billion parameters) achieve F1 ≈ 0.80–0.85, while small ones (7 billion parameters) show F1 ≈ 0.06–0.30. Complicating prompts with step-by-step instructions, search criteria, and examples improves the results only for large models, while for small ones it leads to degradation due to context overload. The largest errors occur when processing tables with hierarchical headers and merged cells, where even large models lose up to accuracy of recognition. The practical significance of this paper lies in identifying optimal configurations of prompts for different types of models. For example, short instructions are effective for large models, and step-by-step instructions with search criteria are effective for medium ones. This study opens up new possibilities for creating universal tools for automatic analysis of table headers.

Keywords

table, table headers, table structure recognition, header recognition, large language model, prompt engineering.

About the Authors

Ilia Igorevich OKHOTIN

Matrosov Institute for System Dynamics and Control Theory of the Siberian Branch of Russian Academy of Sciences (ISDCT SB RAS)
Russian Federation

A master student at the Institute of Mathematics and Information Technology of Irkutsk State University (IMIT ISU) since 2024. Research interests: large language models, tabular data processing, table structure recognition; table header processing.

Nikita Olegovych DORODNYKH

Matrosov Institute for System Dynamics and Control Theory of the Siberian Branch of Russian Academy of Sciences (ISDCT SB RAS)
Russian Federation

Cand. Sci. (Tech.), senior associate researcher at the Matrosov Institute of System Dynamics and Control Theory named SB RAS (ISDCT SB RAS) since 2021. Research interests: computer-aided development of intelligent systems and knowledge bases, knowledge acquisition based on the transformation of conceptual models and tables.

References

1. Dong H., Cheng Z., He X., Zhou M., Zhou A., Zhou F., Liu A., Han S., Zhang D. Table Pre-training: A Survey on Model Architectures, Pre-training Objectives, and Downstream Tasks. Proc. the Thirty-First International Joint Conference on Artificial Intelligence, Vienna, Austria, 2022, pp. 5426-5435. DOI: 10.24963/ijcai.2022/761.

2. Badaro G., Saeed M., Papotti P. Transformers for Tabular Data Representation: A Survey of Models and Applications. Transactions of the Association for Computational Linguistics, vol. 11, 2023, pp. 227-249. DOI: 10.1162/tacl_a_00544.

3. Dong H., Wang Z. Large Language Models for Tabular Data: Progresses and Future Directions. Proc. the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'24), Washington, USA, 2024, pp. 2997-3000. DOI: 10.1145/3626772.3661384.

4. Bonfitto S., Casiraghi E., Mesiti M. Table understanding approaches for extracting knowledge from heterogeneous tables. Wiley Interdisciplinary Reviews, Data Mining and Knowledge Discovery, vol. 11, 2021, e1407. DOI: 10.1002/widm.1407.

5. Shigarov A. Table understanding: Problem overview. Wiley Interdisciplinary Reviews, Data Mining and Knowledge Discovery, vol. 13, 2022, e1482. DOI: 10.1002/widm.1482.

6. Embley D. W., Krishnamoorthy M. S., Nagy G., Seth S. Converting heterogeneous statistical tables on the web to searchable databases. International Journal on Document Analysis and Recognition, vol. 19, no. 2, 2016, pp. 119-138. DOI: 10.1007/s10032-016-0259-1.

7. Rastan R., Paik H.-Y., Shepherd J. TEXUS: a unified framework for extracting and understanding tables in PDF documents. Information Processing and Management: an International Journal, vol. 56, no. 3, 2019, pp. 895-918. DOI: 10.1016/j.ipm.2019.01.008.

8. Wu X., Chen H., Bu C., Ji S., Zhang Z., Sheng V. S. HUSS: A heuristic method for understanding the semantic structure of spreadsheets. Data Intelligence, vol. 5, no. 3, 2023, pp. 537-559. DOI: 10.1162/dint_a_00201.

9. Roldán J. C., Jiménez P., Corchuelo R. On extracting data from tables that are encoded using html. Knowledge-Based Systems, vol. 190, 2020, 105157. DOI: 10.1016/j.knosys.2019.105157.

10. Fang J., Mitra P., Tang Z., Giles C. L. Table header detection and classification. Proc. the Twenty-Sixth AAAI Conference on Artificial Intelligence (AAAI’12), Toronto, Ontario, Canada, 2012, pp. 599-605. DOI: 10.5555/2900728.2900814.

11. Roldán J. C., Jiménez P., Szekely P., Corchuelo R. TOMATE: A heuristic-based approach to extract data from HTML tables. Information Sciences, vol. 577, 2021, pp. 49-68. DOI: 10.1016/j.ins.2021.04.087.

12. Fetahu B., Anand A., Koutraki M. TableNet: An Approach for Determining Fine-grained Relations for Wikipedia Tables. Proc. the World Wide Web Conference (WWW’19), San Francisco, CA, USA, 2019, pp. 2736-2742. DOI: 10.1145/3308558.3313629.

13. Xue W., Yu B., Wang W., Tao D., Li Q. TGRNet: A Table Graph Reconstruction Network for Table Structure Recognition. Proc. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 2021, pp. 1295-1304. DOI: 10.1109/ICCV48922.2021.00133.

14. Li X. H., Yin F., Dai H. S., Cheng-Lin Liu C. L. Table Structure Recognition and Form Parsing by End-to-End Object Detection and Relation Parsing. Pattern Recognition, vol. 132, no. C, 2022, DOI: 10.1016/j.patcog.2022.108946.

15. Lin W., Sun Z., Ma C., Li M., Wang J., Sun L., Huo Q. TSRFormer: Table Structure Recognition with Transformers. Proc. the 30th ACM International Conference on Multimedia (MM’22), New York, USA, 2022, pp. 6473-6482. DOI: 10.1145/3503161.3548038.

16. Yang J., Gupta A., Upadhyay S., He L., Goel R., Paul S. TableFormer: Robust Transformer Modeling for Table-Text Encoding. Proc. the 60th Annual Meeting of the Association for Computational Linguistics (ACL’22), Dublin, Ireland, 2022, pp. 528-537. DOI: 10.18653/v1/2022.acl-long.40.

17. Huang Y., Lu N., Chen D., Li Y., Xie Z., Zhu S., Gao L., Peng W. Improving Table Structure Recognition with Visual-Alignment Sequential Coordinate Modeling. Proc. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 2023, pp. 11134-11143. DOI: 10.1109/CVPR52729.2023.01071.

18. Smock B., Pesala R., Abraham R. Aligning Benchmark Datasets for Table Structure Recognition. Proc. 17th International Conference of Document Analysis and Recognition (ICDAR’2023), San Jose, CA, USA, 2023, pp 371-386. DOI: 10.1007/978-3-031-41734-4_23.

19. Chen L., Huang C., Zheng X., Lin J., Huang X. TableVLM: Multi-modal Pre-training for Table Structure Recognition. Proc. the 61st Annual Meeting of the Association for Computational Linguistics (ACL’23), Toronto, Canada, 2023, pp. 2437-2449. DOI: 10.18653/v1/2023.acl-long.137.

20. Herzig J., Nowak P. K., Muller T., Piccinno F., Eisenschlos J. M. TaPas: Weakly Supervised Table Parsing via Pre-training. Proc. 58th Annual Meeting of the Association for Computational Linguistics, Online, 2020, pp. 4320-4333. DOI: 10.18653/v1/2020.acl-main.398.

21. Yin P., Neubig G., Yih W. TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data. Proc. the 58th Annual Meeting of the Association for Computational Linguistics, 2020, pp. 8413-8426. DOI: 10.18653/v1/2020.acl-main.745.

22. Zheng X., Burdick D., Popa L., Zhong X., Wang N. X. R. Global Table Extractor (GTE): A Framework for Joint Table Identification and Cell Structure Recognition Using Visual Context. Proc. 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 2021, pp. 697-706. DOI: 10.1109/WACV48630.2021.00074.

23. Jain A., Paliwal S., Sharma M., Vig L. TSR-DSAW: Table Structure Recognition via Deep Spatial Association of Words. Proc. the European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN'2021), Online, 2021, pp. 257-262. DOI: 10.14428/esann/2021.es2021-109.

24. Singha A., Cambronero J., Gulwani S., Le V., Parnin C. Tabular Representation, Noisy Operators, and Impacts on Table Structure Understanding Tasks in LLMs. Proc. Table Representation Learning Workshop at NeurIPS 2023, Online, 2023, pp. 1-14. DOI: arxiv.org/abs/2310.10358.

25. Sui Y., Zhou M., Zhou M., Han S., Zhang D. Table Meets LLM: Can Large Language Models Understand Structured Table Data? A Benchmark and Empirical Study. Proc. the 17th ACM International Conference on Web Search and Data Mining (WSDM’24), Merida, Mexico, 2024, pp. 645-654. DOI: 10.1145/3616855.3635752.

26. Mistral-7B-Instruct-v0.3, Available at: https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3, accessed 09.05.2025.

27. Llama-3.1-8B-Instruct, Available at: https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct, accessed 09.05.2025.

28. Mistral-Small-24B-Instruct-2501, Available at: https://huggingface.co/mistralai/Mistral-Small-24B-Instruct-2501, accessed 09.05.2025.

29. Gemma-2-27b-it, Available at: https://huggingface.co/google/gemma-2-27b-it, accessed 09.05.2025.

30. Llama-3.3-70B-Instruct, Available at: https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct, accessed 09.05.2025.

31. Qwen2-72B-Instruct, Available at: https://huggingface.co/Qwen/Qwen2-72B-Instruct, accessed 09.05.2025.

32. DeepSeek-R1-Distill-Llama-70B, Available at: https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B, accessed 09.05.2025.

33. Llama-3.1-405B-Instruct, Available at: https://huggingface.co/meta-llama/Llama-3.1-405B-Instruct, accessed 09.05.2025.

34. Smock B., Pesala R., Abraham R. PubTables-1M: Towards comprehensive table extraction from unstructured documents. Proc. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 2022, pp. 4634-4642. DOI: 10.1109/CVPR52688.2022.00459.

35. A developer’s guide to prompt engineering and LLMs, Available at: https://github.blog/ai-and-ml/generative-ai/prompt-engineering-guide-generative-ai-llms/, accessed 09.05.2025.

36. LangChain framework, Available at: https://www.langchain.com/, accessed 09.05.2025.

37. Together AI, Available at: https://www.together.ai/, accessed 09.05.2025.

Review

For citations:

OKHOTIN I.I., DORODNYKH N.O. Using Large Language Models for Table Header Recognition. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2025;37(6):149-166. (In Russ.) https://doi.org/10.15514/ISPRAS-2025-37(6)-9

This work is licensed under a Creative Commons Attribution 4.0 License.

ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)

Username
Password
	Remember me
Not a user? Register with this site Forgot your password?

User

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)

Using Large Language Models for Table Header Recognition

Full Text:

Abstract

Keywords

About the Authors

References

Review

For citations:

Cookies policy