Preview

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)

Advanced search

A Preliminary Analysis of Prompt Engineering in Large Language Models for Code Generation

https://doi.org/10.15514/ISPRAS-2025-37(6)-57

Abstract

Large language models (LLMs) have significantly advanced code generation tasks by enabling natural language-to-code translation. However, the effectiveness of these models is highly dependent on prompt engineering - the practice of crafting input prompts that guide model behavior. While prior surveys have explored prompt engineering across general NLP applications, they provide limited insights into its role in code generation. In this survey, we examine 19 prompt engineering strategies specifically designed for code synthesis. We introduce a functional taxonomy dividing these strategies into simple and complex categories, and propose a penalty-based evaluation framework that quantifies the trade-off between model performance and resource consumption. Our analysis consolidates fragmented findings, identifies emerging patterns, and offers actionable guidance for practitioners aiming to optimize LLM-driven code generation. This work establishes a foundation for future research on adaptive and cost-efficient prompting methods in program synthesis.

About the Authors

Yaroslav Olegovich YUDINSKIKH
Innopolis University
Russian Federation

Postgraduate student of Innopolis University. Research interests: Large Language Models (LLM), code generation, multi-agent systems.



Vladimir Vladimirovich IVANOV
Innopolis University
Russian Federation

Cand. Sci. (Phys.-Math.), Associate Professor at the Center for Top-Level Educational Programs in Artificial Intelligence, Autonomous Non-Profit Organization of Higher Education “Innopolis University.”. Research interests: natural language processing (NLP), large language models (LLM), code generation.



References

1. Schulhoff S., Ilie M., Balepur N., Kahadze K., Liu A., Si C., Li Y., Gupta A., Han H., Schulhoff S., Dulepet P.S., Vidyadhara S., Ki D., Agrawal S., Pham C., Kroiz G., Li F., Tao H., Srivastava A., Da Costa H., Gupta S., Rogers M.L., Goncearenco I., Sarli G., Galynker I., Peskoff D., Carpuat M., White J., Anadkat S., Hoyle A. & Resnik P. The Prompt Report: A Systematic Survey of Prompting Techniques. arXiv preprint arXiv:2406.06608, 2024. Available at: https://arxiv.org/abs/2406.06608, accessed 01.12.2025. DOI: 10.48550/arXiv.2406.06608.

2. Wei J., Wang X., Schuurmans D., Bosma M., Ichter B., Xia F., Chi E., Le Q., Zhou D. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. arXiv preprint arXiv:2201.11903, 2022. Available at: http://arxiv.org/abs/2201.11903, accessed 01.12.2025. DOI: 10.48550/arXiv.2201.11903.

3. Chen M., Tworek J., Jun H., Yuan Q., Pinto H., Kaplan J., Edwards H., Burda Y., Joseph N., Brockman G., Ray A., Puri R., Krueger G., Petrov M., Khlaaf H., Sastry G., Mishkin P., Chan B., Gray S., Ryder N., Pavlov M., Power A., Kaiser L., Bavarian M., Winter C., Tillet P., Such F., Cummings D., Plappert M., Chantzis F., Barnes E., Herbert-Voss A., Guss W., Nichol A., Paino A., Tezak N., Tang J., Babuschkin I., Balaji S., Jain S., Saunders W., Hesse C., Carr A., Leike J., Achiam J., Misra V., Morikawa E., Radford A., Knight M., Brundage M., Murati M., Mayer K., Welinder P., McGrew B., Amodei D., McCandlish S., Sutskever I., Zaremba W. Evaluating Large Language Models Trained on Code. arXiv preprint arXiv:2107.03374, 2021. Available at: http://arxiv.org/abs/2107.03374, accessed 01.12.2025. DOI: 10.48550/arXiv.2107.03374.

4. Chowdhery A., Narang S., Devlin J., Bosma M., Mishra G., Roberts A., Barham P., Chung H., Sutton C., Gehrmann S., Schuh P., Shi K., Tsvyashchenko S., Maynez J., Rao A., Barnes P., Tay Y., Shazeer N., Prabhakaran V., Reif E., Du N., Hutchinson B., Pope R., Bradbury J., Austin J., Isard M., Gur-Ari G., Yin P., Duke T., Levskaya A., Ghemawat S., Dev S., Michalewski H., Garcia X., Misra V., Robinson K., Fedus L., Zhou D., Ippolito D., Luan D., Lim H., Zoph B., Spiridonov A., Sepassi R., Dohan D., Agrawal S., Omernick M., Dai A., Pillai T., Pellat M., Lewkowycz A., Moreira E., Child R., Polozov O., Lee K., Zhou Z., Wang X., Saeta B., Diaz M., Firat O., Catasta M., Wei J., Meier-Hellstern K., Eck D., Dean J., Petrov S., Fiedel N. PaLM: Scaling Language Modeling with Pathways. arXiv preprint arXiv:2204.02311, 2022. Available at: http://arxiv.org/abs/2204.02311, accessed 01.12.2025. DOI: 10.48550/arXiv.2204.02311.

5. Nijkamp E., Pang B., Hayashi H., Tu L., Wang H., Zhou Y., Savarese S., Xiong C. CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis. arXiv preprint arXiv:2203.13474, 2022. Available at: http://arxiv.org/abs/2203.13474, accessed 01.12.2025. DOI: 10.48550/arXiv.2203.13474.

6. Ouyang L., Wu J., Jiang X., Almeida D., Wainwright C., Mishkin P., Zhang C., Agarwal S., Slama K., Ray A., Schulman J., Hilton J., Kelton F., Miller L., Simens M., Askell A., Welinder P., Christiano P., Leike J., Lowe R. Training Language Models to Follow Instructions with Human Feedback. arXiv preprint arXiv:2203.02155, 2022. Available at: http://arxiv.org/abs/2203.02155, accessed 01.12.2025. DOI: 10.48550/arXiv.2203.02155.

7. Vatsal S., Dubey H. A Survey of Prompt Engineering Methods in Large Language Models for Different NLP Tasks. arXiv preprint arXiv:2407.12994, 2024. Available at: http://arxiv.org/abs/2407.12994, accessed 01.12.2025. DOI: 10.48550/arXiv.2407.12994.

8. Brown T., Mann B., Ryder N., Subbiah M., Kaplan J., Dhariwal P., Neelakantan A., Shyam P., Sastry G., Askell A., Agarwal S., Herbert-Voss A., Krueger G., Henighan T., Child R., Ramesh A., Ziegler D., Wu J., Winter C., Hesse C., Chen M., Sigler E., Litwin M., Gray S., Chess B., Clark J., Berner C., McCandlish S., Radford A., Sutskever I., Amodei D. Language Models Are Few-Shot Learners. arXiv preprint arXiv:2005.14165, 2020. Available at: https://arxiv.org/abs/2005.14165, accessed 01.12.2025. DOI: 10.48550/arXiv.2005.14165.

9. Radford A., Wu J., Child R., Luan D., Amodei D., Sutskever I. Language Models Are Unsupervised Multitask Learners. OpenAI technical report, 2019. Available at: https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf, accessed 01.12.2025.

10. Yasunaga M., Chen X., Li Y., Pasupat P., Leskovec J., Liang P., Chi E., Zhou D. Large Language Models as Analogical Reasoners. arXiv preprint arXiv:2310.01714, 2023. Available at: https://arxiv.org/abs/2310.01714, accessed 01.12.2025. DOI: 10.48550/arXiv.2310.01714.

11. Zhang K., Li Z., Li J., Li G., Jin Z. Self-Edit: Fault-Aware Code Editor for Code Generation. arXiv preprint arXiv:2305.04087, 2023. Available at: https://arxiv.org/abs/2305.04087, accessed 01.12.2025. DOI: 10.48550/arXiv.2305.04087.

12. Elizabeth M., Veyret M., Couceiro M., Dusek O., Rojas-Barahona L.M. Exploring ReAct Prompting for Task-Oriented Dialogue: Insights and Shortcomings. arXiv preprint arXiv:2412.01262, 2024. Available at: https://arxiv.org/abs/2412.01262, accessed 01.12.2025. DOI: 10.48550/arXiv.2412.01262.

13. Shinn N., Cassano F., Berman E., Gopinath A., Narasimhan K., Yao S. Reflexion: Language Agents with Verbal Reinforcement Learning. arXiv preprint arXiv:2303.11366, 2023. Available at: https://arxiv.org/abs/2303.11366, accessed 01.12.2025. DOI: 10.48550/arXiv.2303.11366.

14. Lewis P., Perez E., Piktus A., Petroni F., Karpukhin V., Goyal N., Küttler H., Lewis M., Yih W.-t., Rocktäschel T., Riedel S., Kiela D. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. arXiv preprint arXiv:2005.11401, 2020. Available at: https://arxiv.org/abs/2005.11401, accessed 01.12.2025. DOI: 10.48550/arXiv.2005.11401.

15. Jiang X., Dong Y., Wang L., Fang Z., Shang Q., Li G., Jin Z., Jiao W. Self-planning Code Generation with Large Language Models. arXiv preprint arXiv:2303.06689, 2023. Available at: http://arxiv.org/abs/2303.06689, accessed 01.12.2025. DOI: 10.48550/arXiv.2303.06689.

16. Zhang Y., Yuan Y., Yao A. Meta Prompting for AI Systems. arXiv preprint arXiv:2311.11482, 2023. Available at: http://arxiv.org/abs/2311.11482, accessed 01.12.2025. DOI: 10.48550/arXiv.2311.11482.

17. Huang D., Bu Q., Qing Y., Cui H. CodeCoT: Tackling Code Syntax Errors in CoT Reasoning for Code Generation. arXiv preprint arXiv:2308.08784, 2023. Available at: http://arxiv.org/abs/2308.08784, accessed 01.12.2025. DOI: 10.48550/arXiv.2308.08784.

18. Lewis P., Perez E., Piktus A., Petroni F., Karpukhin V., Goyal N., Küttler H., Lewis M., Yih W.-t., Rocktäschel T., Riedel S., Kiela D. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. arXiv preprint arXiv:2005.11401, 2020. Available at: http://arxiv.org/abs/2005.11401, accessed 01.12.2025. DOI: 10.48550/arXiv.2005.11401.

19. Huang D., Zhang J., Luck M., Bu Q., Qing Y., Cui H. AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and Optimisation. arXiv preprint arXiv:2312.13010, 2023. Available at: http://arxiv.org/abs/2312.13010, accessed 01.12.2025. DOI: 10.48550/arXiv.2312.13010.

20. Islam M.A., Ali M.E., Parvez M.R. MapCoder: Multi-Agent Code Generation for Competitive Problem Solving. arXiv preprint arXiv:2405.11403, 2024. Available at: http://arxiv.org/abs/2405.11403, accessed 01.12.2025. DOI: 10.48550/arXiv.2405.11403.

21. Li J., Li G., Tao C., Li J., Zhang H., Liu F., Jin Z. Large Language Model-Aware In-Context Learning for Code Generation. arXiv preprint arXiv:2310.09748, 2023. Available at: http://arxiv.org/abs/2310.09748, accessed 01.12.2025. DOI: 10.48550/arXiv.2310.09748.

22. Hendrycks D., Basart S., Kadavath S., Mazeika M., Arora A., Guo E., Burns C., Puranik S., He H., Song D., Steinhardt J. Measuring Coding Challenge Competence With APPS. arXiv preprint arXiv:2105.09938, 2021. Available at: http://arxiv.org/abs/2105.09938, accessed 01.12.2025. DOI: 10.48550/arXiv.2105.09938.

23. Dong Y., Jiang X., Jin Z., Li G. Self-collaboration Code Generation via ChatGPT. arXiv preprint arXiv:2304.07590, 2023. Available at: http://arxiv.org/abs/2304.07590, accessed 01.12.2025. DOI: 10.48550/arXiv.2304.07590.

24. Madaan A., Tandon N., Gupta P., Hallinan S., Gao L., Wiegreffe S., Alon U., Dziri N., Prabhumoye S., Yang Y., Gupta S., Majumder B.P., Hermann K., Welleck S., Yazdanbakhsh A., Clark P. Self-Refine: Iterative Refinement with Self-Feedback. arXiv preprint arXiv:2303.17651, 2023. Available at: http://arxiv.org/abs/2303.17651, accessed 01.12.2025. DOI: 10.48550/arXiv.2303.17651.

25. Chen X., Lin M., Schärli N., Zhou D. Teaching Large Language Models to Self-Debug. arXiv preprint arXiv:2304.05128, 2023. Available at: http://arxiv.org/abs/2304.05128, accessed 01.12.2025. DOI: 10.48550/arXiv.2304.05128.

26. Zhong L., Wang Z., Shang J. Debug like a Human: A Large Language Model Debugger via Verifying Runtime Execution Step-by-step. arXiv preprint arXiv:2402.16906, 2024. Available at: http://arxiv.org/abs/2402.16906, accessed 01.12.2025. DOI: 10.48550/arXiv.2402.16906.

27. Zhang K., Li J., Li G., Shi X., Jin Z. CodeAgent: Enhancing Code Generation with Tool-Integrated Agent Systems for Real-World Repo-level Coding Challenges. arXiv preprint arXiv:2401.07339, 2024. Available at: http://arxiv.org/abs/2401.07339, accessed 01.12.2025. DOI: 10.48550/arXiv.2401.07339.

28. Chen W., Ma X., Wang X., Cohen W.W. Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks. arXiv preprint arXiv:2211.12588, 2022. Available at: http://arxiv.org/abs/2211.12588, accessed 01.12.2025. DOI: 10.48550/arXiv.2211.12588.

29. Li Y., Choi D., Chung J., Kushman N., Schrittwieser J., Leblond R., Eccles T., Keeling J., Gimeno F., Lago A., Hubert T., Choy P., de Masson d’Autume C., Babuschkin I., Chen X., Huang P., Welbl J., Gowal S., Cherepanov A., Molloy J., Mankowitz D.J., Robson E., Kohli P., de Freitas N., Kavukcuoglu K., Vinyals O. Competition-level Code Generation with AlphaCode. Science, 378(6624), 2022, pp. 1092–1097. DOI: 10.1126/science.abq1158. Available at: https://www.science.org/doi/10.1126/science.abq1158, accessed 01.12.2025.

30. Sun Q., Chen N., Wang J., Li X., Gao M. TransCoder: Towards Unified Transferable Code Representation Learning Inspired by Human Skills. arXiv preprint arXiv:2306.07285, 2023. Available at: http://arxiv.org/abs/2306.07285, accessed 01.12.2025. DOI: 10.48550/arXiv.2306.07285.


Review

For citations:


YUDINSKIKH Ya.O., IVANOV V.V. A Preliminary Analysis of Prompt Engineering in Large Language Models for Code Generation. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2025;37(6):175-186. https://doi.org/10.15514/ISPRAS-2025-37(6)-57



Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)