Could an LLM Like chatGPT Perform a Functional Size Measurement using the COSMIC Method?

Francisco VALDÉS-SOUTO; Daniel TORRES-ROBLEDO

doi:10.15514/ISPRAS-2024-36(6)-6

Could an LLM Like chatGPT Perform a Functional Size Measurement using the COSMIC Method?

Francisco VALDÉS-SOUTO, Daniel TORRES-ROBLEDO

https://doi.org/10.15514/ISPRAS-2024-36(6)-6

Full Text:

PDF (Eng)

Generate QR code

Abstract

The process of developing software is intricate and time-consuming. Resource estimation is one of the most important responsibilities in software development. Since it is currently the only acceptable metric, the functional size of the program is used to generate estimating models in a widely accepted manner. On the other hand, functional size measurement takes time. The use of artificial intelligence (AI) to automate certain software development jobs has gained popularity in recent years. Software functional sizing and estimation is one area where artificial intelligence may be used. In this study, we investigate how to apply the concepts and guidelines of the COSMIC method to measurements using ChatGPT 4o, a large language model (LLM). To determine whether ChatGPT can perform COSMIC measurements, we discovered that ChatGPT could not reliably produce accurate findings. The primary shortcomings found in ChatGPT include its incapacity to accurately extract data movements, data groups, and functional users from the text. Because of this, ChatGPT's measurements fall short of two essential requirements for measurement: accuracy and reproducibility.

Keywords

COSMIC, CFP, functional size measurement, LLM, chatGPT, software engineering, AI, automatization.

About the Authors

Francisco VALDÉS-SOUTO

National Autonomous University of Mexico, Science Faculty, CDMX
Mexico

Had a PhD in Software Engineering with a specialty in Software Measurement and Estimation at the École de Technologie Supérieure (ETS) in Canada, two master's degrees in Mexico and France. President of COSMIC. Associate Professor of the Faculty of Sciences of the National Autonomous University of Mexico (UNAM). Founder of the Mexican Association of Software Metrics (AMMS). More than 25 years of experience in critical software development. He currently has more than 50 publications including articles in Indexed Journals, Proceedings, books and book chapters. He is the main promoter of the topic of formal software metrics in Mexico, promoting COSMIC (ISO/IEC 19761) as a National Standard. Member of the National System of Researchers (SNI). Research interests: software measurement and estimation applied to software project management, scope management, productivity and economics in software projects.

Daniel TORRES-ROBLEDO

National Autonomous University of Mexico Research Institute in Applied Mathematics and Systems, CDMX
Mexico

Master student at Research Institute in Applied Mathematics and Systems, degree in Computer Science from Science Faculty of the UNAM.

References

1. ABRAN, Alain. Software metrics and software metrology. John Wiley & Sons, 2010. https://doi.org/10.1002/9780470606834.ch2.

2. Silhavy, R., Prokopova, Z. & Silhavy, P. Algorithmic optimization method for effort estimation. Program Comput Soft 42, 161–166 (2016). https://doi.org/10.1134/S0361768816030087.

3. Durán, M., Juárez-Ramírez, R., Jiménez, S. et al. User Story Estimation Based on the Complexity Decomposition Using Bayesian Networks. Program Comput Soft 46, 569–583 (2020). https://doi.org/10.1134/S0361768820080095.

4. O. Fedotova, L. Teixeira, A.H. Alvelos, Software effort estimation with multiple linear regression: Review and practical application, J. Inf. Sci. Eng. 29 (2013) 925–945.

5. T.K. Lee, K.T. Wei, A.A.A. Ghani, Systematic literature review on effort estimation for Open Sources (OSS) web application development, in: FTC 2016 - Proc. Futur. Technol. Conf., IEEE, San Francisco, California, USA, 2016: pp. 1158–1167. https://doi.org/10.1109/FTC.2016.7821748.

6. P. Sharma, J. Singh, Systematic literature review on software effort estimation using machine learning approaches, in: Proc. - 2017 Int. Conf. Next Gener. Comput. Inf. Syst. ICNGCIS 2017, IEEE, Jammu, India, 2018: pp. 54–57. https://doi.org/10.1109/ICNGCIS.2017.33.

7. C.E. Carbonera, K. Farias, V. Bischoff, Software development effort estimation: A systematic mapping study, IET Res. Journals. 14 (2020) 1–14. https://doi.org/10.1049/iet-sen.2018.5334.

8. E. Ungan, C. Hammond, A. Abran, Automated COSMIC Measurement and Requirement Quality Improvement Through ScopeMaster ® Tool, in: A.C. Murat Salmanoglu (Ed.), Proc. Acad. Pap. IWSM Mensura 2018 "COSMIC Funct. Points - Fundam. Softw. Effort Estim. Held Conjunction with China Softw. Cost Meas. Conf. (CSCM 2018), CEUR Workshop Proceedings (CEURWS.org), Beijing, China, 2018: pp. 1–13. doi: ISSN:1613-0073.

9. P. L. Braga, A. L. I. Oliveira and S. R. L. Meira, "Software Effort Estimation using Machine Learning Techniques with Robust Confidence Intervals," in 7th International Conference on Hybrid Intelligent Systems, Kaiserslautern, Germany, 2007.

10. Yaozhi Zhang, Nina Katrine Prebensen, Co-creating with ChatGPT for tourism marketing materials, Annals of Tourism Research Empirical Insights, Volume 5, Issue 1, 2024, 100124, ISSN 2666-9579, https://doi.org/10.1016/j.annale.2024.100124.

11. Altmäe, Signe Sola-Leyva, Alberto Salumets, Andres, Artificial intelligence in scientific writing: a friend or a foe?, Volume 47, Issue 1, 2023, ISSN 1472-6483 https://doi.org/10.1016/j.rbmo.2023.04.009.

12. Zuckerman, M., Flood, R., Tan, R. J. B., Kelp, N., Ecker, D. J., Menke, J., & Lockspeiser, T. (2023). ChatGPT for assessment writing. Medical Teacher, 45(11), 1224–1227. https://doi.org/10.1080/0142159X.2023.2249239.

13. T. Putjorn and P. Putjorn, "Augmented Imagination: Exploring Generative AI from the Perspectives of Young Learners," 2023 15th International Conference on Information Technology and Electrical Engineering (ICITEE), Chiang Mai, Thailand, 2023, pp. 353-358, doi: 10.1109/ICITEE59582.2023.10317680.

14. S. Bengesi et al., Advancements in Generative AI: A Comprehensive Review of GANs, GPT, Autoencoders, Diffusion Model, and Transformers. arXiv preprint arXiv:2311.10242 (2023).

15. McKinsey & Company, What is ChatGPT, DALL-E, and generative AI? | McKinsey. McKinsey & Company (2023).

16. OpenAI and Josh Achiam and Steven Adler and Sandhini Agarwal, GPT-4 Technical Report, 2024. arXiv:2303.08774.

17. M. Jørgensen, M. Shepperd, A systematic review of software development cost estimation studies, IEEE Trans. Softw. Eng. 33 (2007) 33–53. https://doi.org/10.1109/TSE.2007.256943.

18. S. Bilgaiyan, S. Sagnika, S. Mishra, M. Das, A systematic review on software cost estimation in Agile Software Development, J. Eng. Sci. Technol. Rev. 10 (2017) 51–64. https://doi.org/10.25103/jestr.104.08.

19. N. Kinoshita, A. Monden, M. Tshunoda and Z. Yucel, "Predictability classification for software effort estimation," in Proceedings - 2018 IEEE/ACIS 3rd International Conference on Big Data, Cloud Computing, Data Science and Engineering, BCD 2018, Yonago, Japan, 2018.

20. R. Britto, V. Freitas, E. Mendes, M. Usman, Effort estimation in global software development: A systematic literature review, Proc. - 2014 IEEE 9th Int. Conf. Glob. Softw. Eng. ICGSE 2014. (2014) 135–144. https://doi.org/10.1109/ICGSE.2014.11.

21. F. Valdés-Souto, Validation of supplier estimates using cosmic method, CEURInternational Work. Softw. Meas. Int. Conf. Softw. Process Prod. Meas. (IWSM Mensura 2019). 2476 (2019) 15–30.

22. M. Shin and A. L. Goel, "Empirical Data Modeling in Software Engineering Using Radial Basis Functions," IEEE Transactions on Software Engineering, vol. 26, no. 6, pp. 567-576, 2000.

23. M. Linda and M. C. B. Laird, Software Measurement and Estimation: A Practical Approach, New York, N.Y., USA: Jonh Wiley & Sons, 2006.

24. S. Koch and J. Mitlöhner, "Software project effort estimation with voting rules," Decision Support Systems, vol. 46, no. 4, pp. 895-901, 2009.

25. De Lucia, E. Pompella and S. Stefanucci, "Assessing effort estimation models for corrective maintenance through empirical studies," Information and Software Technology, vol. 47, no. 1, pp. 3-15, 2005.

26. J. Hill, L. C. Thomas and D. E. Allen, «Experts’ estimates of task durations in software development projects», International Journal of Project Management, vol. 18, nº 1, pp. 13-21, 2000.

27. Hartenstein, S., Johnson, S.L., Schmietendorf, A., ¨Towards a fast cost estimation Supported by large language models¨ (2024). URL: https://cosmic-sizing.org/publications/fast-cost-estimation-by-chatgpt/

28. The COSMIC Functional Size Measurement Method: Measurement Manual (2021), v. 5.0 ed., URL https://cosmic-sizing.org/measurement-manual/

29. OpenAI and Josh Achiam and Steven Adler and Sandhini Agarwal, GPT-4 System Card, 2024. arXiv:2303.08774

30. F. Vogelezang and H. v. Heeringen, Benchmarking: Comparing Apples to Apples (Apress, Berkeley, CA, 2019), pp. 205–217, ISBN 978-1-4842-4221-6.

31. Vogelezang, COSMIC Group, ¨Early Software Sizing with COSMIC, Practitioners¨ (2020), v.4.0.2, URL: https://cosmic-sizing.org/publications/early-software-sizing-with-cosmic-practitioners-guide/

32. Vogelezang, COSMIC Group, ¨Early Software Sizing with COSMIC: Experts Guide¨ (2020), v.4.0.2, URL: https://cosmic-sizing.org/publications/early-software-sizing-with-cosmic-experts-guide/

33. Sánchez Alonso, S., Sicilia Urban, M. Á., & Rodríguez García, D. (2011). Ingeniería del software : un enfoque desde la guía SWEBOK (1a ed., 1a reimp.). Garceta.

34. Symons, C.R., et al, Course Registration (‘C-REG’) System Case Study, v2.0.1 2018. https://cosmic-sizing.org/publications/course-registration-c-reg-system-case-study-v2-0-1/

35. Bruel Gérançon, Sylvie Trudel, Roger Kkambou, Serge Robert, Software Functional Sizing Automation from Requirements Written as Triplets, ICSEA 2021: The Sixteenth International Conference on Software Engineering Advances, 2021.

Review

For citations:

VALDÉS-SOUTO F., TORRES-ROBLEDO D. Could an LLM Like chatGPT Perform a Functional Size Measurement using the COSMIC Method? Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2024;36(6):103-114. https://doi.org/10.15514/ISPRAS-2024-36(6)-6

This work is licensed under a Creative Commons Attribution 4.0 License.

ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)

Username
Password
	Remember me
Not a user? Register with this site Forgot your password?

User

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)

Could an LLM Like chatGPT Perform a Functional Size Measurement using the COSMIC Method?

Full Text:

Abstract

Keywords

About the Authors

References

Review

For citations:

Cookies policy