Preview

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)

Advanced search

Software project estimation using smooth curve methods and variable selection and regularization methods using a wedge-shape form database

https://doi.org/10.15514/ISPRAS-2023-35(1)-9

Abstract

Context: The impact of an excellent estimation in planning, budgeting, and control, makes the estimation activities an essential element for the software project success. Several estimation techniques have been developed during the last seven decades. Traditional regression-based is the most often estimation method used in the literature. The generation of models needs a reference database, which is usually a wedge-shaped dataset when real projects are considered. The use of regression-based estimation techniques provides low accuracy with this type of database. Objective: Evaluate and provide an alternative to the general practice of using regression-based models, looking if smooth curve methods and variable selection and regularization methods provide better reliability of the estimations based on the wedge-shaped form databases. Method: A previous study used a reference database with a wedge-shaped form to build a regression-based estimating model. This paper utilizes smooth curve methods and variable selection and regularization methods to build estimation models, providing an alternative to linear regression models. Results: The results show the improvement in the estimation results when smooth curve methods and variable selection and regularization methods are used against regression-based models when wedge-shaped form databases are considered. For example, GAM with all the variables show that the R-squared is for Effort: 0.6864 and for Cost: 0.7581; the MMRE is for Effort: 0.1095 and for Cost: 0.0578. The results for the GAM with LASSO show that the R-squared is for Effort: 0.6836 and for Cost: 0.7519; the MMRE is for Effort: 0.1105 and for Cost: 0.0585. In comparison to the R-squared is for Effort: 0.6790 and for Cost: 0.7540; the MMRE is for Effort: 0.1107 and for Cost: 0.0582 while using MLR.

About the Authors

Francisco VALDÉS-SOUTO
Universidad Nacional Autónoma de México
Mexico

Ph.D. in Software Engineering, Associate Professor



Lizbeth NARANJO-ALBARRÁN
Universidad Nacional Autónoma de México
Mexico

Ph.D. in Mathematics, Professor



References

1. Fedotova O., Teixeira L., Alvelos A.H. Software effort estimation with multiple linear regression: Review and practical application. Journal of Information Science and Engineering, vol. 29, issue 5, 2013, pp. 925–945.

2. Sharma P., Singh J. Systematic literature review on software effort estimation using machine learning approaches. In Proc. of the International Conference on Next Generation Computing and Information Systems (ICNGCIS), 2017: pp. 43-47.

3. Oliveira A.L.I. Estimation of software project effort with support vector regression. Neurocomputing, vol. 69, issues 13-15, 2006, pp. 1749-1753.

4. Papadopoulos H., Papatheocharous E., Andreou A.S. Reliable confidence intervals for software effort estimation. In Proc. of the Workshops of the 5th IFIP Conference on Artificial Intelligence Applications & Innovations (AIAI-2009), 2009: pp. 211-220.

5. Valdés-Souto F., Naranjo-Albarrán L. Improving the Software Estimation Models Based on Functional Size through Validation of the Assumptions behind the Linear Regression and the Use of the Confidence Intervals When the Reference Database Presents a Wedge-Shape Form. Programming and Computer Software, vol. 47, issue 8, 2021, pp. 673-693.

6. Jørgensen M., Shepperd M. A systematic review of software development cost estimation studies. IEEE Transactions on Software Engineering, vol. 33, no. 1, 2007, pp. 33-53.

7. Braga P.L., Oliveira A.L.I., Meira S.R.L. Software Effort Estimation using Machine Learning Techniques with Robust Confidence Intervals. In Proc. of the 7th International Conference on Hybrid Intelligent Systems (HIS 2007), 2007, pp. 352-357.

8. Shin M., Goel A.L. Empirical Data Modeling in Software Engineering Using Radial Basis Functions. IEEE Transactions on Software Engineering, vol. 26, no. 6, 2000, pp. 567-576.

9. Kitchenham B., Mendes E. Why comparative effort prediction studies may be invalid. In Proc. of the 5th International Conference on Predictor Models in Software Engineering, 2009, article no. 4, 5 p.

10. Bilgaiyan S., Sagnika S. et al. A systematic review on software cost estimation in Agile Software Development. Journal of Engineering Science and Technology Review, vol. 10, issue 4, 2017, pp. 51-64.

11. Jørgensen M. Regression Models of Software Development Effort Estimation Accuracy and Bias. Empirical Software Engineering, vol. 9, issue 3, 2004, pp. 297-314.

12. Abran A. Software Project Estimation: The Fundamentals for Providing High Quality Information to Decision Makers, 1st ed. John Wiley & Sons, 2015, 288 p.

13. Kitchenham B., Taylor N. Software cost models, ICL Technical Journal, vol. 4, issue 1, 1984, pp. 73-102.

14. Lee T.K., Wei K.T., Ghani A.A.A. Systematic literature review on effort estimation for Open Sources (OSS) web application development, In Proc. of the Future Technologies Conference (FTC), 2016, pp. 1158-1167.

15. Carbonera C.E., Farias K., Bischoff V. Software development effort estimation: A systematic mapping study. IET Software, vol. 14, issue 4, (2020, pp. 328-344.

16. Yadav N., Gupta et al. Comparison of COSYSMO Model with Different Software Cost Estimation Techniques. In Proc. of the International Conference on Issues and Challenges in Intelligent Computing Techniques (ICICT), 2019, pp. 1-5.

17. Gray A.R., MacDonell S.G. A Comparison of Techniques for Developing Predictive Models of Software Metrics. Information and Software Technology, vol. 39, issue 6, 1997, pp. 425-437.

18. Silhavy R., Prokopova Z., Silhavy P. Algorithmic optimization method for effort estimation. Programming and Computer Software, vol. 42, issue 3, 2016, pp. 161-166 / Сильхавы Р., Попова З., Сильхавы П. Алгоритмический метод оптимизации оценки трудозатрат. Программирование, том 42, вып. 3, 2016 г., стр. 64-71.

19. Durán M., Juárez-Ramírez R. et al. User Story Estimation Based on the Complexity Decomposition Using Bayesian Networks. Programming and Computer Software, vol. 46, issue 8, 2020, pp. 569-583 / Дуран М., Хуарес-Рамирес Р. и др. Оценка пользовательских историй на основе декомпозиции сложности с использованием байесовских сетей. Труды ИСП РАН, том 33, вып. 2, 2021 г., стр. 77-92. DOI: 10.15514/ISPRAS–2021–33(2)–4.

20. Bourque P., Oligny S. et al. Developing Project Duration Models in Software Engineering. Journal of Computer Science and Technology, vol. 22, 2007, pp. 348-357.

21. Laird L.M., Brennan M.C. Software Measurement and Estimation: A Practical Approach, Jonh Wiley & Sons, 2006, 280 p.

22. Koch S., Mitlöhner J. Software project effort estimation with voting rules, Decision Support Systems, vol. 46, issue 4, 2009, pp. 895-901.

23. De Lucia, Pompella E., Stefanucci S. Assessing effort estimation models for corrective maintenance through e A.mpirical studies, Information and Software Technology, vol. 47, issue 1, 2005, pp. 3-15.

24. Hill J., Thomas L.C., Allen D.E. Experts’ estimates of task durations in software development projects, International Journal of Project Management, vol. 18, issue 1, 2000, pp. 13-21.

25. ISO/IEC 14143-1:2007 Standard. Information technology — Software measurement — Functional size measurement — Part 1: Definition of concepts. 2007.

26. Shepperd M., MacDonell S. Evaluating prediction systems in software project estimation. Information and Software Technology, vol. 54, issue 8, 2012, pp. 820-827.

27. Foss T., Stensrud E. et al, A simulation study of the model evaluation criterion MMRE. IEEE Transactions on Software Engineering, vol. 29, issue 11, 2003, pp. 985-995.

28. Myrtveit I., Stensrud E., Shepperd M. Reliability and validity in comparative studies of software prediction models. IEEE Transactions on Software Engineering, vol. 31, issue 5, 2005, pp. 380-391.

29. Jørgensen M., Halkjelsvik T., Liestøl K. When should we (not) use the mean magnitude of relative error (MMRE) as an error measure in software development effort estimation? Information and Software Technology, vol. 143, 2022, article no. 106784, 5 p.

30. Hastie T., Tibshirani R., Friedman J. The Elements of Statistical Learning, Data Mining, Inference, and Prediction, 2nd ed. Springer New York, 2009, 745 p.

31. Yee T.W. Vector Generalized Linear and Additive Models. With an Implementation in R, Springer,, 2015, 613 p.

32. Hastie T., Tibshirani R., Wainwright M. Statistical Learning with Sparsity The Lasso and Generalizations. ‎Routledge, 2015, 367 p.

33. Wood S.N. Generalized Additive Models, 2nd ed. Chapman and Hall/CRC, 2017, 476 p.

34. Hastie T.J., Tibshirani R.J., Sasieni P. Generalized additive models, Routledge, 1990, 352 p.

35. McCullagh P., Nelder J.A., Enderlein G. Generalized linear models. 2nd ed. ‎ Chapman and Hall/CRC, 1989, 532 p.

36. James G., Witten D. et al, An Introduction to Statistical Learning with Applications in R, 1st ed., Springer, 2013. 440 p.

37. Yuan M., Lin Y. Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society, Series B (Statistical Methodology), vol. 68, issue 1, 2005, pp. 49-67.

38. Groll A., Hambuckers J. et al. LASSO-type penalization in the framework of generalized additive models for location, scale and shape, Computational Statistics and Data Analysis, vol. 140, 2019, pp. 59-73.

39. Meier, L., van de Geer S., Bühlmann P., The Group Lasso for Logistic Regression, Journal of the Royal Statistical Society, Series B (Statistical Methodology), vol. 70, issue 1, 2008, pp. 53-7.1

40. Nelder J., Wedderburn R. Generalized linear models, Journal of the Royal Statistical Society. Series A (General), vol. 135, issue 3, 1972, pp. 370-384.


Review

For citations:


VALDÉS-SOUTO F., NARANJO-ALBARRÁN L. Software project estimation using smooth curve methods and variable selection and regularization methods using a wedge-shape form database. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2023;35(1):123-140. https://doi.org/10.15514/ISPRAS-2023-35(1)-9



Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)