Обнаружение неточно повторяющегося текста в документации программного обеспечения
https://doi.org/10.15514/ISPRAS-2017-29(4)-21
Аннотация
Об авторах
Л. Д. КантеевРоссия
Ю. О. Костюков
Россия
Д. В. Луцив
Россия
Д. В. Кознов
Россия
М. Н. Смирнов
Россия
Список литературы
1. Wagner S., Fernández D.M. Analysing Text in Software Projects. Preprint, 2016. URL: https://arxiv.org/abs/1612.00164
2. Parnas D. L. Precise Documentation: The Key To Better Software. Nanz S. (ed.) The Future of Software Engineering, Springer, 2011. DOI: 10.1007/978-3-642-15187-3_8
3. Akhin, M., Itsykson, V. Clone Detection: Why, What and How? Proceedings of CEE-SECR’10, 2010, pp. 36-42. DOI: 10.1109/CEE-SECR.2010.5783148
4. Juergens E. et al. Can clone detection support quality assessments of requirements specifications? Proceedings of the 32Nd ACM/IEEE International Conference on Software Engineering, 2010, vol. 2, pp. 79-88.
5. Wingkvist A., Ericsson M., Lincke R., Löwe W. A Metrics-Based Approach to Technical Documentation Quality. Proceedings of 7th International Conference on the Quality of Information and Communications Technology, 2010, pp. 476-481.
6. Nosál M., Porubän J. Preliminary report on empirical study of repeated fragments in internal documentation. Proceedings of the Federated Conference on Computer Science and Information Systems, Gdansk, 2016, pp. 1573-1576.
7. Sajnani H., Saini V., Svajlenko J., Roy C.K., Lopes C.V. Sourcerercc: Scaling code clone detection to big-code. Proceedings of the 38th International Conference on Software Engineering, ACM, New York, USA, 2016, pp. 1157-1168. DOI: 10.1145/2884781.2884877
8. Jiang L., Misherghi G., Su Z., Glondu S. DECKARD: Scalable and accurate tree-based detection of code clones. Proceedings of 29th International Conference on Software Engineering. Institute of Electrical and Electronics Engineers, 2007, pp. 96-105. DOI: 10.1109/ICSE.2007.30
9. Huang T.K., Rahman M.S., Madhyastha H.V., Faloutsos M., Ribeiro B. An analysis of socware cascades in online social networks. Proceedings of the 22Nd International Conference on World Wide Web, 2013, pp. 619-630.
10. Cordy J.R., Roy C.K.: The NiCad clone detector. Proceedings of the 19th IEEE International Conference on Program Comprehension. Institute of Electrical and Electronics Engineers, 2011, pp. 219-220. DOI: 10.1109/ICPC.2011.26
11. Луцив Д.В., Кознов Д.В., Басит Х.А., Ли О.Е., Смирнов М.Н., Романовский К.Ю. Метод поиска повторяющихся фрагментов текста в технической документации. Научно-технический вестник информационных технологий, механики и оптики, т. 92, вып. 4, 2014, стр. 106-114.
12. Koznov D. et al. Clone detection in reuse of software technical documentation. Mazzara M., Voronkov A. (eds.), International Andrei Ershov Memorial Conference on Perspectives of System Informatics, 2015; Lecture Notes in Computer Science, vol. 9609, 2016, pp. 170-185. DOI: 10.1007/978-3-319-41579-6_14
13. Луцив Д.В., Кознов Д.В., Басит Х.А., Терехов А.Н. Задача поиска нечётких повторов при организации повторного использования документации. Программирование, т. 42, № 4, 2016, стр. 39-49.
14. Basit H.A., Smyth W.F., Puglisi S.J., Turpin A., Jarzabek S. Efficient Token Based Clone Detection with Flexible Tokenization. Proceedings of ACM SIGSOFT International Symposium on the Foundations of Software Engineering, ACM Press, 2007, pp. 513-516. DOI: 10.1145/1295014.1295029
15. Natural Language Toolkit, URL: http://nltk.org/
16. Horie M., Chiba S. Tool support for crosscutting concerns of API documentation. Proceedings of 9th International Conference on Aspect-Oriented Software Development, 2010, pp. 97-108. DOI: 10.1145/1739230.1739242
17. Rago A., Marcos C., Diaz-Pace J.A. Identifying duplicate functionality in textual use cases by aligning semantic actions. International Journal on Software and Systems Modeling, vol. 15, issue 2, 2016, pp. 579-603. DOI: 10.1007/s10270-014-0431-3
18. Nosál’ M., Porubän J. Reusable software documentation with phrase annotations. Open Computer Science, vol. 4, issue 4, 2014, pp. 242-258. DOI: 10.2478/s13537-014-0208-3
19. Bassett P. Framing software reuse - lessons from real world. Prentice Hall, 1996. ISBN: 0-13-327859-X
20. Jarzabek S., Bassett P., Zhang H., Zhang W. XVCL: XML-based Variant Configuration Language. Proceedings of 25th International Conference on Software Engineering, 2003, pp. 810-811. DOI: 10.1109/ICSE.2003.1201298
21. Кознов Д.В., Романовский К.Ю. DocLine: метод разработки документации семейства программных продуктов. Программирование, т. 34, вып. 4, 2008, С. 1-13.
22. Romanovsky K., Koznov D., Minchin L. Refactoring the Documentation of Software Product Lines. Central and East European Conference on Software Engineering Techniques, Brno (Czech Republic), 2008; Lecture Notes in Computer Science, vol. 4980, Springer, 2011, pp. 158-170. DOI: 10.1007/978-3-642-22386-0_12
23. Broder A.Z. et al. Syntactic clustering of the web. Computer Networks and ISDN Systems. vol. 29, issue 8, 1997, pp. 1157-1166. DOI: 10.1016/S0169-7552(97)00031-7
24. Documentation Refactoring Toolkit, URL: http://www.math.spbu.ru/user/kromanovsky/docline/index.html
25. Basili V., Caldiera G., Rombach H. The Goal Question Metric Approach. Encyclopedia of Software Engineering, Wiley, 1994. DOI: 10.1002/0471028959.sof142
26. Frakes W., Terry C. Software reuse: metrics and models. ACM Computing Surveys, vol. 28, issue 2, 1996, pp. 415-435. DOI: 10.1145/234528.234531
27. Linux Kernel Documentation, snapshot on Dec 11, 2013. URL: https://github.com/torvalds/linux/tree/master/Documentation/DocBook/
28. Zend PHP Framework documentation, snapshot on Apr 24, 2015. URL: https://github.com/zendframework/zf1/tree/master/documentation
29. DocBook Definitive Guide, snapshot on Apr 24, 2015. URL: http://sourceforge.net/p/docbook/code/HEAD/tree/trunk/defguide/en/
30. SVN Book, snapshot on Apr 24, 2015. URL: http://sourceforge.net/p/svnbook/source/HEAD/tree/trunk/en/book/
31. Braun R.K., Kaneshiro R. Exploiting topic pragmatics for new event detection. Technical report. National Institute of Standards and Technology, Topic Detection and Tracking Workshop, 2004.
32. Jaccard P. Distribution de la flore alpine dans le Bassin des Dranses et dans quelques regions voisines. Bulletin de la Société Vaudoise des Sciences Naturelles, vol. 140, issue 37, 1901, pp. 241-272 (франц.)
33. Drobintsev P.D., Kotlyarov V. P., Letichevsky A.A. A formal approach to test scenarios generation based on guides. Automatic Control and Computer Sciences, vol. 48, issue 7, 2014, pp. 415-423. DOI: 10.3103/S0146411614070062
34. Zelenov S.V., Silakov D.V., Petrenko A.K., Conrad M., Fey I. Automatic test generation for model-based code generators. Proceedings of 2nd International Symposium on Leveraging Applications of Formal Methods, Verification and Validation, pp. 75-81. DOI: 10.1109/ISoLA.2006.70
Рецензия
Для цитирования:
Кантеев Л.Д., Костюков Ю.О., Луцив Д.В., Кознов Д.В., Смирнов М.Н. Обнаружение неточно повторяющегося текста в документации программного обеспечения. Труды Института системного программирования РАН. 2017;29(4):303-314. https://doi.org/10.15514/ISPRAS-2017-29(4)-21
For citation:
Kanteev L.D., Kostyukov Yu.O., Luciv D.V., Koznov D.V., Smirnov M.N. Discovering Near Duplicate Text in Software Documentation. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2017;29(4):303-314. https://doi.org/10.15514/ISPRAS-2017-29(4)-21