Preview

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)

Advanced search

Framework for Machine Instruction Usage Analysis

https://doi.org/10.15514/ISPRAS-2023-35(3)-12

Abstract

When migrating software to new hardware architectures, including the development of optimizing compilers for new platforms, there is a need for statistical analysis of data on the use of different machine instructions or their groups in the machine code of programs. This paper describes a new framework useful for statistical research on machine opcodes that is designed to be extensible and a dataset that can be used by other researchers. We automatically collect data on different GNU/Linux distributions and architectures and provide facilities for its statistical analysis.

About the Authors

Danila Evgenevich PECHENEV
St. Petersburg State University
Russian Federation

Student and researcher at St. Petersburg State University



Iakov Aleksandrovich KIRILENKO
St. Petersburg State University
Russian Federation

Head of the Infrastructure Solutions Programming Technologies Laboratory at St. Petersburg State University



Olga Andreevna AFONINA
St. Petersburg State University,
Russian Federation

Student and researcher at Saint Petersburg State University



References

1. RISC-V International home page, Available at:: https://riscv.org/about/ (accessed: 01.05.2023).

2. RISC-V Alliance in Russia, Available at: https://riscv-alliance.ru/ (accessed: 01.05.2023).

3. Global News on High Performance Computing (HPC), Available at: https://www.hpcwire.com/2022/12/16/europe-to-dish-out-e270-millionto-build-risc-v-hardware-and-software/ (accessed: 01.05.2023).

4. Akshintala A., Jain B., Tsai C., Ferdman M., Porter D. X86-64 Instruction Usage among C/C++ Applications. Proceedings of The 12th ACM International Conference On Systems And Storage. pp. 68-79 (2019), DOI: 10.1145/3319647.3325833.

5. GitHub repository, Available at: https://github.com/DanilaPechenev/InstructionAnalysisFramework/tree/syrcose (accessed: 01.05.2023).

6. Kollara A. Opcode Frequency Based Malware Detection Using Hybrid Classifiers. National College of Ireland, 2020.

7. Bilar D. Opcodes as Predictor for Malware. Int. J. Electron. Secur. Digit. Forensic. 1, 156-168 (2007,1), DOI: 10.1504/IJESDF.2007.016865.

8. Baldwin J., Dehghantanha A. Leveraging support vector machine for opcode density based detection of crypto-ransomware. Cyber Threat Intelligence. pp. 107-136 (2018), DOI: 10.1007/978-3-319-73951-9 6.

9. Rad B., Masrom M., Ibrahim S. Opcodes histogram for classifying metamorphic portable executables malware. 2012 International Conference On E-Learning And E-Technologies In Education (ICEEE). pp. 209-213 (2012), DOI: 10.1109/ICeLeTE.2012.6333411.

10. Han K., Kang B., Im E. Malware Classification Using Instruction Frequencies. Proceedings Of The 2011 ACM Symposium On Research In Applied Computation. pp. 298-300 (2011), DOI: 10.1145/2103380.2103441.

11. Shabtai A., Moskovitch R., Feher C., Dolev S., Elovici Y. Detecting unknown malicious code by applying classification techniques on opcode patterns. Security Informatics. 1, 1-22 (2012).

12. Ding Y., Dai W., Yan S., Zhang Y. Control flow-based opcode behavior analysis for Malware detection. Computers & Security. 44 pp. 65-74 (2014), DOI: 10.1016/j.cose.2014.04.003.

13. Kenneth V. Opcode statistics for detecting compiler settings. University of Amsterdam, 2018.

14. Mutigwe C., Kinyua J., Aghdasi F. Instruction set usage analysis for application-specific systems design. Int’l Journal Of Information Technology And Computer Science. 7 (2013).

15. Ibrahim A., Abdelhalim M., Hussein H., Fahmy A. An Analysis of x86-64 Instruction Set for Optimization of System Softwares. International Journal Of Advanced Computer Science. 1, 152-162 (2011, 10).

16. Lopes B., Auler R., Ramos L., Borin E., Azevedo R. SHRINK: Reducing the ISA Complexity via Instruction Recycling. SIGARCH Comput. Archit. News. 43, 311-322 (2015,6), DOI: 10.1145/2872887.2750391.

17. DockerHub repository, Available at: https://hub.docker.com/repository/docker/danilapechenev/instructionanalysis/general (accessed: 01.05.2023).

18. Obtained datasets, Available at: https://github.com/DanilaPechenev/InstructionAnalysisFramework/tree/syrcose-data (accessed: 01.05.2023).

19. Framework documentation, Available at: https://danilapechenev.github.io/InstructionAnalysisFramework/ (accessed: 01.05.2023).

20. x86 and amd64 instruction reference, Available at: https://www.felixcloutier.com/x86/ (accessed: 01.05.2023).

21. x86 Opcode and Instruction Reference, Available at: http://ref.x86asm.net/geek.html (accessed: 01.05.2023).

22. x86-64 Instructions Set (Linux Assembly libraries project), Available at: https://linasm.sourceforge.net/docs/instructions/index.php (accessed: 01.05.2023).


Review

For citations:


PECHENEV D.E., KIRILENKO I.A., AFONINA O.A. Framework for Machine Instruction Usage Analysis. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2023;35(3):163-170. https://doi.org/10.15514/ISPRAS-2023-35(3)-12



Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)