Preview

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)

Advanced search

Methods and software tools for combined binary code analysis

https://doi.org/10.15514/ISPRAS-2014-26(1)-8

Abstract

This paper presents methods and tools for binary code analysis that have been developed in ISP RAS and their applications in fields of algorithm and data format recovery. The analysis subject is executable code of various general purpose CPU architectures. The analysis is carried out in lack of source code, debug records, and without specific OS version requirements. The approach consists of collecting a detailed machine instruction level execution trace; method for successive presentation level increase; extraction of code belonging to the algorithm followed by structuring of both code and data formats it processes. Important results have been achieved: an intermediate representation has been developed, that allows for carrying out most of the preliminary processing tasks and algorithm code extraction without having to focus on specifics of a given machine; and a method and software tool have been developed for automated recovery of network message and file formats. The tools have been incorporated into a unified analysis platform that supports their combined use. The architecture behind the platform is also described in the paper. Examples of its application to real programs are given.

About the Authors

V. A. Padaryan
Institute for System Programming of RAS
Russian Federation


A. I. Getman
Institute for System Programming of RAS
Russian Federation


M. A. Solovyev
Institute for System Programming of RAS
Russian Federation


M. G. Bakulin
Institute for System Programming of RAS
Russian Federation


A. I. Borzilov
Institute for System Programming of RAS
Russian Federation


V. V. Kaushan
Institute for System Programming of RAS
Russian Federation


I. N. Ledovskich
Institute for System Programming of RAS
Russian Federation


U. V. Markin
Institute for System Programming of RAS
Russian Federation


S. S. Panasenko
Institute for System Programming of RAS
Russian Federation


References

1. Tikhonov А.YU., Avetisyan A.I., Padaryan V.A., Metodika izvlecheniya algoritma iz binarnogo koda na osnove dinamicheskogo analiza [Methodology of exploring of an algorithm from binary code by dynamic analysis]. Problemy informatsionnoj bezopasnosti. Komp'yuternye sistemy. 2008, №3. pp. 66-71 (in Russian)

2. Avetisyan A.I., Padaryan V.A., Getman А.I., Solov’ev M.A. O nekotorykh metodakh povysheniya urovnya predstavleniya pri analize zashhishhennogo binarnogo koda [Some Approaches To Raising Representation Level In Analysis Of Protected Binary Code]. Materialy Obshherossijskoj nauchno-tekhnicheskoj konferentsii «Metody i tekhnicheskie sredstva obespecheniya bezopasnosti informatsii», 2010. pp. 97-98. (in Russian)

3. Tikhonov А.YU., Avetisyan A.I. Kombinirovannyj (staticheskij i dinamicheskij) analiz binarnogo koda. [Combined (static and dynamic) analysis of binary code]. Trudy ISP RAN [The Proceedings of ISP RAS], vol. 22, 2012, pp. 131-152. DOI: 10.15514/ISPRAS-2012-22-9. (in Russian)

4.

5. Getman A.I., Padaryan V.A., Solov’ev M.A. Combined approach to solving problems in binary code analysis. Proceedings of 9th International Conference on Computer Science and Information Technologies (CSIT’2013), pp. 295-297.

6. Batuzov K.A., Dovgalyuk P., Koshelev V.K., Padaryan V.A. Dva sposoba organizatsii mekhanizma polnosistemnogo determinirovannogo vosproizvedeniya v simulyatore QEMU [Two Approaches To Full-System Deterministic Replay QEMU]. Trudy ISP RAN [The Proceedings of ISP RAS], vol. 22, 2012, pp. 77-94. DOI: 10.15514/ISPRAS-2012-22-6. (in Russian)

7. Song D., Brumley D., Yin H., Caballero J., Jager I., Kang M.G., Liang Z., Newsome J., Poosankam P., Saxena P. BitBlaze: A New Approach to Computer Security via Binary Analysis. International Conference on Information Systems Security, 2008, LNCS 5352, pp. 1-25.

8. Yan L.K., Yin H. DroidScope: seamlessly reconstructing the OS and Dalvik semantic views for dynamic Android malware analysis. Proceedings of the 21st USENIX conference on Security symposium (Security'12). USENIX Association, Berkeley, CA, USA, pp. 29-29.

9. Yin H., Song D. TEMU: Binary Code Analysis via Whole-System Layered Annotative Execution. EECS Department University of California, Berkeley, Technical Report No. UCB/EECS-2010-3, January 11, 2010, p. 14.

10. Harman M., Danicic S., Sivagurunathan Y., Simpson D.. The next 700 slicing criteria. Second UK Workshop on Program Comprehension, 1996.

11. Padaryan V.A., Getman A.I., Solov’ev M.A. Programmnaya sreda dlya dinamicheskogo analiza binarnogo koda [Software environment for dynamic analysis of binary code]. Trudy ISP RAN [The Proceedings of ISP RAS], vol 16, 2009, pp. 51-72 (in Russian).

12. Padaryan V.A., Solov’ev M.A., Kononov A.I.. Simulation of operational semantics of machine instructions. Programming and Computer Software, May 2011, Volume 37, Issue 3, pp 161-170, DOI 10.1134/S0361768811030030

13. Brumley D., Jager I., Avgerinos T., Schwartz E. J. BAP: a binary analysis platform. Proceedings of the 23rd international conference on Computer aided verification (CAV'11), pp. 463-469.

14. Getman A.I., Markin YU.V., Padaryan V.A., Shhetinin E.I. Vosstanovlenie formata dannykh [Data format recovery]. Trudy ISP RAN [The Proceedings of ISP RAS], 2010, vol. 19, pp. 195-214 (in Russian)

15. Avetisyan A.I., Getman A.I. Vosstanovlenie struktury binarnykh dannykh po trassam program [Recovery the structure of binary data structures from program traces]. Trudy ISP RAN [The Proceedings of ISP RAS], 2012, vol. 22, pp. 95-118. DOI: 10.15514/ISPRAS-2012-22-7. (in Russian)

16. Newsome J., Song D. Dynamic Taint Analysis for Automatic Detection, Analysis, and Signature Generation of Exploits on Commodity Software. Proceedings of the Network and Distributed System Security Symposium (NDSS), 2005.

17. Caballero J., Poosankam P., Kreibich C., Song D. Dispatcher: Enabling Active Botnet Infiltration using Automatic Protocol Reverse-Engineering. Proceedings of the 16th ACM conference on Computer and communications security (CCS), 2009, pp. 621-634.

18. Cui W., Peinado M., Chen K., Wang H. J., Irun-Briz L. Tupni: automatic reverse engineering of input formats. Proceedings of the15th ACM conference on Computer and communications security, 2008.

19. Lin Z., Zhang X., Xu D. Automatic reverse engineering of data structures from binary execution. Proceedings of the Network and Distributed System Security Symposium, 2010.

20. Needleman S. B., Wunsch C. D. A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins. Journal of Molecular Biology, 48(3):443–453, 1970.

21. Wang Y., Zhang Z., Yao D., Qu B., Guo L. Inferring protocol state machine from network traces: a probabilistic approach. Proceeding of the 9th international conference on Applied cryptography and network security (ACNS), 2011, pp. 1-18.

22. Comparetti P.M., Wondracek G., Kruegel C., Kirda E. Prospex: Protocol Specification Extraction. Proceedings of the 30th IEEE Symposium on Security and Privacy, 2009, pp. 110-125.

23. Balakrishnan G., Gruian R., Reps T., Teitelbaum T. CodeSurfer/x86—A platform for analyzing x86 executables. Proceedings of the 14th international conference on Compiler Construction (CC'05), Springer-Verlag, Berlin, Heidelberg, pp. 250-254.

24. Balakrishnan G., Reps T. Analyzing Memory Accesses in x86 Executables. Proceedings of Compiler Construction, Springer-Verlag, New York, 2004, pp. 5-23.

25. Babić D., Martignoni L., McCamant S., Song D. Statically-directed dynamic automated test generation. Proceedings of the 2011 International Symposium on Software Testing and Analysis (ISSTA '11). ACM, New York, USA, pp. 12-22.

26. Caselden D., Bazhanyuk A., Payer M., McCamant S., Song D. HI-CFG: Construction by Binary Analysis, and Application to Attack Polymorphism. Proceedings of 18th European Symposium on Research in Computer Security, Egham, UK, 2013. LNCS 8134, pp. 164-181.

27. Saxena P., Poosankam P., McCamant S., Song D. Loop-extended symbolic execution on binary programs. Proceedings of the eighteenth international symposium on Software testing and analysis (ISSTA '09). ACM, New York, USA, pp. 225-236.

28. Caballero J., Poosankam P., McCamant S., Babić D., Song D. Input generation via decomposition and re-stitching: finding bugs in Malware. Proceedings of the 17th ACM conference on Computer and communications security (CCS '10). ACM, New York, USA, pp. 413-425.

29. Cha S. K., Avgerinos T., Rebert A., Brumley D. Unleashing Mayhem on Binary Code. Proceedings of the 2012 IEEE Symposium on Security and Privacy (SP '12). IEEE Computer Society, Washington, USA, pp. 380-394.


Review

For citations:


Padaryan V.A., Getman A.I., Solovyev M.A., Bakulin M.G., Borzilov A.I., Kaushan V.V., Ledovskich I.N., Markin U.V., Panasenko S.S. Methods and software tools for combined binary code analysis. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2014;26(1):251-276. (In Russ.) https://doi.org/10.15514/ISPRAS-2014-26(1)-8



Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)