Preview

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)

Advanced search

Recovery of binary data structures from program traces

Abstract

In this paper we consider the problem of recovery of binary data formats and describe the format recovery system implemented in ISP RAS.  First, we enumerate general approaches to this problem, their advantages and constraints: static, dynamic and network trace analysis.  Here we also describe the fundamental dynamic analysis constraint (incomplete code coverage) and several possible methods to partly compensate it in this particular problem.  Second, we discuss data sources and features of analysis of such objects as files, network packets of different levels and different kinds of protocols (stateful and stateless), incoming and outgoing messages.  We also discuss the problem of protocol analysis and specifically the problem of recovering the protocol state machine. Third, we describe our function specification facility that allows us to define models of functions and their parameters and brings additional accuracy to our format recovery approach through taking into consideration user's knowledge about the features of a specific software environment.  In this paper we also present the general scheme of our approach and test results of the implemented system.  Finally, we discuss future research directions: encrypted traffic analysis and several possible applications for recovery results.

About the Authors

A. I. Avetisyan
ISP RAS
Russian Federation


A. I. Getman
ISP RAS
Russian Federation


References

1. Lim J., Reps T., Liblit B. Extracting Output Formats from Executables. Proceedings of the 13th Working Conference on Reverse Engineering, 2006. рр. 167—178.

2. Caballero J., Yin H., Liang Z., Song D. Polyglot: Automatic Extraction of Protocol Message Format using Dynamic Binary Analysis. Proceedings of the 14th ACM Conference on Computer and and Communications Security, 2007. pp. 317—329.

3. Lin Z., Jiang X., Xu D., Zhang X. Automatic Protocol Format Reverse Engineering through Context-Aware Monitored Execution. Proceedings of the 15th Symposium on Network and Distributed System Security, 2008.

4. Wondracek G., Kruegel C., Kirda E., Milani P. Automatic Network Protocol Analysis. Proceedings of the 15th Symposium on Network and Distributed System Security, 2008.

5. Cui W. , Peinado M., Chen K., Wang H. J., Irun-Briz L. Tupni: Automatic Reverse Engineering of Input Formats. Proceedings of the 15th ACM conference on Computer and communications security, 2008. pp. 391—402.

6. Cui W., Kannan J., Wang H. J. Discoverer: Automatic Protocol Reverse Engineering from Network Traces. Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium, 2007. pp. 14:1—14:14


Review

For citations:


Avetisyan A.I., Getman A.I. Recovery of binary data structures from program traces. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2012;22. (In Russ.)



Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)