Down to the Bare Metal: Using Processor Features for Binary Analysis

Carsten Willems, Ralf Hund, Dennis Felsch, Andreas Fobian, Thorsten Holz

TR-HGI-2012-001, Ruhr-Universität Bochum, Horst Görtz Institut für IT-Sicherheit (HGI), November 2012


A detailed understanding of the behavior of exploits and malicious software is necessary to obtain a comprehensive overview of vulnerabilities in operating systems or client applications, and to develop protection techniques and tools. To this end, a lot of research has been done in the last few years on binary analysis techniques to eciently and precisely analyze code. Most of the common analysis frameworks are based on software emulators since such tools o er a ne-grained control over the execution of a given program. Naturally, this leads to an arms race where the attackers are constantly searching for new methods and techniques to detect such analysis frameworks in order to successfully evade analysis. In this paper, we focus on two aspects. As a rst contribution, we introduce several novel mechanisms by which an attacker can delude an emulator. In contrast to existing detection approaches that perform a dedicated test on the environment and combine the test with an explicit conditional branch, our detection mechanisms introduce code sequences that have an implicitly di erent behavior on a native machine when compared to an emulator. Such di erences in behavior are caused by the side-e ects of the particular operations and imperfections in the emulation process that cannot be mitigated easily. Even powerful analysis techniques such as multi-path execution cannot analyze our detection mechanisms since the emulator itself is deluded. Motivated by these ndings, we introduce a novel approach to generate execution traces. We propose to utilize the processor itself to generate such traces. Mores precisely, we propose to use a hardware feature called branch tracing available on commodity x86 processors in which the log of all branches taken during code execution is generated directly by the processor. E ectively, the logging is thus performed at the lowest level possible. We present implementation details for both Intel and AMD x86 CPUs and evaluate the practical viability and e ectiveness of this approach.


tags: analysis, Malware