Speakers‎ > ‎

Mario Vuksan and Tomislaw Pericin, ReversingLabs

File analysis and unpacking in the age of 40M new samples per year

With daily unique malware counts exceeding 100,000 pressure is exerted at sample analysis and automated unpacking systems. Known 400+ packer families and custom packers can be mixed together in layers and in parallel. Today's system has to be able to handle all known format schemas statically and dynamically while anticipating increases in complexity.
We will discuss the creation of a complex file identity model which layers out the entire binary object. This then enables utilization of a correct unpacking and analysis model for each of the identified segments. Object segmenting is done to cover all aspects of the binary object including the multiple packing layers, resources, sections and overlay. Identification methods will cover traditional file identification with special attention to methods used to fool detection tools as well as generic detection methods. We will describe creation and performance of a complex system handling identification and unpacking of large quantities of files, and contrast it against methods in use today. Static, dynamic and generic file unpacking models will be described showing their benefits and flaws in all viable black and white listing scenarios. Utilization of those binary content processors for each identified segment will be queried for performance and scalability.