Using Value Set Analysis for Classification of metamorphic Malware SamplesMetamorphic malware changes its structure from infection to infection. Different variants often only have sequences of very few bytes in common. While the appearance changes completely, the core functionality has to stay the same. The functionality is reflected by the data used at specific points inside the malware. We have found that specific data values, like constants, flags, or computation results, are suitable as characteristics for metamorphic malware. Primitive examples of such characteristic values are constants used when creating network sockets or opening files. Others are loop counters and encoding or encryption keys. We have developed a value set analysis that tracks possible values along the dataflow graph and computations. The over-approximation of specific value sets is suitable for characterizing metamorphic malware families. We have tested different schemes for both finding characteristic value sets as well as for matching identifying infected files using a multidimensional sensitivity analysis. We have presented our results of 100% detection with 0 false positives. At Caro, we would like to present that our approach is also suitable for classification of metamorphic malware and present performance data with respect to computation time. While the results in were based on a small set of only 50 files per family, more than 4000 files were used for evaluating whether the characteristic sets are unique for each family. Again, a perfect clustering of all considered families has been observed. All members of each family were identified while other families were missing characteristic value sets. |