[Paper of the day][#18] How do you triage #malware? How do you tell two files are similar? An interesting static analysis approach is to use #similarity #hashing tools, such as #ssdeep and #sdhash. However, to be effective, their application can't be straightforward, but should follow a protocol. In this paper, we discuss how to efficiently apply these functions for malware family classification. We show that hashing only the instruction disassembly has a greater impact than hashing the entire file. Check this result and much more.
Academic paper: https://www.sciencedirect.com/science/article/pii/S2666281721001281
Archived version: https://secret.inf.ufpr.br/papers/marcus_similarity_hashing.pdf
#malware #similarity #hashing #ssdeep #sdhash