So, I was wrong and that algorithm doesn't work for this case. Well... It is always the same with #bindiffing, some algorithms look like the obvious solution (like bipartite graph matching) and then turns out they do not really work.
So, continuing my rant about academic research in the #bindiffing area and not releasing required stuff: In one paper they say that 2 malware samples aren't properly diffed by both #Diaphora and #BinDiff, so I have tried to search for the samples to do the diffing myself and see why, if at all, it fails. There is no dataset or sample hashes anywhere, only a set of assembly instructions for a specific basic block... #Fail
#bindiffing #Diaphora #bindiff #fail
If I were to take a decision to continue the development of #Diaphora based on what academic research in the #BinDiffing area says, I should stop because almost every academic paper I read considers their authors already solved the problem and they even improve previous papers.