A #hashfunction is neat tool for rapid DNA lookups - break DNA into small pieces, DNA -> integer, put ints in table and then theoretically near O(1) lookup. Then you actually build one and the performance sucks and you look at the distribution of keys, and it turns out that DNA is decidedly non random.. so the bucket usage is really lumpy and there are lots of collisions
@badtuple Urgh, yes, hash functions are such an interesting and delicate topic!
Upper-bound for collision probability can be calculated for so called universal hash functions:
https://en.wikipedia.org/wiki/Universal_hashing
Not sure, if it is possible with other kind of hash functions, though.
I can highly recommend the following resource by Tomek Czajka
How to pick a hash function, part 1:
https://sortingsearching.com/2020/05/21/hashing.html
How to pick a hash function, part 2:
https://sortingsearching.com/2020/06/28/hashing-part-2.html
1/2
"There is a very real advantage here when it comes to security against quantum computers: While most currently used schemes would be broken by a large enough quantum computer running Shor’s algorithm, no generic quantum attacks (better than Grover’s algorithm) are known against hash functions. As long as we can build a quantum-secure hash function, we can plug it into a hash-based signature scheme and prove security."
Hash-based digital signatures (almost) from scratch
https://medium.com/@georgwiese/hash-based-digital-signatures-almost-from-scratch-da57e54dd774
#hashfunction #digitalsignature #cryptography