A common bioinformatics task is to decompose a DNA sequence into its constituent k-mers and compute a hash value for each k-mer. Rolling hash functions are an appealing solution for this task, since they can be computed very quickly. A rolling hash does not compute each the hash value from scratch with each k-mer: rather it updates a running hash value using an update strategy and a sliding window over the data.
It's also very useful for many applications to have a k-mer hash to the same value as its reverse complement. Unless the data were generated using a strand-specific sample prep, it's impossible to distinguish a k-mer from its reverse complement, and they should be treated as the same sequence.
Are there any rolling hashes that will map reverse complements to the same value? If not, how would we develop such an algorithm?
UPDATE: Ideally the hash function would be able to support k > 32, which would be lossy unless using something larger than a 64-bit integer.
ANOTHER UPDATE: I don't think it's necessary to store both the running k-mer and its reverse complement in a single value. If storing two k-mer strings and/or two hash values makes this easier, I'm totally cool with that.