11

I have read some times that hashing is a one way function, that is you can make the hash of a message, but you can't recover the original message from the hash, just check its integrity.

However, if this were true, why can we decrypt MD5 hashes and get the original data?

  • 13
    Since you posit MD5 is "decryptable", please recover the original data for this hash: 0cca9b3eeae7b8747eaf61f8d282156d. I'll even give you a hint, it's a real md5 of 42 character ASCII string. PS, using the best public attacks on MD5 this should only take work 2**123 ~ 10,633,823,966,279,326,983,230,456,482,242,756,608 or at a billion hashes per second on a billion computers, it should only take about 170 billion years on average. – dr jimbob Jun 28 '13 at 18:22
  • 1
    @drjimbob: Assuming 42 printable ASCII characters, there are probably about 3.4e+44 such strings with that same MD5 hash. (9542 strings divided by 2128 possible MD5 hashes.) So your 170 billion years of work will give you a few hundred tredecillion possible answers, and no way to know which one is correct. If I were you, I'd almost consider not bothering. – Keith Thompson Jun 29 '13 at 01:00
  • @drjimbob I'm sure I can find a collision before that with rainbow tables. Used to be a time when Google had indexed at least one collision of most md5 hashes until the modified their search input to ignore hex strings of certain lengths. – ewanm89 Jun 29 '13 at 05:48
  • @ewanm89 - Want to bet? I'm willing to bet a $200 donation to any legitimate charity of loser's choice (say Cancer Research Institute - cancerresearch.org) if you can prove me wrong in the next month (I'd agree for longer terms if you need it, but risk forgetting about the wager). Full disclosure, I only vaguely remember the original string. So if you can generate an input (not necessarily my string) md5sums to that hash or this one 0113fd21d9ec4e367abb761b26ef6010 (also 42 ascii chars but I saved this string to disk). Or if you want we could do a straight-up bet via bitcoin. – dr jimbob Jun 29 '13 at 06:51
  • @ewanm89 - I take you couldn't find a collision with your MD5 rainbow tables. A rainbow table is just a time-memory tradeoff, every hash still has to be computed (plus apply a reduction function). Imagine a powerful adversary built a rainbow table with 280 MD5s. (If one MD5 takes 1 cpu-nanosecond this would require 38 billion CPU years, 10 billion computers for ~4 years, to construct). With this and a 128-bit MD5 of a high-entropy passphrase, the chance of one of my MD5s being broken is ~1 in 248 (281,474,976,710,656). Granted a stronger unpublished preimage attack on MD5 could exist. – dr jimbob Jul 01 '13 at 16:04
  • @KeithThomspon - I'd be satisfied with any input that matches (like a password that matches the md5 for login). Note, 96^42 ~ 10^83 ~ 2^277 >> 2^123 ~ 10^37. Brute-forcing all printable ASCII 42-char phrases would not take 170 billion years and give 10^44 possible answers. It would take 10^44 times longer than 170 billion years. Or it would take 7700 trillion trillion years using a trillion trillion computers that each test a trillion trillion MD5 hashes per second. A trillion trillion (10^24 aka septillion) is a very large number; e.g., Earth's mass is 6 trillion trillion kilograms. – dr jimbob Jul 18 '13 at 19:40

2 Answers2

46

Hashing is not encryption (it is hashing), so we do not "decrypt" MD5 hashes, since they were not "encrypted" in the first place.

Hashing is one-way, but deterministic: hash twice the same value, and you get twice the same output. So cracking a MD5 hash is about trying potential inputs (passwords) until a match is found. It works well when the input is "a password which a human user came up with" because human users are awfully unimaginative when it comes to choosing passwords.

Tom Leek
  • 172,594
  • 29
  • 349
  • 481
  • 2
    You might want to rewrite "hash twice the same value" which can be interpreted as hashing the hash of a value. – mowwwalker Jun 28 '13 at 23:10
  • 3
    @Walkerneo - Tom said "hash twice the same value" not "hash twice a value". It is not ambiguous by any stretch of imagination, unless you can find a hash of a certain input that is exactly the same to its input. Needless to say, if you have h(v) = v your h doesn't do its job correctly as a hash function. – TildalWave Jun 29 '13 at 13:21
  • @TildalWave: technically, a good hash function h has probability about 63.2% of actually having a fixed point v, i.e. a value v such that h(v) = v. We do expect, however, that it is not feasible to demonstrate the existence or inexistence of such a fixed point for any particular hash function, let alone compute it. (If we can compute the fixed point, then the function is definitely not a random oracle, which does not mathematically imply that it is broken as a hash function, but is still bad news in practice.) – Tom Leek Jul 02 '13 at 15:41
  • I suggest to clarify that cryptographic hashing can be believed to be a one way function, but this is not proven. No one-way function it is proven to exist (https://en.wikipedia.org/wiki/One-way_function). In addition, MD5 for sure is not a cryptographic hashing function because it's broken (https://en.wikipedia.org/wiki/MD5#Security): it is easy to find a collision so it's vulnerable to second pre-image attacks (at least) – infosec-guy Mar 03 '19 at 17:08
16

You cannot "decrypt" MD5.

What you can do is try to match a large number of possible inputs in the hopes of stumbling upon the input that matches your hash. There are several attacks against the MD5 algorithm that makes this significantly easier.