40

What's wrong with this code?

$password = "hello";
$password = md5($password);
for ($i = 1; $i < 20; $i++) {
  $password = md5($password);
}

I don't think an attacker with access to the hashes storage would be able to decrypt any password using more than 2 characters.

The attacker would have to decrypt this list of hashes to be able to gain the plaintext password.

69a329523ce1ec88bf63061863d9cb14
0dcd649d4ef5f787e39ddf48d8e625a5
5d6aaee903365197ede6f325eb3716c5
cbe8d0c48ab0ed8d23eacb1621f6c5c3
8fa852c5f5b1d0d6b1cb0fad32596c71
91a84cf929b73800d2ff81da28834c64
45b7d5e4d3fca6a4868d46a941076b72
e5b7d9d10fef132829731255ef644319
b3af6ff5f5c7ae757ca816a6cb62f092
150f3682b2e58d1d0e1f789f9ba06982
3f76626950bf31dbc815c667ca4b2b43
44f4c75517671e12946aab3c8c293f98
442256b098b2d88e93428b08b5155308
7fd8ebc5bdff94f24a10decaa1ab64e8
d04bbc863839b720d932a697f0bf443b
de737c934db23c2d1d1026863e7583a8
d745f6394700c4ab1e9ded0924ea35d2
ce9428b51d3a63431c57a435423877b6
7017f40bdb4f1be1f5fae7dd0fc7b907

With brute force, an attacher should try 3632 (*19) combinations, which is pretty unachievable. Isn't that true?

apaderno
  • 111
  • 5
genesis
  • 718
  • 6
  • 15
  • See also http://stackoverflow.com/questions/420843/need-some-help-understanding-password-salt –  Jul 24 '11 at 14:35
  • 1
    Why is this bad? 1) You didn't design it before you wrote it. 2) There don't appear to be any error checks. 3) You don't provide a check on the strength of the password. 4) You havn't specified the compiler or interpreter options for secure operation. 5) You haven't provided any test vectors. 6) It hasn't been independently reviewed and tested. – this.josh Jul 24 '11 at 23:40
  • 3
    Not 36^32. 16^32. You are looking at hex-digest of the md5 hash, so there are thirty-two characters selected from an 16 character alphabet (0-9,a-f) not 36. – dr jimbob Jul 25 '11 at 14:24
  • @drjimbob: aaah, it'S earier than I thought! Could you add it as an answer? I would like to accept it – genesis Jul 25 '11 at 14:29
  • @genesis, not to nitpick, but I pointed that out in my answer about two hours before dr jimbob's comment. http://security.stackexchange.com/questions/5586/why-do-people-think-that-this-is-bad-way-to-hash-passwords/5627#5627 – user Jul 26 '11 at 08:02
  • because you need a minimum of 21 nested md5()s. it's the Golden Rule of Cryptography! (yes i am joking) – geoff Feb 24 '14 at 23:44

15 Answers15

76

The wrong things on your method are:

  • You use way too few iterations (20 is too low, it should be 20000 or more): password processing is still too fast, an attacker with a basic PC will still be able to "try" dozens of millions of passwords per second.
  • There is no salt: an attacker may attack several passwords with very low per-password cost, e.g. with precomputed tables of hashed passwords (in particular rainbow tables).
  • You are in the process of inventing your own cryptography. There is nothing wrong with being inquisitive and trying to understand things, but since there is no sure test for knowing whether a given algorithm is secure or not, inventing your own cryptography is often a recipe for disaster. Don't do it.

What you should do is to use bcrypt; there is a PHP implementation in the Portable PHP password hashing framework.

Thomas Pornin
  • 326,555
  • 60
  • 792
  • 962
  • 2
    more iterationgs don't bring higher security. The oftener you hash, the higher is the probability of hash collisions. I'd say using md5 twice would be better than using md5 once, but if you use it 2, 20 or 20,000 times doesn't really matter. Using it twice might be better, as standard cracking websites like http://www.md5cracker.org/ will not work any longer. – Martin Thoma Jul 24 '11 at 16:39
  • 19
    @moose: collisions are not an issue here -- we work on preimage resistance, not collision resistance, and that's good because MD5 is utterly broken with regards to collisions anyway. Reduction of the output space might be viewed as an issue, but even after 2^64 iterations, the space should still have size about 2^64; the weakness here is not MD5, but the password (i.e. the human brain which can memorize a password). The multiple iterations are there to make dictionary attacks more expensive. – Thomas Pornin Jul 24 '11 at 17:16
  • 2
    Actually, the iteration count in today's software is much higher than 20000 - sometimes even in millions. But solid advice never the less. – Nakedible Jul 24 '11 at 18:00
  • 4
    @moose, that's not correct. Collisions are not a worry here; the likelihood of experiencing a collision in this context is negligible (like, less than the chances of being struck by lightning). For the parameters experienced here (iterating thousands, millions, or billions of times), more iterations do bring higher security, and the probability of collisions are negligibly small. – D.W. Aug 18 '11 at 08:11
  • @ThomasPornin Bcrypt is the best option, that's understood, but out of curiosity (situation I have no control over) is 50 iterations of SHA-256 with salting effective? Why or why not? Does that fight rainbow tables, and does the 50 iterations increase computational cost? – temporary_user_name Aug 28 '14 at 16:33
  • I'm actually going to contest the "Bcrypt is the best option" claim being thrown about (by @ThomasPornin and others) here. Bcrypt is better than PBKDF2 (which is just iterated salting and hashing) because it requires an amount of memory that isn't completely trivial, but it's still easy to parallelize on modern hardware. RAM is cheap these days. Scrypt or the even-newer Argon2 (winner of a three-year competition for best password hashing algorithm) are better because they are memory-hard as well as compute-hard, and the memory-hardness is tunable (Bcrypt's memory cost is low, and fixed). – CBHacking Apr 22 '18 at 02:48
35

Others have described the limitations of this hashing method; I'd like to point out a conceptual mistake in the question:

I don't think that attacker with my DB would be able to decrypt any password with lenght > 2

Attacker would have to decrypt this list of md5 hashes to be able to gain plain-password:

[list of intermediate results]

The mistake here is thinking that the complexity of the intermediate results provides any protection at all against a brute-force dictionary attack. I think the asker is thinking the attack must work backward, starting from the stored hash, and brute-force each intermediate result in turn.

This isn't true at all; reasonable dictionary attacks will start with possible passwords, and attack the whole 20-hash stack at once. Here's the algorithm sketch:

for each candidate password:
    hash 20 times
    compare with stored hash

Using this to check all possible 3-character passwords (assuming printable ASCII) would require only 20 * 95^3 = 17147500 hashes, which is basically trivial. Using SHA-512 instead of MD5, despite having much larger intermediate values, would be more secure only because each hash takes a little longer to compute.

tl;dr a complex hash function can't save you if the password itself doesn't have enough entropy.

Gordon Davisson
  • 2,611
  • 1
  • 18
  • 13
  • 2
    and how does the cracker know he must run the hash 20 times? – Enrique Feb 19 '12 at 23:46
  • 14
    @Enrique: the cracker might not know that, but you shouldn't count on it. This is Kerckhoffs's principle/Shannon's maxim: the enemy knows the system. – Gordon Davisson Feb 20 '12 at 03:54
  • OK, but then it is a good idea to hash X times just because if the cracker doesn't know X it will be more difficult to him to crack the hash. Am I wrong? – Enrique Feb 20 '12 at 16:08
  • 1
    @Enrique: No, that just forces the cracker to compare after each hash, which doesn't slow him down much. You should hash many times (thousands, not just 20) so the cracker has much more work to check each possible password (see Thomas and oliland's answers). – Gordon Davisson Feb 21 '12 at 02:02
  • 4
    @Enrique, To be clear, MD5 is the wrong hash for the job. Using bcrypt, and adjusting the repetition parameter until the operation takes 1ms minimum is the way to go. (perhaps even 100ms won't be that big a deal if this just happens once per login) Using bcrypt will prevent the necessity of installing your own repetition logic, and another feature which is very important (unless there are a limited number of users), which is that each password gets its own salt, requiring one brute force job per user, instead of once for all users together. – 700 Software Mar 08 '12 at 18:56
17

20x MD5 is a fast hashing algorithm, meaning it can generate passwords at an astonishing rate.

Please stop using fast hashing algorithms to store passwords. Even with individual salts; if someone has direct (read: offline) access to your database they can be computed very easily.

This article explains why much better than I can:

http://codahale.com/how-to-safely-store-a-password/

The article heavily mentions BCrypt (with a link to a PHP library), but bear in mind there are other slow hashing algorithms that may suit you.

oliland
  • 279
  • 1
  • 3
  • even 19 * 36 ^ 32 combinations + unknown lenght, characters, salt in last md5 hash is bad? – genesis Jul 24 '11 at 15:51
  • 2
    using a reversible symmetric-key algo to encrypt your password is almost as bad as storing data in plaintext. As if you've compromised the server, there is a good chance that that you've compromised the code/key to decipher data. – dvhh Jul 25 '11 at 02:22
  • 1
    @dvhh - a MD5 cannot be reversed. – Ramhound Mar 08 '12 at 18:15
  • @Ramhound old topic, but with today's computers it is possible to compute hash fast enough for a dictionary or brute force attack, there is even precomputed tables to test against non salted passwords. So yes it cannot be reversed, but if you are motivated enough, you can find out the input that match the md5 fast enough. – dvhh Mar 09 '12 at 05:51
  • 1
    @dvhh from the article: "bcrypt is an adaptive password hashing algorithm which uses the Blowfish keying schedule, not a symmetric encryption algorithm." – oliland Aug 07 '13 at 05:22
11

The problem is that this is a pretty obvious "algorithm", and pretty fast to boot.
It's very likely that there's a precomputed rainbow table available for this "algorithm", and even if there isn't, md5 is fast enough to be able to precompute one in a realistic amount of time.

You should always use an individual salt for each password to avoid this kind of attack.

deceze
  • 715
  • 3
  • 12
  • But what would attacker need in order to get my password(s) fast? And how long would it take with rainbow table? – genesis Jul 24 '11 at 14:35
  • 7
    @gen If you have a rainbow table for the "20xMD5" algorithm, looking up a password is instantaneous. – deceze Jul 24 '11 at 14:38
  • @genesis How long it will take to crack one password? Check this: http://md5hashcracker.appspot.com/ – Chris Ghenea Jul 24 '11 at 14:38
  • @deceze: is there such a table? – genesis Jul 24 '11 at 14:39
  • @gen I can't give you a definite "yes", since I don't know, but I think it's very likely. – deceze Jul 24 '11 at 14:40
  • @Chris: Those “crackers” are just using a lookup table of known (input,Hash(input)) pairs. – Gumbo Jul 24 '11 at 14:40
  • @deceze: A rainbow table of 30+ chars. Don't think it exists. At least not a complete one – PeeHaa Jul 24 '11 at 14:41
  • @Gumbo. Yes, is true. But the table is growing and growing and growing... – Chris Ghenea Jul 24 '11 at 14:43
  • The fact that the algorithm is "obvious" is not a security problem. 2. Using a salt is a good idea, but also more iterations are needed (and PBKDF2, bcrypt, or scrypt would be better).
  • – D.W. Aug 18 '11 at 08:12
  • @D.W. That "obvious" actually went together with the next sentence, that it's not unlikely that a rainbow table already exists for such an algorithm. – deceze Aug 18 '11 at 08:14
  • 1
    Even if it weren't "obvious", and even if no rainbow table already existed, it would still be a bad idea. If your password hash algorithm is crummy, the fact that no one else is using it doesn't make it OK to use the crummy algorithm. – D.W. Aug 18 '11 at 08:22
  • @D.W. And I'm in total agreement with that, in case that wasn't clear. – deceze Aug 18 '11 at 08:26