can't I just prevent [password shucking] by salting the pre-hash?
Yes.
To understand how password shucking works, and how to prevent it, consider the following hypothetical scenario:
Suppose a user uses a password that is hard (but not impossible) to crack such as: 0e6d5b95c11d
. The user uses this same password on both your site and another site.
The SHA256 hash of this password is: 31903c9394eb17e176898d31b2ac06d0cfd04b077192341f8e8f3b5866ea0da2
.
The other site hashes passwords by simply using unsalted SHA256, i.e. sha256(password)
, so that site stores 31903c9394eb17e176898d31b2ac06d0cfd04b077192341f8e8f3b5866ea0da2
for this user in its password database.
Now suppose your site hashes passwords using bcrypt(sha256(password))
. This effectively means that your site would be storing bcrypt(31903c9394eb17e176898d31b2ac06d0cfd04b077192341f8e8f3b5866ea0da2)
for this user in your password database.
Now suppose your site and the other site are both breached. The attacker sees that your site uses bcrypt(sha256(password))
as your password hashing algorithm, so he runs all of the unsalted SHA256 hashes from the other site (even though these are not yet cracked) through bcrypt to see if any of them appear in your password database. Lo and behold, when he gets to 31903c9394eb17e176898d31b2ac06d0cfd04b077192341f8e8f3b5866ea0da2
, he finds the result in your password database.
Now, the attacker can concentrate his resources on cracking the unsalted SHA256 hash 31903c9394eb17e176898d31b2ac06d0cfd04b077192341f8e8f3b5866ea0da2
. Cracking a single iteration of unsalted SHA256 is much easier for the attacker than cracking bcrypt, so this effectively removes any added security that bcrypt provides. This is how password shucking works.
But, none of this would be possible if you used bcrypt(sha256(password + salt))
, because the attacker would presumably not have sha256(password + salt)
from another breach, assuming that the salt that you are using for the SHA256 'prehash' is globally unique for each user.
To ensure that the salt used in the prehash is globally unique for each user, it's best to use a CSPRNG to generate a random salt for each user. This means that you'll need to store the prehash salts in your password database for each user.