I am working on an application in which I have to compare 2 hashed passwords in a database, one password is being generated in PHP with $Password = password_hash($RawPassword, PASSWORD_BCRYPT);
While the other password that is being sent to the database to compare with the PHP hashed password is generated in Java with String hashedPassword = BCrypt.hashpw(password);
As of PHP 7.0 the salting is automatically generated, how can i know what salt is being applied in PHP so i can apply it to my java code? Or is there a way to still specify the salt that is no longer in the documentation for PHP Hashing?
The standard idea behind the vast majority of bcrypt impls is that the thing that is in the database looks like $2y$10$AB where A is 22 characters and B is 31 characters for a grand total of 60. A is: left(base64(salt + 00 + 00), 22) and B is: left(base64(bcryptraw(salt + pass)), 31). (2y refers to the hash algorithm/ EDIT: 2y and 2a are more or less interchangible; most bcrypt impls treat them the same, and it is unlikely to matter which one is there. The 10 refers to the # of bcrypt rounds applied. 10 is common and usually what you want).
where:
base64(X) = apply base64 conversion, using . and / as the 63rd and 64th char.
+ is concatenate, i.e. salt (a 16-byte byte array) gets 2 zero bytes added.
left(chars, size) means: Take the first size chars and discard the rest.
salt is the salt in bytes and pass is the password, converted to bytes via UTF_8. (if not converting via UTF-8, it's generally $2a$ instead, and you should upgrade, folks with non-ascii chars in their password get pretty bad hashes in the older $2a$ mode!
This one string contains everything that a bcrypt impl needs to check if a given password is correct or not. Thus, all non-idiotic bcrypt library impls have just two methods and no others:
// This is for when the user creates an account or edits their password.
// send the password to this method, then take the string it returns,
// and store this in your database.
hash = crypto.hashPass(password);
// This is for when the user tries to log in. For 'hash', send the thing
// that the hashPass method made for you.
boolean passwordIsCorrect = crypto.checkPass(password, hash);
EDIT: NB: A truly well designed crypto library calls these methods processNewPassword and checkExistingPassword to avoid the kind of confusion that caused you to ask this question, but unfortunately, nobody out there seems to have had the wherewithal to think for 5 seconds about what their names suggest. Unfortunate. Security is hard.
if your BCrypt API doesn't work like this, get rid of it, and find a standard implementation that works like this.
It sounds like you're using the wrong method. To check passwords, don't use hashPass. Use checkPass, or whatever goes for checkPass in your impl (it might be called checkPw or verifyPw or validate, etcetera. It take 2 strings).
Thus, you should never generate a salt, nor ever extract a salt from such a string. Let the bcrypt lib do it. Those 'hashes' that standard bcrypt libraries generate (the $2y$ string) are interchangible; your PHP library can make em and your java library can check em, and vice versa.
If you MUST extract the salt (but don't):
take those 22 characters, after the $protocol$rounds$ part.
append 'aa' to this.
base64decode the result.
this gets you 18 bytes. toss the last 2 bytes, which contain garbage.
The remaining 16 bytes are the salt.
You should absolutely not write this - your bcrypt library will do this.
Related
When running the encode method of a spring security Pbkdf2PasswordEncoder instance multiple times, the method returns different results for the same inputs. The snippet
String salt = "salt";
int iterations = 100000;
int hashWidth = 128;
String clearTextPassword = "secret_password";
Pbkdf2PasswordEncoder pbkdf2PasswordEncoder = new Pbkdf2PasswordEncoder(salt, iterations, hashWidth);
String derivedKey = pbkdf2PasswordEncoder.encode(clearTextPassword);
System.out.println("derivedKey: " + derivedKey);
String derivedKey2 = pbkdf2PasswordEncoder.encode(clearTextPassword);
System.out.println("derivedKey2: " + derivedKey2);
results in a output like
derivedKey: b6eb7098ee52cbc4c99c4316be0343873575ed4fa4445144
derivedKey2: 2bef620cc0392f9a5064c0d07d182ca826b6c2b83ac648dc
The expected output would be the same values for both derivations. In addition, when running the application another time, the outputs would be different again. The different output behavior also appears for two different Pbkdf2PasswordEncoder instances with same inputs. The encoding method behaves more like a random number generator. Spring boot version used is 2.6.1, spring-security-core version is 5.6.0 .
Is there any obvious setting that I am missing? The documentation does not give additional hints. Is there a conceptual error in the spring boot project set up?
Is there any obvious setting that I am missing?
Yes. The documentation you linked to is fairly clear, I guess you missed it. That string you pass to the Pbkdf2PasswordEncoder constructor is not a salt!
The encoder generates a salt for you, and generates a salt every time you ask it to encode something, which is how you're supposed to do this stuff1. (The returned string contains both this randomly generated salt as well as the result of applying the encoding, in a single string). Because a new salt is made every time you call .encode, the .encode call returns a different value every time you call it, even if you call it with the same inputs.
The string you pass in is merely 'another secret' - which can sometimes be useful (for example, if you can store this secret in a secure enclave, or it is sent by another system / entered upon boot and never stored on disk, then if somebody runs off with your server they can't check passwords. PBKDF means that if they did have the secret the checking will be very slow, but if they don't, they can't even start).
This seems like a solid plan - otherwise people start doing silly things. Such as using the string "salt" as the salt for all encodes :)
The real problem is:
The expected output would be the same values for both derivations
No. Your expectation is broken. Whatever code you are writing that made this assumption needs to be tossed. For example, this is how you are intended to use the encoder:
When a user creates a new password, you use .encode and store what this method returns in a database.
When a user logs in, you take what they typed, and you take the string from your database (the one .encode sent you) and call .matches.
It sounds like you want to again run .encode and see if it matches. Not how you're supposed to use this code.
Footnote1: The why
You also need to review your security policies. The idea you have in your head of how this stuff works is thoroughly broken. Imagine it worked like you wanted, and there is a single salt used for all password encodes. Then if you hand me a dump of your database, I can trivially crack about 5% of all accounts within about 10 minutes!!
How? Well, I sort all hashed strings and then count occurrences. There will be a bunch of duplicate strings inside. I can then take all users whose passhash is in this top 10 of most common hashes and then log in as them. Because their password is iloveyou, welcome123, princess, dragon, 12345678, alexsawesomeservice!, etcetera - the usual crowd of extremely oft-used passwords. How do I know that's their password? Because their password is the same as that of many other users on your system.
Furthermore, if none of the common passwords work, I can tell that likely these are really different accounts from the same user.
These are all things that I definitely should not be able to derive from the raw data. The solution is, naturally, to have a unique salt for everything, and then store the salt in the DB along with the hash value so that one can 'reconstruct' when a user tries to log in. These tools try to make your life easy by doing the work for you. This is a good idea, because errors in security implementations (such as forgetting to salt, or using the same salt for all users) are not (easily) unit testable, so a well meaning developer writes code, it seems to work, a casual glance at the password hashes seem to indicate "it is working" (the hashes seem random enough to the naked eye), and then it gets deployed, security issue and all.
While using Jasypt, the encrypted passwords contains = (equal character) at the end. Is it guaranteed that the encrypted passwords will always have = at the end?
How/Can we control this behavior?
Foe example: test is encrypted to Nv4nMcuVwsvWVuYD7Av44Q==
It looks like the =s come from padding the Base64 representation of the encryption / hash output.
In that case, the answer is generally no, it won't necessarily end with "=".
However if the algorithm you're using produces constant-length output (e.g. if it uses hashing along the way), it might by a chance end up producing those "="s all the time - but there's no way of knowing that for sure unless you fully understand all steps the algorithm you're using performs.
We encrypt with AES 256 CBC PKCS5PADDING in Java with the libraries one has to download from Oracle, with Base64 encoding of the resulting byte arrays. I have read that static common initialization vector drastically decreases the security as texts that starts with the chars will looks the same when encrypted. Is this still true for short strings (12 numeric chars)?
I have encrypted a large set and I cannot find any reoccurring substrings in the resulting encrypted strings, even when they start with the same sequence.
Example (plaintext on the left and resulting encrypted string on the right)
555555555501 -> U0Mkd0PPloB5iLBy5jM6nw==
555555555502 -> NUHWaFs62LMEeyoGA0mGoQ==
555555555503 -> X3/XJNd4TzEsMv7V0bXwqg==
Albeit separate from the question, but to preempt some suggestions: we need to be able to do look ups based on plaintext strings and to be able to decrypt. We could do both hashing and encryption, but prefer to avoid it if it does not improve security significantly as it adds complexity.
I have read that static common initialization vector are bad as one can derive the key from encrypted strings.
I'm curious: where have you read that?
With short (<=16 bytes) plaintext, a random IV effectifely works as a Salt, i.e. it causes the ciphertext to differ even if the plain text is the same. This is an important feature in a lot of applications. But you write:
We need to be able to do look ups based on plaintext strings.
So you want to build some sort of pseudonymization database? If that is a requirement for you, the feature that salt, and in your case random IV adds, is actually one that you specifically don't want. Depending on your other requirements you can probably get away with using a static IV here. But for pseudonymization in general, it is recommended to use a dedicated pseudonym. In your case the data seems to be atomic. But in the general case of, for example, address data, you want to hash the name, the zip code, the city and whatever else your pseudonym is, separately, both to allow more specific queries, and to keep access to and information flow from your data under strict control.
I am writing a Django app that needs to work with an existing Java Play framework app. The Play app uses PasswordHash.java to store passwords. It stores passwords in a colon separated format. Each hash is stored as iterations:salt:pbkdf2_hash.
For example, here is an entry for the password "test":
1000:f7fe4d511bcd33321747a778dd21097f4c0ff98f1e0eba39:b69139f51bc4098afc36b4ff804291b0bc697f87be9c1ced
Here we can split the string by : and find:
Iterations: 1000
Salt: f7fe4d511bcd33321747a778dd21097f4c0ff98f1e0eba39
PBKDF2 Hash: b69139f51bc4098afc36b4ff804291b0bc697f87be9c1ced.
I modified Django's check_password mechanism to be compatible with this format, but found that it didn't think the password was correct. I used Django's crypto.py to regenerate a hash for "test" using the same salt that Play used, and came up with this:
hash = crypto.pbkdf2('test', 'f7fe4d511bcd33321747a778dd21097f4c0ff98f1e0eba39', 1000, 24)
base64.b16encode(hash)
'9A8725BA1025803028ED5B92748DD61DFC2625CC39E45B91'
The PBKDF2 hash from play does not match this hash. (For those wondering, I used 24 as the fourth parameter because that is what is used in PasswordHash.java).
After I was unable to make Django's generated hash match Java's, I tried it on a website that does it for you.
I plugged in the same salt, used SHA-1 with 1000 iterations and a 24-bit key size and found that the website matched what Django had created!
I am not sure what is going on with PasswordHash.java, but I desperately need to get Django and Play to "play nicely" (couldn't help myself haha). Does anyone have an idea as to what is going on here?
Try salt = base64.b16decode(salt.upper()).
I did and I got the hash in your initial example, albeit uppercased B69139F5...
Explanation:
The hash and salt are both being stored in Base16 (hex) in your initial example. So you decode the salt to use it and then encode the resulting hash to compare it with the stored one.
The upper() is because python's b16decode is strict about uppercase Base16. It will error if you give it the lowercase one.
I'm planning to hash user passwords using bcrypt, and to store these hashed passwords in a database.
The server that handles user account creation, and inserts the hashed password to the database is written in Java.
Another server that needs to access user information (including the hashed passwords) is written in Python.
I was planning to use jBCrypt for the Java side, but before I do that I want to make sure that I'll by able to recognise/use these hashed passwords from the Python side.
How I understand things, this should be no problem as long as the Python BCrypt implementation is the same as the Java implementation.
So, can I use the passwords hashed using jBCrypt from Python? How?
Thanks in advance!
The best way to know is to actually try it.
Assuming both implementations are correct, they should be compatible, as long as you take care to re-encode data as necessary.
Typically, a hash is stored in memory either as a byte array of the raw hash, or as a ASCII hexadecimal representation. The best way to know what encoding it's using is actually printing it to the console: if it looks like garbage, it'll be a raw byte array; if it prints a hexadecimal string (0-9 and a-f), it's ascii encoded hexadecimal.
Salt will probably be stored like the hash. The number of rounds is a integer. It's up to you to store all this data in a common format. If you need to convert a ascii hex string to a byte array (actually, a string) in python, you can use string.encode:
>>> 'hello world'.encode('hex')
'68656c6c6f20776f726c64'
>>> '68656c6c6f20776f726c64'.decode('hex')
'hello world'
For a bcrypt implementation in python, you may want to try py-bcrypt