How to get reproducible Pbkdf2PasswordEncoder output in spring boot?

How to get reproducible Pbkdf2PasswordEncoder output in spring boot? - java

When running the encode method of a spring security Pbkdf2PasswordEncoder instance multiple times, the method returns different results for the same inputs. The snippet
String salt = "salt";
int iterations = 100000;
int hashWidth = 128;
String clearTextPassword = "secret_password";
Pbkdf2PasswordEncoder pbkdf2PasswordEncoder = new Pbkdf2PasswordEncoder(salt, iterations, hashWidth);
String derivedKey = pbkdf2PasswordEncoder.encode(clearTextPassword);
System.out.println("derivedKey: " + derivedKey);
String derivedKey2 = pbkdf2PasswordEncoder.encode(clearTextPassword);
System.out.println("derivedKey2: " + derivedKey2);
results in a output like
derivedKey: b6eb7098ee52cbc4c99c4316be0343873575ed4fa4445144
derivedKey2: 2bef620cc0392f9a5064c0d07d182ca826b6c2b83ac648dc
The expected output would be the same values for both derivations. In addition, when running the application another time, the outputs would be different again. The different output behavior also appears for two different Pbkdf2PasswordEncoder instances with same inputs. The encoding method behaves more like a random number generator. Spring boot version used is 2.6.1, spring-security-core version is 5.6.0 .
Is there any obvious setting that I am missing? The documentation does not give additional hints. Is there a conceptual error in the spring boot project set up?

Is there any obvious setting that I am missing?
Yes. The documentation you linked to is fairly clear, I guess you missed it. That string you pass to the Pbkdf2PasswordEncoder constructor is not a salt!
The encoder generates a salt for you, and generates a salt every time you ask it to encode something, which is how you're supposed to do this stuff1. (The returned string contains both this randomly generated salt as well as the result of applying the encoding, in a single string). Because a new salt is made every time you call .encode, the .encode call returns a different value every time you call it, even if you call it with the same inputs.
The string you pass in is merely 'another secret' - which can sometimes be useful (for example, if you can store this secret in a secure enclave, or it is sent by another system / entered upon boot and never stored on disk, then if somebody runs off with your server they can't check passwords. PBKDF means that if they did have the secret the checking will be very slow, but if they don't, they can't even start).
This seems like a solid plan - otherwise people start doing silly things. Such as using the string "salt" as the salt for all encodes :)
The real problem is:
The expected output would be the same values for both derivations
No. Your expectation is broken. Whatever code you are writing that made this assumption needs to be tossed. For example, this is how you are intended to use the encoder:
When a user creates a new password, you use .encode and store what this method returns in a database.
When a user logs in, you take what they typed, and you take the string from your database (the one .encode sent you) and call .matches.
It sounds like you want to again run .encode and see if it matches. Not how you're supposed to use this code.
Footnote1: The why
You also need to review your security policies. The idea you have in your head of how this stuff works is thoroughly broken. Imagine it worked like you wanted, and there is a single salt used for all password encodes. Then if you hand me a dump of your database, I can trivially crack about 5% of all accounts within about 10 minutes!!
How? Well, I sort all hashed strings and then count occurrences. There will be a bunch of duplicate strings inside. I can then take all users whose passhash is in this top 10 of most common hashes and then log in as them. Because their password is iloveyou, welcome123, princess, dragon, 12345678, alexsawesomeservice!, etcetera - the usual crowd of extremely oft-used passwords. How do I know that's their password? Because their password is the same as that of many other users on your system.
Furthermore, if none of the common passwords work, I can tell that likely these are really different accounts from the same user.
These are all things that I definitely should not be able to derive from the raw data. The solution is, naturally, to have a unique salt for everything, and then store the salt in the DB along with the hash value so that one can 'reconstruct' when a user tries to log in. These tools try to make your life easy by doing the work for you. This is a good idea, because errors in security implementations (such as forgetting to salt, or using the same salt for all users) are not (easily) unit testable, so a well meaning developer writes code, it seems to work, a casual glance at the password hashes seem to indicate "it is working" (the hashes seem random enough to the naked eye), and then it gets deployed, security issue and all.

Related

PHP Bcrypt Salt as of 7.0

I am working on an application in which I have to compare 2 hashed passwords in a database, one password is being generated in PHP with $Password = password_hash($RawPassword, PASSWORD_BCRYPT);
While the other password that is being sent to the database to compare with the PHP hashed password is generated in Java with String hashedPassword = BCrypt.hashpw(password);
As of PHP 7.0 the salting is automatically generated, how can i know what salt is being applied in PHP so i can apply it to my java code? Or is there a way to still specify the salt that is no longer in the documentation for PHP Hashing?

The standard idea behind the vast majority of bcrypt impls is that the thing that is in the database looks like $2y$10$AB where A is 22 characters and B is 31 characters for a grand total of 60. A is: left(base64(salt + 00 + 00), 22) and B is: left(base64(bcryptraw(salt + pass)), 31). (2y refers to the hash algorithm/ EDIT: 2y and 2a are more or less interchangible; most bcrypt impls treat them the same, and it is unlikely to matter which one is there. The 10 refers to the # of bcrypt rounds applied. 10 is common and usually what you want).
where:
base64(X) = apply base64 conversion, using . and / as the 63rd and 64th char.
+ is concatenate, i.e. salt (a 16-byte byte array) gets 2 zero bytes added.
left(chars, size) means: Take the first size chars and discard the rest.
salt is the salt in bytes and pass is the password, converted to bytes via UTF_8. (if not converting via UTF-8, it's generally $2a$ instead, and you should upgrade, folks with non-ascii chars in their password get pretty bad hashes in the older $2a$ mode!
This one string contains everything that a bcrypt impl needs to check if a given password is correct or not. Thus, all non-idiotic bcrypt library impls have just two methods and no others:
// This is for when the user creates an account or edits their password.
// send the password to this method, then take the string it returns,
// and store this in your database.
hash = crypto.hashPass(password);
// This is for when the user tries to log in. For 'hash', send the thing
// that the hashPass method made for you.
boolean passwordIsCorrect = crypto.checkPass(password, hash);
EDIT: NB: A truly well designed crypto library calls these methods processNewPassword and checkExistingPassword to avoid the kind of confusion that caused you to ask this question, but unfortunately, nobody out there seems to have had the wherewithal to think for 5 seconds about what their names suggest. Unfortunate. Security is hard.
if your BCrypt API doesn't work like this, get rid of it, and find a standard implementation that works like this.
It sounds like you're using the wrong method. To check passwords, don't use hashPass. Use checkPass, or whatever goes for checkPass in your impl (it might be called checkPw or verifyPw or validate, etcetera. It take 2 strings).
Thus, you should never generate a salt, nor ever extract a salt from such a string. Let the bcrypt lib do it. Those 'hashes' that standard bcrypt libraries generate (the $2y$ string) are interchangible; your PHP library can make em and your java library can check em, and vice versa.
If you MUST extract the salt (but don't):
take those 22 characters, after the $protocol$rounds$ part.
append 'aa' to this.
base64decode the result.
this gets you 18 bytes. toss the last 2 bytes, which contain garbage.
The remaining 16 bytes are the salt.
You should absolutely not write this - your bcrypt library will do this.

will every encrypted password of jasypt would contain "=" at the end?

While using Jasypt, the encrypted passwords contains = (equal character) at the end. Is it guaranteed that the encrypted passwords will always have = at the end?
How/Can we control this behavior?
Foe example: test is encrypted to Nv4nMcuVwsvWVuYD7Av44Q==

It looks like the =s come from padding the Base64 representation of the encryption / hash output.
In that case, the answer is generally no, it won't necessarily end with "=".
However if the algorithm you're using produces constant-length output (e.g. if it uses hashing along the way), it might by a chance end up producing those "="s all the time - but there's no way of knowing that for sure unless you fully understand all steps the algorithm you're using performs.

What should I do after using a password to log in to a system? [duplicate]

In Swing, the password field has a getPassword() (returns char[]) method instead of the usual getText() (returns String) method. Similarly, I have come across a suggestion not to use String to handle passwords.
Why does String pose a threat to security when it comes to passwords?
It feels inconvenient to use char[].

Strings are immutable. That means once you've created the String, if another process can dump memory, there's no way (aside from reflection) you can get rid of the data before garbage collection kicks in.
With an array, you can explicitly wipe the data after you're done with it. You can overwrite the array with anything you like, and the password won't be present anywhere in the system, even before garbage collection.
So yes, this is a security concern - but even using char[] only reduces the window of opportunity for an attacker, and it's only for this specific type of attack.
As noted in the comments, it's possible that arrays being moved by the garbage collector will leave stray copies of the data in memory. I believe this is implementation-specific - the garbage collector may clear all memory as it goes, to avoid this sort of thing. Even if it does, there's still the time during which the char[] contains the actual characters as an attack window.

While other suggestions here seem valid, there is one other good reason. With plain String you have much higher chances of accidentally printing the password to logs, monitors or some other insecure place. char[] is less vulnerable.
Consider this:
public static void main(String[] args) {
Object pw = "Password";
System.out.println("String: " + pw);
pw = "Password".toCharArray();
System.out.println("Array: " + pw);
}
Prints:
String: Password
Array: [C#5829428e

To quote an official document, the Java Cryptography Architecture guide says this about char[] vs. String passwords (about password-based encryption, but this is more generally about passwords of course):
It would seem logical to collect and store the password in an object
of type java.lang.String. However, here's the caveat: Objects of
type String are immutable, i.e., there are no methods defined that
allow you to change (overwrite) or zero out the contents of a String
after usage. This feature makes String objects unsuitable for
storing security sensitive information such as user passwords. You
should always collect and store security sensitive information in a
char array instead.
Guideline 2-2 of the Secure Coding Guidelines for the Java Programming Language, Version 4.0 also says something similar (although it is originally in the context of logging):
Guideline 2-2: Do not log highly sensitive information
Some information, such as Social Security numbers (SSNs) and
passwords, is highly sensitive. This information should not be kept
for longer than necessary nor where it may be seen, even by
administrators. For instance, it should not be sent to log files and
its presence should not be detectable through searches. Some transient
data may be kept in mutable data structures, such as char arrays, and
cleared immediately after use. Clearing data structures has reduced
effectiveness on typical Java runtime systems as objects are moved in
memory transparently to the programmer.
This guideline also has implications for implementation and use of
lower-level libraries that do not have semantic knowledge of the data
they are dealing with. As an example, a low-level string parsing
library may log the text it works on. An application may parse an SSN
with the library. This creates a situation where the SSNs are
available to administrators with access to the log files.

Character arrays (char[]) can be cleared after use by setting each character to zero and Strings not. If someone can somehow see the memory image, they can see a password in plain text if Strings are used, but if char[] is used, after purging data with 0's, the password is secure.

Some people believe that you have to overwrite the memory used to store the password once you no longer need it. This reduces the time window an attacker has to read the password from your system and completely ignores the fact that the attacker already needs enough access to hijack the JVM memory to do this. An attacker with that much access can catch your key events making this completely useless (AFAIK, so please correct me if I am wrong).
Update
Thanks to the comments I have to update my answer. Apparently there are two cases where this can add a (very) minor security improvement as it reduces the time a password could land on the hard drive. Still I think it's overkill for most use cases.
Your target system may be badly configured or you have to assume it is and you have to be paranoid about core dumps (can be valid if the systems are not managed by an administrator).
Your software has to be overly paranoid to prevent data leaks with the attacker gaining access to the hardware - using things like TrueCrypt (discontinued), VeraCrypt, or CipherShed.
If possible, disabling core dumps and the swap file would take care of both problems. However, they would require administrator rights and may reduce functionality (less memory to use) and pulling RAM from a running system would still be a valid concern.

I don't think this is a valid suggestion, but, I can at least guess at the reason.
I think the motivation is wanting to make sure that you can erase all trace of the password in memory promptly and with certainty after it is used. With a char[] you could overwrite each element of the array with a blank or something for sure. You can't edit the internal value of a String that way.
But that alone isn't a good answer; why not just make sure a reference to the char[] or String doesn't escape? Then there's no security issue. But the thing is that String objects can be intern()ed in theory and kept alive inside the constant pool. I suppose using char[] forbids this possibility.

The answer has already been given, but I'd like to share an issue that I discovered lately with Java standard libraries. While they take great care now of replacing password strings with char[] everywhere (which of course is a good thing), other security-critical data seems to be overlooked when it comes to clearing it from memory.
I'm thinking of e.g. the PrivateKey class. Consider a scenario where you would load a private RSA key from a PKCS#12 file, using it to perform some operation. Now in this case, sniffing the password alone wouldn't help you much as long as physical access to the key file is properly restricted. As an attacker, you would be much better off if you obtained the key directly instead of the password. The desired information can be leaked manifold, core dumps, a debugger session or swap files are just some examples.
And as it turns out, there is nothing that lets you clear the private information of a PrivateKey from memory, because there's no API that lets you wipe the bytes that form the corresponding information.
This is a bad situation, as this paper describes how this circumstance could be potentially exploited.
The OpenSSL library for example overwrites critical memory sections before private keys are freed. Since Java is garbage-collected, we would need explicit methods to wipe and invalidate private information for Java keys, which are to be applied immediately after using the key.

As Jon Skeet states, there is no way except by using reflection.
However, if reflection is an option for you, you can do this.
public static void main(String[] args) {
System.out.println("please enter a password");
// don't actually do this, this is an example only.
Scanner in = new Scanner(System.in);
String password = in.nextLine();
usePassword(password);
clearString(password);
System.out.println("password: '" + password + "'");
}
private static void usePassword(String password) {
}
private static void clearString(String password) {
try {
Field value = String.class.getDeclaredField("value");
value.setAccessible(true);
char[] chars = (char[]) value.get(password);
Arrays.fill(chars, '*');
} catch (Exception e) {
throw new AssertionError(e);
}
}
when run
please enter a password
hello world
password: '***********'
Note: if the String's char[] has been copied as a part of a GC cycle, there is a chance the previous copy is somewhere in memory.
This old copy wouldn't appear in a heap dump, but if you have direct access to the raw memory of the process you could see it. In general you should avoid anyone having such access.

There is nothing that char array gives you vs String unless you clean it up manually after use, and I haven't seen anyone actually doing that. So to me the preference of char[] vs String is a bit exaggerated.
Take a look at the widely used Spring Security library here and ask yourself - are Spring Security guys incompetent or char[] passwords just don't make much sense. When some nasty hacker grabs memory dumps of your RAM be sure s/he'll get all the passwords even if you use sophisticated ways to hide them.
However, Java changes all the time, and some scary features like String Deduplication feature of Java 8 might intern String objects without your knowledge. But that's a different conversation.

Edit: Coming back to this answer after a year of security research, I realize it makes the rather unfortunate implication that you would ever actually compare plaintext passwords. Please don't. Use a secure one-way hash with a salt and a reasonable number of iterations. Consider using a library: this stuff is hard to get right!
Original answer: What about the fact that String.equals() uses short-circuit evaluation, and is therefore vulnerable to a timing attack? It may be unlikely, but you could theoretically time the password comparison in order to determine the correct sequence of characters.
public boolean equals(Object anObject) {
if (this == anObject) {
return true;
}
if (anObject instanceof String) {
String anotherString = (String)anObject;
int n = value.length;
// Quits here if Strings are different lengths.
if (n == anotherString.value.length) {
char v1[] = value;
char v2[] = anotherString.value;
int i = 0;
// Quits here at first different character.
while (n-- != 0) {
if (v1[i] != v2[i])
return false;
i++;
}
return true;
}
}
return false;
}
Some more resources on timing attacks:
A Lesson In Timing Attacks
A discussion about timing attacks over on Information Security Stack Exchange
And of course, the Timing Attack Wikipedia page

Strings are immutable and cannot be altered once they have been created. Creating a password as a string will leave stray references to the password on the heap or on the String pool. Now if someone takes a heap dump of the Java process and carefully scans through he might be able to guess the passwords. Of course these non used strings will be garbage collected but that depends on when the GC kicks in.
On the other side char[] are mutable as soon as the authentication is done you can overwrite them with any character like all M's or backslashes. Now even if someone takes a heap dump he might not be able to get the passwords which are not currently in use. This gives you more control in the sense like clearing the Object content yourself vs waiting for the GC to do it.

String is immutable and it goes to the string pool. Once written, it cannot be overwritten.
char[] is an array which you should overwrite once you used the password and this is how it should be done:
char[] passw = request.getPassword().toCharArray()
if (comparePasswords(dbPassword, passw) {
allowUser = true;
cleanPassword(passw);
cleanPassword(dbPassword);
passw = null;
}
private static void cleanPassword (char[] pass) {
Arrays.fill(pass, '0');
}
One scenario where the attacker could use it is a crashdump—when the JVM crashes and generates a memory dump—you will be able to see the password.
That is not necessarily a malicious external attacker. This could be a support user that has access to the server for monitoring purposes. He/she could peek into a crashdump and find the passwords.

The short and straightforward answer would be because char[] is mutable while String objects are not.
Strings in Java are immutable objects. That is why they can't be modified once created, and therefore the only way for their contents to be removed from memory is to have them garbage collected. It will be only then when the memory freed by the object can be overwritten, and the data will be gone.
Now garbage collection in Java doesn't happen at any guaranteed interval. The String can thus persist in memory for a long time, and if a process crashes during this time, the contents of the string may end up in a memory dump or some log.
With a character array, you can read the password, finish working with it as soon as you can, and then immediately change the contents.

Case String:
String password = "ill stay in StringPool after Death !!!";
// some long code goes
// ...Now I want to remove traces of password
password = null;
password = "";
// above attempts wil change value of password
// but the actual password can be traced from String pool through memory dump, if not garbage collected
Case CHAR ARRAY:
char[] passArray = {'p','a','s','s','w','o','r','d'};
// some long code goes
// ...Now I want to remove traces of password
for (int i=0; i<passArray.length;i++){
passArray[i] = 'x';
}
// Now you ACTUALLY DESTROYED traces of password form memory

A string in Java is immutable. So whenever a string is created, it will remain in memory until it is garbage collected. So anyone who has access to the memory can read the value of the string.
If the value of the string is modified then it will end up creating a new string. So both the original value and the modified value stay in the memory until it is garbage collected.
With the character array, the contents of the array can be modified or erased once the purpose of the password is served. The original contents of the array will not be found in memory after it is modified and even before the garbage collection kicks in.
Because of the security concern it is better to store password as a character array.

It is debatable as to whether you should use String or use Char[] for this purpose because both have their advantages and disadvantages. It depends on what the user needs.
Since Strings in Java are immutable, whenever some tries to manipulate your string it creates a new Object and the existing String remains unaffected. This could be seen as an advantage for storing a password as a String, but the object remains in memory even after use. So if anyone somehow got the memory location of the object, that person can easily trace your password stored at that location.
Char[] is mutable, but it has the advantage that after its usage the programmer can explicitly clean the array or override values. So when it's done being used it is cleaned and no one could ever know about the information you had stored.
Based on the above circumstances, one can get an idea whether to go with String or to go with Char[] for their requirements.

A lot of the previous answers are great. There is another point which I am assuming (please correct me if I am wrong).
By default Java uses UTF-16 for storing strings. Using character arrays, char[] array, facilitates use of Unicode, regional characters, etc. This technique allows all character set to be respected equally for storing the passwords and henceforth will not initiate certain crypto issues due to character set confusion. Finally, using the character array, we can convert the password array to our desired character set string.

Is it safe (in matter of uniqueness) to use UUID to generate a unique identifier for specific string?

String myText;
UUID.nameUUIDFromBytes((myText).getBytes()).toString();
I am using above code to generate a representative for specific texts.
For example 'Moien' should always be represeted with "e9cad067-56f3-3ea9-98d2-26e25778c48f", not any changes like project rebuild should be able to change that UUID.
The reason why I'm doing this is so that I don't want those specific texts to be readable(understandable) to human.
Note: I don't need the ability to regenerate the main text (e.g "Moien") after hashing .
I have an alternative way too :
MessageDigest digest = MessageDigest.getInstance("SHA-256");
byte[] hash = digest.digest((matcher.group(1)).getBytes("UTF-8"));
String a = Base64.encode(hash);
Which od you think is better for my problem?

UUID.nameUUIDFromBytes appears to basically just be MD5 hashing, with the result being represented as a UUID.
It feels clearer to me to use a base64-encoded hash explicitly, partly as you can then control which hash gets used - which could be relevant if collisions pose any sort of security risk. (SHA-256 is likely a better option than MD5 for exactly that reason.) The string will be longer from SHA-256 of course, but hopefully that's not a problem.
Note that in either case, I'd convert the string to text using a fixed encoding via StandardCharsets. Don't use the platform default (as per your first snippet) and prefer StandardCharsets over magic string values (as per your second snippet).

Optimized way of doing string.endsWith() work.

I need to look for all web requests received by Application Server to check if the URL has extensions like .css, .gif, etc
Referred how tomcat is listening for every request and they pick the right configured Servlet to serve.
CharChunk , MessageBytes , Mapper
Here is my idea to implement:
Load all the extensions we like to compare and get the byte
representation of them.
get a unique value for this xtension by summing up the bytes in the byte Array // eg: "css".getBytes()
Add the result value to Sorted List
Whenever we receive the request, get the byte representation of the URL // eg: "flipkart.com/eshopping/images/theme.css".getBytes()
Start summing the bytes from the byte array's last index and break when we encounter "." dot byte value
Search for existence of the value thus summed with the Sorted List // Use binary Search here
Kindly give your feed backs about the implementation and issues if any.
-With thanks, Krishna

This sounds way more complicated than it needs to be.
Use String.lastIndeXOf to find the last dot in the URL
Use String.substring to get the extension based on that
Have a HashSet<String> for a set of supported extensions, or a HashMap<String, Whatever> if you want to map the extension to something else
I would be absolutely shocked to discover that this simple approach turned out to be a performance bottleneck - and indeed I suspect it would be more efficient than the approach you suggested, given that it doesn't require the entire URL to be converted into a byte array... (It's not clear why your approach uses byte arrays anyway instead of forming the hash from char values.)
Fundamentally, my preferred approach to performance is:
Do up-front design and testing around things which are hard to change later, architecturally
For everything else:
Determine the performance criteria first so you know when you can stop
Write the simplest code that works
Test it with realistic data
If it doesn't perform well enough, use profilers (etc) to work out where the bottleneck is, and optimize that making sure that you can prove the benefits using your existing tests

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.