Why is using char[] secure while using String insecure? [duplicate] - java

In Swing, the password field has a getPassword() (returns char[]) method instead of the usual getText() (returns String) method. Similarly, I have come across a suggestion not to use String to handle passwords.
Why does String pose a threat to security when it comes to passwords?
It feels inconvenient to use char[].

Strings are immutable. That means once you've created the String, if another process can dump memory, there's no way (aside from reflection) you can get rid of the data before garbage collection kicks in.
With an array, you can explicitly wipe the data after you're done with it. You can overwrite the array with anything you like, and the password won't be present anywhere in the system, even before garbage collection.
So yes, this is a security concern - but even using char[] only reduces the window of opportunity for an attacker, and it's only for this specific type of attack.
As noted in the comments, it's possible that arrays being moved by the garbage collector will leave stray copies of the data in memory. I believe this is implementation-specific - the garbage collector may clear all memory as it goes, to avoid this sort of thing. Even if it does, there's still the time during which the char[] contains the actual characters as an attack window.

While other suggestions here seem valid, there is one other good reason. With plain String you have much higher chances of accidentally printing the password to logs, monitors or some other insecure place. char[] is less vulnerable.
Consider this:
public static void main(String[] args) {
Object pw = "Password";
System.out.println("String: " + pw);
pw = "Password".toCharArray();
System.out.println("Array: " + pw);
}
Prints:
String: Password
Array: [C#5829428e

To quote an official document, the Java Cryptography Architecture guide says this about char[] vs. String passwords (about password-based encryption, but this is more generally about passwords of course):
It would seem logical to collect and store the password in an object
of type java.lang.String. However, here's the caveat: Objects of
type String are immutable, i.e., there are no methods defined that
allow you to change (overwrite) or zero out the contents of a String
after usage. This feature makes String objects unsuitable for
storing security sensitive information such as user passwords. You
should always collect and store security sensitive information in a
char array instead.
Guideline 2-2 of the Secure Coding Guidelines for the Java Programming Language, Version 4.0 also says something similar (although it is originally in the context of logging):
Guideline 2-2: Do not log highly sensitive information
Some information, such as Social Security numbers (SSNs) and
passwords, is highly sensitive. This information should not be kept
for longer than necessary nor where it may be seen, even by
administrators. For instance, it should not be sent to log files and
its presence should not be detectable through searches. Some transient
data may be kept in mutable data structures, such as char arrays, and
cleared immediately after use. Clearing data structures has reduced
effectiveness on typical Java runtime systems as objects are moved in
memory transparently to the programmer.
This guideline also has implications for implementation and use of
lower-level libraries that do not have semantic knowledge of the data
they are dealing with. As an example, a low-level string parsing
library may log the text it works on. An application may parse an SSN
with the library. This creates a situation where the SSNs are
available to administrators with access to the log files.

Character arrays (char[]) can be cleared after use by setting each character to zero and Strings not. If someone can somehow see the memory image, they can see a password in plain text if Strings are used, but if char[] is used, after purging data with 0's, the password is secure.

Some people believe that you have to overwrite the memory used to store the password once you no longer need it. This reduces the time window an attacker has to read the password from your system and completely ignores the fact that the attacker already needs enough access to hijack the JVM memory to do this. An attacker with that much access can catch your key events making this completely useless (AFAIK, so please correct me if I am wrong).
Update
Thanks to the comments I have to update my answer. Apparently there are two cases where this can add a (very) minor security improvement as it reduces the time a password could land on the hard drive. Still I think it's overkill for most use cases.
Your target system may be badly configured or you have to assume it is and you have to be paranoid about core dumps (can be valid if the systems are not managed by an administrator).
Your software has to be overly paranoid to prevent data leaks with the attacker gaining access to the hardware - using things like TrueCrypt (discontinued), VeraCrypt, or CipherShed.
If possible, disabling core dumps and the swap file would take care of both problems. However, they would require administrator rights and may reduce functionality (less memory to use) and pulling RAM from a running system would still be a valid concern.

I don't think this is a valid suggestion, but, I can at least guess at the reason.
I think the motivation is wanting to make sure that you can erase all trace of the password in memory promptly and with certainty after it is used. With a char[] you could overwrite each element of the array with a blank or something for sure. You can't edit the internal value of a String that way.
But that alone isn't a good answer; why not just make sure a reference to the char[] or String doesn't escape? Then there's no security issue. But the thing is that String objects can be intern()ed in theory and kept alive inside the constant pool. I suppose using char[] forbids this possibility.

The answer has already been given, but I'd like to share an issue that I discovered lately with Java standard libraries. While they take great care now of replacing password strings with char[] everywhere (which of course is a good thing), other security-critical data seems to be overlooked when it comes to clearing it from memory.
I'm thinking of e.g. the PrivateKey class. Consider a scenario where you would load a private RSA key from a PKCS#12 file, using it to perform some operation. Now in this case, sniffing the password alone wouldn't help you much as long as physical access to the key file is properly restricted. As an attacker, you would be much better off if you obtained the key directly instead of the password. The desired information can be leaked manifold, core dumps, a debugger session or swap files are just some examples.
And as it turns out, there is nothing that lets you clear the private information of a PrivateKey from memory, because there's no API that lets you wipe the bytes that form the corresponding information.
This is a bad situation, as this paper describes how this circumstance could be potentially exploited.
The OpenSSL library for example overwrites critical memory sections before private keys are freed. Since Java is garbage-collected, we would need explicit methods to wipe and invalidate private information for Java keys, which are to be applied immediately after using the key.

As Jon Skeet states, there is no way except by using reflection.
However, if reflection is an option for you, you can do this.
public static void main(String[] args) {
System.out.println("please enter a password");
// don't actually do this, this is an example only.
Scanner in = new Scanner(System.in);
String password = in.nextLine();
usePassword(password);
clearString(password);
System.out.println("password: '" + password + "'");
}
private static void usePassword(String password) {
}
private static void clearString(String password) {
try {
Field value = String.class.getDeclaredField("value");
value.setAccessible(true);
char[] chars = (char[]) value.get(password);
Arrays.fill(chars, '*');
} catch (Exception e) {
throw new AssertionError(e);
}
}
when run
please enter a password
hello world
password: '***********'
Note: if the String's char[] has been copied as a part of a GC cycle, there is a chance the previous copy is somewhere in memory.
This old copy wouldn't appear in a heap dump, but if you have direct access to the raw memory of the process you could see it. In general you should avoid anyone having such access.

There is nothing that char array gives you vs String unless you clean it up manually after use, and I haven't seen anyone actually doing that. So to me the preference of char[] vs String is a bit exaggerated.
Take a look at the widely used Spring Security library here and ask yourself - are Spring Security guys incompetent or char[] passwords just don't make much sense. When some nasty hacker grabs memory dumps of your RAM be sure s/he'll get all the passwords even if you use sophisticated ways to hide them.
However, Java changes all the time, and some scary features like String Deduplication feature of Java 8 might intern String objects without your knowledge. But that's a different conversation.

Edit: Coming back to this answer after a year of security research, I realize it makes the rather unfortunate implication that you would ever actually compare plaintext passwords. Please don't. Use a secure one-way hash with a salt and a reasonable number of iterations. Consider using a library: this stuff is hard to get right!
Original answer: What about the fact that String.equals() uses short-circuit evaluation, and is therefore vulnerable to a timing attack? It may be unlikely, but you could theoretically time the password comparison in order to determine the correct sequence of characters.
public boolean equals(Object anObject) {
if (this == anObject) {
return true;
}
if (anObject instanceof String) {
String anotherString = (String)anObject;
int n = value.length;
// Quits here if Strings are different lengths.
if (n == anotherString.value.length) {
char v1[] = value;
char v2[] = anotherString.value;
int i = 0;
// Quits here at first different character.
while (n-- != 0) {
if (v1[i] != v2[i])
return false;
i++;
}
return true;
}
}
return false;
}
Some more resources on timing attacks:
A Lesson In Timing Attacks
A discussion about timing attacks over on Information Security Stack Exchange
And of course, the Timing Attack Wikipedia page

Strings are immutable and cannot be altered once they have been created. Creating a password as a string will leave stray references to the password on the heap or on the String pool. Now if someone takes a heap dump of the Java process and carefully scans through he might be able to guess the passwords. Of course these non used strings will be garbage collected but that depends on when the GC kicks in.
On the other side char[] are mutable as soon as the authentication is done you can overwrite them with any character like all M's or backslashes. Now even if someone takes a heap dump he might not be able to get the passwords which are not currently in use. This gives you more control in the sense like clearing the Object content yourself vs waiting for the GC to do it.

String is immutable and it goes to the string pool. Once written, it cannot be overwritten.
char[] is an array which you should overwrite once you used the password and this is how it should be done:
char[] passw = request.getPassword().toCharArray()
if (comparePasswords(dbPassword, passw) {
allowUser = true;
cleanPassword(passw);
cleanPassword(dbPassword);
passw = null;
}
private static void cleanPassword (char[] pass) {
Arrays.fill(pass, '0');
}
One scenario where the attacker could use it is a crashdump—when the JVM crashes and generates a memory dump—you will be able to see the password.
That is not necessarily a malicious external attacker. This could be a support user that has access to the server for monitoring purposes. He/she could peek into a crashdump and find the passwords.

The short and straightforward answer would be because char[] is mutable while String objects are not.
Strings in Java are immutable objects. That is why they can't be modified once created, and therefore the only way for their contents to be removed from memory is to have them garbage collected. It will be only then when the memory freed by the object can be overwritten, and the data will be gone.
Now garbage collection in Java doesn't happen at any guaranteed interval. The String can thus persist in memory for a long time, and if a process crashes during this time, the contents of the string may end up in a memory dump or some log.
With a character array, you can read the password, finish working with it as soon as you can, and then immediately change the contents.

Case String:
String password = "ill stay in StringPool after Death !!!";
// some long code goes
// ...Now I want to remove traces of password
password = null;
password = "";
// above attempts wil change value of password
// but the actual password can be traced from String pool through memory dump, if not garbage collected
Case CHAR ARRAY:
char[] passArray = {'p','a','s','s','w','o','r','d'};
// some long code goes
// ...Now I want to remove traces of password
for (int i=0; i<passArray.length;i++){
passArray[i] = 'x';
}
// Now you ACTUALLY DESTROYED traces of password form memory

A string in Java is immutable. So whenever a string is created, it will remain in memory until it is garbage collected. So anyone who has access to the memory can read the value of the string.
If the value of the string is modified then it will end up creating a new string. So both the original value and the modified value stay in the memory until it is garbage collected.
With the character array, the contents of the array can be modified or erased once the purpose of the password is served. The original contents of the array will not be found in memory after it is modified and even before the garbage collection kicks in.
Because of the security concern it is better to store password as a character array.

It is debatable as to whether you should use String or use Char[] for this purpose because both have their advantages and disadvantages. It depends on what the user needs.
Since Strings in Java are immutable, whenever some tries to manipulate your string it creates a new Object and the existing String remains unaffected. This could be seen as an advantage for storing a password as a String, but the object remains in memory even after use. So if anyone somehow got the memory location of the object, that person can easily trace your password stored at that location.
Char[] is mutable, but it has the advantage that after its usage the programmer can explicitly clean the array or override values. So when it's done being used it is cleaned and no one could ever know about the information you had stored.
Based on the above circumstances, one can get an idea whether to go with String or to go with Char[] for their requirements.

A lot of the previous answers are great. There is another point which I am assuming (please correct me if I am wrong).
By default Java uses UTF-16 for storing strings. Using character arrays, char[] array, facilitates use of Unicode, regional characters, etc. This technique allows all character set to be respected equally for storing the passwords and henceforth will not initiate certain crypto issues due to character set confusion. Finally, using the character array, we can convert the password array to our desired character set string.

Related

How to get reproducible Pbkdf2PasswordEncoder output in spring boot?

When running the encode method of a spring security Pbkdf2PasswordEncoder instance multiple times, the method returns different results for the same inputs. The snippet
String salt = "salt";
int iterations = 100000;
int hashWidth = 128;
String clearTextPassword = "secret_password";
Pbkdf2PasswordEncoder pbkdf2PasswordEncoder = new Pbkdf2PasswordEncoder(salt, iterations, hashWidth);
String derivedKey = pbkdf2PasswordEncoder.encode(clearTextPassword);
System.out.println("derivedKey: " + derivedKey);
String derivedKey2 = pbkdf2PasswordEncoder.encode(clearTextPassword);
System.out.println("derivedKey2: " + derivedKey2);
results in a output like
derivedKey: b6eb7098ee52cbc4c99c4316be0343873575ed4fa4445144
derivedKey2: 2bef620cc0392f9a5064c0d07d182ca826b6c2b83ac648dc
The expected output would be the same values for both derivations. In addition, when running the application another time, the outputs would be different again. The different output behavior also appears for two different Pbkdf2PasswordEncoder instances with same inputs. The encoding method behaves more like a random number generator. Spring boot version used is 2.6.1, spring-security-core version is 5.6.0 .
Is there any obvious setting that I am missing? The documentation does not give additional hints. Is there a conceptual error in the spring boot project set up?
Is there any obvious setting that I am missing?
Yes. The documentation you linked to is fairly clear, I guess you missed it. That string you pass to the Pbkdf2PasswordEncoder constructor is not a salt!
The encoder generates a salt for you, and generates a salt every time you ask it to encode something, which is how you're supposed to do this stuff1. (The returned string contains both this randomly generated salt as well as the result of applying the encoding, in a single string). Because a new salt is made every time you call .encode, the .encode call returns a different value every time you call it, even if you call it with the same inputs.
The string you pass in is merely 'another secret' - which can sometimes be useful (for example, if you can store this secret in a secure enclave, or it is sent by another system / entered upon boot and never stored on disk, then if somebody runs off with your server they can't check passwords. PBKDF means that if they did have the secret the checking will be very slow, but if they don't, they can't even start).
This seems like a solid plan - otherwise people start doing silly things. Such as using the string "salt" as the salt for all encodes :)
The real problem is:
The expected output would be the same values for both derivations
No. Your expectation is broken. Whatever code you are writing that made this assumption needs to be tossed. For example, this is how you are intended to use the encoder:
When a user creates a new password, you use .encode and store what this method returns in a database.
When a user logs in, you take what they typed, and you take the string from your database (the one .encode sent you) and call .matches.
It sounds like you want to again run .encode and see if it matches. Not how you're supposed to use this code.
Footnote1: The why
You also need to review your security policies. The idea you have in your head of how this stuff works is thoroughly broken. Imagine it worked like you wanted, and there is a single salt used for all password encodes. Then if you hand me a dump of your database, I can trivially crack about 5% of all accounts within about 10 minutes!!
How? Well, I sort all hashed strings and then count occurrences. There will be a bunch of duplicate strings inside. I can then take all users whose passhash is in this top 10 of most common hashes and then log in as them. Because their password is iloveyou, welcome123, princess, dragon, 12345678, alexsawesomeservice!, etcetera - the usual crowd of extremely oft-used passwords. How do I know that's their password? Because their password is the same as that of many other users on your system.
Furthermore, if none of the common passwords work, I can tell that likely these are really different accounts from the same user.
These are all things that I definitely should not be able to derive from the raw data. The solution is, naturally, to have a unique salt for everything, and then store the salt in the DB along with the hash value so that one can 'reconstruct' when a user tries to log in. These tools try to make your life easy by doing the work for you. This is a good idea, because errors in security implementations (such as forgetting to salt, or using the same salt for all users) are not (easily) unit testable, so a well meaning developer writes code, it seems to work, a casual glance at the password hashes seem to indicate "it is working" (the hashes seem random enough to the naked eye), and then it gets deployed, security issue and all.

Best way to optimize string data in an application that allocates quite a bit of it

I have an application that uses a ton of String objects. One of my objects (lets call it Person) contains 9 of them. The data that is written to each String object is never written more than once, but will be read several times after. There will be several hundred thousand or so Person objects at a given time and many of these Person objects will share first name, last name, etc...
I am trying to think of immediate ways to reduce the amount memory that is consumed by the Person object but I am no expert when it comes to how Java manages its memory underneath.
Before I go down this rabbit hole, I would like to know what drawbacks there would be if I went down these paths and if it even make sense in the first place:
Using StringBuilder or StringBuffer solely because of the trimToSize() method which would allow me to reduce the number of allocated bytes used in the string.
Store the strings as byte[] array's and provide a getter that would convert the byte[] to String and a setter that would accept String and convert to byte[] - data is being read quite a bit, so would this be too expensive?
Create a hash table for (lets just say) "names" that would prevent duplicate allocations (using a pointer) for the same name over and over (there could be thousands of names with 10+ characters).
Before I pointlessly head down any of these roads, does it make sense to do? Maybe Java is already reducing String allocations and checking for duplicates?
I don't mind a good read either. I have found some documentation but nothing that explores to this depth.
Obviously StringBuilder and StringBuffer couldn't help in this case. String is immutable object, so these 2 classes were introduced for building Strings not for storing. Anyway you may (in most cases - must) use StringBuilder if you concatinate/insert chars in the middle/delete some chars from/of Strings
In my opinion, second option could led to increasing memory consuption because new String will be created when byte[] will be converted to String every time you need it.
Handwritten StringDeduplicator is very reasonable solution, especially if you are stuck with java 5,6,7.
Java 8/9 has String Deduplication option. By default, this option is disabled. To use this one in Java 8, you must enable the G1 garbage collector, while in Java 9 G1 is the default.
-XX:+UseStringDeduplication
Regarding String Deduplication, see:
JEP 192: String Deduplication in G1
Java 8 Update 20 Release Notes
Other Stack Overflow posts

copymethod() with Java string

I have a requirement to use copymethod() of unsafe class to use for a string, I came across the link http://mishadoff.com/blog/java-magic-part-4-sun-dot-misc-dot-unsafe/, where I found the following example-
String password = new String("l00k#myHor$e");
String fake = new String(password.replaceAll(".", "?"));
System.out.println(password); // l00k#myHor$e
System.out.println(fake); // ????????????
getUnsafe().copyMemory(
fake, 0L, null, toAddress(password), sizeOf(password));
System.out.println(password); // ????????????
System.out.println(fake); // ????????????
static long toAddress(Object obj) {
Object[] array = new Object[] {obj};
long baseOffset = getUnsafe().arrayBaseOffset(Object[].class);
return normalize(getUnsafe().getInt(array, baseOffset));
}
private static long normalize(int value) {
if(value >= 0) return value;
return (~0L >>> 32) & value;
}
I tried the example but I got IllegalArgumentException. Can anyone please help in getting this example worked.
My advice: Don't do this!
If you need to erase a String, you can do it "safely" by using reflection to dig out the String object's private chars array and filling it with NUL characters. Of course, you need to be sure that the string you are erasing is not shared with other code; e.g. it has not been interned.
The code you are copying looks broken. For a start, it seems to be assuming that references fit in an int ... which is not so for a 64 bit JVM. I wouldn't trust that code. I wouldn't even try to fix it. This is simply the wrong approach.
Unsafe should only be used by people who really, really know what they are doing. Copying someone else's code is not a substitute for knowledge.
And in this case, I don't think that the original code erased the characters properly. In fact, I think it merely dismantles the String object, leaving the char[] containing the super-secret password intact in the heap. The blog post actually hints at this.
In fact, it is next to impossible to guarantee that you have erased a String. Even if you succeed in erasing the char[] containing the characters, you can't be sure that is the only copy of the data. For instance, if the string has been relocated by the GC, there could still be an old copy of the characters in memory at the original location. The characters will eventually be overwritten but there are no guarantees on when that will happen.
Probably the best you can do is to use JNI / JNA (or Unsafe) to allocate some off-heap (that won't be relocated by the GC) and over-write it with zeros before you release it. Obviously, you can't do this with a String.
And even then ... there could be some stale pages on the paging device containing the characters. Or someone with sufficient privilege could set a breakpoint and read the secret out of memory.
My advice: just make sure that the >>platform<< is properly secured, both physically and from access over the network.

What should I do after using a password to log in to a system? [duplicate]

In Swing, the password field has a getPassword() (returns char[]) method instead of the usual getText() (returns String) method. Similarly, I have come across a suggestion not to use String to handle passwords.
Why does String pose a threat to security when it comes to passwords?
It feels inconvenient to use char[].
Strings are immutable. That means once you've created the String, if another process can dump memory, there's no way (aside from reflection) you can get rid of the data before garbage collection kicks in.
With an array, you can explicitly wipe the data after you're done with it. You can overwrite the array with anything you like, and the password won't be present anywhere in the system, even before garbage collection.
So yes, this is a security concern - but even using char[] only reduces the window of opportunity for an attacker, and it's only for this specific type of attack.
As noted in the comments, it's possible that arrays being moved by the garbage collector will leave stray copies of the data in memory. I believe this is implementation-specific - the garbage collector may clear all memory as it goes, to avoid this sort of thing. Even if it does, there's still the time during which the char[] contains the actual characters as an attack window.
While other suggestions here seem valid, there is one other good reason. With plain String you have much higher chances of accidentally printing the password to logs, monitors or some other insecure place. char[] is less vulnerable.
Consider this:
public static void main(String[] args) {
Object pw = "Password";
System.out.println("String: " + pw);
pw = "Password".toCharArray();
System.out.println("Array: " + pw);
}
Prints:
String: Password
Array: [C#5829428e
To quote an official document, the Java Cryptography Architecture guide says this about char[] vs. String passwords (about password-based encryption, but this is more generally about passwords of course):
It would seem logical to collect and store the password in an object
of type java.lang.String. However, here's the caveat: Objects of
type String are immutable, i.e., there are no methods defined that
allow you to change (overwrite) or zero out the contents of a String
after usage. This feature makes String objects unsuitable for
storing security sensitive information such as user passwords. You
should always collect and store security sensitive information in a
char array instead.
Guideline 2-2 of the Secure Coding Guidelines for the Java Programming Language, Version 4.0 also says something similar (although it is originally in the context of logging):
Guideline 2-2: Do not log highly sensitive information
Some information, such as Social Security numbers (SSNs) and
passwords, is highly sensitive. This information should not be kept
for longer than necessary nor where it may be seen, even by
administrators. For instance, it should not be sent to log files and
its presence should not be detectable through searches. Some transient
data may be kept in mutable data structures, such as char arrays, and
cleared immediately after use. Clearing data structures has reduced
effectiveness on typical Java runtime systems as objects are moved in
memory transparently to the programmer.
This guideline also has implications for implementation and use of
lower-level libraries that do not have semantic knowledge of the data
they are dealing with. As an example, a low-level string parsing
library may log the text it works on. An application may parse an SSN
with the library. This creates a situation where the SSNs are
available to administrators with access to the log files.
Character arrays (char[]) can be cleared after use by setting each character to zero and Strings not. If someone can somehow see the memory image, they can see a password in plain text if Strings are used, but if char[] is used, after purging data with 0's, the password is secure.
Some people believe that you have to overwrite the memory used to store the password once you no longer need it. This reduces the time window an attacker has to read the password from your system and completely ignores the fact that the attacker already needs enough access to hijack the JVM memory to do this. An attacker with that much access can catch your key events making this completely useless (AFAIK, so please correct me if I am wrong).
Update
Thanks to the comments I have to update my answer. Apparently there are two cases where this can add a (very) minor security improvement as it reduces the time a password could land on the hard drive. Still I think it's overkill for most use cases.
Your target system may be badly configured or you have to assume it is and you have to be paranoid about core dumps (can be valid if the systems are not managed by an administrator).
Your software has to be overly paranoid to prevent data leaks with the attacker gaining access to the hardware - using things like TrueCrypt (discontinued), VeraCrypt, or CipherShed.
If possible, disabling core dumps and the swap file would take care of both problems. However, they would require administrator rights and may reduce functionality (less memory to use) and pulling RAM from a running system would still be a valid concern.
I don't think this is a valid suggestion, but, I can at least guess at the reason.
I think the motivation is wanting to make sure that you can erase all trace of the password in memory promptly and with certainty after it is used. With a char[] you could overwrite each element of the array with a blank or something for sure. You can't edit the internal value of a String that way.
But that alone isn't a good answer; why not just make sure a reference to the char[] or String doesn't escape? Then there's no security issue. But the thing is that String objects can be intern()ed in theory and kept alive inside the constant pool. I suppose using char[] forbids this possibility.
The answer has already been given, but I'd like to share an issue that I discovered lately with Java standard libraries. While they take great care now of replacing password strings with char[] everywhere (which of course is a good thing), other security-critical data seems to be overlooked when it comes to clearing it from memory.
I'm thinking of e.g. the PrivateKey class. Consider a scenario where you would load a private RSA key from a PKCS#12 file, using it to perform some operation. Now in this case, sniffing the password alone wouldn't help you much as long as physical access to the key file is properly restricted. As an attacker, you would be much better off if you obtained the key directly instead of the password. The desired information can be leaked manifold, core dumps, a debugger session or swap files are just some examples.
And as it turns out, there is nothing that lets you clear the private information of a PrivateKey from memory, because there's no API that lets you wipe the bytes that form the corresponding information.
This is a bad situation, as this paper describes how this circumstance could be potentially exploited.
The OpenSSL library for example overwrites critical memory sections before private keys are freed. Since Java is garbage-collected, we would need explicit methods to wipe and invalidate private information for Java keys, which are to be applied immediately after using the key.
As Jon Skeet states, there is no way except by using reflection.
However, if reflection is an option for you, you can do this.
public static void main(String[] args) {
System.out.println("please enter a password");
// don't actually do this, this is an example only.
Scanner in = new Scanner(System.in);
String password = in.nextLine();
usePassword(password);
clearString(password);
System.out.println("password: '" + password + "'");
}
private static void usePassword(String password) {
}
private static void clearString(String password) {
try {
Field value = String.class.getDeclaredField("value");
value.setAccessible(true);
char[] chars = (char[]) value.get(password);
Arrays.fill(chars, '*');
} catch (Exception e) {
throw new AssertionError(e);
}
}
when run
please enter a password
hello world
password: '***********'
Note: if the String's char[] has been copied as a part of a GC cycle, there is a chance the previous copy is somewhere in memory.
This old copy wouldn't appear in a heap dump, but if you have direct access to the raw memory of the process you could see it. In general you should avoid anyone having such access.
There is nothing that char array gives you vs String unless you clean it up manually after use, and I haven't seen anyone actually doing that. So to me the preference of char[] vs String is a bit exaggerated.
Take a look at the widely used Spring Security library here and ask yourself - are Spring Security guys incompetent or char[] passwords just don't make much sense. When some nasty hacker grabs memory dumps of your RAM be sure s/he'll get all the passwords even if you use sophisticated ways to hide them.
However, Java changes all the time, and some scary features like String Deduplication feature of Java 8 might intern String objects without your knowledge. But that's a different conversation.
Edit: Coming back to this answer after a year of security research, I realize it makes the rather unfortunate implication that you would ever actually compare plaintext passwords. Please don't. Use a secure one-way hash with a salt and a reasonable number of iterations. Consider using a library: this stuff is hard to get right!
Original answer: What about the fact that String.equals() uses short-circuit evaluation, and is therefore vulnerable to a timing attack? It may be unlikely, but you could theoretically time the password comparison in order to determine the correct sequence of characters.
public boolean equals(Object anObject) {
if (this == anObject) {
return true;
}
if (anObject instanceof String) {
String anotherString = (String)anObject;
int n = value.length;
// Quits here if Strings are different lengths.
if (n == anotherString.value.length) {
char v1[] = value;
char v2[] = anotherString.value;
int i = 0;
// Quits here at first different character.
while (n-- != 0) {
if (v1[i] != v2[i])
return false;
i++;
}
return true;
}
}
return false;
}
Some more resources on timing attacks:
A Lesson In Timing Attacks
A discussion about timing attacks over on Information Security Stack Exchange
And of course, the Timing Attack Wikipedia page
Strings are immutable and cannot be altered once they have been created. Creating a password as a string will leave stray references to the password on the heap or on the String pool. Now if someone takes a heap dump of the Java process and carefully scans through he might be able to guess the passwords. Of course these non used strings will be garbage collected but that depends on when the GC kicks in.
On the other side char[] are mutable as soon as the authentication is done you can overwrite them with any character like all M's or backslashes. Now even if someone takes a heap dump he might not be able to get the passwords which are not currently in use. This gives you more control in the sense like clearing the Object content yourself vs waiting for the GC to do it.
String is immutable and it goes to the string pool. Once written, it cannot be overwritten.
char[] is an array which you should overwrite once you used the password and this is how it should be done:
char[] passw = request.getPassword().toCharArray()
if (comparePasswords(dbPassword, passw) {
allowUser = true;
cleanPassword(passw);
cleanPassword(dbPassword);
passw = null;
}
private static void cleanPassword (char[] pass) {
Arrays.fill(pass, '0');
}
One scenario where the attacker could use it is a crashdump—when the JVM crashes and generates a memory dump—you will be able to see the password.
That is not necessarily a malicious external attacker. This could be a support user that has access to the server for monitoring purposes. He/she could peek into a crashdump and find the passwords.
The short and straightforward answer would be because char[] is mutable while String objects are not.
Strings in Java are immutable objects. That is why they can't be modified once created, and therefore the only way for their contents to be removed from memory is to have them garbage collected. It will be only then when the memory freed by the object can be overwritten, and the data will be gone.
Now garbage collection in Java doesn't happen at any guaranteed interval. The String can thus persist in memory for a long time, and if a process crashes during this time, the contents of the string may end up in a memory dump or some log.
With a character array, you can read the password, finish working with it as soon as you can, and then immediately change the contents.
Case String:
String password = "ill stay in StringPool after Death !!!";
// some long code goes
// ...Now I want to remove traces of password
password = null;
password = "";
// above attempts wil change value of password
// but the actual password can be traced from String pool through memory dump, if not garbage collected
Case CHAR ARRAY:
char[] passArray = {'p','a','s','s','w','o','r','d'};
// some long code goes
// ...Now I want to remove traces of password
for (int i=0; i<passArray.length;i++){
passArray[i] = 'x';
}
// Now you ACTUALLY DESTROYED traces of password form memory
A string in Java is immutable. So whenever a string is created, it will remain in memory until it is garbage collected. So anyone who has access to the memory can read the value of the string.
If the value of the string is modified then it will end up creating a new string. So both the original value and the modified value stay in the memory until it is garbage collected.
With the character array, the contents of the array can be modified or erased once the purpose of the password is served. The original contents of the array will not be found in memory after it is modified and even before the garbage collection kicks in.
Because of the security concern it is better to store password as a character array.
It is debatable as to whether you should use String or use Char[] for this purpose because both have their advantages and disadvantages. It depends on what the user needs.
Since Strings in Java are immutable, whenever some tries to manipulate your string it creates a new Object and the existing String remains unaffected. This could be seen as an advantage for storing a password as a String, but the object remains in memory even after use. So if anyone somehow got the memory location of the object, that person can easily trace your password stored at that location.
Char[] is mutable, but it has the advantage that after its usage the programmer can explicitly clean the array or override values. So when it's done being used it is cleaned and no one could ever know about the information you had stored.
Based on the above circumstances, one can get an idea whether to go with String or to go with Char[] for their requirements.
A lot of the previous answers are great. There is another point which I am assuming (please correct me if I am wrong).
By default Java uses UTF-16 for storing strings. Using character arrays, char[] array, facilitates use of Unicode, regional characters, etc. This technique allows all character set to be respected equally for storing the passwords and henceforth will not initiate certain crypto issues due to character set confusion. Finally, using the character array, we can convert the password array to our desired character set string.

Why is my hashset so memory-consuming?

I found out the memory my program is increasing is because of the code below, currently I am reading a file that is about 7GB big, and I believe the one that would be stored in the hashset is lesson than 10M, but the memory my program keeps increasing to 300MB and then crashes because of OutofMemoryError. If it is the Hashset problem, which data structure shall I choose?
if(tagsStr!=null) {
if(tagsStr.contains("a")||tagsStr.contains("b")||tagsStr.contains("c")) {
maTable.add(postId);
}
} else {
if(maTable.contains(parentId)) {
//do sth else, no memories added here
}
}
You haven't really told us what you're doing, but:
If your file is currently in something like ASCII, each character you read will be one byte in the file or two bytes in memory.
Each string will have an object overhead - this can be significant if you're storing lots of small strings
If you're reading lines with BufferedReader (or taking substrings from large strings), each one may have a large backing buffer - you may want to use maTable.add(new String(postId)) to avoid this
Each entry in the hash set needs a separate object to keep the key/hashcode/value/next-entry values. Again, with a lot of entries this can add up
In short, it's quite possible that you're doing nothing wrong, but a combination of memory-increasing factors are working against you. Most of these are unavoidable, but the third one may be relevant.
You've either got a memory leak or your understanding of the amount of string data that you are storing is incorrect. We can't tell which without seeing more of your code.
The scientific solution is to run your application using a memory profiler, and analyze the output to see which of your data structures is using an unexpectedly large amount of memory.
If I was to guess, it would be that your application (at some level) is doing something like this:
String line;
while ((line = br.readLine()) != null) {
// search for tag in line
String tagStr = line.substring(pos1, pos2);
// code as per your example
}
This uses a lot more memory than you'd expect. The substring(...) call creates a tagStr object that refers to the backing array of the original line string. Your tag strings that you expect to be short actually refer to a char[] object that holds all characters in the original line.
The fix is to do this:
String tagStr = new String(line.substring(pos1, pos2));
This creates a String object that does not share the backing array of the argument String.
UPDATE - this or something similar is an increasingly likely explanation ... given your latest data.
To expand on another of Jon Skeet's point, the overheads of a small String are surprisingly high. For instance, on a typical 32 bit JVM, the memory usage of a one character String is:
String object header for String object: 2 words
String object fields: 3 words
Padding: 1 word (I think)
Backing array object header: 3 words
Backing array data: 1 word
Total: 10 words - 40 bytes - to hold one char of data ... or one byte of data if your input is in an 8-bit character set.
(This is not sufficient to explain your problem, but you should be aware of it anyway.)
Couldn't be it possible that the data read into memory (from the 7G file) is somehow not freed? Something ike Jon puts... ie. since strings are immutable every string read requires a new String object creation which might lead to out of memory if GC is not quick enough...
If the above is the case than you might insert some 'breakpoints' into your code/iteration, ie. at some defined points, issue gc and wait till it terminates.
Run your program with -XX:+HeapDumpOnOutOfMemoryError. You'll then be able to use a memory analyser like MAT to see what is using up all of the memory - it may be something completely unexpected.

Categories