Recently we have added a tool to find security holes in our organization. One of the issues that was found is that when connecting to a database (ex. using Hikari), we have to provide a String containing the password (encrypted, of course, which will be decrypted when used).
Now, keeping passwords in Strings is not safe, because it can be extracted, until garbage collector comes and clears it.
So we started changing our code to use char[] and byte[] (not sure it's the best, but the idea is that we can clear the array after usage, and not wait for garbage collector to clear it for us) to set our passwords on Hikari, but the last part of the flow is setting an unencrypted password String to Hikari. So all this fuss to find out that Hikari is keeping the password inside a String..
So am I supposed to change Hikari code and recompile it as our own organization implementation of Hikari which use passwords from a char[]? or what?
How can we avoid this?
Now, keeping passwords in Strings is not safe, because it can be extracted, until garbage collector comes and clears it.
Only if someone has sufficient access to capture what is in memory (or swap space on disk). If someone can do that, they can probably also do one or more of the following:
modify your application at the bytecode level to inject code to capture the secret
attach a debugger and use it to set a breakpoint at the point where the secret is used
read the secret from the file system, database, whatever
find the private key for your server's SSL certs and use it to snoop on the HTTPS traffic to your server,
walk out of your machine room with your hard drives, etc and then attack them at their leisure
and so on.
Spending a lot of effort to use char[] for handling passwords won't address any of those other ways of stealing the secrets.
And it won't address various other security blunders ... like porous firewalls, unencrypted backups saved to the cloud, keys on a stolen devops laptop, successful spear phishing, etc.
So am I supposed to change Hikari code and recompile it as our own organization implementation of Hikari which use passwords from a char[]? or what?
That is what you would need to do if you wanted to address this attack vector. Don't ever hold passwords in String objects, and make sure that you clear the char[] or byte[] or whatever that you use to hold them as soon as possible.
Are you "supposed" to do that? Shrug.
My advice would be to do a full security risk assessment, look at all of the issues and decide whether or not addressing this one will make a significant difference to overall security. Balance that against the costs of creating and maintaining the Hikari patches. On the flip-side, quantify the costs to your organization if (these) passwords were stolen.
But it is not up to us to decide what you should. And it is not even possible to give you an objective recommendation, because we don't understand the full context.
Related
I learned about using char[] to store passwords back in the Usenet days in comp.lang.java.*.
Searching Stack Overflow, you can also easily find highly upvoted questions like this:
Why is char[] preferred over String for passwords? which agrees with what I learned a long, long time ago.
I still write my APIs to use char[] for password. But are that just hollow ideals now?
For example, look at Atlassian Jira's Java API:
LoginManager.authenticate which takes your password as a String.
Or Thales' Luna Java API:
login() method in LunaSlotManager. Of all people, an HSM vendor using String for the HSM slot password.
I think I've also read somewhere that the internals of URLConnection (and many other classes) uses String internally to handle the data. So if you ever send a password (although the password is encrypted by TLS over the wire), it will be in a String in your server's memory.
Is accessing server memory an attack factor so difficult to achieve that it is okay to store passwords as String now? Or is Thales' doing it because your password will end up in a String anyway due to classes written by others?
First, let’s recall the reason for the recommendation to use char[] instead of String: Strings are immutable, so once the string is created, there is limited control over the contents of the string until (potentially well after) the memory is garbage collected. An attacker that can dump the process memory can thus potentially read the password data. Meanwhile the contents of the char[] object can be overwritten after it has been created. Assuming that this is done, and that the GC hasn’t moved the object to another physical memory location in the interim, this means that the password contents can be destroyed (somewhat) deterministically after it has been used. An attacker reading the process memory after that point won’t be able to get the password.
So using char[] instead of String lets you prevent a very specific attack scenario, where the attacker has full access to the process memory,1 but only at specific points in time rather than continuously. And even under this scenario, using char[] and overwriting its contents does not prevent the attack, it just reduces its chance of success (if the attacker happens to read the process memory between the creation and the erasure of the password, they can read it).
I am not aware of any evidence that shows (a) how frequent this scenario is, nor (b) how much this mitigation reduces the probability of success under that scenario. As far as I know, this is pure guesswork.
In fact, on most systems, this scenario likely does not exist at all: an attacker who can get access to another process’ memory can also gain full tracing access. For instance, on both Linux and Windows any process that can read another process’ memory can also inject arbitrary logic into that process (e.g. via LD_PRELOAD and similar mechanisms2). So I would say that this mitigation at best has a limited benefit, and potentially none at all.
… Actually I can think of one specific counter-example: an application that loads an untrusted plugin library. As soon as that library is loaded via conventional means (i.e. in the same memory space), it has access to the parent application. In this scenario, it might make sense to use char[] instead of String, and overwrite its contents when done with it, if the password is handled before the plugin is loaded. But a better solution would be not to load untrusted plugins into the same memory space. A common alternative is to launch it in a separate process and communicate via IPC.
(See the answer by Gilles for more vulnerable scenarios. I still think that the benefit is relatively limited, but it’s clearly not nil.)
1 As shown in Gilles’ answer, this is not correct: no full memory access is required to mount a successful attack.
2 Although LD_PRELOAD specifically requires the attacker to not only have access to another process but either to launch that process, or to have access to its parent process.
(Note: I am a security expert but not a Java expert.)
Yes, there is a significant security advantage in using char[] rather than strings for passwords. This also applies to some extent to other highly confidential data, although most highly confidential data (e.g. cryptographic keys) tends to be bytes and not characters.
The old, and still valid, reason to use char[] is to clean up memory as soon as it is used, which is not possible with String. This is a very firmly established security practice. For example, in the (in)famous FIPS 140 requirements for cryptographic processing, which are generally considered to be security requirements, there are in fact extremely few security requirements at level 1 (the easiest level). Just two, in fact: one is that you may only used approved cryptographic algorithms, and the other one is that keys, passwords and other sensitive data must be wiped after use.
This practice is one of the reason why production implementations of cryptographic primitives are usually implemented in languages with manual memory management such as C, C++ or Rust: cryptography implementers want to retain control of where sensitive data goes, and to be sure to wipe all copies of sensitive material.
As an example of what can go wrong, consider the (in)famous Heartbleed bug. It allowed anyone on the Internet connecting to a vulnerable server to dump some of the memory of the server, without being detected. The attacker didn't get much control over which part of the memory, but could try again and again. An attacker could make requests that would cause the dumpable part to move around the heap, and thus could potentially dump the whole memory.
Are such bug common? No. This one got a lot of buzz because it was in a very popular software and the consequences were bad. But such bugs do exist and it's good to protect against them.
In addition, since Java 8, there is another reason, which is to avoid string deduplication. String deduplication means that if two String objects have the same content, they may be merged. String deduplication is problematic if an attacker can mount a side channel attack when the deduplication is attempted. The attack does not require the password to be deduplicated (although it is easier in this case): there's a problem as soon as some code compares the password against another string.
The usual way to compare strings for equality is:
If the lengths are different, return false.
Otherwise compare the characters one by one. As soon as there are different characters at one position, return false.
If the end of the strings is reached without encountering a difference, return true.
This has a timing side channel: the time of the middle step depends on the number of identical characters at the beginning of the string. Suppose that an attacker can measure this time, and can upload some strings for comparison (e.g. by making legitimate requests to a server). The attacker notices that comparing with sssssssss takes slightly longer than comparing with aaaaaaaaa, so the password must begin with s. Then the attacker tries to vary the second character, and finds that comparing with swwwwwwww takes again slightly longer. And thus, in relatively short time, the attacker can reconstruct the password character by character.
In the context of string deduplication, the attack is harder, because (as far as I know) the deduplication code first hashes the strings to compare. This may mean that the attacker has to first guess the hash value. But the total number of hash values in a given hash table (that's the number of hash buckets, not the full range of the hash method) is small enough that it's practical to enumerate.
This is not an easy attack, to be sure. But I would absolutely not rule it out, especially with a local attacker, but even with a remote attacker. Remote timing attacks are practical (still).
In conclusion, yes, you should not use String for passwords. Read them as char[], keep careful track of any copies, hash them as soon as possible if you're verifying them, and wipe all copies.
If you need to store a password for a third-party service, it's a good idea to store it in encrypted form even if there is no separate access control for the encryption key. Copies of an encrypted password are less prone to leaking through side channels than copies of the password itself, which is a printable string with low entropy.
I think I've also read somewhere that the internals of URLConnection (and many other classes) uses String internally to handle the data. So if you ever send a password (although the password is encrypted by TLS over the wire), it will be in a String in your server's memory.
I'm not a Java expert, but this doesn't sound right: the plaintext of a connection (TLS or otherwise) is a byte stream, not a character stream. It should be arrays of 8-bit bytes, not arrays of Unicode code points.
Or that your password will end up in a String anyway due to classes written by others, is that why Thales' doing it.
Possibly. Or possibly because they aren't Java experts, or because the people who write the high-level layers are often not the foremost security experts.
Almost everyone else's answer plus one additional point:
Swap space on a storage media.
If the JVM heap is ever paged out to disk and the password is still in memory as a string (immutable and not GC'd), it will be written to the swap file. This swap file can then be scanned for password values, so, essentially another attack vector that's time based and still rather difficult to utilize but obviously not that difficult because we're here :D.
Wiping the mutable array at least reduces the time where the password is in memory.
The story I heard was that if an attacker can attack your process (like a DDOS) and trigger the process to swap out, it's somewhat easier to attack the swap space than the memory, AND swap space is preserved across boots/crashes/etc. This allows for yet another attack vector where the attacker pulls the swap drive out to scan the swap space.
Lots of detail in the answers but here's the short of it: yes, in theory, putting the password in an array and wiping it provides security benefits. In practice, that only helps if you can avoid the password ever being stored in a String. That is, if you take a password stored in a String and put the contents of the String into a char[], it doesn't magically make the String disappear from the heap. The necessary requirement is that the password never is placed in a String at all. I'd be interested to see that successfully implemented in a real Java application.
It's not an idea of the moment of transfer over the network. There indeed you're indeed better off using a String as it's just more convient to use to send over the network, of course making sure it's properly encrypted.
For using passwords in applications it's different due to stack-dumps and reverse engineering, and the problem of the String being immutable:
In case the password has been entered, even if the reference to the variable is changed to another string, there is no certainty about when the garbage collector will actually remove the String from the heap. So a hacker being able to see the dump will also be able to see the password. Using an array of char prevents this as you can change the data in the array directly without relying on the garbage collector.
Now you might say: well then when sending it over the network as a String it'll still be visible no? Well yes, but that's why encrypting it before sending it is important. Never send plain text passwords over the network when possible.
My application uses Google protocol buffers to send sensitive data between client and server instances. The network link is encrypted with SSL, so I'm not worried about eavesdroppers on the network. I am worried about the actual loading of sensitive data into the protobuf because of memory concerns explained in this SO question.
For example:
Login login = Login.newBuilder().setPassword(password)// problem
.build();
Is there no way to do this securely since protocol buffers are immutable?
Protobuf does not provide any option to use char[] instead of String. On the contrary, Protobuf messages are intentionally designed to be fully immutable, which provides a different kind of security: you can share a single message instance between multiple sandboxed components of a program without worrying that one may modify the data in order to interfere with another.
In my personal opinion as a security engineer -- though others will disagree -- the "security" described in the SO question to which you link is security theater, not actually worth pursuing, for a number of reasons:
If an attacker can read your process's memory, you've already lost. Even if you overwrite the secret's memory before discarding it, if the attacker reads your memory at the right time, they'll find the password. But, worse, if an attacker is in a position to read your process's memory, they're probably in a position to do much worse things than extract temporary passwords: they can probably extract long-lived secrets (e.g. your server's TLS private key), overwrite parts of memory to change your app's behavior, access any and all resources to which your app has access, etc. This simply isn't a problem that can be meaningfully addressed by zeroing certain fields after use.
Realistically, there are too many ways that your secrets may be copied anyway, over which you have no control, making the whole exercise moot:
Even if you are careful, the garbage collector could have made copies of the secret while moving memory around, defeating the purpose. To avoid this you probably need to use a ByteBuffer backed by non-managed memory.
When reading the data into your process, it almost certainly passes through library code that doesn't overwrite its data in this way. For example, an InputStream may do internal buffering, and probably doesn't zero out its buffer afterwards.
The operating system may page your data out to swap space on disk at any time, and is not obliged to zero that data afterwards. So even if you zero out the memory, it may persist in swap. (Encrypting swap ensures that these secrets are effectively gone when the system shuts down, but doesn't necessarily protect against an attacker present on the local machine who is able to extract the swap encryption key out of the kernel.)
Etc.
So, in my opinion, using mutable objects in Java specifically to be able to overwrite secrets in this way is not a useful strategy. These threats need to be addressed elsewhere.
Strings in Java are kept around for a while, potentially a long while. This is a good thing, unless that String contains a user's actual password. Character arrays are suggested because they aren't immutable and can be cleared faster. (Let's hope there's never a "Bitter Coffee" attack that works like Heartbleed but against the JVM (remote heap dump)).
I notice that Spring Security PasswordEncoder takes a CharSequence not a String, possibly for this reason. However I'm not sure what object I should use to keep the password in memory pre-hashing. Would StringBuilder be appropriate? what would that look like? I'm even less sure if I'm creating a REST API (with Jackson under the hood, via either Spring Data or Spring MVC) how I can keep that from ever being a String.
How can I code a JSON REST API for creating/updating passwords whilst being as secure as possible and avoiding the various problems with using Strings?
How can I code a JSON REST API for creating/updating passwords whilst
being as secure as possible and avoiding the various problems with
using Strings?
Well API keys are not true passwords. You have control on how API keys are created so you can create some random string which will have a very low collision (ie double UUID) and very low common substring (in the case of dedup). After the client logs in through the REST API using the key you could use temporary tokens thus improving the likelihood of the API key getting garbage collected.
As for dealing with real passwords which is the case for a human logging in (perhaps to reset the API key) you don't really have much options given that almost every servlet container will turn request parameters into strings. One cheesy option is to have the client through Javascript (or whatever your clients are) Base64 encode the password and then add a separator and then add a randomly generated number or string to the password. This is not really for obfuscation but again to lower the probability of keeping the same string around. You'll have to be careful of course to decode into char or byte array and then remove the random suffix by manipulating the char or byte array (see CharBuffer).
Another complicated option is the microservice cloud approach. Just make an authentication service composed of a couple of tiny round robin instances that only do authentication. Have those JVM instances get restarted frequently (to flush memory). Or if they are small enough they will hopefully garbage collect more frequently.
I'll assume of course that your data repository has salted passwords (otherwise this safety precaution is pretty moot).
To be honest though there are so many other threats that I don't really think its worth the effort for most use cases in a HTTP server environment.
The reason why Java Swing uses char[] for password because Swing is used for a desktop environment. Desktop environments are far more likely to have malicious programs such as virus/spyware that could do some memory probing for passwords.
With that in mind its really the clients you should worry about and not the server.
Our web-based applications has user accounts tied down to users with the passwords specified during account creation. In the case of Java, how does one process the password securely, before persisting its hash in the database.
To be more specific, how does one ensure that the string holding the password is garbage collected within a sufficiently short interval of time ?
If you have the possibility (may be difficult in web applications), it would be better to store passwords in character arrays than to store them in strings. If you finished storing the password you can overwrite it in memory by using Array.fill() and make the reference available for the garbage collector by discarding it:
Arrays.fill(password, ' ');
password = null;
I just noticed that nulling the password would be a bit paranoid but you can do if it reassures you :)
You do not use a String. You use a char[] and then overwrite the char[] when done.
There are absolutely no guarantees when it comes to garbage collection (aside from that the finalizer will run before the object is collected). The GC may never run, if it runs it may never GC the String that has the password in it.
If you create the hash on the client side, there should be no need to think about this problem. The plain password is never submitted to the server.
Two words: Local Scope. The declared variables for password processing need to have the absolute smallest scope possible.
Once the variables go out of scope, the objects are eligible for garbage collection.
Often, you're picking things out of a request. You want a very, very small transaction that accepts the request, hashes the password, persists it and redirects. The page to which you redirect can then fetch content and do all the "other" processing that is part of your application.
There is no way to guarantee that clear text passwords are removed from memory in Java.
However a hacker doesn't need access to the memory of a program to get clear text passwords. There are much simpler ways (such as sniffing the packets) so it is highly unlikely anyone would rely on this approach.
The best approach is to have the client encrypt the password as #Mork0075 suggests. However, while it means you cannot easily get the password, a program can still get the encrypted version of passwords and so pretend to be a user. A way around this is to encrypt the whole connection using SSL.
All this is rather academic, as the simplest approach for a hacker is to monitor the packets to the database and get the password for your database. I suspect direct access to your database is more concerning... or perhaps its isn't. ;)
Use a password challenge:
Server chooses a challenge value and sends it to the Client
Server performs a 1-way translation with the password and the challenge, ex. MD5(CONCAT(challenge, password)) and assigns it to the session.
Plain-text password is now out-of-scope and ready for garbage collection.
Client also performs the same translation and sends the result to the Server.
If Server and Client choose the same final value, the client is authenticated.
This method prevents replay attacks, but requires the challenge value to be very unpredictable (random) and not often reused (long).
The plain-text password is only in scope during the handling of the initial connection request - not during authentication. It doesn't matter how long the 1-way translation result is in scope (not garbage collected) because it has little replay value.
I have a project to build a voting desktop application for a class in Java. While security isn't the focus of the project, I would like to be as realistic as I can. What are some of the primary tools to integrate security into a Java application.
Edit: I'm not primarily worried about physical security, we are simply building an application not a whole system. I want to ensure votes are recorded correctly and not able to be changed or read by someone else.
It really depends on what kind of security you are looking to integrate. Do you want security to ensure that the user isn't running any debuggers or such to flip bits in your application to change the votes? Do you want to ensure that the user doesn't install logging software to keep track of who voted for who? Do you want to ensure that the person who is supposed to be voting is actually voting? Security is a very broad subject, and it's hard to give an answer without knowing what exactly you are looking for.
My company did lately app with very strong security. Maybe it helps.
Our app
It was java EE app.
Architecture is following:
Client computer has a cryptography package.
Dirty serwer that stores encrypted user input and output
Clean serwer that is not accesible from outside that stores keys and decrypted data.
Users are issued cryptography cards (you may want to use something less safe - eg. pgp), and are required by jsp pages to encrypt with them all input. Page contains component that connects to cryctography app, asks user for key passphrase, encrypts it with server public key and signs it with user private key, then submits.
Data is stored in external server then transferred to internal server, where it is decrypted and signature is verified, then data is processed and reencrypted, then it is sent to dirty server, and then user may get it.
So even if someone cracked the dirty server (even get hold of database) he would get mostly useless data.
Your app
I'd send encrypted and signed votes to server. It would assert two things:
You know who sent the vote
Noone wil be able to know what the vote was.
Then get data from server, assert that everyone voted at most once count the votes, voila!
If you're looking for a "higher-level" explanation of this stuff (as in, not code), Applied Cryptography has quite a few relevant examples (and I believe a section on "secure elections" that covers some voting strategies).
I'm not primarily worried about physical security, we are simply building an application not a whole system. I want to ensure votes are recorded correctly and not able to be changed or read by someone else.
Putting to one side questions of protecting against physical tampering (e.g. of the underlying database), since you've stipulated that physical security is not the present concern...
I think the primary consideration is how to ensure that a given voter votes only once. At a paper poll, each registered voter is restricted to a particular booth/location and verification is done by name+SSN and a signature.
You might need a high resolution digital signature capture and therefore a touchscreen capture peripheral or a touch screen terminal. A more sophisticated approach would be a biometric scanner, but that would require government records of thumb/finger prints or retinal scan - I can already see the privacy advocates lining up at the lawyer's offices.
Another approach would be for the voter "registrar office" to issue digital keys to each voter prior to the election - a (relatively) short (cryptographically strong) random alpha/numeric key that is entered with the voter's name and/or SSN into the application. Knowledge of that key is required for that particular voter in that particular election. These keys would be issued by post in tamper-evident envelopes, like those used by banks for postal confirmation of wire transfers and delivery of PIN numbers. The key must include checksum data so that the user can have the entry of it immediately validated and it should be in groups of 4, so something like XXXX-XXXX-XXXX-CCCC.
Any other "secret" knowledge, such as SSN, is likely too easily discovered for a large percentage of the population (though we don't seem to be able to make credit-granting organizations understand this), and therefore is unsuitable for authentication.
Vote counting can be done by generating a public key encrypted data file which is transferred (by sneaker net?) to the central system. This must include the "voting booth" identity information and a record for each voter including their SSN and the digital key (or signature, or biometric data). Votes with invalid keys are eliminated. Multiple votes with the same key and same votes are treated as a single vote for that candidate. Multiple votes with the same key and different votes are flagged for fraud investigation (with the constituent contacted by phone, issued a new key, and directed to revote).
Your problem is that you need to identify the user reliably, so that you can prevent them from re-voting and accessing each others votes.
This is not any different from any other desktop application that requires authentication (and potentially authorization). If your voters are a closed group on a network with user accounts, you could integrate with the directory and require users to log in.
If voters do not have network user accounts, this is where it gets interesting. Each user will still need to authenticate with the application. You could generate accounts with passwords in the application and distribute this information securely prior to voting. Your application could ask users to select a password when the access the application for the first time.
Without knowing the specifics, it is hard give a more specific answer.
You are aware that electronic voting is an unsolved research problem? Large scale fraud should take a large effort.
I believe that physical security is more important for voting booth system rather than you know, code security.
These machine by their very nature shouldn't be connected to any kind of public networks, especially not the the internet. But having a good physical security to prevent any sort of physical tampering is very important.