We're in the process of defining a time-sensitive token generation scheme for our SOAP services. Currently we have two ideas and we're not sure which is better and why.
Scheme 1:
The rands at the beginning and end don't add any particular value, but the idea is to add a small layer of salting.
Sender
<token>rand(6) ++ encrypt(time) ++ rand(6)</token>
Receiver
enc = token.substring(6,token.len-6)
time = decrypt(enc)
assert(time is within +/- range)
Scheme 2:
Here we encrypt the rands into the token.
Sender
<token>encrypt(rand(6) ++ encrypt(time) ++ rand(6))</token>
Receiver
dec = decrypt(token)
time = dec.substring(6,dec.len-6)
assert(time is within +/- range)
So I'm not only looking for an answer for which is best, but WHY it's best. I've looked for some documents or best practices, but short of IEEE documents, I haven't found much. If you have any documents you can point us to, we'd love that!
Of the two schemes you present, the first doesn't really address the goal of salting. From Wikipedia:
The primary function of salts is to defend against dictionary attacks
and pre-computed rainbow table attacks
and I would expect the salting to encrypt the same sequence differently each time you encrypt it. e.g. (using your examples)
encrypt(rand(6) + encryptable)
such that multiple encryptions of the same sequence require individual decrypting.
If you need a unique token, rather than rely on a timestamp, why not use a GUID ?
Related
public static void main(String[] args) {
String s = "text";
hash=DatatypeConverter.printHexBinary(MessageDigest.getInstance("MD5").digest(s.getBytes("UTF8")))
System.err.println(hash);
}
You can't. That's what hashing is about. This is not a API or library limitation but a mathematical one, which is there by design.
You need to understand that hashing and encryption/ decryption are two completely different things (which are often used together).
Hashing
A hash, or to be precise, a cryptographic hash, is the result of a mathematical one-way function which means it is meant to be irreversible. Once something is hashed, it is not possible to get the text that was hashed back from it. This is at least true for good hash algorithms, which are not considered broken. MD5 as well as SHA1 are considered broken, so if you need this in a security context, use SHA-256 or even SHA-512 instead. If you just need this for a checksum, MD5 should be fine.
You can just use an MD5 hash value and type it into Google and get the result back. That is not what you want from a hash function. E.g. take your the hash 1cb251ec0d568de6a929b520c4aed8d1 which is the MD5 hash of your message text and you'll get a website which shows you the original text.
Another property of a hash it that for two different inputs it should ideally never have the same output. That is of course impossible as the set of possible messages to hash is much much bigger then the set of possible hash messages as hashes have a fixed length. But it should not be possible to generate such a hash collision artificially following some algorithm or clever input.
The only way to find out what was hashed is to hash many messages (brute-force) and see if the computed hash matches the hash to be cracked. See Rainbow tables for more information on this.
Hashes are mostly used to ensure integrity i.e. no modification was done to a message sent over the network. It is often used in conjunction with encryption algorithms (that's probably where your confusion originates from) as you need more than integrity to guarantee secure communication i.e. additionally authentication (e.g. by using certificates) and confidentiality (this is provided by encryption algorithms).
Encryption
An encryption function is a function which takes a message and a key and produces a result which is not readable unless you have the key. Side note: there may be one or two keys for encryption/ decryption depending on the algorithm you use (symmetric vs. asymmetric). If you have the key, it is reversible by applying the decryption function. If it was a one-way function like a hash, encryption would make no sense, as a recipient of a message would not be able to retrieve the message.
When running the encode method of a spring security Pbkdf2PasswordEncoder instance multiple times, the method returns different results for the same inputs. The snippet
String salt = "salt";
int iterations = 100000;
int hashWidth = 128;
String clearTextPassword = "secret_password";
Pbkdf2PasswordEncoder pbkdf2PasswordEncoder = new Pbkdf2PasswordEncoder(salt, iterations, hashWidth);
String derivedKey = pbkdf2PasswordEncoder.encode(clearTextPassword);
System.out.println("derivedKey: " + derivedKey);
String derivedKey2 = pbkdf2PasswordEncoder.encode(clearTextPassword);
System.out.println("derivedKey2: " + derivedKey2);
results in a output like
derivedKey: b6eb7098ee52cbc4c99c4316be0343873575ed4fa4445144
derivedKey2: 2bef620cc0392f9a5064c0d07d182ca826b6c2b83ac648dc
The expected output would be the same values for both derivations. In addition, when running the application another time, the outputs would be different again. The different output behavior also appears for two different Pbkdf2PasswordEncoder instances with same inputs. The encoding method behaves more like a random number generator. Spring boot version used is 2.6.1, spring-security-core version is 5.6.0 .
Is there any obvious setting that I am missing? The documentation does not give additional hints. Is there a conceptual error in the spring boot project set up?
Is there any obvious setting that I am missing?
Yes. The documentation you linked to is fairly clear, I guess you missed it. That string you pass to the Pbkdf2PasswordEncoder constructor is not a salt!
The encoder generates a salt for you, and generates a salt every time you ask it to encode something, which is how you're supposed to do this stuff1. (The returned string contains both this randomly generated salt as well as the result of applying the encoding, in a single string). Because a new salt is made every time you call .encode, the .encode call returns a different value every time you call it, even if you call it with the same inputs.
The string you pass in is merely 'another secret' - which can sometimes be useful (for example, if you can store this secret in a secure enclave, or it is sent by another system / entered upon boot and never stored on disk, then if somebody runs off with your server they can't check passwords. PBKDF means that if they did have the secret the checking will be very slow, but if they don't, they can't even start).
This seems like a solid plan - otherwise people start doing silly things. Such as using the string "salt" as the salt for all encodes :)
The real problem is:
The expected output would be the same values for both derivations
No. Your expectation is broken. Whatever code you are writing that made this assumption needs to be tossed. For example, this is how you are intended to use the encoder:
When a user creates a new password, you use .encode and store what this method returns in a database.
When a user logs in, you take what they typed, and you take the string from your database (the one .encode sent you) and call .matches.
It sounds like you want to again run .encode and see if it matches. Not how you're supposed to use this code.
Footnote1: The why
You also need to review your security policies. The idea you have in your head of how this stuff works is thoroughly broken. Imagine it worked like you wanted, and there is a single salt used for all password encodes. Then if you hand me a dump of your database, I can trivially crack about 5% of all accounts within about 10 minutes!!
How? Well, I sort all hashed strings and then count occurrences. There will be a bunch of duplicate strings inside. I can then take all users whose passhash is in this top 10 of most common hashes and then log in as them. Because their password is iloveyou, welcome123, princess, dragon, 12345678, alexsawesomeservice!, etcetera - the usual crowd of extremely oft-used passwords. How do I know that's their password? Because their password is the same as that of many other users on your system.
Furthermore, if none of the common passwords work, I can tell that likely these are really different accounts from the same user.
These are all things that I definitely should not be able to derive from the raw data. The solution is, naturally, to have a unique salt for everything, and then store the salt in the DB along with the hash value so that one can 'reconstruct' when a user tries to log in. These tools try to make your life easy by doing the work for you. This is a good idea, because errors in security implementations (such as forgetting to salt, or using the same salt for all users) are not (easily) unit testable, so a well meaning developer writes code, it seems to work, a casual glance at the password hashes seem to indicate "it is working" (the hashes seem random enough to the naked eye), and then it gets deployed, security issue and all.
While using Jasypt, the encrypted passwords contains = (equal character) at the end. Is it guaranteed that the encrypted passwords will always have = at the end?
How/Can we control this behavior?
Foe example: test is encrypted to Nv4nMcuVwsvWVuYD7Av44Q==
It looks like the =s come from padding the Base64 representation of the encryption / hash output.
In that case, the answer is generally no, it won't necessarily end with "=".
However if the algorithm you're using produces constant-length output (e.g. if it uses hashing along the way), it might by a chance end up producing those "="s all the time - but there's no way of knowing that for sure unless you fully understand all steps the algorithm you're using performs.
I am writing a Django app that needs to work with an existing Java Play framework app. The Play app uses PasswordHash.java to store passwords. It stores passwords in a colon separated format. Each hash is stored as iterations:salt:pbkdf2_hash.
For example, here is an entry for the password "test":
1000:f7fe4d511bcd33321747a778dd21097f4c0ff98f1e0eba39:b69139f51bc4098afc36b4ff804291b0bc697f87be9c1ced
Here we can split the string by : and find:
Iterations: 1000
Salt: f7fe4d511bcd33321747a778dd21097f4c0ff98f1e0eba39
PBKDF2 Hash: b69139f51bc4098afc36b4ff804291b0bc697f87be9c1ced.
I modified Django's check_password mechanism to be compatible with this format, but found that it didn't think the password was correct. I used Django's crypto.py to regenerate a hash for "test" using the same salt that Play used, and came up with this:
hash = crypto.pbkdf2('test', 'f7fe4d511bcd33321747a778dd21097f4c0ff98f1e0eba39', 1000, 24)
base64.b16encode(hash)
'9A8725BA1025803028ED5B92748DD61DFC2625CC39E45B91'
The PBKDF2 hash from play does not match this hash. (For those wondering, I used 24 as the fourth parameter because that is what is used in PasswordHash.java).
After I was unable to make Django's generated hash match Java's, I tried it on a website that does it for you.
I plugged in the same salt, used SHA-1 with 1000 iterations and a 24-bit key size and found that the website matched what Django had created!
I am not sure what is going on with PasswordHash.java, but I desperately need to get Django and Play to "play nicely" (couldn't help myself haha). Does anyone have an idea as to what is going on here?
Try salt = base64.b16decode(salt.upper()).
I did and I got the hash in your initial example, albeit uppercased B69139F5...
Explanation:
The hash and salt are both being stored in Base16 (hex) in your initial example. So you decode the salt to use it and then encode the resulting hash to compare it with the stored one.
The upper() is because python's b16decode is strict about uppercase Base16. It will error if you give it the lowercase one.
I've been given a challenge and it has to do with testing a friend's encryption process.
It's a Diffie-Hellman exchange process, and here are the known variables / constants:
P, G
my generated private key (variable)
my generated public key(variable)
the recipients public key (constant).
When looking at my private key - P and G are both within it. For example, the first 'x' bytes seem to have no relation to anything, then the next 'y' bytes are P, the next two bytes are static, and the next 'z' bytes are G, the remainder are variable.
The process is to encrypt a file, and send it to a device, which will in turn decrypt it - my ideas of attack are this:
try to duplicate the secret shared key. The problem here is that is fine as long as I know my generated private key, at which case - I don't for the files he's given me.
Try to find the recipients private key. Here, I could brute force my way in - but would take forever unless I had some sort of supercomputer.
Are there any other options to look at when trying to attack this?
I probably should keep my mouth shut, but it is also an opportunity for those interested in Diffie-Hellman to learn something:
Simple implementation of Diffie-Hellman to generate the shared key is vulnerable to man-in-the-middle attacks. However, most implementation of DH tackle this issue properly by adding authentication between Alice and Bob.
If your implementation of DH allows declaring a new set of PQG, you could request the other peer to use a new weak set. If Bob does not verify the quality of this set, then it is vulnerable to attacks.
DH requires Alice to send X = g^x, if Bob does not check the quality of X, he is vulnerable, since the space of possible values of the secret key can significantly be reduced by Eve in the middle.
If your implementation does not remember compromised keys, they can be re-used by Eve.
If your implementation does not remember compromised certificates, they can be re-used by Eve.
If your implementation does not check certificates, Eve will have fun for sure.