How to generate a DomainKeys (not DKIM) signature? - java

I am using DKIM for JavaMail to sign outgoing mails with DKIM.
Now, I would like to add a DomainKey-Signature. From reading through the docs, specs and other related posts I know that the signing process is almost identical (using the same algorithm, DNS entries, etc.).
The only difference is that DKIM offers more options, e.g. In choosing which fields to sign. That makes it easy to select the signing fields (e.g. From, Subject) and generate the right hash values.
For DomainKeys I could not figure out which mail parts to hash though. I read the docs but it is not clearly stated if you should only hash the body or the entire source code.
On a different website it says
DomainKeys uses the ‘From’, and ‘Sender’ headers, as well as the message body, in
combination with the Private Key to generate a DomainKeys signature
That makes sense - but what does it mean for my other header fields (e.g. Date, Message-ID) and what is meant by message body?
So my overall question is:
What input (mail parts) do I use to generate the DomainKey hash?

To find which header field signed by "DKIM for JavaMail" have a look into the source "DKIMSigner.java", they are specified in the array " String[] defaultHeadersToSign".
Body means the message itself (stripped down simplified structure of an email: header fields + one empty line + body).

There is no need to use the depricated DomainKeys anymore, if you are already using DKIM.
You may want have a look at this Implementation http://www.badpenguin.co.uk/dkim/

Related

How to get reproducible Pbkdf2PasswordEncoder output in spring boot?

When running the encode method of a spring security Pbkdf2PasswordEncoder instance multiple times, the method returns different results for the same inputs. The snippet
String salt = "salt";
int iterations = 100000;
int hashWidth = 128;
String clearTextPassword = "secret_password";
Pbkdf2PasswordEncoder pbkdf2PasswordEncoder = new Pbkdf2PasswordEncoder(salt, iterations, hashWidth);
String derivedKey = pbkdf2PasswordEncoder.encode(clearTextPassword);
System.out.println("derivedKey: " + derivedKey);
String derivedKey2 = pbkdf2PasswordEncoder.encode(clearTextPassword);
System.out.println("derivedKey2: " + derivedKey2);
results in a output like
derivedKey: b6eb7098ee52cbc4c99c4316be0343873575ed4fa4445144
derivedKey2: 2bef620cc0392f9a5064c0d07d182ca826b6c2b83ac648dc
The expected output would be the same values for both derivations. In addition, when running the application another time, the outputs would be different again. The different output behavior also appears for two different Pbkdf2PasswordEncoder instances with same inputs. The encoding method behaves more like a random number generator. Spring boot version used is 2.6.1, spring-security-core version is 5.6.0 .
Is there any obvious setting that I am missing? The documentation does not give additional hints. Is there a conceptual error in the spring boot project set up?
Is there any obvious setting that I am missing?
Yes. The documentation you linked to is fairly clear, I guess you missed it. That string you pass to the Pbkdf2PasswordEncoder constructor is not a salt!
The encoder generates a salt for you, and generates a salt every time you ask it to encode something, which is how you're supposed to do this stuff1. (The returned string contains both this randomly generated salt as well as the result of applying the encoding, in a single string). Because a new salt is made every time you call .encode, the .encode call returns a different value every time you call it, even if you call it with the same inputs.
The string you pass in is merely 'another secret' - which can sometimes be useful (for example, if you can store this secret in a secure enclave, or it is sent by another system / entered upon boot and never stored on disk, then if somebody runs off with your server they can't check passwords. PBKDF means that if they did have the secret the checking will be very slow, but if they don't, they can't even start).
This seems like a solid plan - otherwise people start doing silly things. Such as using the string "salt" as the salt for all encodes :)
The real problem is:
The expected output would be the same values for both derivations
No. Your expectation is broken. Whatever code you are writing that made this assumption needs to be tossed. For example, this is how you are intended to use the encoder:
When a user creates a new password, you use .encode and store what this method returns in a database.
When a user logs in, you take what they typed, and you take the string from your database (the one .encode sent you) and call .matches.
It sounds like you want to again run .encode and see if it matches. Not how you're supposed to use this code.
Footnote1: The why
You also need to review your security policies. The idea you have in your head of how this stuff works is thoroughly broken. Imagine it worked like you wanted, and there is a single salt used for all password encodes. Then if you hand me a dump of your database, I can trivially crack about 5% of all accounts within about 10 minutes!!
How? Well, I sort all hashed strings and then count occurrences. There will be a bunch of duplicate strings inside. I can then take all users whose passhash is in this top 10 of most common hashes and then log in as them. Because their password is iloveyou, welcome123, princess, dragon, 12345678, alexsawesomeservice!, etcetera - the usual crowd of extremely oft-used passwords. How do I know that's their password? Because their password is the same as that of many other users on your system.
Furthermore, if none of the common passwords work, I can tell that likely these are really different accounts from the same user.
These are all things that I definitely should not be able to derive from the raw data. The solution is, naturally, to have a unique salt for everything, and then store the salt in the DB along with the hash value so that one can 'reconstruct' when a user tries to log in. These tools try to make your life easy by doing the work for you. This is a good idea, because errors in security implementations (such as forgetting to salt, or using the same salt for all users) are not (easily) unit testable, so a well meaning developer writes code, it seems to work, a casual glance at the password hashes seem to indicate "it is working" (the hashes seem random enough to the naked eye), and then it gets deployed, security issue and all.

Comparing if two list of strings are equal using hashcode?

I am writing a Java/JEE client server application. I have a requirement were the files present in the server should match with the files present in the client. I am only trying to validating if there is an exact match to the file names and number of files in a specific directory.
Example of what is required:
Server
DirectoryA
FileA
FileB
FileC
Client
DirectoryA
FileA
FileB
FileC
What would be the most efficient way for the server to make sure that all clients have the same files, assuming I can have over 100 clients and that I do not want my client/server communication to be too chatty.
Here is my current approach is using a REST API and REST Client:
Server:
Find list of files in the target directory
Create a checksum for the directory by making use of hashcode derived by file names and summing it up with number 31.
Clients:
Upon receiving a request to verify integrity of the target directory, the client takes the checksum provided by the server and runs the same algorithm to generate checksum on local directory. `
If the checksum matches the client responds to the server as success.
Is this approach correct?
Is this approach correct?
The approach is correct, but the proposed implementation is not (IMO).
I assume that "summing with 31" means something like this
int hash = 0;
for (String name : names)
hash = hash * 31 + name.hashCode();
Java hashcode values are 32 bit quantities. If we assume that the filenames are distributed uniformly, that means that there is a chance of 1 in 2^32 that two different sets of filenames will have the same hash (as calculated above). In other words, a "hash collision".
An algorithm that gets it wrong one time in 4 billion times is probably not acceptable. Worse still, if the algorithm is known, then someone can trivially manufacture a situation (i.e. a set of filenames) where the algorithm gives the wrong answer.
If you want to avoid these problems, you need longer checksums. If you want to protect against people manufacturing collisions, then you need to use a cryptographically strong hash / checksum. MD5 is a popular choice.
But if it was me, I would also consider just sending a complete list of filenames ... or using the (cheap) hashcode-based checksum as a just a hint that the directory contents could be the same. (Whether the latter makes sense depends on what you need to do next.)

As400Text class vs MQC.MQGMO_CONVERT

Why some people prefer to use As400Text object to handle EBCDIC/ASCII conversion (Java code with IBM MQ jars) if we already have MQC.MQGMO_CONVERT option to handle this?
My requirement is to convert ASCII->EBCDIC during the PUT operation which I am doing by setting the character set to 37 and the write format to "STRING" and using MQC.MQGMO_CONVERT option to automatically convert EBCDIC ->ASCII during the GET operation.
Is there any downfall of using convert option? Could anyone please let me know if this is not 100 percent safe option?
Best practice is to write the MQ message in your local code page (where the CCSID and Encoding will normally be filled in automatically as the correct values) and to set the Format field. Then the getter will should use MQGMO_CONVERT to request the message in the CCSID and Encoding they need it in.
Get with Convert is safe, and will be correct so long as you provide the correct CCSID a and Encoding that describes the message, when you put it.
In the description of what you are doing in your question you convert from ASCII->EBCDIC before putting the message, and then getter is converting from EBCDIC->ASCII on the MQGET. This means you have paid for two data conversion operations, when you could have done none (or if two different ASCIIs, only one).

Is it safe (in matter of uniqueness) to use UUID to generate a unique identifier for specific string?

String myText;
UUID.nameUUIDFromBytes((myText).getBytes()).toString();
I am using above code to generate a representative for specific texts.
For example 'Moien' should always be represeted with "e9cad067-56f3-3ea9-98d2-26e25778c48f", not any changes like project rebuild should be able to change that UUID.
The reason why I'm doing this is so that I don't want those specific texts to be readable(understandable) to human.
Note: I don't need the ability to regenerate the main text (e.g "Moien") after hashing .
I have an alternative way too :
MessageDigest digest = MessageDigest.getInstance("SHA-256");
byte[] hash = digest.digest((matcher.group(1)).getBytes("UTF-8"));
String a = Base64.encode(hash);
Which od you think is better for my problem?
UUID.nameUUIDFromBytes appears to basically just be MD5 hashing, with the result being represented as a UUID.
It feels clearer to me to use a base64-encoded hash explicitly, partly as you can then control which hash gets used - which could be relevant if collisions pose any sort of security risk. (SHA-256 is likely a better option than MD5 for exactly that reason.) The string will be longer from SHA-256 of course, but hopefully that's not a problem.
Note that in either case, I'd convert the string to text using a fixed encoding via StandardCharsets. Don't use the platform default (as per your first snippet) and prefer StandardCharsets over magic string values (as per your second snippet).

Query regarding Google Protocol Buffer

message Person {
required string name = 1;
required int32 id = 2;
optional string email = 3;
}
The above is a snippet from addrbook.proto file mentioned in Google Protocol Buffer tutorials.
The requirement is that, the application that is being developed will need to decode binary data received from a socket.For example,name, id and e-mail represented as binary data.
Now, id can be read as an integer. But I am really not sure how to read name and email considering that these two can be of variable lengths. (And unfortunately, I do not get lengths prefixed before these two fields)
Application is expected to read these kind of data from a variety of sources. Our objective is to make a decoder/adapter for the different types of data originating from these sources. Then there could be different types of messages from same source too.
Thanks in advance
But I am really not sure how to read name and email considering that these two can be of variable lengths.
The entire point of a serializer such as protobuf is that you don't need to worry about that. Specifically, in the case of protobuf strings are always prefixed by their length in bytes (using UTF-8 encoding, and varint encoding of the length).
And unfortunately, I do not get lengths prefixed before these two fields
Then you aren't processing protobuf data. Protobuf is a specific data format, in the same way that xml or json is a data format. Any conversation involving "protocol buffers" only makes sense if you are actually discussing the protobuf format, or data serialized using that format.
Protocol buffers is not an arbitrary data handling API. It will not allow you to process data in any format other than protobuf.
It sounds like you might be trying to re-implement Protocol Buffers by hand; you don't need to do that (though I'm sure it would be fun). Google provides C, Java, and Python implementations to serialize and de-serialize content in protobuf format as part of the Protocol Buffers project.

Categories