Validate the unique string generated in the system

Validate the unique string generated in the system - java

I'm generating a unique id(generated by the frame work used are, so I should use only this ID and there is no API in the framework to check if this generated by the framework. Thanks RichieHH for pointing this) for each request in the web application and this can be presented back as a part of another request to the system. Now, I am storing these unique ID's generated in the database, and for every request the DB query is issued to check if this ID already exists(this is how the validation is done currently for the unique ID's). Now, if I have to validate the ID sent in the request has been generated by the application with out using the persistent storage, which approach should I be following?
My initial approacht is to generate the ID which adds to particular sum after hashing, but this can be identified after going through the patterns.
It will be great if some one can help me with an approach to solve this problem in a way it can validate the uniqueID generated with in the application. Thanks.

Use UUID, which is pretty standard solution for this task. You don't need to validate UUID, you can assume that it is unique always.
You can use ServerName+Timestamp+some extra. It can be more advantageous for debug but less secure.

Related

JavaEE application create a unique key using java.util.UUID

I want to make a javaEE application when users can register and confirm their email when receiving a email with a link after inserting their data in registration form (name, mail...)
To do that I am going to generate a long and unique key with java.util.UUID, store in a database and then send an email to the user with that key being part of the URL (Example: www.mysite.com/account.xhtml?id=KEY). Then the user will click the link, I extract the key from the URL and check if that key is stored in the DB. If it is, the user registration will be completed.
My question is, when creating that key with java.util.UUID, how can I know that it is a unique key? Should I check if there is another equal key in the DB and if so create a new one until the created key is unique?

What's the chance that a randomly-generated 128-bit integer will be equal to another randomly-generated integer?
If you just need peace of mind, use a primary key and if the insert fails due to a key collision, re-create a new UUID and retry the insert.

There are couple of ways you can do UUID in Java.
Java 5 onwards better practice is using java.util.UUID It is size of the string 36 characters. This link gives you simple example.
This discussion will give you answer to your question. It is very strong. I have never came across someone is complaining about its uniqueness.
But if you adding into DB or using in storage or using through network, size may be matters. So converting to other formats - Bases is good solution (Base64, Base85 etc). Please check this discussion here. You can use apache library org.apache.commons.codec.binary.Base64. Base85 is not safe for URLs.
My recommendation is, if you have may application/session beans/web services (many interconnections other applications and data transfers etc) are creating UUIDs, I prefer to do unique application name padding too. Like APP1, APP2 etc and then decode to other bases. If UUID is 6fcb514b-b878-4c9d-95b7-8dc3a7ce6fd8, then APP1-6fcb514b-b878-4c9d-95b7-8dc3a7ce6fd8 like that...
Though it is off the topic here, BUT When you use a URL like this www.mysite.com/account.xhtml?id=KEY, beware about SQL injection hacking attacks.

Generate truly globally unique id across many clients and servers

Summary
Really globally unique IDs in flash and/or javascript clients. Can I do this with an RNG available in current browsers/flash or must I build a composite ID with server-side randomness?
Details
I need to generate globally unique identifiers for objects. I have multiple server-side "systems" written in java that need to be able to exchange ids; each of these systems also has a set of flex/javascript clients that actually generate the IDs for new objects. I need to guarantee global uniqueness across the set of unrelated systems; for instance I need to be able to merge/sync the databases of two independent systems. I must guarantee that there is never a collision between these ids and that I never need to change the id of an object once created. I need to be able to generate ids in flash and javascript clients without contacting the server for every id. A solution that relies on some server provided seed or system id is fine as long as the server isn't contacted too often. A solution that works completely disconnected is preferable. Similarly a solution that requires no upfront registration of systems is preferable to one that relies on a central authority (like the OUI in a MAC address).
I know the obvious solution is "use a UUID generator," such as UIDUtil in flash. This function specifically disclaims global uniqueness. In general, I'm concerned about relying on a PRNG to guarantee global uniqueness.
Proposed solutions
Rely entirely on a secure random number generator in the client.
Flash 11+ has flash.crypto.generateRandomBytes; Javascript has window.crypto but it's pretty new and not supported in IE. There are solutions like sjcl that use the mouse to add entropy.
I understand that given a perfect RNG the possibility of collision for a 2122 random UID is meteorite tiny, but I'm worried that I won't actually get this degree of randomness in a javascript or flash client. I'm further concerned that the typical use case for even a cryptographic RNG is different from mine: for session keys, etc, collisions are acceptable as long as they are unpredictable by an attacker. In my case, collisions are completely unacceptable. Should I really rely on the raw output of a secure RNG for a unique ID?
Generate a composite ID that includes system, session and object IDs.
An obvious implementation would be to create a system UUID at server install time, keep a per-client-login session id (eg in a database), and then send the system and session ids to the client which would keep a per-session counter. The uid would be the triple: system ID, session ID, client counter.
I could imagine directly concatenating these or hashing them with a cryptographic hash. I'm concerned that the hashing itself may potentially introduce collisions, particularly if the input to the hash is about the same size as the output. But the hash would obscure the system id and counters which could leak information.
Instead of generating the system ID at install time, another solution would be to have a central registry that handed out unique system IDs, kind of like what DOI does. This requires more coordination however, but I guess is the only way to really guarantee global uniquness.
Key questions
Random or composite based?
Include system ID?
If system id: generate a random system ID or use a central registry?
Include timestamp or some other nonce?
To hash or not to hash?

The simplest answer is to use a server assigned client ID which is incremented for each client, and a value on each client which is incremented for each fragment on that client. The pair of client ID and fragment ID becomes the globally unique ID for that piece of content.
Another simple approach is to generate a set of unique IDs (say 2k at a time) on the server and send them in a batch to each client. When the client runs out of IDs it contacts the server for more.
Client IDs should be stored in a central repository accessible to all the servers.
It may help looking at methods for distributed hashing which is used to uniquely identify and locates fragments within a peer-to-peer environment. This may be overkill considering you have a server which can intervene to assert uniqueness.
To answer your questions you need to determine the benefit that the added complexity of a system ID, nonce or hash would bring.
System ID:
A system ID would typically be used to uniquely identify the system within a domain. So if you don't care who the user is, or how many sessions are open, but only want to make sure you know who the device is, then use a system ID. This is usually less useful in a user-centric environment such as JavaScript or Flash, where the user or session may be relevant.
Nonce:
A nonce/salt/random seed would be used to obfuscate or otherwise scramble the ID. This is important when you don't want others to be able to guess the original value of the ID. If this is necessary then it may be better to encrypt the ID with a private encryption key, and pass a public decryption key to each consumer who needs to read the ID.
Timestamp: Considering the variability of the client's clock (ie you cannot guarantee it adheres to any time or time zone), a timestamp would need to be treated as a pseudo-random value for this application.
Hash: While hashes are often (ab)used to create unique keys, their real purpose is to map a large (possibly infinite) domain to a smaller, more manageable one. For example, MD5 is typically used to generate a unique ID from a timestamp, random number, and/or nonce data. What is actually happening is that the MD5 function is mapping an infinite range of data into a space of 2^128 possibilities. While this is a massive space, it is not infinite, so logic tells you that there will be (even if only in theory) the same hash assigned to two different fragments. On the other hand perfect hashing attempts to assign a unique identifier to each piece of data, however this is entirely unnecessary if you just assign a unique identifier to each client fragment to start with.

Something quick and dirty and also may not work out for your use case --
Using Java's UUID and coupling that with something like , say clientName.
This should solve the multiple client and multiple server issue.
The rationale behind this is that the possibility of getting 2 calls at the same nanosecond are low, refer to the links provided below. Now, by coupling the clientName with the UUID you are ensuring unique IDs across clients and that should leave only handling the use case of the same client calling twice within the same nanosecond.
You could write a java module to generate the IDs and then get Flash to talk to this module.
For your reference, you could refer to --
Is unique id generation using UUID really unique?
Getting java and flash to talk to each other

A middle ground builds on #ping's answer:
Use client name, high-resolution time, and optionally some other pseudo-random seed
Hash the data to produce the UID (or, just go directly to using UUIDs)
Log the resulting to a central server for entry into a database
Treat any collision as a prominently flagged bug, rather than as a situation that deserves special code.
With a UUID or reasonably long hash, the chances of a duplicate or nil. So either:
A) You'll get no duplicates for the life of the application, life is good.
B) You'll see a duplicate, or maybe two (freaky!), over a few decades. Intervene manually to deal with those cases; if you're running servers with your client, you can afford it.
C) If you get a third collision, then there is something fundamentally wrong with the code, and this can be investigated and measures taken to avoid a repetition.
This way, the ID is generated at the client, contacts to the server are one-way and operationally non-critical, the seeds don't have to be random, the hashing obscures the origins of the ID and so avoids constructed collisions, and you can be confident that there have been no collisions. (If you test that collision detection code!) Even UUIDs could be plenty adequate in this scenario.
The only way hashing increases the likelihood of collisions is if your information content in the original seed information approaches the size of the hash. That's extremely unlikely, but if true and you're still thinking about micrometeorites, just increase the size of the hashed value.

My two cents.. Each server locks a DB table and get an id from it, and increments it. this will be the server unique id.
Each client connecting will get this id, coupled with a unique identifier issued by the server. This unique key has to be unique for this server, but another server might issue the same id to a different client.
Finally, each client will generate a unique id for each request.
Coupling all three will guarantee a true unique global id over the entire system, the final id will look something like:
[server id][client id][request id]

Are there techniques to prevent double submissions in stateless web applications?

I want to implement double submission prevention in an existing java web application (struts actually). Architecture wise we are talking about 2 to N possible application servers (tomcat) and one single database server (mysql). The individual servers do not know each other and are not able to exchange messages. In front of the application servers there is a single load balancer which has the ability to do sticky sessions.
So basically there are two kinds of double submission prevention client side and server side. If possible I want to go server-side because all client side techniques seem to fail if people disable cookies and/or javascript in their browsers.
This leaves me with the idea of doing some kind of mutex-like synchronisation via database locks. I think it may be possible to calculate a checksum of the user entered data and persisting it to a dedicated database table. On each submit the application would have to check for presence of an equal checksum which would indicate that the given submission is a duplicate. Of course the checksums in this table have to be cleared periodically. The problem is the whole process of checking whether there is a duplicate checksum already in the database and inserting the checksum if there is none is pretty much a critical section. Therefore the checksum table has to be locked beforehand and unlocked again after the section.
My deadlock and bottle neck alarm bells start to ring when I think about table locks. So my question is: Are there saner ways to prevent double submissions in stateless web applications?
Please note that the struts TokenInterceptor can not be applied here because it fails miserably when cookies are disabled (it relies on the HTTP session which simply isn't present without session cookies).

A simpler DB based solution would be something like this. This can be made generic across multiple forms as well.
Have a database table that can be used to store tokens.
When an new form is displayed - insert a new row into the token table
and add the token as a hidden field in the form.
When you get a form submit do a select for update on the row
corresponding to the token you received as a part of the form.
If the row still exists then this is the first submit. Process the
submit and delete the row.
If the row doesn't exist then the form has already been processed -
you can return an error.

The classic technique to prevent double submissions is to assign two IDs (both as "hidden" field in HTML Form tag) - one "session-ID" which stays the same from login to logout...
The second ID changes with every submission... server-side you only need to keep track of the "current valid ID" (session-specific)... if you get a "re-submission" (by click-happy-user or a "refresh-button" or a "back-button" or...) then that wouldn't match the current ID... this way you know: this submission should be discarded and a new ID is generated and sent back with the answer.
Some implementations use an ID that is inremented on every submission which eases a bit the check/kepp track part but that could be vulnerable to "guessing" (security concern)...
I like to generate cryptographically strong IDs for this kind of protection...
IF you have a load-balanced environment with sticky session then you only need to keep track of the ID on the server itself (in-memory)... but you can certainly store the ID in the DB... since you store it together with the session ID the lock would be on "row level" (not table level) which should be ok.
The way you described goes one step further by examining the content... BUT I see the content part more on the "application logic" level than on the "re-submission prevention level" since it depends on the app logic whether it wants to accepts the same data again...

What if you work with sticky sessions then you would be fine with some TokenManagement. There exist a DoubleClickFilter which you can add to your web.xml.
Since you have sticky sessions there is no need for a Cross-Tomcat-Solution.

Designing Unique Keys(Primary Keys) for a heavily denormalized NoSQL database

I am working on a web application related to Discussion forums using Java and Cassandra database.
I need to construct 'keys' for the rows storing the user's details and & another set of rows storing the content posted by the user.
One option is to get the randomly generated UUID provided by Java language, but these are 16 bytes long. and since NoSQL database involves heavy denormalization, I am concerned whether I would be wasting lots of disk space, RAM and other resources if the key could be generated in smaller sizes.
I need to generate two types of keys, one for the Users & other for Content Posted by Users.
For the Content posted by users, would timestamp+userId be a good key. where timestamp is the server time at which content was posted and userId refers to key of user row.
Any suggestions, comments appreciated ..
Thanks
Marcos

Is this a distributed application?
Then you could use a simple synchronized counter and initialize it on startup with the next available id.
On the other hand a database should be able to handle the UUID hashes as created by java.
This is a standard for creating things like sessionIds, that need to be unique.
Your problem is somewhat similar since a session in your context would represent a set of user input.

What's the best way to secure a query string with Java?

When a user signs up in our Struts application, we want to send them an email that includes a link to a different page. The link needs to include a unique identifier in its query string so the destination page can identify the user and react accordingly.
To improve the security of this system, I'd like to first encrypt the query string containing the identifier and second set the link to expire--after it's been used and/or after a few days.
What Java technologies/methods would you suggest I use to do this?

I'm going to make some assumptions about your concerns:
A user should not be able to guess another user's URL.
Once used, a URL should not be reusable (avoiding session replay attacks.)
Whether used or not, a URL shouldn't live forever, thus avoiding brute-force probing.
Here's how I'd do it.
Keep the user's ID and the expiration timestamp in a table.
Concatenate these into a string, then make an SHA-1 hash out of it.
Generate a URL using the SHA-1 hash value. Map all such URLs to a servlet that will do the validation.
When someone sends you a request for a page with the hash, use it to look up the user and expiration.
After the user has done whatever the landing page is supposed to do, mark the row in the database as "used".
Run a job every day to purge rows that are either used or past their expiration date.

For the first part have a look at Generating Private, Unique, Secure URLs. For the expiration, you simply need to store the unique key creation timestamp in the database and only allow your action to execute when for example now-keyCreatedAt<3 days. Another way is to have a cron or Quartz job periodically delete those rows which evaluate true for "now-keyCreatedAt<3 days".

I think you can do this in a stateless way, ie without the database table others are suggesting.
As mtnygard suggests, make a SHA-1 hash of the URL parameters AND a secret salt string.
Add the hash value as a required parameter on the URL.
Send the URL in the email.
When the user click on the URL:
Verify the integrity of the URL by calculating the hash again, and comparing the calculated value to the one on the URL.
As long as you never divulge your secrete salt string, no one will be able to forge requests to the system. However, unlike the other proposals, this one does not prevent replaying an old URL. That may or may not be desirable, depending on your situation.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.