how to compress uuid and timeuuid? - java

uuid or Universally unique identifier and timeuuid are long 128-bit value.
In Cassandra database and because of its concepts I used uuid and timeuuid for our entities identifier and now I want to compress uuid and timeuuid or reduce its size when client (user) can see the id in the URL bar.
For example Twitter also used Cassandra but when you open a Tweet, the Tweet's id is like 10153967535312713 but a simple uuid is like 10646334-2c02-11e6-bb4a-7720eb141b83 that is more characters and not user friendly (of course both IDs are not user friendly :D)
In different programming languages there are some compression functions, such as gzcompress in PHP and GZIPOutputStream in Java but these functions (classes) will compress data and return in GZIP format which is not allowed to use in URL!
Now just by the way is there any way or function/algorithm to get smaller or compressed version of a uuid or timeuuid?

Twitter originally developed Snowflake a long time ago and I believe that it is this id that Twitter is still using. There are now many flake id implementations available that can generate a UUID like number instead of a true UUID. I have used Flake Java in a project of mine.

Related

How to store UUIDs client side

For my app I have a server-side database in which I store users and their data. I am wondering how to keep track of which user has which UUID. I want to make sure that only the same user with their own unique UUID can access their data in the database.
What would be the best way to do this?
In your database, create a table where each row represents one particular user. That table would have their permanently assigned UUID, along with name, email, etc.
Some databases such as Postgres and H2 support UUID as a built-in data type. Some databases proved a feature, perhaps as a plug-in, to generate a UUID value. For example, the uuid-ossp plug-in for Postgres. If not, you can generate a UUID in Java.
When creating an account for a user, create a row in that table. When a user logs in, look up their credentials in this table, retrieving their previously assigned UUID.
During the execution of your app, keep that retrieved UUID in memory as a java.util.UUID object. In a web app built on Jakarta Servlet, a good place to keep their UUID would be as an attribute on the Session. See Binding Attributes into a Session in the spec. See HttpSession#setAttribute and getAttribute.
When you write rows in other tables that belong to a particular user, include a column for their UUID. Include their UUID as a criteria in your queries.
You might want to look into multitenancy as a topic.
After authenticating the user (via your favorite authentication process), add a set-cookie response header with the user id (or any other data you deem appropriate) as the value.
Don't forget to set the cookie properties httponly, secure, and samesite.

Implementing UUID for better performance

Requirement:
we need to assign a random, long enough number/string to merchants so that they are uniquely identifiable(do not want someone to guess the identifier ). This is required because we are printing this number/string as QR code and giving it to merchants so that our users can read the QR code and get the info about the merchant.
current Implementation:
we cannot print id's as they will be sequential, so we have introduced a new field (externalId) to store a unique id generated by TimeBasedGenerator of JUG UUID generator. As I researched more I found there are performance issues with UUID. All the articles talks about UUId as primary key, I am not using UUID as primary key but as a secondary identifier. I am not worried about inserts and updates, as they will be fewer as compared to search. As user will be sending us the UUID read from the printed QR code and we will need to search the merchant using this UUID field, Hence there will be huge impact on performance.
solution as per me:
instead of UUID we will give a combination of ID:UUID, so that when we get the request we can split and fetch the merchant using ID and check if the externalId(UUID) present in our DB matches with the one provided, if it matches we return the merchant other wise not.
Will this solution increase the performance significantly or the time taken by string operations will nullify the effect.
Is there any better approach to do this?
Thanks
UUIDs are generated in various flavors. As you are perhaps already aware, RFC 4122 outlines a variant (i.e. UUID schema) that defines five different versions. I suggest that you look into version 1 of RFC 4122.
Version 1 uses a MAC address of the machine generating the UUID as the last 12 digits of the generated UUID.
Assume you go to a second-hand computer recycler and purchase an obsolete ethernet adaptor; power it up and get the MAC address of that adaptor; and then destroy the adaptor (shredder, shot-gun, acid, etc.). You can now be assured that no other computer in the universe will -ever- generate a V1 UUID using that MAC address.
Now you could write a UUID (RFC 4122, V1) generator that would use harvested MAC address.
You could harvest a separate MAC address (from additional cards) for each merchant, and then you would be able to identify UUIDs generated for each.
BONUS: You might experiment with V1 UUIDs using Mahonri Moriancumer's UUID and GUID Generator and Forensics.
The best implementation would depend on your UUID'S character distribution but what I did when faced with similar situation is use only the UUID for the end user. In the database I added a column filled with the first 2 chars of the UUID and indexed that column.
Either this or, as you proposed, the ID would serve the same purpose. The benefits of indexing a partial UDID are that
you can tweak how many characters to use to improve performance
the id is not used externaly potentially causing issues.
On the other hand, an integer index is probably using less memory and in no need of tweeking
Performance wise, for my implementation we were facing search queries of 1 in a million taking 7-8 seconds. With this solution we dropped to a few milliseconds

JavaEE application create a unique key using java.util.UUID

I want to make a javaEE application when users can register and confirm their email when receiving a email with a link after inserting their data in registration form (name, mail...)
To do that I am going to generate a long and unique key with java.util.UUID, store in a database and then send an email to the user with that key being part of the URL (Example: www.mysite.com/account.xhtml?id=KEY). Then the user will click the link, I extract the key from the URL and check if that key is stored in the DB. If it is, the user registration will be completed.
My question is, when creating that key with java.util.UUID, how can I know that it is a unique key? Should I check if there is another equal key in the DB and if so create a new one until the created key is unique?
What's the chance that a randomly-generated 128-bit integer will be equal to another randomly-generated integer?
If you just need peace of mind, use a primary key and if the insert fails due to a key collision, re-create a new UUID and retry the insert.
There are couple of ways you can do UUID in Java.
Java 5 onwards better practice is using java.util.UUID It is size of the string 36 characters. This link gives you simple example.
This discussion will give you answer to your question. It is very strong. I have never came across someone is complaining about its uniqueness.
But if you adding into DB or using in storage or using through network, size may be matters. So converting to other formats - Bases is good solution (Base64, Base85 etc). Please check this discussion here. You can use apache library org.apache.commons.codec.binary.Base64. Base85 is not safe for URLs.
My recommendation is, if you have may application/session beans/web services (many interconnections other applications and data transfers etc) are creating UUIDs, I prefer to do unique application name padding too. Like APP1, APP2 etc and then decode to other bases. If UUID is 6fcb514b-b878-4c9d-95b7-8dc3a7ce6fd8, then APP1-6fcb514b-b878-4c9d-95b7-8dc3a7ce6fd8 like that...
Though it is off the topic here, BUT When you use a URL like this www.mysite.com/account.xhtml?id=KEY, beware about SQL injection hacking attacks.

Validate the unique string generated in the system

I'm generating a unique id(generated by the frame work used are, so I should use only this ID and there is no API in the framework to check if this generated by the framework. Thanks RichieHH for pointing this) for each request in the web application and this can be presented back as a part of another request to the system. Now, I am storing these unique ID's generated in the database, and for every request the DB query is issued to check if this ID already exists(this is how the validation is done currently for the unique ID's). Now, if I have to validate the ID sent in the request has been generated by the application with out using the persistent storage, which approach should I be following?
My initial approacht is to generate the ID which adds to particular sum after hashing, but this can be identified after going through the patterns.
It will be great if some one can help me with an approach to solve this problem in a way it can validate the uniqueID generated with in the application. Thanks.
Use UUID, which is pretty standard solution for this task. You don't need to validate UUID, you can assume that it is unique always.
You can use ServerName+Timestamp+some extra. It can be more advantageous for debug but less secure.

Designing Unique Keys(Primary Keys) for a heavily denormalized NoSQL database

I am working on a web application related to Discussion forums using Java and Cassandra database.
I need to construct 'keys' for the rows storing the user's details and & another set of rows storing the content posted by the user.
One option is to get the randomly generated UUID provided by Java language, but these are 16 bytes long. and since NoSQL database involves heavy denormalization, I am concerned whether I would be wasting lots of disk space, RAM and other resources if the key could be generated in smaller sizes.
I need to generate two types of keys, one for the Users & other for Content Posted by Users.
For the Content posted by users, would timestamp+userId be a good key. where timestamp is the server time at which content was posted and userId refers to key of user row.
Any suggestions, comments appreciated ..
Thanks
Marcos
Is this a distributed application?
Then you could use a simple synchronized counter and initialize it on startup with the next available id.
On the other hand a database should be able to handle the UUID hashes as created by java.
This is a standard for creating things like sessionIds, that need to be unique.
Your problem is somewhat similar since a session in your context would represent a set of user input.

Categories