I've been wondering what's the correct practice for generating unique ids? The thing is in my web app I'll have a plugin system, when a user registers a plugin I want to generate a unique serial ID for it. I've been thinking about storing all numbers in a DB or a file on the server, generating a random number and checking whether it already exists in the DB/file, but that doesn't seem that good. Are there other ways to do it? Would using the UUID be the preferred way to go?
If the ids are user-facing, which it seems they are, then you want them to be difficult to guess. Use the built-in UUID class, which generates random ids for you and can format them nicely for you. Extract:
UUID idOne = UUID.randomUUID();
UUID idTwo = UUID.randomUUID();
log("UUID One: " + idOne);
log("UUID Two: " + idTwo);
Example output:
UUID One: 067e6162-3b6f-4ae2-a171-2470b63dff00
UUID Two: 54947df8-0e9e-4471-a2f9-9af509fb5889
There are other solutions in the link provided. I think it compares the methods quite well, so choose the one which best suits your needs.
Another interesting method is the one MongoDB uses, but this is possibly overkill for your needs:
A BSON ObjectID is a 12-byte value
consisting of a 4-byte timestamp
(seconds since epoch), a 3-byte
machine id, a 2-byte process id, and a
3-byte counter. Note that the
timestamp and counter fields must be
stored big endian unlike the rest of
BSON
If they weren't user facing, then you could just leave it to the database to do an auto-incrementing id: 1, 2, 3, etc.
Why not go for a static (or, in this case, a context-scoped) AtomicInteger that would be incremented when a plugin is registered ?
You could base it on user ID and timestamp, ensuring uniqueness but also providing a certain "readability"? E.g. User ID = 101, timestamp = 23 Dec 2010 13:54, you could give it ID:
201012231354101
or
101201012231354
The alternative UUID is obviously guaranteed to be unique but takes up a lot of DB space, and is quite unwieldy to work with.
A final idea is to ask your central DB for a unique ID, e.g. Oracle has sequences, or MySQL uses AUTO_INCREMENT fields, to assign unique integers.
Related
I am trying to implement code that generates FHIR message from some type of input message. When I create each FHIR resources, I would need to create resource logical id that are unique and repeatedly generated.
From Microsoft's FHIR-Converter github repository, I found that they use SHA256 to hash the input string value to generate some type of 64 character id. I used the same approach to generate UUID in java. Here is code from Microsoft FHIR-Converter in .NET:
public static string GenerateUUID(string input)
{
if (string.IsNullOrWhiteSpace(input))
{
return null;
}
var bytes = Encoding.UTF8.GetBytes(input);
var algorithm = SHA256.Create();
var hash = algorithm.ComputeHash(bytes);
var guid = new byte[16];
Array.Copy(hash, 0, guid, 0, 16);
return new Guid(guid).ToString();
}
It generates uuid like this: e40b96a6-e62e-a67e-3ac7-69a099830e1c
My questions are:
In order to repeatedly generate the same id, does the string input MUST be same as well? Meaning, if I have an input of 123, it will generate e40b96a6-e62e-a67e-3ac7-69a099830e1c all the time?
If I HAVE to use unique id in order to generate this uuid, what is the advantage of using this extra step? If my input always have unique id for each resources, can I just assign id to be (Resource name)-(id)?
Is there a way to generate id without having unique id? I have some resources that do not have something unique. Are there other techniques where I can generate a unique input that can be repeated in different platforms? I don't see how I can do this without providing unique id from input..
A given string will always generate the same id. A different string should generate a different id, though there's a very slim chance of two strings generating the same hash.
There are rules for the format of the id (only certain characters permitted, maximum length allowed), but other than that, no obvious benefit I can see. It's fine to use your 'native' identifier as the resource id. (That said, resource ids generally shouldn't be real-world identifiers like social security numbers, license numbers, etc. as that can leak protected information.)
The expectation in FHIR is that a unique resource id corresponds to a unique real-world object. If you don't have a real identifier on the object, there's a possibility you could have multiple instances that correspond to distinct real-world objects. E.g. multiple Practitioner instances where all you have is a name of "A. Smith" would not be appropriate to presume are always the same instance. If you have no 'identity', you might be better off using the 'contained' mechanism rather than generating an id just from the content.
I have a Java service that generates a 16 digit unique number using current time in the format yymmddhhmmssmsms. And it handles multiple calls on the same mili-second using Atomic Long.
But now the problem is that I need this service on multiple cloud machines. How can I handle calls at the same microsecond on different servers and generate a unique number for each of this calls.
And I dont want to use database for this.
EDIT:
I understand UUID can be a solution. But UUID generates a random no. everytime, not unique, though the chances of collision are very low.
A think, you can try to use UUID.randomUUID() object
The UUID v4 is the right choice for distributed systems.
The UUID v4 implementation uses random numbers as the source. The Java
implementation is SecureRandom – which uses an unpredictable value as
the seed to generate random numbers to reduce the chance of
collisions.
Source: https://www.baeldung.com/java-uuid
Secure Random: This class provides a cryptographically strong random number generator (RNG). Source: https://docs.oracle.com/javase/8/docs/api/java/security/SecureRandom.html
Example how to use the UUID:
UUID uuid = UUID.randomUUID();
Using a random number alone (such as a random UUID) should only be done if—
you have a way to check the number for uniqueness across all calls, or
you can tolerate the risk of generating the same number for different calls.
If you find it appropriate, try assigning each server a unique number (e.g., by requesting a unique number from a central database). Then each server can generate a unique ID for each call it makes by appending a random number to that unique number; this may work well because the server can now more easily check the IDs it then generates for uniqueness, as no further contact with a central database or other servers is required.
See also my section on generating unique random identifiers.
I am creating chat application so want generate unique message id .
Is it possible never create duplicate message id.
MongoDB's ObjectId is pretty complex is probably one of the good randomness from a unique id point of view.
So you can take a sneak peek in their source code to see how they generate it.
Leaving the definition from their official documentation here for posterity:
ObjectIds are small, likely unique, fast to generate, and ordered.
ObjectId values consists of 12-bytes, where the first four bytes are a
timestamp that reflect the ObjectId’s creation, specifically:
a 4-byte value representing the seconds since the Unix epoch,
a 3-byte machine identifier,
a 2-byte process id, and
a 3-byte counter, starting with a random value.
Example of Mongo's ObjectId:
ObjectId("507f1f77bcf86cd799439011")
There could be many ways to generate one! One common way would be to generate timestamp value and use it as a id which is also unique.
For example you can do this:
public int createID(){
Date now = new Date();
int id = Integer.parseInt(new SimpleDateFormat("ddHHmmss", Locale.US).format(now));
return id; }
you can also try and make it string and add any specific string format with it to make it more unique according to ur apps need!
base on your poor description, you can create compound id. for example you can create your ides with user id+timestamp. and if you use this pattern, your user id length must be same for all ides. so if it is not, you have to add "0" befor your current id to obtain equal length for all of your user ides
for better description:
String uniquemsgid= userid+ System.currentTimeMillis();
as a matter of fact, your user have a unique id an timestamp is unique for this user.
caution: if you use only timestamp or a date with any format, this method cant guarantee a unique message id. because two user can create a message at a moment
You can make a Random randomId= new Random();
int id = randLan.nextInt(99999) + 1;
Then you check if Id is already given, and if yes, try again, if not, you have an Id.
if(randomId == someOtherId), do same process again.
You might want to use device IMEI number for this, which is always unique and quite easy to get.
<uses-permission android:name="android.permission.READ_PHONE_STATE" />
Add above permission in your manifest file and then use the below two lines to get the IMEI.
TelephonyManager mngr = (TelephonyManager)getSystemService(Context.TELEPHONY_SERVICE);
long id = Long.parseLong(mngr.getDeviceId());
Background : I have a database table called Contact. All users of my system have details of their contacts in this table including a firstname, a lastname, and also a varchar field called 'UniqueId'. Users of the system may put anything in the UniqueId field, as long as it is unique from that user's other contact's unique ids.
Aim : I now need to change my code so a unique id is automatically generated if the user does not provide one. This should be short and visually pleasing. Ideally it could just be an auto-incrementing number. However, AUTO_INCREMENT works for an integer field, not a varchar field.
Also note that each contact UniqueId needs to be unique from the other contacts of that user, but not neccesarily unique to the entire system. Therefore, the following UniqueIds are valid :
Contact
UserId Firstname Lastname UniqueId
1 Bob Jones 1
1 Harold Smith 2
2 Joe Bloggs 1
Question : So, how can I achieve this? Is there a reliable and clean way to get the database to generate a unique id for each contact in the existing UniqueId varchar field (Which is my preference if possible)? Or am I forced to make Java go and get the next available unique id, and if so, what is the most reliable way of doing this? Or any alternative solution?
Edit - 11th April AM: We use hibernate to map our fields. I'm just beginning to research if that may provide an alternative solution? Any opinions?
Edit - 11th April PM: 2 options are currently standing out, but neither seem as ideal as I would like.
1. As #eis suggests, I could have an auto-incrementing field in addition to my current varchar field. Then, either when a contact is saved the int can also be saved in the varchar field, or when a contact is retrieved the int can be used if the varchar is empty. But it feels messy and wrong to use two fields rather than one
2. I am looking into using a hibernate generator, as discussed here. But this involves holding a count elsewhere of the next id, and Java code, and seems to massively overcomplicate the process.
If my existing uniqueId field had been an int field, AUTO_INCREMENT would simply work, and work nicely. Is there no way to make the database generate this but save it as a String?
I think what you really should do is ditch your current 'uniqueid' and generate new ones that are really unique across the system, being always autogenerated and never provided by the user. You would need to do separate work to migrate to the new system. That's the only way I see to keep it sane. User could provide something like an alias to be more visually pleasing, if needs be.
On the upside, you could use autoincrement then.
Ok, one additional option, if you really really want what you're asking. You could have a prefix like §§§§ that is never allowed for a user, and always autogenerate ids based on that, like §§§§1, §§§§2 etc. If you disallow anything starting with that prefix from the end user, you would know that there would be no collisions, and you could just generate them one-by-one whenever needed.
Sequences would be ideal to generate numbers to it. You don't have sequences in MySQL, but you could emulate them for example like this.
I apologize, I really don't know MySQL syntax, but here's how I'd do it in SQL Server. Hopefully that will still have some value to you. Basically, I'm just counting the number of existing contacts and returning it as a varchar.
CREATE FUNCTION GetNewUniqueId
(#UserId int)
RETURNS varchar(3)
AS
BEGIN
DECLARE #Count int;
SELECT #Count = COUNT(*)
FROM Contacts
WHERE UserId = #UserId;
SET #Count = #Count + 1;
RETURN CAST(#Count AS varchar(3));
END
But if you really want something "visually pleasing," why not try returning something more like Firstname + Lastname?
CREATE FUNCTION GetNewUniqueId
(#UserId int, #FirstName varchar(255), #LastName varchar(255))
RETURNS varchar(515)
AS
BEGIN
DECLARE #UniqueId varchar(515), #Count int;
SET #UniqueId = #FirstName + #LastName;
SELECT #Count = COUNT(*)
FROM Contacts
WHERE UserId = #UserId AND LEFT(UniqueId, LEN(#UniqueId)) = #UniqueId;
IF #Count > 0
SET #UniqueId = #UniqueId + '_' + CAST(#Count + 1 AS varchar(3));
RETURN #UniqueId;
END
I wish to store UUIDs created using java.util.UUID in a HSQLDB database.
The obvious option is to simply store them as strings (in the code they will probably just be treated as such), i.e. varchar(36).
What other options should I consider for this, considering issues such as database size and query speed (neither of which are a huge concern due to the volume of data involved, but I would like to consider them at least)
HSQLDB has a built-in UUID type. Use that
CREATE TABLE t (
id UUID PRIMARY KEY
);
You have a few options:
Store it as a VARCHAR(36), as you already have suggested. This will take 36 bytes (288 bits) of storage per UUID, not counting overhead.
Store each UUID in two BIGINT columns, one for the least-significant bits and one for the most-significant bits; use UUID#getLeastSignificantBits() and UUID#getMostSignificantBits() to grab each part and store it appropriately. This will take 128 bits of storage per UUID, not counting any overhead.
Store each UUID as an OBJECT; this stores it as the binary serialized version of the UUID class. I have no idea how much space this takes up; I'd have to run a test to see what the default serialized form of a Java UUID is.
The upsides and downsides of each approach is based on how you're passing the UUIDs around your app -- if you're passing them around as their string-equivalents, then the downside of requiring double the storage capacity for the VARCHAR(36) approach is probably outweighed by not having to convert them each time you do a DB query or update. If you're passing them around as native UUIDs, then the BIGINT method probably is pretty low-overhead.
Oh, and it's nice that you're looking to consider speed and storage space issues, but as many better than me have said, it's also good that you recognize that these might not be critically important given the amount of data your app will be storing and maintaining. As always, micro-optimization for the sake of performance is only important if not doing so leads to unacceptable cost or performance. Otherwise, these two issues -- the storage space of the UUIDs, and the time it takes to maintain and query them in the DB -- are reasonably low-importance given the cheap cost of storage and the ability of DB indices to make your life much easier. :)
I would recommend char(36) instead of varchar(36). Not sure about hsqldb, but in many DBMS char is a little faster.
For lookups, if the DBMS is smart, then you can use an integer value to "get closer" to your UUID.
For example, add an int column to your table as well as the char(36). When you insert into your table, insert the uuid.hashCode() into the int column. Then your searches can be like this
WHERE intCol = ? and uuid = ?
As I said, if hsqldb is smart like mysql or sql server, it will narrow the search by the intCol and then only compare at most a few values by the uuid. We use this trick to search through million+ record tables by string, and it is essentially as fast as an integer lookup.
Using BINARY(16) is another possibility. Less storage space than character types. Use CREATE TYPE UUID .. or CREATE DOMAIN UUID .. as suggested above.
I think the easiest thing to do would be to create your own domain thus creating your own UUID "type" (not really a type, but almost).
You also should consider the answer to this question (especially if you plan to use it instead of a "normal" primary key)
INT, BIGINT or UUID/GUID in HSQLDB? (deleted by community ...)
HSQLDB: Domain Creation and Manipulation