FHIR Resource logical id generation using SHA256 - java

I am trying to implement code that generates FHIR message from some type of input message. When I create each FHIR resources, I would need to create resource logical id that are unique and repeatedly generated.
From Microsoft's FHIR-Converter github repository, I found that they use SHA256 to hash the input string value to generate some type of 64 character id. I used the same approach to generate UUID in java. Here is code from Microsoft FHIR-Converter in .NET:
public static string GenerateUUID(string input)
{
if (string.IsNullOrWhiteSpace(input))
{
return null;
}
var bytes = Encoding.UTF8.GetBytes(input);
var algorithm = SHA256.Create();
var hash = algorithm.ComputeHash(bytes);
var guid = new byte[16];
Array.Copy(hash, 0, guid, 0, 16);
return new Guid(guid).ToString();
}
It generates uuid like this: e40b96a6-e62e-a67e-3ac7-69a099830e1c
My questions are:
In order to repeatedly generate the same id, does the string input MUST be same as well? Meaning, if I have an input of 123, it will generate e40b96a6-e62e-a67e-3ac7-69a099830e1c all the time?
If I HAVE to use unique id in order to generate this uuid, what is the advantage of using this extra step? If my input always have unique id for each resources, can I just assign id to be (Resource name)-(id)?
Is there a way to generate id without having unique id? I have some resources that do not have something unique. Are there other techniques where I can generate a unique input that can be repeated in different platforms? I don't see how I can do this without providing unique id from input..

A given string will always generate the same id. A different string should generate a different id, though there's a very slim chance of two strings generating the same hash.
There are rules for the format of the id (only certain characters permitted, maximum length allowed), but other than that, no obvious benefit I can see. It's fine to use your 'native' identifier as the resource id. (That said, resource ids generally shouldn't be real-world identifiers like social security numbers, license numbers, etc. as that can leak protected information.)
The expectation in FHIR is that a unique resource id corresponds to a unique real-world object. If you don't have a real identifier on the object, there's a possibility you could have multiple instances that correspond to distinct real-world objects. E.g. multiple Practitioner instances where all you have is a name of "A. Smith" would not be appropriate to presume are always the same instance. If you have no 'identity', you might be better off using the 'contained' mechanism rather than generating an id just from the content.

Related

Generate Bigger HashCode JAVA

Since there is known fact that Java generates around 4 Billion unique Hashcodes.
I am using Hashcode of Some String (Example Fname + Lname + DOB + DATE) which becomes Primary Key of my Database
in #PrePersist I set it with Hashcode which helps me in generating Hashcode for new Users. (Which has to be unique).
Now I am running out of has codes. Possible alternative for me is to use SHA-2 , MD5 etc.
How can I increase size of hash code & yet avoid that big collisions.
If your goal is to create a unique identifier for the database, I would suggest using UUID.
UUID Version 3, as it uses a namespace, will fit your case.
Some databases have native support for UUID, for instance PostgreSQL
I think you are confused about using int Object.hashCode(), which you can override and which returns an int and using a secure hash function. Those are two things. Object.hashCode is not intended to return unique integers (returning 1 is a valid implementation). So, using String.hashCode() for object identity is not a great idea since it can and will have collisions. It's intended for use with e.g. HashTables; which means it is optimized for performance and not for avoiding collisions.
You can indeed use sha1, sha2, sha3, or md5 if you want some kind of content hash. If not, use SecureRandom or UUID to generate something random. All of these have a very low probability of ever giving you a collision (not completely 0 of course).

Create Unique Message Id Like whats app

I am creating chat application so want generate unique message id .
Is it possible never create duplicate message id.
MongoDB's ObjectId is pretty complex is probably one of the good randomness from a unique id point of view.
So you can take a sneak peek in their source code to see how they generate it.
Leaving the definition from their official documentation here for posterity:
ObjectIds are small, likely unique, fast to generate, and ordered.
ObjectId values consists of 12-bytes, where the first four bytes are a
timestamp that reflect the ObjectId’s creation, specifically:
a 4-byte value representing the seconds since the Unix epoch,
a 3-byte machine identifier,
a 2-byte process id, and
a 3-byte counter, starting with a random value.
Example of Mongo's ObjectId:
ObjectId("507f1f77bcf86cd799439011")
There could be many ways to generate one! One common way would be to generate timestamp value and use it as a id which is also unique.
For example you can do this:
public int createID(){
Date now = new Date();
int id = Integer.parseInt(new SimpleDateFormat("ddHHmmss", Locale.US).format(now));
return id; }
you can also try and make it string and add any specific string format with it to make it more unique according to ur apps need!
base on your poor description, you can create compound id. for example you can create your ides with user id+timestamp. and if you use this pattern, your user id length must be same for all ides. so if it is not, you have to add "0" befor your current id to obtain equal length for all of your user ides
for better description:
String uniquemsgid= userid+ System.currentTimeMillis();
as a matter of fact, your user have a unique id an timestamp is unique for this user.
caution: if you use only timestamp or a date with any format, this method cant guarantee a unique message id. because two user can create a message at a moment
You can make a Random randomId= new Random();
int id = randLan.nextInt(99999) + 1;
Then you check if Id is already given, and if yes, try again, if not, you have an Id.
if(randomId == someOtherId), do same process again.
You might want to use device IMEI number for this, which is always unique and quite easy to get.
<uses-permission android:name="android.permission.READ_PHONE_STATE" />
Add above permission in your manifest file and then use the below two lines to get the IMEI.
TelephonyManager mngr = (TelephonyManager)getSystemService(Context.TELEPHONY_SERVICE);
long id = Long.parseLong(mngr.getDeviceId());

How to create a 7 characters of alphanumeric unique id? [duplicate]

This question already has answers here:
Creating a unique alphanumeric 10-character string
(7 answers)
Closed 8 years ago.
I need to create alphanumeric unique IDs of length 7 or 10. Similar to the shorter version of Git commit IDs (7a471b2).
I tried UUID, but the generated unique ID is longer than I need.
Is there a built-in method / code snippet in Java that can help here?
If you want to generate random values you should use SecureRandom
SecureRandom random = new SecureRandom();
byte bytes[] = new byte[15];
random.nextBytes(bytes);
To get the proper key length you may want to convert that into your expected from. The characters are also number, so you can generate longer random value and afterward just encode it. You may want to use Base64 or hext for that. In Java you use DatatypeConverter
String key = DatatypeConverter.printBase64Binary(random);
our use Apache
org.apache.commons.codec.binary.Base64
String key = new String(Base64.encodeBase64(random));
There is not Java class that support generation of random values in that form.
You did not mention whether you need the number to be generated in a state-less manner. You only need this if you have many sources generating IDs whereas each source is independent and does not know of any state of any other sources. For such a case UUID allows to generate ID's that yre still very unlikely to collide.
If you are the only source generating ID's, then you can make use of state. As an example, in a database you often simply use a sequence to generate IDs (the state being the nextval of the sequence). These numbers are perfectly unique too. If you need it to "look" random, there are algorithms to shuffle the number space by mapping each sequential number onto a random-looking number.
A second example of "state" is the set of all IDs already in use. You can use this by generating a "random" number in an arbitrarily primitive way and then matching it against all your existing numbers. If it collides, generate another one.
Try Apache lang RandomStringUtils class
This is not as simple as it looks like. First of all UUID is not 100% unique.
It can only produce 2^128 unique numbers (I might be wrong about the 128 number. But you get the idea).
Making it shorter will only increase the probability of repetition.
The best way I could think of right now is to take the UUID and use some base64 encoder over it.
[EDIT] Alternatively, use Random.nextInt and increment by one each time you need a new ID.

Unique serial number in a java web application

I've been wondering what's the correct practice for generating unique ids? The thing is in my web app I'll have a plugin system, when a user registers a plugin I want to generate a unique serial ID for it. I've been thinking about storing all numbers in a DB or a file on the server, generating a random number and checking whether it already exists in the DB/file, but that doesn't seem that good. Are there other ways to do it? Would using the UUID be the preferred way to go?
If the ids are user-facing, which it seems they are, then you want them to be difficult to guess. Use the built-in UUID class, which generates random ids for you and can format them nicely for you. Extract:
UUID idOne = UUID.randomUUID();
UUID idTwo = UUID.randomUUID();
log("UUID One: " + idOne);
log("UUID Two: " + idTwo);
Example output:
UUID One: 067e6162-3b6f-4ae2-a171-2470b63dff00
UUID Two: 54947df8-0e9e-4471-a2f9-9af509fb5889
There are other solutions in the link provided. I think it compares the methods quite well, so choose the one which best suits your needs.
Another interesting method is the one MongoDB uses, but this is possibly overkill for your needs:
A BSON ObjectID is a 12-byte value
consisting of a 4-byte timestamp
(seconds since epoch), a 3-byte
machine id, a 2-byte process id, and a
3-byte counter. Note that the
timestamp and counter fields must be
stored big endian unlike the rest of
BSON
If they weren't user facing, then you could just leave it to the database to do an auto-incrementing id: 1, 2, 3, etc.
Why not go for a static (or, in this case, a context-scoped) AtomicInteger that would be incremented when a plugin is registered ?
You could base it on user ID and timestamp, ensuring uniqueness but also providing a certain "readability"? E.g. User ID = 101, timestamp = 23 Dec 2010 13:54, you could give it ID:
201012231354101
or
101201012231354
The alternative UUID is obviously guaranteed to be unique but takes up a lot of DB space, and is quite unwieldy to work with.
A final idea is to ask your central DB for a unique ID, e.g. Oracle has sequences, or MySQL uses AUTO_INCREMENT fields, to assign unique integers.

Exposing Entity IDs of Google Datastore data

Is it save to expose entity ids of data that is in Google Datastore.
For example in my code i have entity with this id:
#PrimaryKey
#Persistent(valueStrategy = IdGeneratorStrategy.IDENTITY)
#Extension(vendorName="datanucleus", key="gae.encoded-pk", value="true")
private String id;
The id is going to be similar to this: agptZeERtzaWYvSQadLEgZDdRsUYRs
Can anyone extract password, application url and any other information from this string? What is the meaning of that string?
That entity ID contains the object id, appliation id, and object class name. It's just an encoded string. Not really any sort of security risk.
You can use the KeyFactory to convert to keytoString, stringToKey as follows URL Google App Engine:
the ID that I believe that it was an unique id for the data storage in Google App Engine.
Key instances can be converted to and
from the encoded string representation
using the KeyFactory methods
keyToString() and stringToKey().
When using encoded key strings, you
can provide access to an object's
string or numeric ID with an
additional fields.
I hope it helps.
Tiger.
If you navigate to localhost:4321/_ah/admin , you can take advantage of the sdk datastore viewer, where you will see that every kind of entity has a KEY field and a NAME/ID field;
Whether you use long, String or Key as your #PrimaryKey, there will be an ID/Name column with a String/number, and a KEY column, with the encoded key for said ID. As mentioned in other posts, this encoding hashes {md5s, most likely} your appspot application id, the fully qualified classname of the data object, and whatever you specify as the #PrimaryKey.
The only time you will ever want direct access to this field is if you absolutely don't care what the data is named,{when you need your program to find it, but humans won't be searching for it by guessing words into a text box}, or when you WANT to have multiple objects of the same type and name {maybe using a version int?} then you should use the encoded key syntax. Both KEY and ID are present in the db whether you put a field in your class, using the encoded key syntax just gives you access to this value.
Also, there is an available speed bonus for applications that use encoded keys... There are only two types of queries: SELECT * and SELECT _ _ key _ _ {spaces used to show there are two _}. For large data sets in AJAX apps, the only efficient way to paginate data is to select all the keys, send them to the client, and have the client ask for 0->X number of records, build links for the other X->Y results, and query the server with the first set of encoded keys for full data, parse response into nice little lists, and avoid loading 397 server data objects that aren't immediately useful.
Sending encoded keys up and down the wire might take a little more bandwidth than unencoded keys {unless you're as long winded at naming things as I am!}; but it shaves those cpu cycles on appengine, makes your quotas happier, and everybody's app runs just a fraction bit faster!
This key, even if somehow unhashed, will only expose data as sensitive as whatever you make a PrimaryKey. You app password is not involved, nor will user passwords in any sane data model. About the only thing that might {BIG might} leak is a user email address, if you use the provided User class for authentication, or the class names you use in your source.
...Basically, only information already available in watching a firebug request or two could possibly be exposed.

Categories