How to generate fixed length random number without conflict?

How to generate fixed length random number without conflict? - java

I'm working on an application where I've to generate code like Google classroom. When a user creates a class I generate code using following functions
private String codeGenerator(){
StringBuilder stringBuilder=new StringBuilder();
String chars="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789";
int characterLength=chars.length();
for(int i=0;i<5;i++){
stringBuilder.append(chars.charAt((int)Math.floor(Math.random()*characterLength)));
}
return stringBuilder.toString();
}
As I have 62 different characters. I can generate total 5^62 code total which is quite large. I can generate this code in server or user device. So my question is which one is better approach? How likely a generated code will conflict with another code?

From a comment, it seems that you are generating group codes for your own application.
For the purposes and scale of your app, 5-character codes may be appropriate. But there are several points you should know:
Random number generators are not designed to generate unique numbers. You can generate a random code as you're doing now, but you should check that code for uniqueness (e.g., check it against a table that stores group codes already generated) before you treat that code as unique.
If users are expected to type in a group code, you should include a way to check whether a group code is valid, to avoid users accidentally joining a different group than intended. This is often done by adding a so-called "checksum digit" to the end of the group code. See also this answer.
It seems that you're trying to generate codes that should be hard to guess. In that case, Math.random() is far from suitable (as is java.util.Random) — especially because the group codes are so short. Use a secure random generator instead, such as java.security.SecureRandom (fortunately for you, its security issues were addressed in Android 4.4, which, as I can tell from a comment of yours, is the minimum Android version your application supports; see also this question). Also, if possible, make group codes longer, such as 8 or 12 characters long.
For more information, see Unique Random Identifiers.
Also, there is another concern. There is a serious security issue if the 5-character group code is the only thing that grants access to that group. Ideally, there should be other forms of authorization, such as allowing only logged-in users or certain logged-in users—
to access the group via that group code, or
to accept invitations to join the group via that group code (e.g., in Google Classroom, the PERMISSION_DENIED error code can be raised when a user tries to accept an invitation to join a class).

The only way to avoid duplicates in your scheme is to keep a copy of the ones that you have already generated, and avoid "generating" anything that would result in a duplicate. Since 5^62 is a lot, you could simply store them on a table if using a database; or on a hashset if everything is in-memory and there is only one instance of the application (remember to save the list of generated IDs to disk every time you create a new one, and to re-read it at startup).
The chances of a collision are low: you would need to generate around 5^(62/2) = 5^31 ~= 4.6E21 really-random identifiers for a collision to be more likely than not (see birthday paradox) - and it would take a lot of space to store and check all those identifiers for duplicates to detect that this was the case. But such is the price of security.

Que: A sack contains a blue ball and a red ball. I draw one ball from the sack. What are the chances it is a red ball?
Ans: 1/2
Que: I have a collection of 5^62 unique codes. I choose one code from the collection. What are the chances that it is "ABCDE"?
Ans: 1/(5^62)
NOTE: Random number generators are not actually random.

Well, in case you need a unique generator, what about the following. This is definitely not a random, but it's definitely unique for one instance.
public final class UniqueCodeGenerator implements Supplier<String> {
private int code;
#Override
public synchronized String get() {
return String.format("%05d", code++);
}
public static void main(String... args) {
Supplier<String> generator = new UniqueCodeGenerator();
for (int i = 0; i < 10; i++)
System.out.println(generator.get());
}
}

Related

Impose order in Jsprit with HardActivityConstraint

In a scenario of re-solving a previously solved problem (with some new data, of course), it's typically impossible to re-assign a vehicle's very-first assignment once it was given. The driver is already on its way, and any new solution has to take into account that:
the job must remain his (can't be assigned to another vehicle)
the activity that's been assigned to him as the very-first, must remain so in future solutions
For the sake of simplicity, I'm using a single vehicle scenario, and only trying to impose the second bullet (i.e. ensure that a certain activity will be the first in the solution).
This is how I defined the constraint:
new HardActivityConstraint()
{
#Override
public ConstraintsStatus fulfilled(JobInsertionContext iFacts, TourActivity prevAct, TourActivity newAct, TourActivity nextAct,
double prevActDepTime)
{
String locationId = newAct.getLocation().getId();
// we want to make sure that any solution will have "C1" as its first activity
boolean activityShouldBeFirst = locationId.equals("C1");
boolean attemptingToInsertFirst = (prevAct instanceof Start);
if (activityShouldBeFirst && !attemptingToInsertFirst)
return ConstraintsStatus.NOT_FULFILLED_BREAK;
if (!activityShouldBeFirst && attemptingToInsertFirst)
return ConstraintsStatus.NOT_FULFILLED;
return ConstraintsStatus.FULFILLED;
}
}
This is how I build the algorithm:
VehicleRoutingAlgorithmBuilder vraBuilder;
vraBuilder = new VehicleRoutingAlgorithmBuilder(vrpProblem, "schrimpf.xml");
vraBuilder.addCoreConstraints();
vraBuilder.addDefaultCostCalculators();
StateManager stateManager = new StateManager(vrpProblem);
ConstraintManager constraintManager = new ConstraintManager(vrpProblem, stateManager);
constraintManager.addConstraint(new HardActivityConstraint() { ... }, Priority.HIGH);
vraBuilder.setStateAndConstraintManager(stateManager, constraintManager);
VehicleRoutingAlgorithm algorithm = vraBuilder.build();
The results are not good. I'm only getting solutions with a single job assigned (the one with the required activity). In debug it's clear that the job insertion iterations consider many viable options that appear to solve the problem entirely, but at the bottom line, the best solution returned by the algorithm doesn't include the other jobs.
UPDATE: even more surprising, is that when I use the constraint in scenarios with over 5 vehicles, it works fine (worst results are with 1 vehicle).
I'll gladly attach more information if needed.
Thanks
Zach

First, you can use initial routes to ensure that certain jobs need to be assigned to specific vehicles right from the beginning (see example).
Second, to ensure that no activity will be inserted between start and your initial job(location) (e.g. "C1" in your example), you need to prohibit it the way you defined your HardActConstraint, just modify it so that a newAct can never be between prevAct=Start and nextAct=act(C1).
Third, with regards to your update, just have in mind that the essence of the algorithm is to ruin part of the solution (remove a number of jobs) and recreate the solution again (insert the unassigned jobs). Currently, the schrimpf algorithm ruins a number of jobs relative to the total number of jobs, i.e. noJobs = 0.5 * totalNoJobs for the random ruin and 0.3 * totalNoJobs for the radial ruin. If your problem is very small, the share of jobs to be removed might not sufficiant. This is going to change with next release, where you can use an algorithm out of the box which defines an absolute minimum of jobs that need to be removed. For the time being, modify the shares in your algorithmConfig.xml.

Existing solution to "smart" initial capacity for StringBuilder

I have a piece logging and tracing related code, which called often throughout the code, especially when tracing is switched on. StringBuilder is used to build a String. Strings have reasonable maximum length, I suppose in the order of hundreds of chars.
Question: Is there existing library to do something like this:
// in reality, StringBuilder is final,
// would have to create delegated version instead,
// which is quite a big class because of all the append() overloads
public class SmarterBuilder extends StringBuilder {
private final AtomicInteger capRef;
SmarterBuilder(AtomicInteger capRef) {
int len = capRef.get();
// optionally save memory with expense of worst-case resizes:
// len = len * 3 / 4;
super(len);
this.capRef = capRef;
}
public syncCap() {
// call when string is fully built
int cap;
do {
cap = capRef.get();
if (cap >= length()) break;
} while (!capRef.compareAndSet(cap, length());
}
}
To take advantage of this, my logging-related class would have a shared capRef variable with suitable scope.
(Bonus Question: I'm curious, is it possible to do syncCap() without looping?)
Motivation: I know default length of StringBuilder is always too little. I could (and currently do) throw in an ad-hoc intitial capacity value of 100, which results in resize in some number of cases, but not always. However, I do not like magic numbers in the source code, and this feature is a case of "optimize once, use in every project".

Make sure you do the performance measurements to make sure you really are getting some benefit for the extra work.
As an alternative to a StringBuilder-like class, consider a StringBuilderFactory. It could provide two static methods, one to get a StringBuilder, and the other to be called when you finish building a string. You could pass it a StringBuilder as argument, and it would record the length. The getStringBuilder method would use statistics recorded by the other method to choose the initial size.
There are two ways you could avoid looping in syncCap:
Synchronize.
Ignore failures.
The argument for ignoring failures in this situation is that you only need a random sampling of the actual lengths. If another thread is updating at the same time you are getting an up-to-date view of the string lengths anyway.

You could store the string length of each string in a statistic array. run your app, and at shutdown you take the 90% quartil of your string length (sort all str length values, and take the length value at array pos = sortedStrings.size() * 0,9
That way you created an intial string builder size where 90% of your strings will fit in.
Update
The value could be hard coded (like java does for value 10 in ArrayList), or read from a config file, or calclualted automatically in a test phase. But the quartile calculation is not for free, so best you run your project some time, measure the 90% quartil on the fly inside the SmartBuilder, output the 90% quartil from time to time, and later change the property file to use the value.
That way you would get optimal results for each project.
Or if you go one step further: Let your smart Builder update that value from time to time in the config file.
But this all is not worth the effort, you would do that only for data that have some millions entries, like digital road maps, etc.

How to generate incremental identifier in java

I have requirement in which I continuously receive messages that needs to be written in a file. Every time a new message is received it needs to be written in a separate file. What I want is to generate an unique identifier to be used as a file-name. I also want to preserve the order of the messages as well. By this I mean, the identifier generated as a file-name should always be incremental.
I was using UUID.randomUUID() to generate file-names but the problem with this approach is that UUID only assures randomness of the identifier but is not incremental. As a result I am losing the ordering of the file (I want file generated first should appear first in the list).
Approaches known
Can use System.currentTimeMillis() but I can receive multiple messages at same time stamp.
2.Another approach could be to implement static long value and increment it whenever a file is to be created and use the long value as a file-name. But I am not sure about this approach. Also it doesn't seem to be a proper solution to my problem. I think there could be far better solutions than this one.
If someone could suggest me a better solution to this problem, will be highly appreciated.

If you want your id value to uniformly rise even between server restarts, then you must either base it on the system time or have some elaborately robust logic that persists the last ID used. Note that achieving robustness on its own is not hard, but achieving it in a performant and scalable way is.
If you additionally need the id to be unique across multiple nodes in a redundant server cluster, then you need even more elaborate logic, which definitely involves a persistent store to which all the boxes synchronize access. Making this performant is, of course, even harder.
The best option I can see is to have a quite long ID so there's room for these parts:
System.currentTimeMillis for long-term uniqueness (across restarts);
System.nanotime for finer granularity;
a unique id of each server node (determined in a platform-specific way).
The method will still have to remember the last value generated and retry in case of a duplicate. It won't have to retry too many times, though, just until the next nanoTime clock tick—it could even busy-wait for it.
Sketch of code without point 3 (single-node implementation):
private static long lastNanos;
public static synchronized String uniqueId() {
for (;/*ever*/;) {
final long n = System.nanoTime();
if (n == lastNanos) continue;
lastNanos = n;
return "" + System.currentTimeMillis() + n;
}
}

Ok, my hands up. My last answer was fairly flaky and I've deleted it.
Keeping with the spirit of the site, I thought I'd try a different tac.
If you say you are keeping these messages in a single file then you could try something like creating an unique Id out of the size of the file?
Before you write the message to the file it's id could be the current size of the file.
You could add the filename + size as the id if these messages need to be unique across a number of files.
I'll leave the hot potato of synchronization to another day. But you could wrap all of this up in a syncronized object that keeps track of things.
Also, I am assuming that any messages written to the file will not be removed in the future.
ADDITIONAL NOTE:
You could create an message processing object that opens the file on construction (or via a create method).
This object will get the initial size of the file and this will be used as the unique id.
As each message is added (in a synchronized manner), the id is incremented by the size of the message.
This would address the performance issues. Will not work if more than one JVM/Node accesses the same file.
Skeletal Idea:
public class MessageSink {
private long id = 0;
public MessageSink(String filename) {
id = ... get file size ..
}
public synchronized addMessage(Message msg) {
msg.setId(id);
.. write to file + flush ..
.. or add to stack of messages that need to be written to file
.. at a later stage.
id = id + msg.getSize();
}
public void flushMessages() {
.. open file
.. for each message in stack write ...
.. flush and close file
}
}

I had the same requirement and found a suitable solution. Twitter Snowflake uses a simple algorithm to generate sortable 64bit (long) ids. Snowflake is written on Scala but the approach is simple and could be easily used in a Java code.
id is composed of:
timestamp - 41 bits (millisecond precision w/ a custom epoch gives us 69 years);
machine id - 10 bits (MAC address could be used as a hardware id);
sequence number - 12 bits - rolls over every 4096 per machine (with protection to avoid rollover in the same ms)
Formula looks like: ((timestamp - customEpoch) << timestampShift) | (machineId << machineIdShift) | sequenceNumber;
Shift for each component depends on it's bits position in ID.
Detailed description and source code could be found at github:
Twitter Snowflake
Basic Java implementation of the Snowflake algorithm

Why does my UUID use too much time?

String s = UUID.randomUUID().toString();
return s.substring(0,8) + s.substring(9,13) + s.substring(14,18) +
s.substring(19,23) + s.substring(24);
I use JDK1.5's UUID, but it uses too much time when I connect/disconnect from the net.
I think the UUID may want to access some net.
Can anybody help me?

UUID generation is done locally and doesn't require any alive network connection.

Quoting the API odc:
public static UUID randomUUID()
Static factory to retrieve a type 4 (pseudo randomly generated) UUID.
The UUID is generated using a
cryptographically strong pseudo random
number generator.
Your delay is probably being caused by the intialization of the cryptographically strong RNG - those take some time, and might even depend on the presence of a network connection as a source of entropy. However, this should happen only once during the runtime of the JVM. I don't see a way around this problem, though.

The javadoc for UUID http://java.sun.com/j2se/1.5.0/docs/api/java/util/UUID.html has some good information on how the UUID is generated. It uses the time and clock frequency to generate the UUID. Like sharptooth says, no network interface is required. Is there possibly some other concurrent process running that could possibly be causing this problem?

What's the purpose of those s.substring calls? It looks like you're returning the original string.

If you're appending 5 Strings together, over a large set of data, that could be the issue. Try to use StringBuffer. It's amazing the difference that can make when concatenating more than 1-2 Strings together, especially for larger datasets

For older versions of Java (6 and earlier maybe?), there's a bug in Random that causes it to iterate over the entire temp directory. We've seen seed generation take 10 minutes on some egregiously bad build machines at NVIDIA. You might want to check the size of your temp dir.
Compare: http://www.docjar.com/html/api/sun/security/provider/SeedGenerator.java.html
To: http://www.java2s.com/Open-Source/Java-Document/6.0-JDK-Modules/j2me/sun/security/provider/SeedGenerator.java.htm

Generating a globally unique identifier in Java

Summary: I'm developing a persistent Java web application, and I need to make sure that all resources I persist have globally unique identifiers to prevent duplicates.
The Fine Print:
I'm not using an RDBMS, so I don't have any fancy sequence generators (such as the one provided by Oracle)
I'd like it to be fast, preferably all in memory - I'd rather not have to open up a file and increment some value
It needs to be thread safe (I'm anticipating that only one JVM at a time will need to generate IDs)
There needs to be consistency across instantiations of the JVM. If the server shuts down and starts up, the ID generator shouldn't re-generate the same IDs it generated in previous instantiations (or at least the chance has to be really, really slim - I anticipate many millions of presisted resources)
I have seen the examples in the EJB unique ID pattern article. They won't work for me (I'd rather not rely solely on System.currentTimeMillis() because we'll be persisting multiple resources per millisecond).
I have looked at the answers proposed in this question. My concern about them is, what is the chance that I will get a duplicate ID over time? I'm intrigued by the suggestion to use java.util.UUID for a UUID, but again, the chances of a duplicate need to be infinitesimally small.
I'm using JDK6

Pretty sure UUIDs are "good enough". There are 340,282,366,920,938,463,463,374,607,431,770,000,000 UUIDs available.
http://www.wilybeagle.com/guid_store/guid_explain.htm
"To put these numbers into perspective, one's annual risk of being hit by a meteorite is estimated to be one chance in 17 billion, that means the probability is about 0.00000000006 (6 × 10−11), equivalent to the odds of creating a few tens of trillions of UUIDs in a year and having one duplicate. In other words, only after generating 1 billion UUIDs every second for the next 100 years, the probability of creating just one duplicate would be about 50%. The probability of one duplicate would be about 50% if every person on earth owns 600 million UUIDs"
http://en.wikipedia.org/wiki/Universally_Unique_Identifier

public class UniqueID {
private static long startTime = System.currentTimeMillis();
private static long id;
public static synchronized String getUniqueID() {
return "id." + startTime + "." + id++;
}
}

If it needs to be unique per PC: you could probably use (System.currentTimeMillis() << 4) | (staticCounter++ & 15) or something like that.
That would allow you to generate 16 per ms. If you need more, shift by 5 and and it with 31...
if it needs to be unique across multiple PCs, you should also combine in your primary network card's MAC address.
edit: to clarify
private static int staticCounter=0;
private final int nBits=4;
public long getUnique() {
return (currentTimeMillis() << nBits) | (staticCounter++ & 2^nBits-1);
}
and change nBits to the square root of the largest number you should need to generate per ms.
It will eventually roll over. Probably 20 years or something with nBits at 4.

From memory the RMI remote packages contain a UUID generator. I don't know whether thats worth looking into.
When I've had to generate them I typically use a MD5 hashsum of the current date time, the user name and the IP address of the computer. Basically the idea is to take everything that you can find out about the computer/person and then generate a MD5 hash of this information.
It works really well and is incredibly fast (once you've initialised the MessageDigest for the first time).

why not do like this
String id = Long.toString(System.currentTimeMillis()) +
(new Random()).nextInt(1000) +
(new Random()).nextInt(1000);

if you want to use a shorter and faster implementation that java UUID take a look at:
https://code.google.com/p/spf4j/source/browse/trunk/spf4j-core/src/main/java/org/spf4j/concurrent/UIDGenerator.java
see the implementation choices and limitations in the javadoc.
here is a unit test on how to use:
https://code.google.com/p/spf4j/source/browse/trunk/spf4j-core/src/test/java/org/spf4j/concurrent/UIDGeneratorTest.java

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.