Is it reproducible to use Arbitrary.sample from within an Action? - java

We have a stateful test for an order system. There is an Arbitrary that will generate an Order object that has a number of LineItem's.
There are actions to:
Create an Order
Cancel a LineItem
The action to create an order takes the order itself, eg:
Arbitraries.defaultFor(Order.class).map(CreateOrderAction::new)
The state for the actions has knowledge about all created orders.
To cancel a LineItem, we need knowledge about what orders are created. Inside CancelLineItemAction is it safe to do the following?
LineItem line = Arbitraries.<Collection<Order>>of(state.orders())
.flatMap(order -> Arbitraries.<Collection<LineItem>>of(order.lineItems()))
.sample();
Based on the javadoc of Arbitrary.sample(), it seems safe, but this construct isn't explicitly mentioned in the documentation on stateful tests, and we don't want to use it extensively only to break the reproducibility of our tests.

TLDR
Arbitrary.sample() is not designed to be used in that way
I recommend to use a random cancel index with modulo over the number of line items
1. Why Arbitrary.sample() is not recommended
Arbitrary.sample() is designed to be used outside of properties, e.g. to experiment with generated values or to use it in other contexts like JUnit Jupiter. There are at least three reasons:
The underlying random seed used for generating values depends on what happens
before sampling. Thus the results are not really reproducible.
Sampling will not consider any added domain contexts that may change what's
being generated.
Values generated by sample() DO NOT PARTICIPATE IN SHRINKING
2. Option 1: Hand in a Random object and use it for generating
Hand in a Random instance when generating a CancelLineItemAction:
Arbitraries.random().map(random -> new CancelLineItemAction(random))
Use the random to invoke a generator:
LineItem line = Arbitraries.of(state.orders())
.flatMap(order -> Arbitraries.of(order.lineItems()))
.generator(100).next(random).value();
But actually that's very involved for what you want to do. Here's a simplification:
3. Option 2: Hand in a Random object and use it for picking a line item
Same as above but don't take a detour with sampling:
List<LineItem> lineItems = state.orders().stream()
.flatMap(order -> order.lineItems().stream())
.collect(Collectors.toList());
int randomIndex = random.nextInt(lineItems.size());
LineItem line = lineItems.get(randomIndex);
Both option 1 and 2 will (hopefully) behave reasonably in jqwik's lifecycle
but they won't attempt any shrinking. That's why I recommend the next option.
4. Option 3: Hand in a cancel index and modulo it over the number of line items
To generate the action:
Arbitraries.integer().between(0, MAX_LINE_ITEMS)
.map(cancelIndex -> new CancelLineItemAction(cancelIndex))
Use it in action:
List<LineItem> lineItems = state.orders().stream()
.flatMap(order -> order.lineItems().stream())
.collect(Collectors.toList());
int randomIndex = cancelIndex % lineItems.size();
LineItem line = lineItems.get(randomIndex);
The approach is described in more detail here: https://blog.johanneslink.net/2020/03/11/model-based-testing/
5. Future Outlook
In some more or less distant future jqwik may allow to hand in the current state when generating actions. This would make stuff like yours a bit simpler. But this feature has not yet been prioritized.

Related

How to generate fixed length random number without conflict?

I'm working on an application where I've to generate code like Google classroom. When a user creates a class I generate code using following functions
private String codeGenerator(){
StringBuilder stringBuilder=new StringBuilder();
String chars="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789";
int characterLength=chars.length();
for(int i=0;i<5;i++){
stringBuilder.append(chars.charAt((int)Math.floor(Math.random()*characterLength)));
}
return stringBuilder.toString();
}
As I have 62 different characters. I can generate total 5^62 code total which is quite large. I can generate this code in server or user device. So my question is which one is better approach? How likely a generated code will conflict with another code?
From a comment, it seems that you are generating group codes for your own application.
For the purposes and scale of your app, 5-character codes may be appropriate. But there are several points you should know:
Random number generators are not designed to generate unique numbers. You can generate a random code as you're doing now, but you should check that code for uniqueness (e.g., check it against a table that stores group codes already generated) before you treat that code as unique.
If users are expected to type in a group code, you should include a way to check whether a group code is valid, to avoid users accidentally joining a different group than intended. This is often done by adding a so-called "checksum digit" to the end of the group code. See also this answer.
It seems that you're trying to generate codes that should be hard to guess. In that case, Math.random() is far from suitable (as is java.util.Random) — especially because the group codes are so short. Use a secure random generator instead, such as java.security.SecureRandom (fortunately for you, its security issues were addressed in Android 4.4, which, as I can tell from a comment of yours, is the minimum Android version your application supports; see also this question). Also, if possible, make group codes longer, such as 8 or 12 characters long.
For more information, see Unique Random Identifiers.
Also, there is another concern. There is a serious security issue if the 5-character group code is the only thing that grants access to that group. Ideally, there should be other forms of authorization, such as allowing only logged-in users or certain logged-in users—
to access the group via that group code, or
to accept invitations to join the group via that group code (e.g., in Google Classroom, the PERMISSION_DENIED error code can be raised when a user tries to accept an invitation to join a class).
The only way to avoid duplicates in your scheme is to keep a copy of the ones that you have already generated, and avoid "generating" anything that would result in a duplicate. Since 5^62 is a lot, you could simply store them on a table if using a database; or on a hashset if everything is in-memory and there is only one instance of the application (remember to save the list of generated IDs to disk every time you create a new one, and to re-read it at startup).
The chances of a collision are low: you would need to generate around 5^(62/2) = 5^31 ~= 4.6E21 really-random identifiers for a collision to be more likely than not (see birthday paradox) - and it would take a lot of space to store and check all those identifiers for duplicates to detect that this was the case. But such is the price of security.
Que: A sack contains a blue ball and a red ball. I draw one ball from the sack. What are the chances it is a red ball?
Ans: 1/2
Que: I have a collection of 5^62 unique codes. I choose one code from the collection. What are the chances that it is "ABCDE"?
Ans: 1/(5^62)
NOTE: Random number generators are not actually random.
Well, in case you need a unique generator, what about the following. This is definitely not a random, but it's definitely unique for one instance.
public final class UniqueCodeGenerator implements Supplier<String> {
private int code;
#Override
public synchronized String get() {
return String.format("%05d", code++);
}
public static void main(String... args) {
Supplier<String> generator = new UniqueCodeGenerator();
for (int i = 0; i < 10; i++)
System.out.println(generator.get());
}
}

Create a flowable with generate function using RxJava2

I need to create a custom Flowable with backpressure implemented. I'm trying to achieve some sort of paging. That means when downstream requests 5 items I will "ask the data source" for items 0 - 5. Then when downstream needs another 5, I'll get items 5 - 10 and emit back.
The best thing I've found so far is to use Flowable.generate method but I really don't understand why there is no way (as far as I know) how to get the requested number of items the downstream is requesting. I can use the state property of generator to save the index of last items requested so then I need only the number of newly requested items. The emmiter instance I got in the BiFunction apply is GeneratorSubscription which is extending from AtomicLong. So casting emmiter to AtomicLong can get me the requested number. But I know this can't be the "recommended" way.
On the other hand when you use Flowable.create you get the FlowableEmitter which has long requested() method. Using generate is suiting me more for my use-case, but now I'm also curious what is the "correct" way to use Flowable.generate.
Maybe I'm overthinking the whole thing so please point me in the right direction. :) Thank you.
This is what the actual code looks like (in Kotlin):
Flowable.generate(Callable { 0 }, BiFunction { start /*state*/, emitter ->
val requested = (emitter as AtomicLong).get().toInt() //this is bull*hit
val end = start + requested
//get items [start to end] -> items
emmiter.onNext(items)
end /*return the new state*/
})
Ok, I found out that the apply function of the BiFunction is called that many times as is the request amount (n). So there's no reason to have a getter for it. It's not what I have hoped for but it is apparently how generate works. :)

Impose order in Jsprit with HardActivityConstraint

In a scenario of re-solving a previously solved problem (with some new data, of course), it's typically impossible to re-assign a vehicle's very-first assignment once it was given. The driver is already on its way, and any new solution has to take into account that:
the job must remain his (can't be assigned to another vehicle)
the activity that's been assigned to him as the very-first, must remain so in future solutions
For the sake of simplicity, I'm using a single vehicle scenario, and only trying to impose the second bullet (i.e. ensure that a certain activity will be the first in the solution).
This is how I defined the constraint:
new HardActivityConstraint()
{
#Override
public ConstraintsStatus fulfilled(JobInsertionContext iFacts, TourActivity prevAct, TourActivity newAct, TourActivity nextAct,
double prevActDepTime)
{
String locationId = newAct.getLocation().getId();
// we want to make sure that any solution will have "C1" as its first activity
boolean activityShouldBeFirst = locationId.equals("C1");
boolean attemptingToInsertFirst = (prevAct instanceof Start);
if (activityShouldBeFirst && !attemptingToInsertFirst)
return ConstraintsStatus.NOT_FULFILLED_BREAK;
if (!activityShouldBeFirst && attemptingToInsertFirst)
return ConstraintsStatus.NOT_FULFILLED;
return ConstraintsStatus.FULFILLED;
}
}
This is how I build the algorithm:
VehicleRoutingAlgorithmBuilder vraBuilder;
vraBuilder = new VehicleRoutingAlgorithmBuilder(vrpProblem, "schrimpf.xml");
vraBuilder.addCoreConstraints();
vraBuilder.addDefaultCostCalculators();
StateManager stateManager = new StateManager(vrpProblem);
ConstraintManager constraintManager = new ConstraintManager(vrpProblem, stateManager);
constraintManager.addConstraint(new HardActivityConstraint() { ... }, Priority.HIGH);
vraBuilder.setStateAndConstraintManager(stateManager, constraintManager);
VehicleRoutingAlgorithm algorithm = vraBuilder.build();
The results are not good. I'm only getting solutions with a single job assigned (the one with the required activity). In debug it's clear that the job insertion iterations consider many viable options that appear to solve the problem entirely, but at the bottom line, the best solution returned by the algorithm doesn't include the other jobs.
UPDATE: even more surprising, is that when I use the constraint in scenarios with over 5 vehicles, it works fine (worst results are with 1 vehicle).
I'll gladly attach more information if needed.
Thanks
Zach
First, you can use initial routes to ensure that certain jobs need to be assigned to specific vehicles right from the beginning (see example).
Second, to ensure that no activity will be inserted between start and your initial job(location) (e.g. "C1" in your example), you need to prohibit it the way you defined your HardActConstraint, just modify it so that a newAct can never be between prevAct=Start and nextAct=act(C1).
Third, with regards to your update, just have in mind that the essence of the algorithm is to ruin part of the solution (remove a number of jobs) and recreate the solution again (insert the unassigned jobs). Currently, the schrimpf algorithm ruins a number of jobs relative to the total number of jobs, i.e. noJobs = 0.5 * totalNoJobs for the random ruin and 0.3 * totalNoJobs for the radial ruin. If your problem is very small, the share of jobs to be removed might not sufficiant. This is going to change with next release, where you can use an algorithm out of the box which defines an absolute minimum of jobs that need to be removed. For the time being, modify the shares in your algorithmConfig.xml.

Testing for Randomness in JUnit Framework

I'm working on training myself in a very rigid Test Driven Development JUnit atmosphere. I'm trying to find out what the best method for testing FOR randomness would be in such an atmosphere. For example, I'm working on implementing a randomized queue array that queues and item and immediately switches that item with an item with index 0-(n-1) on the array (thus simulating a random item coming off the queue when it is dequeued). Here's some example code form my enqueue method:
int randIndex = StdRandom.uniform(size); // generate random index to swap with last item
Item tmp = randArray[randIndex];
randArray[size] = item;
randArray[randIndex] = randArray[size]; //perform swap to create a random item for dequeue
randArray[size] = tmp;
size++;
I want to run a few tests to make sure that my enqueue method is actually randomly switching the queued variable with some other index in the array. Normally I'd just throw some code in the Main() method that iterates through a bunch of enqueue() calls and prints the results, then I'd check to make sure it "felt" random.
But, like I said, I want to do this in a very rigid unit testing framework. It seems like JUnit pretty much exclusively uses assert statements, but I'm not sure what I should assert against what, unless I just run some Monte Carlo type thing and check the average against a certain epsilon, but that seems a little much for testing such a simple method.
You can split the test in two parts.
1) You test that by a given sequecne of pseudo random numbers, your queing works as expected. For that define any arbitray fixed number of int values: e.g "5,2,100,3".
Then test with asser that the enque, deque delivers that expected element.
2) Test the Random() class of java: You, most likely should omit that test, because Random() is well implemented.
Otherwise for 2) you have it using Chi-Square Random Number Test, and that that that thi sstatistic is within soem epsilon as you stated. But this woul dbe an overkill, so stay with point 1)
I'm not sure what you are really heading for but I read it like testing the random number generator itself (cause your switching is .. quite straight forward).
If you use java SecureRandom, you should be on a quite good side regarding the entropy, see
SecureRandom. If you doubt that, use some entropy checkers or just a sequence of real random from some source in the internet like here

Existing solution to "smart" initial capacity for StringBuilder

I have a piece logging and tracing related code, which called often throughout the code, especially when tracing is switched on. StringBuilder is used to build a String. Strings have reasonable maximum length, I suppose in the order of hundreds of chars.
Question: Is there existing library to do something like this:
// in reality, StringBuilder is final,
// would have to create delegated version instead,
// which is quite a big class because of all the append() overloads
public class SmarterBuilder extends StringBuilder {
private final AtomicInteger capRef;
SmarterBuilder(AtomicInteger capRef) {
int len = capRef.get();
// optionally save memory with expense of worst-case resizes:
// len = len * 3 / 4;
super(len);
this.capRef = capRef;
}
public syncCap() {
// call when string is fully built
int cap;
do {
cap = capRef.get();
if (cap >= length()) break;
} while (!capRef.compareAndSet(cap, length());
}
}
To take advantage of this, my logging-related class would have a shared capRef variable with suitable scope.
(Bonus Question: I'm curious, is it possible to do syncCap() without looping?)
Motivation: I know default length of StringBuilder is always too little. I could (and currently do) throw in an ad-hoc intitial capacity value of 100, which results in resize in some number of cases, but not always. However, I do not like magic numbers in the source code, and this feature is a case of "optimize once, use in every project".
Make sure you do the performance measurements to make sure you really are getting some benefit for the extra work.
As an alternative to a StringBuilder-like class, consider a StringBuilderFactory. It could provide two static methods, one to get a StringBuilder, and the other to be called when you finish building a string. You could pass it a StringBuilder as argument, and it would record the length. The getStringBuilder method would use statistics recorded by the other method to choose the initial size.
There are two ways you could avoid looping in syncCap:
Synchronize.
Ignore failures.
The argument for ignoring failures in this situation is that you only need a random sampling of the actual lengths. If another thread is updating at the same time you are getting an up-to-date view of the string lengths anyway.
You could store the string length of each string in a statistic array. run your app, and at shutdown you take the 90% quartil of your string length (sort all str length values, and take the length value at array pos = sortedStrings.size() * 0,9
That way you created an intial string builder size where 90% of your strings will fit in.
Update
The value could be hard coded (like java does for value 10 in ArrayList), or read from a config file, or calclualted automatically in a test phase. But the quartile calculation is not for free, so best you run your project some time, measure the 90% quartil on the fly inside the SmartBuilder, output the 90% quartil from time to time, and later change the property file to use the value.
That way you would get optimal results for each project.
Or if you go one step further: Let your smart Builder update that value from time to time in the config file.
But this all is not worth the effort, you would do that only for data that have some millions entries, like digital road maps, etc.

Categories