Java command-line application somehow retains state - java

Foreword: I apologise if this is a very silly error or something that is in fact well-documented. To me right now it seems very strange and makes absolutely no sense.
The Application
I have a Java command-line application built in IntelliJ IDEA Ultimate on macOS 10.13.4 that makes use of four Maven libraries listed below. Its purpose is to download files from a website, and navigates across paginated results in doing so.
One of this application's features is the ability to keep running in a loop, checking for new results if enough time has passed by the time it finishes its current scan. To do this, it calls Thread.sleep(remainingMillis) as part of the while condition in a do-while block.
The Problem
The application worked without any issues, but after introducing the Thread.sleep() call (I suspect this is the troublesome line anyways), some very strange behaviour occurs: The application performs the first run without issues, fetching three items from the configured website; it is then configured to ensure that 60 seconds have passed before running again. However upon consequent runs, rather than scan the first page of results, logs indicate that it starts looking at page 31 (as an example), where it finds no results. Having failed to find anything, attempt two of three looks at page 32, and the final attempt looks at page 33; it then once again waits until 60 seconds have passed since the scan iteration began.
I can't confirm this, but it seems as though it then continues this count in subsequent scans: 34, 35, then 36, and waiting again. However, the code would suggest that this should have started at 1 again when another iteration of the while starts up.
This could have been IntelliJ or Java playing up, and it may have simply required cleaning out the bin/obj folders, but if this is something due to my code, I would much rather know about it so I don't encounter the same silly issue in the future.
The Observations
Having just run the application a few days later with the current configuration means that it doesn't call Thread.sleep(), as more than 60 seconds pass so it continues with the next iteration immediately; when this happens, the weird page index incrementing issue doesn't rear its head - instead the next iteration continues from page 1 as it should.
Afterwards, running it such that it did Thread.sleep() for several seconds before starting the next iteration didn't cause a problem either... very strange. Was this a dream?
The Code
Sidenote: I added Thread.currentThread().interrupt() to try and fix this issue, but it didn't seem to have an effect.
public static void main(String[] args) {
do {
startMillis = System.currentTimeMillis();
int itemsFetched = startFetching(agent, config, record, 1, 0);
} while (shouldRepeat(config.getRepeatSeconds(), startMillis));
}
private static boolean shouldRepeat(int repeatSeconds, long startMillis) {
long passedMillis = System.currentTimeMillis() - startMillis;
int repeatMillis = repeatSeconds * 1000;
boolean repeatSecondsReached = passedMillis >= repeatMillis;
if (repeatSeconds < 0) {
return false;
} else if (repeatSecondsReached) {
return true;
}
long remainingMillis = repeatMillis - passedMillis;
int remainingSeconds = (int) (remainingMillis / 1000);
try {
Thread.sleep(remainingMillis);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
throw new RuntimeException(e);
}
return true;
}
private static int startFetching(Agenter agent, MyApplicationConfig config, MyApplicationRecord record, int pageIndex, int itemsFetched) {
String categoryCode = config.getCategoryCode();
List<Item> items = agent.getPageOfItems(categoryCode, pageIndex, config);
if (items == null) {
return itemsFetched;
}
int maxItems = config.getMaxItems();
try {
for (Item item : items) {
String itemURL = item.getURL();
agent.downloadItem(itemURL, config, item.getItemCount());
itemsFetched++;
if (maxItems > 0 && itemsFetched >= maxItems) {
return itemsFetched;
}
}
} catch (IOException e) {
// Log
}
return startFetching(agent, config, record, pageIndex + 1, itemsFetched);
}
}
Maven Libraries
commons-cli:commons-cli:1.4
org.apache.logging.log4j:log4j-api:2.11.0
org.apache.logging.log4j:log4j-core:2.11.0
org.jsoup:jsoup:1.11.2

Check your Agenter implementation, in the call of
agent.getPageOfItems the pageIndex is supplied but could be stored there in an instance variable or something like that. The error itself might be that on additional calls it probably didn't get reset (correctly).

Related

Random for-loop in Java?

I have 25 batch jobs that are executed constantly, that is, when number 25 is finished, 1 is immediately started.
These batch jobs are started using an URL that contains the value 1 to 25. Basically, I use a for loop from 1 to 25 where I, in each round, call en URL with the current value of i, http://batchjobserver/1, http://batchjobserver/2 and so on.
The problem is that some of these batch jobs are a bit unstable and sometimes crashes which causes the for-loop to restart at 1. As a consequence, batch job 1 is run every time the loop is initiated while 25 runs much less frequently.
I like my current solution because it is so simple (in pseudo code)
for (i=1; i < 26; i++) {
getURL ("http://batchjob/" + Integer.toString(i));
}
However, I would like I to be a random number between 1 and 25 so that, in case something crashes, all the batch jobs, in the long run, are run approximately the same number of times.
Is there some nice hack/algorithm that allows me to achieve this?
Other requirements:
The number 25 changes frequently
This is not an absolut requirement but it would be nice one batch job wasn't run again until all other all other jobs have been attempted once. This doesn't mean that they have to "wait" 25 loops before they can run again, instead - if job 8 is executed in the 25th loop (the last loop of the first "set" of loops), the 26th loop (the first loop in the second set of loops) can be 8 as well.
Randomness has another advantage: it is desirable if the execution of these jobs looks a bit manual.
To handle errors, you should use a try-catch statement. It should look something like this:
for(int i = 1, i<26, i++){
try{
getURL();
}
catch (Exception e){
System.out.print(e);
}
}
This is a very basic example of what can be done. This will, however, only skip the failed attempts, print the error, and continue to the next iteration of the loop.
There are two parts of your requirement:
Randomness: For this, you can use Random#nextInt.
Skip the problematic call and continue with the remaining ones: For this, you can use a try-catch block.
Code:
Random random = new Random();
for (i = 1; i < 26; i++) {
try {
getURL ("http://batchjob/" + Integer.toString(random.nextInt(25) + 1));
} catch (Exception e) {
System.out.println("Error: " + e.getMessage());
}
}
Note: random.nextInt(25) returns an int value from 0 to 24 and thus, when 1 is added to it, the range becomes 1 to 25.
You could use a set and start randomizing numbers in the range of your batches, while doing this you will be tracking which batch you already passed by adding them to the set, something like this:
int numberOfBatches = 26;
Set<Integer> set = new HashSet<>();
List<Integer> failedBatches = new ArrayList<>();
Random random = new Random();
while(set.size() <= numberOfBatches)
{
int ran = random.nextInt(numberOfBatches) + 1;
if(set.contains(ran)) continue;
set.add(ran);
try
{
getURL ("http://batchjob/" + Integer.toString(ran));
} catch (Exception e)
{
failedBatches.add(ran);
}
}
As an extra, you can save which batches failed
The following is an example of a single-threaded, infinite looping (also colled Round-robin) scheduler with simple retry capabilities. I called "scrape" the routine that calls your batch job (scraping means indexing a website contents):
public static void main(String... args) throws Exception {
Runnable[] jobs = new Runnable[]{
() -> scrape("https://www.stackoverfow.com"),
() -> scrape("https://www.github.com"),
() -> scrape("https://www.facebook.com"),
() -> scrape("https://www.twitter.com"),
() -> scrape("https://www.wikipedia.org"),
};
for (int i = 0; true; i++) {
int remainingAttempts = 3;
while (remainingAttempts > 0) {
try {
jobs[i % jobs.length].run();
break;
} catch (Throwable err) {
err.printStackTrace();
remainingAttempts--;
}
}
}
}
private static void scrape(String website) {
System.out.printf("Doing my job against %s%n", website);
try {
Thread.sleep(100); // Simulate network work
} catch (InterruptedException e) {
throw new RuntimeException("Requested interruption");
}
if (Math.random() > 0.5) { // Simulate network failure
throw new RuntimeException("Ooops! I'm a random error");
}
}
You may want to add multi-thread capabilities (that is achieved by simply adding an ExecutorService guarded by a Semaphore) and some retry logic (for example only for certain type of errors and with a exponential backoff).

Java: Running potentially blocking code

I am developing a small game, (Java, LibGdx) where the player fills cloze-style functions with predefined lines of code. The game would then compile the code and run a small test suite to verify that the function does the stuff it is supposed to.
Compiling and running the code already works, but I am faced with the problem of detecting infinite loops. Consider the following function:
// should compute the sum of [1 .. n]
public int foo(int n) {
int i = 0;
while (n > 0) {
i += n;
// this is the place where the player inserts one of many predefined lines of code
// the right one would be: n--;
// but the player could also insert something silly like: i++;
}
return i;
}
Please note that the functions actually used may be more complex and in general it is not possible to make sure that there cannot be any infinite loops.
Currently I am running the small test suite (provided for every function) in a Thread using an ExecutorService, setting a timeout to abort waiting in case the thread is stuck. The problem with this is, that the threads stuck in an endless loop will run forever in the background, which of course will at some point have a considerable impact on game performance.
// TestClass is the compiled class containing the function above and the corresponding test suite
Callable<Boolean> task = new Callable<Boolean>() {
#Override
public Boolean call() throws Exception {
// call the test suite
return new TestClass().test();
}
};
Future<Boolean> future = executorService.submit(task);
try {
Boolean result = future.get(100, TimeUnit.MILLISECONDS);
System.out.println("result: " + (result == null ? "null" : result.toString()));
} catch (InterruptedException e) {
e.printStackTrace();
} catch (ExecutionException e) {
e.printStackTrace();
} catch (TimeoutException e) {
e.printStackTrace();
future.cancel(true);
}
My question is now: How can I gracefully end the threads that accidentally spin inside an endless loop?
*EDIT To clarify why in this case, preventing infinite loops is not possible/feasable: The functions, their test suite and the lines to fill the gaps are loaded from disk. There will be hundrets of functions with at least two lines of code that could be inserted. The player can drag any line into any gap. The effort needed to make sure no combination of function gap/code line produces something that loops infinitely or even runs longer than the timeout grows exponentially with the number of functions. This quickly gets to the point where nobody has the time to check all of these combinations manually. Also, in general, determining, whether a function will finish in time is pretty much impossible because of the halting problem.
There is no such thing as "graceful termination" of a thread inside the same process. The terminated thread can leave inconsistent shared-memory state behind it.
You can either organize things so that each task is started in its own JVM, or make do with forceful termination using the deprecated Thread.stop() method.
Another option is inserting a check into the generated code, but this would require much more effort to implement properly.
The right way is to change the design and avoids never ending loops.
For the time being, inside your loop you could check if the thread is interrupted some way by: isInterrupted() or even isAlive().
And if it is you just exit.
It is not normal to have a never ending loop if it not wanted.
To solve the problem You can add a counter in the loop and if you reach a limit you can exit.
int counter = 0;
while (n > 0) {
counter++;
if (counter > THRESHOLD) {
break;
}
i += n;
// this is the place where the player inserts one of many predefined lines of code
// the right one would be: n--;
// but the player could also insert something silly like: i++;
}

Java: First iteration of for loop takes longer

I am writing some code to test using the MIDI libraries in Java, and have run across a problem. The pause between notes is much longer (almost twice as long, in fact) after the very first note than after all the others. I can't see any reason why, as the sequence of notes has already been generated (hence it is not also having to perform those calculations within the first iteration of the loop, it is only playing notes).
I think I may have also had this problem in the past with a simulation which, without any explanation I could find, took almost 100% of its tick length to perform calculations on the first tick only, and then used only about 2% on all successive iterations.
Main code (extract):
public void play() {
MidiPlayer player = new MidiPlayer();
for (int i = 0; i < NUMNOTES; i++) {
long tic = System.currentTimeMillis();
player.playNote(10, notes[i]);
try {
Thread.sleep(200);
} catch (InterruptedException e) {
e.printStackTrace();
}
long toc = System.currentTimeMillis();
System.out.println(toc - tic);
}
try {
Thread.sleep(500);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
Code for playNote():
public void playNote(int channel, int note) {
channels[channel].allNotesOff();
channels[channel].noteOn(note + 60, volume);
}
There are no 'if' statements that specify the first loop, so surely the delay should be uniform for all notes, as the number of calculations being performed should be the same for all iterations. Please note that the timing variables are just for testing purposes, and the effect was audibly noticeable before I included those.
EDIT: I should also mention that the output produced shows each iteration of the loop taking the expected 200 (occasionally 201) milliseconds. It seems to suggest that there is no gap - yet I clearly hear a gap every time I run the code.
Since you have sleeps, you should calculate how long you should sleep instead of trying to sleep the same amount of time each time - calculate how much more time you actually need to the next note to be played and sleep that much amount. i.e.
long tic = System.currentTimeMillis();
player.playNote(10, notes[i]);
long time_spent = System.currentTimeMillis() - tic;
Thread.sleep(200 - time_spent);

Adding scheduled items to multi-slot crafting queue

I'm working on an Android game.
Areas in the game have crafting slots which determine how many items can be crafted at once. If there are no slots available, the item is given a scheduled start date which correlates to when a slot will become available.
The problem I'm encountering is that the current code only considers when the first slot will become available, not when any slot will.
Adding scheduled item:
long timeSlotAvailable = getTimeSlotAvailable(location);
Pending_Inventory newScheduledItem = new Pending_Inventory(itemId, state, timeSlotAvailable, quantity, craftTime, location);
Getting the time a slot is available:
public static long getTimeSlotAvailable(Long location) {
List<Pending_Inventory> pendingItems = getPendingItems(location, true);
int locationSlots = Slot.getUnlockedSlots(location);
long timeAvailable = System.currentTimeMillis();
// Works for single slots, not for multi though. Needs to consider slot count.
for (Pending_Inventory pending_inventory : pendingItems) {
long finishTime = pending_inventory.getTimeCreated() + pending_inventory.getCraftTime();
if (finishTime > timeAvailable) {
timeAvailable = finishTime;
}
}
return timeAvailable;
}
The code works by looking through every item currently being crafted or scheduled to craft, and getting the time the last one finishes.
locationSlots is currently unused, but I believe it will be required to calculate the correct time a slot will be available.
I've tried a few approaches (adding all finish times to an array & getting the n value showed promise, but I couldn't get my head around it), but am all out of ideas.
Thanks!
Eventually took another go at the array approach, and succeeded.
public static long getTimeSlotAvailable(Long location) {
List<Pending_Inventory> pendingItems = getPendingItems(location, true);
int numSlots = Slot.getUnlockedSlots(location);
// Add all of the times a slot will become available to a list
List<Long> finishTimes = new ArrayList<>();
for (Pending_Inventory pending_inventory : pendingItems) {
long finishTime = pending_inventory.getTimeCreated() + pending_inventory.getCraftTime();
finishTimes.add(finishTime);
}
// Sort these times so the latest time is first
Collections.sort(finishTimes, Collections.<Long>reverseOrder());
if (finishTimes.size() >= numSlots) {
// If we're all full up, get the first time a slot will become available
return finishTimes.get(numSlots-1);
} else {
// Otherwise, it can go in now
return System.currentTimeMillis();
}
}
Hopefully that helps someone in the future with a similar problem.

In OS X, why does using println() cause my program to run faster than without println()

I've run into a really strange bug, and I'm hoping someone here can shed some light as it's way out of my area of expertise.
First, relevant background information: I am running OS X 10.9.4 on a Late 2013 Macbook Pro Retina with a 2.4GHz Haswell CPU. I'm using JDK SE 8u5 for OS X from Oracle, and I'm running my code on the latest version of IntelliJ IDEA. This bug also seems to be specific only to OS X, as I posted on Reddit about this bug already and other users with OS X were able to recreate it while users on Windows and Linux, including myself, had the program run as expected with the println() version running half a second slower than the version without println().
Now for the bug: In my code, I have a println() statement that when included, the program runs at ~2.5 seconds. If I remove the println() statement either by deleting it or commenting it out, the program counterintuitively takes longer to run at ~9 seconds. It's extremely strange as I/O should theoretically slow the program down, not make it faster.
For my actual code, it's my implementation of Project Euler Problem 14. Please keep in mind I'm still a student so it's not the best implementation:
public class ProjectEuler14
{
public static void main(String[] args)
{
final double TIME_START = System.currentTimeMillis();
Collatz c = new Collatz();
int highestNumOfTerms = 0;
int currentNumOfTerms = 0;
int highestValue = 0; //Value which produces most number of Collatz terms
for (double i = 1.; i <= 1000000.; i++)
{
currentNumOfTerms = c.startCollatz(i);
if (currentNumOfTerms > highestNumOfTerms)
{
highestNumOfTerms = currentNumOfTerms;
highestValue = (int)(i);
System.out.println("New term: " + highestValue); //THIS IS THE OFFENDING LINE OF CODE
}
}
final double TIME_STOP = System.currentTimeMillis();
System.out.println("Highest term: " + highestValue + " with " + highestNumOfTerms + " number of terms");
System.out.println("Completed in " + ((TIME_STOP - TIME_START)/1000) + " s");
}
}
public class Collatz
{
private static int numOfTerms = 0;
private boolean isFirstRun = false;
public int startCollatz(double n)
{
isFirstRun = true;
runCollatz(n);
return numOfTerms;
}
private void runCollatz(double n)
{
if (isFirstRun)
{
numOfTerms = 0;
isFirstRun = false;
}
if (n == 1)
{
//Reached last term, does nothing and causes program to return to startCollatz()
}
else if (n % 2 == 0)
{
//Divides n by 2 following Collatz rule, running recursion
numOfTerms = numOfTerms + 1;
runCollatz(n / 2);
}
else if (n % 2 == 1)
{
//Multiples n by 3 and adds one, following Collatz rule, running recursion
numOfTerms = numOfTerms + 1;
runCollatz((3 * n) + 1);
}
}
}
The line of code in question has been commented in with all caps, as it doesn't look like SO does line numbers. If you can't find it, it's within the nested if() statement in my for() loop in my main method.
I've run my code multiple times with and without that line, and I consistently get the above stated ~2.5sec times with println() and ~9sec without println(). I've also rebooted my laptop multiple times to make sure it wasn't my current OS run and the times stay consistent.
Since other OS X 10.9.4 users were able to replicate the code, I suspect it's due to a low-level bug with the compliler, JVM, or OS itself. In any case, this is way outside my knowledge. It's not a critical bug, but I definitely am interested in why this is happening and would appreciate any insight.
I did some research and some more with #ekabanov and here are the findings.
The effect you are seeing only happens with Java 8 and not with Java 7.
The extra line triggers a different JIT compilation/optimisation
The assembly code of the faster version is ~3 times larger and quick glance shows it did loop unrolling
The JIT compilation log shows that the slower version successfully inlined the runCollatz while the faster didn't stating that the callee is too large (probably because of the unrolling).
There is a great tool that helps you analyse such situations, it is called jitwatch. If it is assembly level then you also need the HotSpot Disassembler.
I'll post also my log files. You can feed the hotspot log files to the jitwatch and the assembly extraction is something that you diff to spot the differences.
Fast version's hotspot log file
Fast version's assembly log file
Slow version's hotspot log file
Slow version's assembly log file

Categories