Java Splitting Up Work Between Frames - java

First I'll explain what I want to do and afterwords I'll provide a proposed solution.
Problem
I'm running a game where I want to do a certain amount of work every frame. For example, I have N objects that are in a queue waiting to be initialized (imagine initialization is a fairly expensive operation and N is large) and after adding them all, I want to create their collision boxes, and after that, I want to merge them together to limit render calls. I can't do these operations on a different thread because all this stuff is heavily coupled with the game world. But I want to split up all these operations into bite-size chunks to run each frame so that there is minimal lag (framerate dips). How would I go about doing this?
Proposed Solution
It would be nice to have a function that can stop after one call and continue where it left off after calling it again:
For example,
boolean loadEverything() {
for (int i = 0; i < objectsToAdd.length; i++) {
world.add(objectsToAdd[i]);
if (i % 10 == 0) {
return stop();
}
}
makeCollision();
return stop();
mergeObjects();
return true;
}
Calling loadEverything() the first objectsToAdd/10 times adds 10 objects to the game world at a time. Then calling it after should run makeCollision() and then stop. Calling it again runs mergeObjects() and then the function returns true. In the caller function I would run loadEverything() until it returns true.
I'm aware of yeild-return, yield-break implementations like those described here exist, but I'm wondering if there's a more general implementation of them, or that maybe a better solution exists that doesn't require any extra dependencies.

Do you look at Coroutine yet? There's native implementation in Kotlin but in Java there're options here and here.
By all means we need to make sure those OpenGL or Box2D operations that required to be in main thread should be in main thread, as I believe coroutine will be created under a new thread. So there might not be gain to split works for those kind of operations.
Another option
You say you need to split works in creating objects in run time. Can you predict or estimate the number of objects you would want before hand? and so if you don't really need to dynamically create object like that, I suggest to look at Object Pool in libgdx (see more here). That link has working example to use Pool in your game.
Such Pool already have initialized objects ready to be grabbed and used on-demand, also can grow if need in run time, so initially if you can provide a good estimation of number of objects you intend to use, it's all good.

Why don't you add one static variable which would keep it's value between function calls? Then you can loop from current value to current value + 10, increase current value (that static variable) by 10 and exit.

Related

Reducing Complexity of Agent-Based Models

I am developing an agent based model simulating the growth in-vitro of a cell culture.
I am using the MASON library (Java), but I guess by question could be applicable to different implementations.
Essentially, my agents are programmed to divide every 12 +/- 2 timesteps after their creation. Every time an agent divides, a new one is added to the simulation.
This leads to a very rapid growth of the problem's complexity, which quickly makes the simulation particularly slow.
In order to solve this problem, I decided agents should 'die' after t timesteps from creation.
However, MASON's schedule is built on a BinaryHeap which does not easily allow the removal objects (agents) once they have been added. My solution has been to set a boolean flag:
dead = false;
Which is set to true after t time-steps.
So
if(t == 50)
dead = true;
I then begin my step method, that is the method called each time an agent is stepped, as follows:
if(dead)
return;
However, I understand that simply accessing the object in the schedule is enough to slow the simulation down.
Does anybody have any suggestions as to how I could unset the agent or prevent it from being called?
Thanks,
Dario
Taken from MASON documentation page 94
If your agent is scheduled repeating,
the scheduleRepeating(...)
method returned a sim.engine.Stoppable object.
To prevent the agent from having its step(...) method ever called again, just call stop() on the Stoppable.
This will also make the agent able to be garbage collected.

distribution of processes with MPI

My story
I am quite a beginner in parallel programming (I didn't ever do anything more than writing some basic multithreaded things) and I need to parallelize some multithreaded java-code in order to make it run faster. The multithreaded algorithm simply generates threads and passes them to the operating system which does the distribution of threads for me. The results of every thread can be gathered by some collector that also handles synchronisation issues with semaphores etc and calculates the sum of the results of all different threads. The multithreaded code kinda looks like this:
public static void main(String[] args) {
int numberOfProcesses = Integer.parseInt(args[0]);
...
Collector collector = new Collector(numberOfProcesses);
while(iterator.hasNext()) {
Object x = iterator.next();
new OverwrittenThread(x, collector, otherParameters).start();
}
if(collector.isReady())
System.out.prinltn(collector.getResult());
}
My first idea to convert this to MPI, was the basic way (I guess) to just split up the loop and give every iteration of this loop to another processor like this (with mpiJava):
public static void main(String[args]) {
...
Object[] foo = new Object[number];
int i = 0;
while(iterator.hasNext())
foo[i++] = iterator.next();
...
int myRank = MPI.COMM_WORLD.Rank();
for(int i = myRank; i < numberOfElementsFromIterator; i += myRank) {
//Perform code from OverwrittenThread on foo[i]
}
MPI.COMM_WORLD.Reduce(..., MPI.SUM, ...);
}
The problems
This is, till now, the only way that I, as newbie in mpi, could make things work. This is only an idea, because I have no idea how to tackle implementation-problems like conversion of BigIntegers to MPI datatypes, etc. (But I would get this far, I guess)
The real problem though, is the fact that this approach of solving the problem, leaves the distribution of work very unbalanced because it doesn't take into account how much work a certain iteration takes. This might really cause some troubles as some iterations can be finished in less than a second and others might need several minutes.
My question
Is there a way to get a similar approach like the multithreaded version in an MPI-implementation? At first I thought it would just be a lot of non-blocking point-to-point communication, but I don't see a way to make it work that way. I also considered using the scatter-functionality, but I have too much troubles understanding how to use it correctly.
Could anybody help me to clear this out, please?
(I do understand basic C etc)
Thanks in advance
The first thing you need to ask yourself when converting a multi-threaded program to a distributed program is:
What am I trying to accomplish by distributing the data across multiple cores/nodes/etc.?
One of the most common issues people face when getting started with MPI is thinking that they can take a program that works well in a small, shared-memory environment (i.e. multi-threading on a single node) and throw more CPUs at it to make it faster.
Sometimes that is true, but often it's not. The most important thing to remember about MPI is that for the most part (unless you're getting into RMA, which is another advanced topic alltogether), each MPI process has its own separate memory, distinct from all other MPI processes. This is very different from a multi-threaded environment where all threads typically share memory. This means that you add a new problem on top of the other complexities you see with parallel programming. Now you have to consider how to make sure that the data you need to process is in the right place at the right time.
One common way to do this is to ensure that all of the data is already available to all of the other processes outside of MPI, for instance, through a shared filesystem. Then the processes can just figure out what work they should be doing, and get started with their data. Another way is for a single process, often rank 0, to send the important data to the appropriate ranks. There are obviously other ways that you've already discovered to optimize this process. MPI_SCATTER is a great example.
Just remember that it's not necessarily true that MPI is faster than multi-threading, which is faster than single-threading. In fact, sometimes it can be the opposite. The cost of moving your data around via MPI calls can be quite high. Make sure that it's what you actually want to do before trying to rewrite all of your code with MPI.
The only reason that people use MPI isn't just to speed up their code by taking advantage of more processors (though sometimes that's true). Sometimes it's because the problem that their application is trying to solve is too big to fit the in memory of a single node.
All that being said, if your problem really does map to MPI well, you can do what you want to do. Your application appears to be similar to a master/worker kind of job, which is relatively simple to deal with. Just have your master send non-blocking messages to your workers with their work and post a non-blocking MPI_ANY_SOURCE receive so it can be notified when the work is done. When it gets a message from the workers, send out more work to be done.

How to implement atomic request counter

I'm confronted to the following problem :
I've implemented a crawler, and I would like to know how many requests have been done during the last second, and what amount of data has been downloaded during the last second.
Currently, I've implemented it using locks. My version uses a queue, and two counters (count and sum).
When a task is done, I just increase my counters, and I add an event (with the current date) to the queue
When wanting to get the value of my counters, I check if some stuff in the queue is more than 1second old. If so, I dequeue it and decrease my counters properly. Then, I return the wanted result.
This version works well but I would like, for a training purpose, to reimplement it using atomic operations instead of locks. Nevertheless, I' ve to admit that I'm stuck on the "cleaning operation". (dequeuing of old values)
So, is this the good aproach to implement this ?
Which other approach could I use ?
Thanks !
This version works well but I would like, for a training purpose, to reimplement it using atomic operations instead of locks.
If you need to make multiple changes to the data when the roll period happens, you will need to lock otherwise you will have problems. Any time you have multiple "atomic operations" you need to have a lock to protect against race conditions. For example, in your case, what if something else was added to the queue while you were doing your roll?
Which other approach could I use ?
I'm not 100% sure why you need to queue up the information. If you only are counting the number of requests and the total of the data size downloaded then you should be able to use a single AtomicReference<CountSum>. The CountSum class would store your two values. Then when someone needs to increment it they would do something like:
CountSum newVal = new CountSum();
do {
CountSum old = countSumRef.get();
newVal.setCount(old.getCount() + 1);
newVal.setSum(old.getSum() + requestDataSize);
// we need to loop here if someone changed the value behind our back
} while (!countSumRef.compareAndSet(old, newVal));
This ensures that your count and your sum are always in sync. If you used two AtomicLong variables, you'd have to make two atomic requests and would need the lock again.
When you want to reset the values, you'd do the same thing.
CountSum newVal = new CountSum(0, 0);
CountSum old;
do {
old = countSumRef.get();
// we need to loop here if someone changed the value behind our back
} while (!countSumRef.compareAndSet(old, newVal));
// now you can display the old value and be sure you got everything

Thread safety when only one thread is writing

I know if two threads are writing to the same place I need to make sure they do it in a safe way and cause no problems but what if just one thread reads and does all the writing while another just reads.
In my case I'm using a thread in a small game for the 1st time to keep the updating apart from the rendering. The class that does all the rendering will never write to anything it reads, so I am not sure anymore if I need handle every read and write for everything they both share.
I will take the right steps to make sure the renderer does not try to read anything that is not there anymore but when calling things like the player and entity's getters should I be treating them in the same way? or would setting the values like x, y cords and Booleans like "alive" to volatile do the trick?
My understanding has become very murky on this and could do with some enlightening
Edit: The shared data will be anything that needs to be drawn and moved and stored in lists of objects.
For example the player and other entity's;
With the given information it is not possible to exactly specify a solution, but it is clear that you need some kind of method to synchronize between the threads. The issue is that as long as the write operations are not atomic that you could be reading data at the moment that it is being updates. This means that you for instance get an old y-coordinate with a new x-coordinate.
Basically you only do not need to worry about synchronization if both threads are only reading the information or - even better - if all the data structures are immutable (so both threads can not modify the objects). The best way to proceed is to think about which operations need to be atomic first, and then create a solution to make the operations atomic.
Don't forget: get it working, get it right, get it optimized (in that order).
You could have problems in this case if list's sizes are variable and you don't synchronize the access to them, consider this:
read-only thread reads mySharedList size and it sees it is 15; at that moment its CPU time finishes and read-write thread is given the CPU
read-write thread deletes an element from the list, now its size is 14.
read-only thread is again granted CPU time, it tries to read the last element using the (now obsolete) size it read before being interrupted, you'll have an Exception.

Multithread Solver for N-Puzzle Problem

As a homework assignment in my current CS course, we've been instructed to write a program that implements an A* algorithm for the n-puzzle problem. In order to solve, you must take in an initial nxn board configuration from StdIn. The catch is some of the boards may not be solvable. Thankfully for us, if you create a "twin" board by flipping any two non-zero squares and attempt to solve that, either the original or the twin must be solvable. Therefore, in order to implement the algorithm we are effectively trying to solve two boards at the same time, the original and the twin.
Doing this in a single-thread was quite easy and that's what the actual assignment is. Looking at this problem, it seems like this is a perfect place to utilize parallelism. I was thinking that from the main thread I would try to spawn two concurrent threads each trying to solve their own board. Is this possible to do without too much crazy code in java? For that matter, on a multicore chip would this run significantly faster than the non-multithread version? I am trying to read through the java documentation for threads but it's a little thick for somebody who has never tried this before and for that matter, I find I learn much more quickly by writing/looking at examples than reading more documentation.
Could somebody please give me some sample code that shows the type structures, class, important statements, ect. that would be necessary to do this? So far I'm thinking that I want to implement a private class the implements runnable and having the main thread throw an interrupt to which ever thread does not finish first to figure out which one is solvable plus the number of moves and sequence of boards to get there.
EDIT:
TL;DR THIS IS NOT PART OF THE GRADED ASSIGNMENT. The assignment was to do a single threaded implementation. For my own enrichment and SOLELY my own enrichment I want to try and make my implementation multithreaded.
Since you don't want the implementation yourself threaded (which is arguably a lot more complex; the transposition table is the bottleneck for parallel A* implementations - but in praxis parallel IDA* algorithms are easier to implement anyhow and have the usual advantages) the problem is actually really quite simple.
Just pack your implementation in a runnable Class and use a thread. For simplicity you can just use a global volatile boolean variable that is initialized to false and set to true as soon as one thread has found the solution.
You now just check the flag at apropriate situations in your code and return if the other thread has already found a solution. You could also use interrupts but keeping it simple can't harm (and in the end it's actually quite similar anyhow, you'd just check the variable a bit fancier).
Trivial example:
public class Main implements Runnable {
private static volatile boolean finished = false;
public static void main(String[] args) {
new Thread(new Main()).start();
new Main().run();
}
#Override
public void run() {
while (!finished) {
// do stuff
if (solutionFound) {
finished = true;
// save result
}
}
return;
}
}
Forget about solving two boards, with one of them being unsolvable. I don't see how is that even useful, but ignoring that, parallelization should not stop at two processors. If system has more of those then algorithm should use them all. BTW, checking if the board is solvable is rather easy. Check out the section Solvability in Wikipedia article.
To parallelize things your implementation of A* should have some kind of priority queue that sorts items by heuristic value. Expansion of a node in search tree involves removing node from the top of the queue, and inserting several nodes back in the queue, keeping the queue sorted. When things are organized like this then adding more threads to insert and remove stuff is rather simple. Just make the access to the queue synchronized.

Categories