how to serialize multi-threaded program - java

I have many threads performing different operations on object and when nearly 50% of the task finished then I want to serialize everything(might be I want to shut down my machine ).
When I come back then I want to start from the point where I had left.
How can we achieve?
This is like saving state of objects of any game while playing.
Normally we save the state of the object and retrieve back. But here we are storing its process's count/state.
For example:
I am having a thread which is creating salary excel sheet for 50 thousand employee.
Other thread is creating appraisal letters for same 50 thousand employee.
Another thread is writing "Happy New Year" e-mail to 50 thousand employee.
so imagine multiple operations.
Now I want to shut down in between 50% of task finishes. say 25-30 thousand employee salary excel-sheet have been written and appraisal letters done for 25-30 thousand and so on.
When I will come back next day then I want to start the process from where I had left.
This is like resume.

I'm not sure if this might help, but you can achieve this if the threads communicate via in-memory queues.
To serialize the whole application, what you need to do is to disable the consumption of the queues, and when all the threads are idle you'll reach a "safe-point" where you can serialize the whole state. You'll need to keep track of all the threads you spawn, to know if they are in are idle.
You might be able to do this with another technology (maybe a java agent?) that freezes the JVM and allows you to dump the whole state, but I don't know if this exists.

well its not much different than saving state of object.
just maintain separate queues for different kind of inputs. and on every launch (1st launch or relaunch) check those queues, if not empty resume your 'stopped process' by starting new process but with remaining data.
say for ex. an app is sending messages, and u quit the app with 10 msg remaining. Have a global queue, which the app's senderMethod will check on every launch. so in this case it will have 10msg in pending queue, so it will continue sending remaining msgs.
Edit:
basically, for all resumable process' say pr1, pr2....prN, maintain queue of inputs, say q1, q2..... qN. queue should remove processed elements, to contain only pending inputs. as soon as u suspend system. store these queues, and on relaunching restore them. have a common routine say resumeOperation, which will call all resumable process (pr1, pr2....prN). So it will trigger the execution of methods with non-0 queues. which in tern replicate resuming behavior.

Java provides the java.io.Serializable interface to indicate serialization support in classes.
You don't provide much information about the task, so it's difficult to give an answer.
One way to think about a task is in terms of a general algorithm which can split in several steps. Each of these steps in turn are tasks themselves, so you should see a pattern here.
By cutting down each algorithms in small pieces until you cannot divide further you get a pretty good idea of where your task can be interrupted and recovered later.
The result of a task can be:
a success: the task returns a value of the expected type
a failure: somehow, something didn't turn right while doing computation
an interrupted computation: the work wasn't finished, but it may be resumed later, and the return value is the state of the task
(Note that the later case could be considered a subcase of a failure, it's up to you to organize your protocol as you see fit).
Depending on how you generate the interruption event (will it be a message passed from the main thread to the worker threads? Will it be an exception?), that event will have to bubble within the task tree, and trigger each task to evaluate if its work can be resumed or not, and then provide a serialized version of itself to the larger task containing it.

I don't think serialization is the correct approach to this problem. What you want is persistent queues, which you remove an item from when you've processed it. Every time you start the program you just start processing the queue from the beginning. There are numerous ways of implementing a persistent queue, but a database comes to mind given the scale of your operations.

Related

Prevent from slow job taking over a thread pool

I have a system where currently every job has it's own Runnable class and I pre defined a fixed number of threads for every job.
My understanding is that it is a wrong practice, because:
You have to tailor the number of threads with respect to the machine running the process.
Each threads can only take one type of job.
Would you agree on that? (current solution is wrong)
So, I'd like to use something like Java's ThreadPool instead. I was conflicted with an argument claiming that by doing so, slow jobs will take over most of the thread pool, leaving no place to the other jobs. Whereas, with the current solution, a fixed number of threads were assigned to the slow worker and it won't hurt the others.
(Notice that you can't know a-priori if a job will be "slow")
How can a system be both adaptive in the number of threads it uses, but at the same time not be bounded to the most slow job?
You could try getting the time it takes for the job to complete (With a hand-made Timer class of sorts. Then you normalize this value by dividing this time by the maximum time any given thread has taken. Finally, you multiply this number by a fixed number which varies depending on how many threads you want running per job per second. This will be the requested amount of threads this process should be using. You can adjust that according.
Edit: You can set minimum and maximum values that regulate how many threads a job is entitled to. You could alternatively request threads from a very spacious job when another thread enters the system.
Hope that helps!
It's more of a business problem. Let's say I am a telecom operator. I bar my subscribers from making outgoing calls when they don't clear their dues. When they make payment I clear a flag and in a second the subscriber can make calls. But a lot of other activities go on in my system like usage processing, billing, bill formatting etc.
Now let's assume I have a system wide common pool of threads and I started the billing of 50K subscribers. All my threads are now processing the relatively long running billing jobs and a huge queue is building up.
A poor customer now makes a payment and wants to make an urgent call. But I have no thread left in my pool to clear the flag. The customer had to wait for an hour before he can make the call. That's SLA breach.
What I should have done is create separate thread pools. If the call unblocking jobs are not very frequent and short, I can create a separate pool for it with core size 5 maybe. For billing jobs I'd rather create a pool with core size 25 and max-size 30.
So, my system limits won't anyway exceed because I know in even the worst situation I won't have more than 30 threads.
This will also make it easy to debug. If I have a different thread name pattern for each pool amd my system has some issues. I can easily take a thread dump and understand if the billing or the payment stuff is the culprit.
So, I think the existing design is based on some business use case which you need to thoroughly understand before proposing a solution.

Trigger CPU cache write back manually in java: possible? necessary?

I am writing a video game in my spare time and have a question about data consistency when introducing mult-threading.
At the moment my game is single threaded and has a simple game loop as it is taught in many tutorials:
while game window is not closed
{
poll user input
react to user input
update game state
render game objects
flip buffers
}
I now want to add a new feature to my game where the player can automate certain tasks that are long and tedious, like walking long distances (fast travel). I may chose to simply "teleport" the player character to their destination but I would prefer not to. Instead, the game will be sped up and the player character will actually walk as if the player was doing it manually. The benefit of this is that the game world will interact with the player character as usual and any special events that might happen will still happen and immediately stop the fast travel.
To implement this feature I was thinking about something like this:
Start a new thread (worker thread) and have that thread update the game state continuously until the player character reaches its destination
Have the main thread no longer update the game state and render the games objects as usual and instead display the travel progress in a more simplistic manner
Use a synchronized message queue to have the main thread and the worker thread communicate
When the fast travel is finished or canceled (by player interaction or other reasons) have the worker thread die and resume the standard game loop with the main thread
In pseudo code it may look like this:
[main thread]
while game window is not closed
{
poll user input
if user wants to cancel fast travel
{
write to message queue player input "cancel"
}
poll message queue about fast travel status
if fast travel finished or canceled
{
resume regular game loop
} else {
render travel status
flip buffers
}
}
[worker thread]
while (travel ongoing)
{
poll message queue
if user wants to cancel fast travel
{
write to message queue fast travel status "canceled"
return
}
update game state
if fast travel is interrupted by internal game event
{
write to message queue fast travel status "canceled"
return
}
write to message queue fast travel status "ongoing"
}
if travel was finished
{
write to message queue fast travel status "finished"
}
The message queue will be some kind of two-channeled synchronized data structure. Maybe two ArrayDeque's with a Lock for each. I am fairly certain this will not be too much trouble.
What I am more concerned is caching problems with the game data:
1.a) Could it be that the worker thread, after being started, may see old game data because the main thread may run on a different core which has cached some of its results?
1.b) If the above is true: Would I need to declare every single field in the game data as volatile to protect myself with absolute guarantee against inconsistent data?
2) Am I right to assume that performance would take a non trivial hit if all fields are volatile?
3) Since I only need to pass the data between threads at few and well controlled points in time, would it be possible to force all caches to write back to main memory instead of using volatile fields?
4) Is there a better approach? Is my concept perhaps ill conceived?
Thanks for any help and sorry for the big chunk of text. I thought it would be easier to answer the question if you know the intended use.
Since I only need to pass the data between threads at few and well controlled points in time, would it be possible to force all caches to write back to main memory instead of using volatile fields?
No. That's not how any of this works. Let me give you very short answers to explain why you are thinking about this the wrong way:
1.a) Could it be that the worker thread, after being started, may see old game data because the main thread may run on a different core which has cached some of its results?
Sure. Or it might for some other reason. Memory visibility is not guaranteed, so you can't rely on it unless you use something guaranteed to provide memory visilbity.
1.b) If the above is true: Would I need to declare every single field in the game data as volatile to protect myself with absolute guarantee against inconsistent data?
No. Any method of assuring memory visibility will work. You don't have to do it any particular way.
2) Am I right to assume that performance would take a non trivial hit if all fields are volatile?
Probably. This would probably be the worst possible way to do it.
3) Since I only need to pass the data between threads at few and well controlled points in time, would it be possible to force all caches to write back to main memory instead of using volatile fields?
No. Since there is no "write cache back to memory" operation that assures memory visibility. Your platform may not even have such caches and the issue might be something else entirely. You're writing Java code, you don't have to think about how your particular CPU works, what cores or caches it has, or anything like that. That's one of the big advantages of using a language with semantics that are guaranteed and don't talk about cores, caches, or anything like this.
4) Is there a better approach? Is my concept perhaps ill conceived?
Absolutely. You are writing Java code. Use the various Java synchronization classes and functions and rely on them to prove the semantics they're documented to provide. Don't even think about cores, caches, flushing to memory, or anything like that. Those are hardware details that, as a Java programmer, you don't even have to ever think about.
Any Java documentation you see that talks about cores, caches, or flushes to memory is not actually talking about real cores, caches, or flushes to memory. It's just giving you some ways to think about hypothetical hardware so you can wrap your brain around why memory visibility and total ordering don't always work perfectly just by themselves. Your real CPU or platform may have completely different issues that bear no resemblance to this hypothetical hardware. (And real-world CPUs and systems have cache coherency guaranteed by hardware and their visibility/ordering issues in fact are completely different!)

Increase/Decrease the number of Worker Role instances in Azure

I can increase the number of Worker Role (WR) instances directly from Java using the ServiceManagementRest class in the Azure4Java package. See the tutorial Azure Management through Java.
My question is, when I decrease the number of WR instances, can I decide which WR instances shut down? Because, for the Cloud elasticity idea, I would stop the instances in a IDLE status and not the instances in EXECUTING status.
Regards,
Fabrizio
You can't choose which instance(s) to shut down; you simply change the instance count, and the fabric controller takes care of shutting instances down. One reason is due to fault domains and SLA: if you had, say, 4 instances in 2 fault domains, and shut down two instances in fault domain 0, you'd now have 2 instances in fault domain 1. So now you have two instances in the same rack, perhaps, and that rack goes offline. Now you have zero instances running for a period of time.
Dealing with instance shutdown is a common scenario, and the typical pattern for working around this is by taking advantage of queues to buffer your workload, then have worker role instances consume work items from these queues. If you shut down an instance prior to its work being finished, the item eventually reappears on the queue and another instance can do the work.
This pattern requires idempotency, which is sometimes a challenge. With a recent update to Windows Azure queues, you can now modify queue messages, which makes this a bit easier - you can add information to your queue message as you complete various stages of your work item processing. Then, if your instance is shut down before work is completed, the next worker to pick it up can resume from a point other than "start."
One more detail: you should be able to handle the Stopping event, and tell the "instance being stopped" to stop reading from the queue (maybe set a flag). Then, override OnStop(), and wait for in-process operations to complete before returning. If the still-in-process operations will take more than 5 minutes, you might have to get creative...
You cannot control which instance will shut down, however it is almost always (as far as I've seen) the instance with the highest number suffix. I.e. if you have IN_0, IN_1 and IN_2 and you close an instance it will most likely be IN_2 that shuts down. Maybe you can use this trend to your advantage?
What is probably least obstructive is if you wait for a time of day when the worker roles are less busy to reduce instances?
I think it's wise to never assume what will happen next is based on an instance id. I tend to spread roles over services (scaleunits), starting and stopping when required - pattern used to control 4000 nodes.

Multiple SingleThreadExecutors for a given application...a good idea?

This question is about the fallouts of using SingleThreadExecutor (JDK 1.6). Related questions have been asked and answered in this forum before, but I believe the situation I am facing, is a bit different.
Various components of the application (let's call the components C1, C2, C3 etc.) generate (outbound) messages, mostly in response to messages (inbound) that they receive from other components. These outbound messages are kept in queues which are usually ArrayBlockingQueue instances - fairly standard practice perhaps. However, the outbound messages must be processed in the order they are added. I guess use of a SingleThreadExector is the obvious answer here. We end up having a 1:1 situation - one SingleThreadExecutor for one queue (which is dedicated to messages emanating from one component).
Now, the number of components (C1,C2,C3...) is unknown at a given moment. They will come into existence depending on the need of the users (and will be eventually disposed of too). We are talking about 200-300 such components at the peak load. Following the 1:1 design principle stated above, we are going to arrange for 200 SingleThreadExecutors. This is the source of my query here.
I am uncomfortable with the thought of having to create so many SingleThreadExecutors. I would rather try and use a pool of SingleThreadExecutors, if that makes sense and is plausible (any ready-made, seen-before classes/patterns?). I have read many posts on recommended use of SingleThreadExecutor here, but what about a pool of the same?
What do learned women and men here think? I would like to be directed, corrected or simply, admonished :-).
If your requirement is that the messages be processed in the order that they're posted, then you want one and only one SingleThreadExecutor. If you have multiple executors, then messages will be processed out-of-order across the set of executors.
If messages need only be processed in the order that they're received for a single producer, then it makes sense to have one executor per producer. If you try pooling executors, then you're going to have to put a lot of work into ensuring affinity between producer and executor.
Since you indicate that your producers will have defined lifetimes, one thing that you have to ensure is that you properly shut down your executors when they're done.
Messaging and batch jobs is something that has been solved time and time again. I suggest not attempting to solve it again. Instead, look into Quartz, which maintains thread pools, persisting tasks in a database etc. Or, maybe even better look into JMS/ActiveMQ. But, at the very least look into Quartz, if you have not already. Oh, and Spring makes working with Quartz so much easier...
I don't see any problem there. Essentially you have independent queues and each has to be drained sequentially, one thread for each is a natural design. Anything else you can come up with are essentially the same. As an example, when Java NIO first came out, frameworks were written trying to take advantage of it and get away from the thread-per-request model. In the end some authors admitted that to provide a good programming model they are just reimplementing threading all over again.
It's impossible to say whether 300 or even 3000 threads will cause any issues without knowing more about your application. I strongly recommend that you should profile your application before adding more complexity
The first thing that you should check is that number of concurrently running threads should not be much higher than number of cores available to run those threads. The more active threads you have, the more time is wasted managing those threads (context switch is expensive) and the less work gets done.
The easiest way to limit number of running threads is to use semaphore. Acquire semaphore before starting work and release it after the work is done.
Unfortunately limiting number of running threads may not be enough. While it may help, overhead may still be to great, if time spent per context switch is major part of total cost of one unit of work. In this scenario, often the most efficient way is to have fixed number of queues. You get queue from global pool of queues when component initializes using algorithm such as round-robin for queue selection.
If you are in one of those unfortunate cases where most obvious solutions do not work, I would start with something relatively simple: one thread pool, one concurrent queue, lock, list of queues and temporary queue for each thread in pool.
Posting work to queue is simple: add payload and identity of producer.
Processing is relatively straightforward as well. First you get get next item from queue. Then you acquire the lock. While you have lock in place, you check if any of other threads is running task for same producer. If not, you register thread by adding a temporary queue to list of queues. Otherwise you add task to existing temporary queue. Finally you release the lock. Now you either run the task or poll for next and start over depending on whether current thread was registered to run tasks. After running the task, you get lock again and see, if there is more work to be done in temporary queue. If not, remove queue from list. Otherwise get next task. Finally you release the lock. Again, you choose whether to run the task or to start over.

Patterns/Principles for thread-safe queues and "master/worker" program in Java

I have a problem which I believe is the classic master/worker pattern, and I'm seeking advice on implementation. Here's what I currently am thinking about the problem:
There's a global "queue" of some sort, and it is a central place where "the work to be done" is kept. Presumably this queue will be managed by a kind of "master" object. Threads will be spawned to go find work to do, and when they find work to do, they'll tell the master thing (whatever that is) to "add this to the queue of work to be done".
The master, perhaps on an interval, will spawn other threads that actually perform the work to be done. Once a thread completes its work, I'd like it to notify the master that the work is finished. Then, the master can remove this work from the queue.
I've done a fair amount of thread programming in Java in the past, but it's all been prior to JDK 1.5 and consequently I am not familiar with the appropriate new APIs for handling this case. I understand that JDK7 will have fork-join, and that that might be a solution for me, but I am not able to use an early-access product in this project.
The problems, as I see them, are:
1) how to have the "threads doing the work" communicate back to the master telling them that their work is complete and that the master can now remove the work from the queue
2) how to efficiently have the master guarantee that work is only ever scheduled once. For example, let's say this queue has a million items, and it wants to tell a worker to "go do these 100 things". What's the most efficient way of guaranteeing that when it schedules work to the next worker, it gets "the next 100 things" and not "the 100 things I've already scheduled"?
3) choosing an appropriate data structure for the queue. My thinking here is that the "threads finding work to do" could potentially find the same work to do more than once, and they'd send a message to the master saying "here's work", and the master would realize that the work has already been scheduled and consequently should ignore the message. I want to ensure that I choose the right data structure such that this computation is as cheap as possible.
Traditionally, I would have done this in a database, in sort of a finite-state-machine manner, working "tasks" through from start to complete. However, in this problem, I don't want to use a database because of the high volume and volatility of the queue. In addition, I'd like to keep this as light-weight as possible. I don't want to use any app server if that can be avoided.
It is quite likely that this problem I'm describing is a common problem with a well-known name and accepted set of solutions, but I, with my lowly non-CS degree, do not know what this is called (i.e. please be gentle).
Thanks for any and all pointers.
As far as I understand your requirements, you need ExecutorService. ExecutorService have
submit(Callable task)
method which return value is Future. Future is a blocking way to communicate back from worker to master. You could easily expand this mechanism to work is asynchronous manner. And yes, ExecutorService also maintaining work queue like ThreadPoolExecutor. So you don't need to bother about scheduling, in most cases. java.util.concurrent package already have efficient implementations of thread safe queue (ConcurrentLinked queue - nonblocking, and LinkedBlockedQueue - blocking).
Check out java.util.concurrent in the Java library.
Depending on your application it might be as simple as cobbling together some blocking queue and a ThreadPoolExecutor.
Also, the book Java Concurrency in Practice by Brian Goetz might be helpful.
First, why do you want to hold the items after a worker started doing them? Normally, you would have a queue of work and a worker takes items out of this queue. This would also solve the "how can I prevent workers from getting the same item"-problem.
To your questions:
1) how to have the "threads doing the
work" communicate back to the master
telling them that their work is
complete and that the master can now
remove the work from the queue
The master could listen to the workers using the listener/observer pattern
2) how to efficiently have the master
guarantee that work is only ever
scheduled once. For example, let's say
this queue has a million items, and it
wants to tell a worker to "go do these
100 things". What's the most efficient
way of guaranteeing that when it
schedules work to the next worker, it
gets "the next 100 things" and not
"the 100 things I've already
scheduled"?
See above. I would let the workers pull the items out of the queue.
3) choosing an appropriate data
structure for the queue. My thinking
here is that the "threads finding work
to do" could potentially find the same
work to do more than once, and they'd
send a message to the master saying
"here's work", and the master would
realize that the work has already been
scheduled and consequently should
ignore the message. I want to ensure
that I choose the right data structure
such that this computation is as cheap
as possible.
There are Implementations of a blocking queue since Java 5
Don't forget Jini and Javaspaces. What you're describing sounds very like the classic producer/consumer pattern that space-based architectures excel at.
A producer will write the jobs into the space. 1 or more consumers will take out jobs (under a transaction) and work on that in parallel, and then write the results back. Since it's under a transaction, if a problem occurs the job is made available again for another consumer .
You can scale this trivially by adding more consumers. This works especially well when the consumers are separate VMs and you scale across the network.
If you are open to the idea of Spring, then check out their Spring Integration project. It gives you all the queue/thread-pool boilerplate out of the box and leaves you to focus on the business logic. Configuration is kept to a minimum using #annotations.
btw, the Goetz is very good.
This doesn't sound like a master-worker problem, but a specialized client above a threadpool. Given that you have a lot of scavenging threads and not a lot of processing units, it may be worthwhile simply doing a scavaging pass and then a computing pass. By storing the work items in a Set, the uniqueness constraint will remove duplicates. The second pass can submit all of the work to an ExecutorService to perform the process in parallel.
A master-worker model generally assumes that the data provider has all of the work and supplies it to the master to manage. The master controls the work execution and deals with distributed computation, time-outs, failures, retries, etc. A fork-join abstraction is a recursive rather than iterative data provider. A map-reduce abstraction is a multi-step master-worker that is useful in certain scenarios.
A good example of master-worker is for trivially parallel problems, such as finding prime numbers. Another is a data load where each entry is independant (validate, transform, stage). The need to process a known working set, handle failures, etc. is what makes a master-worker model different than a thread-pool. This is why a master must be in control and pushes the work units out, whereas a threadpool allows workers to pull work from a shared queue.

Categories