Scenario
There is a factory that receives orders. Once received, every order item goes through a multi-step production process. Every step is done by a separate machine and every machine can only handle one item at a time. So the order comes in, the first item goes to machine1, when it's done it goes to machine2 and the next item to machine1, etc.
Technical part
Every machine is implemented as a thread and has a queue with all items lined up that need this step of the process next. The run method of the machine checks in an endless while loop if there is anything in the queue, if yes it will handle that item, sleep for a certain amount of time and then push the item to the queue of the next machine.
Questions
In my head, this sounds all pretty simple. But I constantly run into null-pointer errors and other weird exceptions. I honestly don't fully understand what's wrong but I suspect it's a problem with multi-threading vs. sleep. At this point I got two questions:
What happens if I call a method of a sleeping thread (machine)? (Example: I call machine.addItemToQueue() while that machine is working on another item).
Follows Q1: Let's say I really can't call that method while the machine 'sleeps'. How else would I handle this? Should I take the queue outside the machine? Is this an async problem?
Related
I am trying to figure out the basics of Vertx. I was going through standard doc on it here, where I stumbled upon a section on context object. It says that it lets you run your code later by providing a method called runOnContext. The thing I don't understand is, in which case would I choose to invoke a (non-blocking) block of code later? If the code is non-blocking, it will take same amount of time, whether you execute it now or later.
Can anyone please tell me, in which case, context.runOnContext will be helpful?
Most often it will be helpful if you call it from another thread. It will schedule a task for execution by the event loop bound to this context.
If you're already on the event loop, you may also use it when you read items from a queue: instead of processing all items as a single event, you would schedule an event per item in the queue. That would give other kind of events (network, filesystem) a chance to be processed earlier.
I was going through the javadocs and source code for drainTo method present in BlockingQueue interface and LinkedBlockingQueue implementation of the same. My understanding of this method after looking at the source (JDK7), is that the calling thread actually submits a Collection and afterwards acquires a takeLock(), which blocks other consumers. After that till the count of max elements, the items of the nodes are removed from the queue and put in a collection.
What I could appreciate is that it saves the threads from acquiring locks again and again, but pardon my limited knowledge, I could not appreciate the need for the same in real world examples. Could some one please share some real world examples where drainTo behavior is observable ?
Well, I used it in real life code and it looked quite natural to me: a background database thread creates items and puts them into a queue in a loop until either the end of data is reached or a stop signal is detected. On the first item a UI updater is launched using EventQueue.invokeLater. Due to the asynchronous nature and some overhead in this invokeLater mechanism, it will take some time until the UI updater comes to the point where it queries the queue and most likely more than one item may be available.
So it will use drainTo to get all items that are available at this specific point and update a ListDataModel which produces a single event for the added interval. The next update can be triggered using another invokeLater or using a Timer. So drainTo has the semantic of “gimme all items arrived since the last call” here.
On the other hand, polling the queue for single items could lead to a situation that producer and consumer are blocking each other for a short time and every time the consumer asks for a new item, another item is available due to the fact that the consumer has been blocked just long enough for the producer to create and put a new item. So you have to implement your own time limit to avoid blocking the UI thread too long in this case. Using drainTo once and release the event handling thread afterwards is much easier.
I have many threads performing different operations on object and when nearly 50% of the task finished then I want to serialize everything(might be I want to shut down my machine ).
When I come back then I want to start from the point where I had left.
How can we achieve?
This is like saving state of objects of any game while playing.
Normally we save the state of the object and retrieve back. But here we are storing its process's count/state.
For example:
I am having a thread which is creating salary excel sheet for 50 thousand employee.
Other thread is creating appraisal letters for same 50 thousand employee.
Another thread is writing "Happy New Year" e-mail to 50 thousand employee.
so imagine multiple operations.
Now I want to shut down in between 50% of task finishes. say 25-30 thousand employee salary excel-sheet have been written and appraisal letters done for 25-30 thousand and so on.
When I will come back next day then I want to start the process from where I had left.
This is like resume.
I'm not sure if this might help, but you can achieve this if the threads communicate via in-memory queues.
To serialize the whole application, what you need to do is to disable the consumption of the queues, and when all the threads are idle you'll reach a "safe-point" where you can serialize the whole state. You'll need to keep track of all the threads you spawn, to know if they are in are idle.
You might be able to do this with another technology (maybe a java agent?) that freezes the JVM and allows you to dump the whole state, but I don't know if this exists.
well its not much different than saving state of object.
just maintain separate queues for different kind of inputs. and on every launch (1st launch or relaunch) check those queues, if not empty resume your 'stopped process' by starting new process but with remaining data.
say for ex. an app is sending messages, and u quit the app with 10 msg remaining. Have a global queue, which the app's senderMethod will check on every launch. so in this case it will have 10msg in pending queue, so it will continue sending remaining msgs.
Edit:
basically, for all resumable process' say pr1, pr2....prN, maintain queue of inputs, say q1, q2..... qN. queue should remove processed elements, to contain only pending inputs. as soon as u suspend system. store these queues, and on relaunching restore them. have a common routine say resumeOperation, which will call all resumable process (pr1, pr2....prN). So it will trigger the execution of methods with non-0 queues. which in tern replicate resuming behavior.
Java provides the java.io.Serializable interface to indicate serialization support in classes.
You don't provide much information about the task, so it's difficult to give an answer.
One way to think about a task is in terms of a general algorithm which can split in several steps. Each of these steps in turn are tasks themselves, so you should see a pattern here.
By cutting down each algorithms in small pieces until you cannot divide further you get a pretty good idea of where your task can be interrupted and recovered later.
The result of a task can be:
a success: the task returns a value of the expected type
a failure: somehow, something didn't turn right while doing computation
an interrupted computation: the work wasn't finished, but it may be resumed later, and the return value is the state of the task
(Note that the later case could be considered a subcase of a failure, it's up to you to organize your protocol as you see fit).
Depending on how you generate the interruption event (will it be a message passed from the main thread to the worker threads? Will it be an exception?), that event will have to bubble within the task tree, and trigger each task to evaluate if its work can be resumed or not, and then provide a serialized version of itself to the larger task containing it.
I don't think serialization is the correct approach to this problem. What you want is persistent queues, which you remove an item from when you've processed it. Every time you start the program you just start processing the queue from the beginning. There are numerous ways of implementing a persistent queue, but a database comes to mind given the scale of your operations.
I am currently working on a project in JAVA where I have to make an agent to interact with a server.
Each 50ms, the server will receive the last thing I outputted to System.out and send me a new set of lines as a 'state' through the System.in printstream to analyze and send my next message to System.out.
Also, if the server receives multiple outputs from me, it only regards the most recent one.
..
As for my question:
My program originally constructed a tree and then analyzed each leaf node to see which would be optimal, and then waited around for the next input, but I can recursively do a deeper tree search that would make my output 'better' (and again and again to keep returning a better result).
Using this and the fact that if the server receives multiple outputs, it only takes the most recent one, I could do each level, print my result and start the next level. But here comes my problem...
I can't be stuck in some complex algorithm while I am supposed to receiving the next input as I will then miss it. So I was wondering if there is a way to cancel anything else I am doing when I receive something via System.in and then go back to the beginning of the function and start the search again with the new set of input (and rinse and repeat..)
I hope this all makes sense,
Thank ye all
You absolutely require either multiple threads (or multiple processes) here.
I assume that you've solved the problem of receiving input into System.in, as well as the problem of your algorithm. The next step is to package each in a Runnable interface, and hand each a reference to a queuing object. This will scaffold a Producer-Consumer relationship.
Whenever your listening Runnable (the Producer) gets a message, it needs to put it on your queue. After every unit of work, your algorithm (the Consumer) should look into the queue for items that are there. If it finds something, it should integrate it as normal. If not, it continues on with it's work.
Both the Producer and the Consumer need to be started in their own threads and allowed to run concurrently.
I have an application that checks a resource on the internet for new mails. If there is are new mails it does some processing on them. This means that depending on the amount of mails it might take just a few seconds to hours of processing.
Now the object/program that does the processing is already a singleton. So right now I already took care of there really only being 1 instance that's handling the checking and processing.
However I only have it running once now and I'd like to have it continuously running, checking for new mails more or less every 10 minutes or so to handle them in a timely manner.
I understand I can take care of this with Timer/Timertask or even better I found a resource here: http://www.ibm.com/developerworks/java/library/j-schedule/index.html that uses Scheduler/SchedulerTask. But what I am afraid of.. is if I set it to run every 10 minutes and a previous session is already processing data it will put the new task in a stack waiting to be executed once the previous one is done. So what I'm afraid of is for instance the first run running for 5 hours and then, because it was busy all the time, after that it will launch 5*6-1=29 runs immediately after each other checking for mails and/do some processing without giving the server a break.
Does anyone know how I can solve this?
P.S. the way I have my application set up right now is I'm using a Java Servlet on my tomcat server that's launched upon server start where it creates a Singleton instance of my main program, then calls some method to do the fetching/processing. And what I want is to repeat that fetching/processing every "x" amount of time (10 minutes or so), making sure that really only 1 instance is doing this and that really after each run 10 minutes or so are given to rest.
Actually, Timer + TimerTask can deal with this pretty cleanly. If you schedule something with Timer.scheduleAtFixedRate() You will notice that the docs say that it will attempt to "make up" late events to maintain the long-term period of execution. However, this can be overcome by using TimerTask.scheduledExecutionTime(). The example therein lets you figure out if the task is too tardy to run, and you can just return instead of doing anything. This will, in effect, "clear the queue" of TimerTask.
Of note: TimerTask uses a single thread to execute, so it won't spawn two copies of your task side-by-side.
On the side note part, you don't have to process all 10k emails in the queue in a single run. I would suggest processing for a fixed amount of time using TimerTask.scheduledExecutionTime() to figure out how long you have, then returning. That keeps your process more limber, cleans up the stack between runs, and if you are doing aggregates, ensures that you don't have to rebuild too much data if, for example, the server is restarted in the middle of the task. But this recommendation is based on generalities, since I don't know what you're doing in the task :)