Scheduling tasks, making sure task is ever being executed - java

I have an application that checks a resource on the internet for new mails. If there is are new mails it does some processing on them. This means that depending on the amount of mails it might take just a few seconds to hours of processing.
Now the object/program that does the processing is already a singleton. So right now I already took care of there really only being 1 instance that's handling the checking and processing.
However I only have it running once now and I'd like to have it continuously running, checking for new mails more or less every 10 minutes or so to handle them in a timely manner.
I understand I can take care of this with Timer/Timertask or even better I found a resource here: http://www.ibm.com/developerworks/java/library/j-schedule/index.html that uses Scheduler/SchedulerTask. But what I am afraid of.. is if I set it to run every 10 minutes and a previous session is already processing data it will put the new task in a stack waiting to be executed once the previous one is done. So what I'm afraid of is for instance the first run running for 5 hours and then, because it was busy all the time, after that it will launch 5*6-1=29 runs immediately after each other checking for mails and/do some processing without giving the server a break.
Does anyone know how I can solve this?
P.S. the way I have my application set up right now is I'm using a Java Servlet on my tomcat server that's launched upon server start where it creates a Singleton instance of my main program, then calls some method to do the fetching/processing. And what I want is to repeat that fetching/processing every "x" amount of time (10 minutes or so), making sure that really only 1 instance is doing this and that really after each run 10 minutes or so are given to rest.

Actually, Timer + TimerTask can deal with this pretty cleanly. If you schedule something with Timer.scheduleAtFixedRate() You will notice that the docs say that it will attempt to "make up" late events to maintain the long-term period of execution. However, this can be overcome by using TimerTask.scheduledExecutionTime(). The example therein lets you figure out if the task is too tardy to run, and you can just return instead of doing anything. This will, in effect, "clear the queue" of TimerTask.
Of note: TimerTask uses a single thread to execute, so it won't spawn two copies of your task side-by-side.
On the side note part, you don't have to process all 10k emails in the queue in a single run. I would suggest processing for a fixed amount of time using TimerTask.scheduledExecutionTime() to figure out how long you have, then returning. That keeps your process more limber, cleans up the stack between runs, and if you are doing aggregates, ensures that you don't have to rebuild too much data if, for example, the server is restarted in the middle of the task. But this recommendation is based on generalities, since I don't know what you're doing in the task :)

Related

How long can I leave a threadpool without tasks before it terminates in android?

I am new to using ThreadPools to perform multithreading in my android app. In the past, I have created new Threads to perform network requests, database queries and intense algorithms. Acording to this post new Thread(task).start() VS ThreadPoolExecutor.submit(task) in Android , Using a thread pool is better.
As I was redesigning my program to use a ThreadpoolExecutor, The question that I have been struggling to answer is "What happens to my threadPool if no tasks are sent to it for a while?" For example, say that I am building an app that pulls information from a server and displays it to a user. The user can also update the displayed information by pulling an updated set of data from the server. The user can update the information at any time they please. It could be as long as several hours between updates.
This could be accomplished using a new Threads, however, each time the end user refreshes, new memory must be allocated for the thread. What I am hoping to do is use a threadPool so that I can run the network calls without having to allocate memory every time. However that is built on two assumptions. The first is that I can leave a threadpool alone for an undeterminable amount of time and still be able to use it. The second is that this aproach to using a thread model is in line with good practice. Assuming the second is true, How long can I leave a threadpool without tasks to perform before it shuts down or terminates on its own accord, if it does do that?
I believe it just stays available for the life of the application unless you explicitly call 'shutdown()' on the thread pool.

Selenium, Java, Grid How to create an unconditional pause?

I need to test time tracking on a page. I need to be able to pause and do nothing but run the clock and log time. Most of what I've seen from Googling is just Thread.sleep(300) but there are times that I actually need a test to wait five minutes or more. I don't want to exceed 5 minutes of timeout when I start the node simply because, if there's a client failure, I want the node to release the browser so another test can start. One thing I have tried is waiting a specific amount of time for an element that I know isn't there so that it periodically sends instructions to the node so it doesn't release the browser, but for some reason, it only works when I'm debugging. Otherwise it waits forever. I could make a method that uses .sleep() and periodically sends some trivial instruction to the node like getting the current URL to keep it from dropping the browser. What is the best way to pause for 5+ minutes without increasing the timeout parameter for the node?

Loop a java application in ticks

I'm making a Java server application. The application would comsume alot of resources if it just ran when possible.
As far as I know if I added a sleep method, it would run like this:
Do task (Might take 10ms to do. Can also take longer or less)
Sleep 50ms
Do task (Might take 10ms to do. Can also take longer or less)
Sleep 50ms
So how can I make it run every 50ms (20 tick)?
Thanks
You can use a ScheduledExecutorService
ScheduledExecutorService service = Executors.newScheduledThreadPool(10);
service.scheduleAtFixedRate(() -> {
System.out.println("whatever");
}, 0, 50, TimeUnit.MILLISECONDS);
// ^ rate
The scheduledAtFixedRate() method will schedule the given task for execution at a fixed rate, regardless of the time the task took. You could possibly have one execution take longer than 50ms, and the next one would still run (assuming you have enough threads).
Without knowing what your application does (you could've included it in your question), you could use a scheduler (Quartz, java.util.Timer). Which task are you trying to perform every 50ms?
Edit:
While the "game loop" is all well and good in games, servers rarely have them. Receiving data is a continuous action, and the state should change accordingly. This is a larger design issue in the server. With proper design you don't need to create artificial pauses.
For example a simple design would be having threads waiting to receive input from the clients, and when a message is received, it's processed, and a message is sent to all clients to inform of the changes. No busy waiting, nothing will happen unless a message arrives from a client.

how to serialize multi-threaded program

I have many threads performing different operations on object and when nearly 50% of the task finished then I want to serialize everything(might be I want to shut down my machine ).
When I come back then I want to start from the point where I had left.
How can we achieve?
This is like saving state of objects of any game while playing.
Normally we save the state of the object and retrieve back. But here we are storing its process's count/state.
For example:
I am having a thread which is creating salary excel sheet for 50 thousand employee.
Other thread is creating appraisal letters for same 50 thousand employee.
Another thread is writing "Happy New Year" e-mail to 50 thousand employee.
so imagine multiple operations.
Now I want to shut down in between 50% of task finishes. say 25-30 thousand employee salary excel-sheet have been written and appraisal letters done for 25-30 thousand and so on.
When I will come back next day then I want to start the process from where I had left.
This is like resume.
I'm not sure if this might help, but you can achieve this if the threads communicate via in-memory queues.
To serialize the whole application, what you need to do is to disable the consumption of the queues, and when all the threads are idle you'll reach a "safe-point" where you can serialize the whole state. You'll need to keep track of all the threads you spawn, to know if they are in are idle.
You might be able to do this with another technology (maybe a java agent?) that freezes the JVM and allows you to dump the whole state, but I don't know if this exists.
well its not much different than saving state of object.
just maintain separate queues for different kind of inputs. and on every launch (1st launch or relaunch) check those queues, if not empty resume your 'stopped process' by starting new process but with remaining data.
say for ex. an app is sending messages, and u quit the app with 10 msg remaining. Have a global queue, which the app's senderMethod will check on every launch. so in this case it will have 10msg in pending queue, so it will continue sending remaining msgs.
Edit:
basically, for all resumable process' say pr1, pr2....prN, maintain queue of inputs, say q1, q2..... qN. queue should remove processed elements, to contain only pending inputs. as soon as u suspend system. store these queues, and on relaunching restore them. have a common routine say resumeOperation, which will call all resumable process (pr1, pr2....prN). So it will trigger the execution of methods with non-0 queues. which in tern replicate resuming behavior.
Java provides the java.io.Serializable interface to indicate serialization support in classes.
You don't provide much information about the task, so it's difficult to give an answer.
One way to think about a task is in terms of a general algorithm which can split in several steps. Each of these steps in turn are tasks themselves, so you should see a pattern here.
By cutting down each algorithms in small pieces until you cannot divide further you get a pretty good idea of where your task can be interrupted and recovered later.
The result of a task can be:
a success: the task returns a value of the expected type
a failure: somehow, something didn't turn right while doing computation
an interrupted computation: the work wasn't finished, but it may be resumed later, and the return value is the state of the task
(Note that the later case could be considered a subcase of a failure, it's up to you to organize your protocol as you see fit).
Depending on how you generate the interruption event (will it be a message passed from the main thread to the worker threads? Will it be an exception?), that event will have to bubble within the task tree, and trigger each task to evaluate if its work can be resumed or not, and then provide a serialized version of itself to the larger task containing it.
I don't think serialization is the correct approach to this problem. What you want is persistent queues, which you remove an item from when you've processed it. Every time you start the program you just start processing the queue from the beginning. There are numerous ways of implementing a persistent queue, but a database comes to mind given the scale of your operations.

EJB timer performance

I am trying to decide if use a java-ee timer in my application or not. The server I am using is Weblogic 10.3.2
The need is: After one hour of a call to an async webservice from an EJB, if the async callback method has not been called it is needed to execute some actions. The information regarding if the callback method has been called and the date of the execution of the call is stored in database.
The two possibilities I see are:
Using a batch process that every half hour looks for all the calls that have been more than one hour without response and execute the needed actions.
Create a timer of one hour after every single call to the ws and in the #Timeout method check if the answer has come and if it has not, execute the required actions.
From a pure programming point of view, it looks easier and cleaner the second one, but I am worry of the performance issues I could have if let's say there are 100.000 Timer created at a single moment.
Any thoughts?
You would be better off having a more specialized process. The real problem is the 100,000 issue. It would depend on how long your actions take.
Because its easy to see that each second, the EJB timer would fire up 30 threads to process all of the current pending jobs, since that's how it works.
Also timers are persistent, so your EJB managed timer table will be saving and deleting 30 rows per second (60 total), this is assuming 100K transactions/hour.
So, that's an lot of work happening very quickly. I can easily see the system simply "falling behind" and never catching up.
A specialized process would be much lighter weight, could perhaps batch the action calls (call 5 actions per thread instead of one per thread), etc. It would be nice if you didn't have to persist the timer events, but that is what it is. You could almost easily simply append the timer events to a file for safety, and keep them in memory. On system restart, you can reload that file, and then roll the file (every hour create a new file, delete the older file after it's all been consumed, etc.). That would save a lot of DB traffic, but you could lose the transactional nature of the DB.
Anyway, I don't think you want to use the EJB Timer for this, I don't think it's really designed for this amount of traffic. But you can always test it and see. Make sure you test restarting your container see how well it works with 100K pending timer jobs in its table.
All depends of what is used by the container. e.g. JBoss uses Quartz Scheduler to implement EJB timer functionality. Quartz is pretty good when you have around 100 000 timer instances.
#Pau: why u need to create a timer for every call made...instead u can have a single timer thread created at start up of application which runs after every half-hour(configurable) period of time and looks in your Database for all web services calls whose response have not been received and whose requested time is past 1 hour. And for selected records, in for loop, it can execute required action.
Well above design may not be useful if you have time critical activity to be performed.
If you have spring framework in your application, you may also look up its timer services.http://static.springsource.org/spring/docs/1.2.9/reference/scheduling.html
Maybe you could use some of these ideas:
Where I'm at, we've built a cron-like scheduler which is powered by a single timer. When the timer fires the system checks which crons need to run using a Quartz CronTrigger. Generally these crons have a lot of work to do, and the way we handle that is each cron spins its individual tasks off as JMS messages, then MDBs handle the messages. Currently this runs on a single Glassfish instance and as our task load increases, we should be able to scale this up with a cluster so multiple nodes are processing the jms messages. We balance the jms message processing load for each type of task by setting the max-pool-size in glassfish-ejb-jar.xml (also known as sun-ejb-jar.xml).
Building a system like this and getting all the details right isn't trivial, but it's proving really effective.

Categories