In podcast #15, Jeff mentioned he twittered about how to run a regular event in the background as if it was a normal function - unfortunately I can't seem to find that through twitter. Now I need to do a similar thing and are going to throw the question to the masses.
My current plan is when the first user (probably me) enters the site it starts a background thread that waits until the alloted time (hourly on the hour) and then kicks off the event blocking the others (I am a Windows programmer by trade so I think in terms of events and WaitOnMultipleObjects) until it completes.
How did Jeff do it in Asp.Net and is his method applicable to the Java web-app world?
I think developing a custom solution for running background tasks doesn't always worth, so I recommend to use the Quartz Scheduler in Java.
In your situation (need to run background tasks in a web application) you could use the ServletContextListener included in the distribution to initialize the engine at the startup of your web container.
After that you have a number of possibilities to start (trigger) your background tasks (jobs), e.g. you can use Calendars or cron-like expressions. In your situation most probably you should settle with SimpleTrigger that lets you run jobs in fixed, regular intervals.
The jobs themselves can be described easily too in Quartz, however you haven't provided any details about what you need to run, so I can't provide a suggestion in that area.
As mentioned, Quartz is one standard solution. If you don't care about clustering or persistence of background tasks across restarts, you can use the built in ThreadPool support (in Java 5,6). If you use a ScheduledExecutorService you can put Runnables into the background thread pool that wait a specific amount of time before executing.
If you do care about clustering and/or persistence, you can use JMS queues for asynchronous execution, though you will still need some way of delaying background tasks (you can use Quartz or the ScheduledExecutorService to do this).
Jeff's mechanism was to create some sort of cached object which ASP.Net would automatically recreate at some sort of interval - It seemed to be an ASP.Net specific solution, so probably won't help you (or me) much in Java world.
See https://stackoverflow.fogbugz.com/default.asp?W13117
Atwood: Well, I originally asked on Twitter, because I just wanted something light weight. I really didn't want to like write a windows service. I felt like that was out of band code. Plus the code that actually does the work is a web page in fact, because to me that is a logical unit of work on a website is a web page. So, it really is like we are calling back into the web site, it's just like another request in the website, so I viewed it as something that should stay inline, and the little approach that we came up that was recommended to me on Twitter was to essentially to add something to the application cache with a fixed expiration, then you have a call back so when that expires it calls a certain function which does the work then you add it back in to the cache with the same expiration. So, it's a little bit, maybe "ghetto" is the right word.
My approach has always been to have to OS (i.e. Cron or the Windows task scheduler) load a specific URL at some interval, and then setup a page at that URL to check it's queue, and perform whatever tasks were required, but I'd be interested to hear if there's a better way.
From the transcript, it looks like FogBugz uses the windows service loading a URL approach also.
Spolsky: So we have this special page called heartbeat.asp. And that page, whenever you hit it, and anybody can hit it at anytime: doesn't hurt. But when that page runs it checks a queue of waiting tasks to see if there's anything that needs to be done. And if there's anything that needs to be done, it does one thing and then looks in that queue again and if there's anything else to be done it returns a plus, and the entire web page that it returns is just a single character with a plus in it. And if there's nothing else to be done, the queue is now empty, it returns a minus. So, anybody can call this and hit it as many times, you can load up heartbeat.asp in your web browser you hit Ctrl-R Ctrl-R Ctrl-R Ctrl-R until you start getting minuses instead of pluses. And when you've done that FogBugz will have completed all of its maintenance work that it needs to do. So that's the first part, and the second part is a very, very simple Windows service which runs, and its whole job is to call heartbeat.asp and if it gets a plus, call it again soon, and if it gets a minus call it again, but not for a while. So basically there's this Windows service that's always running, that has a very, very, very simple task of just hitting a URL, and looking to see if it gets a plus or a minus and, and then scheduling when it runs again based on whether it got a plus or a minus. And obviously you can do any kind of variation you want on this theme, like for example, uh you could actually, instead of returning just a plus or minus you could say "Okay call me back in 60 seconds" or "Call me back right away I have more work to be done." And that's how it works... so that maintenance service it just runs, you know, it's like, you know, a half page of code that runs that maintenance service, and it never has to change, and it doesn't have any of the logic in there, it just contains the tickling that causes these web pages to get called with a certain guaranteed frequency. And inside that web page at heartbeat.asp there's code that maintains a queue of tasks that need to be done and looks at how much time has elapsed and does, you know, late-night maintenance and every seven days delete all the older messages that have been marked as spam and all kinds of just maintenance background tasks. And uh, that's how that does that.
We use jtcron for our scheduled background tasks.
It works well, and if you understand cron it should make sense to you.
Here is how they do it on StackOverflow.com:
https://blog.stackoverflow.com/2008/07/easy-background-tasks-in-aspnet/
Related
I'm seeing strange behavior and I don't know how to gain any further insight into and am hoping someone can help.
Background: I have a query that takes a long time to return results so instead of making the user wait for the data directly upon request I execute this query via a Timer object at regular intervals and store the results in a static variable. Therefore, when the user requests the data I always just pull from the static variable, therefore making the response virtually instant. So far so good.
Issue: The behavior I'm seeing, however, is that if I make a request for the data just as the background (Timer) request has begun to query the data, my user's request waits for the data to come back before responding -- forcing the user to wait. It's as if tomcat is behaving synchronously with the threads (I know it's not -- it just looks that way).
This is in a Production environment and, for the most part, everything works great but for users there are times when the site just hangs for them and they feel it's unreliable (well, in a sense it is).
What I've done: Being that the requests for the data were in a static method I thought "A ha! The threads are syncronized which is causing the delay!" so i pulled all of my static methods out, removed the syncronization and forced each call to instantiate it's own object to retrieve the data (to keep it thread safe). There isn't any syncronization on a semaphore to the static variable either.
I've also installed javamelody to try and gain some insight into what's going on but nothing new thus far. I have noticed a lot (majority) of threads are in "WAITING" state but they also have 0ms for User and CPU time so don't think that is pointing to anything(?).
Running Tomcat 5.5 (no apache layer), struts 2, Java 1.5
If anyone has any idea why a simple request to a static variable hangs for longer background processes I would really appreciate it! Or if you know how I can gain insight that would be great too.
Thanks!
One possible explanation is that the threads are actually blocking at the database level due to database locking (or something) caused by the long-running query.
The way to figure out what is going on is to find out exactly where the blocked threads are blocking. A thread dump can be produced by sending a SIGQUIT (or equivalent) to the JVM, and included stack traces for all Java thread stacks. Alternatively, you can get the same information (and more) by attaching a debugger, etcetera. Either way, the class name and line number of the top frame of each stack should allow you to look at the source code and figure out (at least) what kind of locking or blocking is going on.
For those who would like to know I eventually found VisualVM (http://visualvm.java.net/download.html). It's perfect. I run Tomcat from eclipse like I normally do and it appears within the VisualVM client. Right-mouse click the tomcat icon, choose Thread Dump and, boom, I've got it all.
Thanks, all, for the help and pointers towards the right direction!
I'll try my best to explain the situation at hand. I'm developing an auto-logoff feature for an authentication application that runs in an embedded system and should work cooperatively with its own internal auto-logout system. First I'll briefly explain the native auto logoff. If I set it to 60 seconds and I'm in one of the system's screens, it will monitor and reset the timer upon user interaction. After being idle for 60 seconds, the system will call the Java authentication application and it logs off. This is easily done. Once an user logs in, the application starts a thread that just waits. When the application is brought back (upon timeout), a call for the notify() method is made, releasing the thread and executing the logout proccess. Relatively simple, right?
Now things get ugly. This embedded system supports multiple Java applications, as it is expected. But the system built-in auto logout will only keep track of things should the user be interacting with one of the system's native screens, if he's using another Java application, it will do nothing. So, in order to counter that, I need to come up with some way to implement an internal timer that works in conjunction with the workings described above.
In other words, should the system be in "Java mode" I need to use a coded timer, anything else I use the timer described in the first paragraph. The system itself doesn't help much, since there's no support to check in which state the machine is, so I'm pretty much left with only Java to do the job.
I've given it a lot of thought today but I'm just stuck. Any fresh ideas?
BTW, it supports only Java up to 1.4 so no advanced concurrency classes for me.
Edit: I forgot to mention that I can't just code a regular timer, since I can't capture native events from the system (I can capture events from other Java applications though) and a Java timer overrides the native timer function. That's why I need some "coordinate work" between them.
I am looking for ideas on how to deal with a search related task which takes more than usual time (in human terms more than 3 seconds)
I have to query multiple sources, sift through information for the first time and then cache it in the DB for later quick return.
The context of the project is J2EE, Spring and Hibernate (on top of SpringROO)
The possible solutions I could think of
-On the webpage let the user know that task is running in background, if possible give them a queue number or waiting time. Refresh the page via a controller which basically checks if the task is done, then when its done (ie the search result is prepared and stored in DB) then just forward to a new controller and fetch the result from the DB
-The background tasks could be done with Spring Task executor. I am not sure if it is easy to give a measure of how long it would take. It would probably be a bad idea to let all the search terms run concurrently, so some sort of pooling will be a good idea.
-Another option to use background tasks is to use JMS. This is perhaps a solution with more control (retries etc)
-Spring batch also comes to mind
Please suggest how you would do it. I would greatly appreciate a semi-detailed+ description. The sources of info can be man and can be sequential in nature so it can take upto 4-5 minutes for the results to form. It is also possible that such tasks run automatically in the background without user intervention (ie to update from the sources)
From a user perspective, I use AJAX. The default web page contains some kind of "Busy" indicator. When the AJAX request completes, the busy indicator is replaced with the result.
In the background, request handlers are already multi-threaded. So you can simply format the default result, close&flush the output, and do the processing in the current thread. You should put something in the session or DB to make sure that no one can start the same heavy process a second time.
Running task pools in a web container is possible but there are some caveats, especially how to synchronize startup/shutdown: Do you want your web server to "hang" during shutdown while some thread is busy collecting your results? Also the additional load should be considered. It might be better to use JMS and offload the strain to a second server dedicated to build the search results.
Such a system will scale much better if your searches start to become a burden. It also makes it trivial to automate the process by writing a small program which posts searches in the JMS queue.
I've solved this problem in the past doing something like this:
When the user initiates a long running task, I open a popup window that displays the task status. The task status includes a name and estimated time to complete
This task is also stored in my "app" (this can be stored in the DB, session, or application context), so the user can continue doing other things on my web app while having an easy way to navigate back to the running task.
I stored my tasks in a DB, so I could manage what happens on startup and shutdown of the web app. This requires storing the progress of the task in the DB.
The tricky part is display results to the user. If you use the method I've described, you'll need to store results in either the DB, session, or application contexts.
This system I've described is pretty heavyweight, and may be overkill for your application.
In response to the comment
so what do you use to do the
background computing. I have asked
this before
I use java.util.concurrent. A lot of this depends on the nature of your application. Is the task (or steps in the task) idempotent? How critical is it that it run to completion? If you have a non-idempotent task that must run to completion, I would say you generally must record every piece of work you do, and you must do that piece of work within a transaction. For example, if one of your tasks is to email a list of people (this is definitely not idempotent) you would do the emailing in a "transaction" (I'm using the term lightly here) and store your progress after each transaction is complete.
I want to reduce the CPU usage/ROM usage/RAM usage - generally, all system resources that my app uses - who doesn't? :)
For this reason I want to split the preferences window from the rest of the application,
and let the preferences window to run as independent program.
The preferences program should write to a Property file(not a problem at all) and to send a "update signal" to the main program - which means it should call the update method (that i wrote) that found in the Main class.
How can I call the update method in the Main program from the preferences program?
To put it another way, is a way to build preferences window that take system resources just when the window appears?
Is this approach - of separating programs and let them talk to each other (somehow) - the right approach for speeding up my programs?
What you're describing sounds like Premature Optimisation. If you're writing something other than a toy application, it's important to be confident that your optimisations are actually addressing a real problem. Is your program running slowly? If so, have you run it through a profiler or otherwise identified where the poor performance is happening?
If you have identified that what you want to do will address your performance issue, I suggest you look at running the components concurrently in different threads, not different processes. Then your components can avoid blocking each other, you will be able to take advantage of multi-core processors and you do not take on the complexity and performance overhead of inter-process communication over network sockets and the like.
You can communicate back and forth using sockets. Here's a tutorial of how to do something similar..
Unfortunately, I don't think this is going to help you minimize CPU usage, RAM, etc... If anything it might increase the CPU usage, RAM usage etc, because you need to run two JVM's instead of one. Unless you have some incredibly complicated preferences window, it is not likely taking that many resources that you need to worry about it. By adding the network communication, you are just adding more complexity without adding any benefit.
Edit:
If you have read the book Filthy Rich Clients, one of the main points of the book is that Rich Effects do not need to be resource intensive. Most of the book is devoted to showing how to add cool effects to an app with out taking a lot of resources. Throughout the book they are very careful to time everything to show what takes a long time and what doesn't. This is crucial when making your app less resource hungry. Write your app, see what feels slow, add timing code to those particular items that are slow, and speed up those particular parts of the code. Check with your timing code to see if it is actually faster. Rinse and repeat. Otherwise you are doing optimization that may not make any difference. Without timing your code you don't know if code needs to be sped up even if you've sped up the code after doing your optimizing.
Others have mentioned loading the properties window in a separate thread. It's important to remember that Swing has only one thread called the EDT that does all of the painting of pixels to the screen. Any code that causes pixels on the screen to change should be called from the EDT and thus should not be called from a separate thread. So, if you have something that may take a while to run (perhaps a web service call or some expensive computation), you would launch a separate thread off of the EDT, and when it finishes run code on the EDT to do the UI update. There are libraries such as SwingWorker to make this easier. If you are setting a dialog to be visible, this should not be on a separate thread, but it may make sense to build the data structures in a separate thread if it is time consuming to build these data structures.
Using Swing Worker is one of many valuable ideas in Filthy Rich Clients for making UI's feel more responsive. Using the ideas in this book I have taken some fairly resource intensive UI's and made them so the UI was hardly using any resources at all.
You could create a ServerSocket in the main window and have the preferences app connect to that with a regular Socket the protocol to use may be extremely simple, but... I think you should really look for the second approach: to build preferences window that take system resources just when it's appear?
To do that, you have to build the window and all it resources until the user performs the Preferences action, save your file ( or pass the content to the main app ) and dispose all the resources of the preference window by making all of its reference non accessible. The garbage collector will handle the rest.
Maybe you could use some sort of directory watcher like this or maybe implement some sort of semaphore.
Honestly, I think that you should be able to solve the problem if you have some sort of menu item that the user can access. Once that the user saves the preferences, these are written to a file. The application then loads the values from the file whenever it needs them.
If your system is operating slowly, or hanging, you might consider the use of threads, or increase the number of threads.
Actually, as others have explained, you can use socket for inter-process communication.
However, that won't reduce your overall CPU / RAM usage at all. (might even slightly worsen your resources usage)
For your case, you can launch the Perference window in a different Thread rather than a different Process.
Thread is lighter for OS to handle and poses no additional complexity for inter-process communications.
Nobody seems to have mentioned the DBUS - available to developers on a Linux system. I guess that's no good if you're trying to make a Windows/Cross Platform application, but the DBUS is a ready-made application-communication platform. It helps address issues such as:
Someone else might already be using the port you're trying. There's no way for you client application (The "Preferences" window I guess) to know whether the thing listening on that port is your main application, or just something else that happens to be there, so you'll have to do some sort of handshake, and implement a conflict-resolution mechanism
It's not going to be obvious to either the future you, or anyone who comes to maintain your app why you're on the port you are. This might not seem important, but communicating on Socket 5574 just doesn't seem as neat to me as communicating on channel org.yourorganisation.someapp .
Firewalls (as I think someone's already said) can be a little over-zealous
Also, it's worth getting your hand in on DBUS - it's useful for communicating with a whole bunch of other applications such as the little popup notification thing you'll find in recent Ubuntu distributions, or certain instant messaging clients, etc.
You can read up on what I'm talking about (and maybe correct me on some of the things I've said) here: http://www.freedesktop.org/wiki/Software/dbus . It looks like they're working on making it happen on Windows too, which is nice.
I'm developing an application which requires Contents of the database to be written to an ms-excel file at the end of each day. I've written the code for copying the contents into ms-excel file but Now how to proceed further? Whether threads are to be used to check for the completion of 24 hours or there's some other mechanism? Please provide me some guidance.
If you need to facility to run things at set times during the day, you should consider the Quartz Scheduler. It might be overkill, but it's very capable.
For example, you can use its CronTrigger to configure a job to run on a schedule defined by a cron expression, e.g. 0 23 55 * * ? (or something like that) would run your job at 5 to midnight every night (see examples).
Quartz recently got a boost to its future and fortunes by being acquired by the Terracotta folks. Hopefully it'll get some real active development now.
I agree with the others that using something like crontab would be better. However, if you can't do that, I would use the java.util.concurrent package added in Java 1.5. The class you would need is ScheduledThreadPoolExecutor. Specifically, the scheduleAtFixedRate() method.
I think that from the design perspective it is better to use crontab on linux platform or task scheduler on windows platform. It will keep your java program small, and simple. While the solution with thread waiting for the specific time seems simple it will add one serious concern - you will have to monitor its health.
In addition - I would suggest to carefully plan logs your job is writing each time it is run. It is important to have logs for both successful and unsuccessful runs.
It makes sense to make separate file for such logs.
One more case to be considered - what to do if database was not available exactly in the time when job run? Is it acceptable to wait another 24 hours?