What is the purpose of Task Queue Java API? How does it work, and where should it be used?
The homepage looks pretty unambiguous:
With the Task Queue API, applications
can perform work outside of a user
request but initiated by a user
request. If an app needs to execute
some background work, it may use the
Task Queue API to organize that work
into small, discrete units, called
Tasks. The app then inserts these
Tasks into one or more Queues. App
Engine automatically detects new Tasks
and executes them when system
resources permit.
One thing GAE does is keep your request-response cycle very short lived to increase scalability. That is why lot of things like database access and http requests are handled asynchronously.
However there are requests that just can't be fully handled in realtime. This is either because such requests do some long computation (so they can be done in the background) or because they are periodic tasks like cron jobs that you need to schedule and execute repeatedly.
Tasks Queue let do you that.
Related
I am building a web console that will be used to monitor the progress of an ongoing tasks of processing for multiple files and my requirement is to have a realtime updates for each and every file separatly (based on user request) via Web-sockets, so the user will typically login and select the file he want to see the progress on and will be re-directed to a page with live updates from the system about this file so typically I will need a different topic to be created every time based on user request with my scheduled job running on the background publishing updates on each created topic.
my question is there any way to fulfill these requirement using spring WebSockets and scheduled jobs or should I switch to event bus like the one in vertx ?
Regards,
After a week of trials i ended up using vertx eventbus, it gives me exactly what i needed, it even includes a periodic timer which can be cancelled by a timer id which is exactly what i was looking for, adding the Async nature of vertx and its lightweight which makes it a perfect fit for my requirements.
as we know the struts interceptor execute and wait will take care of long running process by not getting the request to timeout and destroy it sends wait and at last the desired response i want to implement the same for long running process in spring and hibernate.
I recommend you to use DeferredResult of Spring. It´s a Future implementation, that use the http long poling technique.
http://docs.spring.io/spring-framework/docs/3.2.0.BUILD-SNAPSHOT/api/org/springframework/web/context/request/async/DeferredResult.html
So let´s says that you will make a request, and the server it will return you the deferredResult, and then your request will keep it open until the internal process(Hibernate) finish his task. The timeout is configurable in the constructor.
Here another example http://www.javacodegeeks.com/2013/03/deferredresult-asynchronous-processing-in-spring-mvc.html
In order to keep the session open throughout the lifetime of a request we tie it to the view. This is done either by using Spring's OpenSessionInViewInterceptor or OpenSessionInViewFilter
An open session in view filter will ensure that the Hibernate session is kept open all the way up to the rendering of the view.
Or
You can use a task queue in the backend for a long running process like this.
Work Queues (aka: Task Queues) is to avoid doing a resource-intensive task immediately and having to wait for it to complete. Instead the tasks are scheduled to be done later. Task is encapsulated as a message and sent to a queue. A worker process running in the background will pop the tasks and eventually execute the job. When you run many workers the tasks will be shared between them.
This concept is especially useful in web applications where it's impossible to handle a complex task during a short HTTP request window.
I have a simple DropWizard service and I'd like a REST API to start a long running processing task - both CPU and I/O bound. REST call will not wait for task completion, notification will happen by polling/long polling/web socket.
For now, I'd prefer if I can do this in Dropwizard and keep everything in single deployable JAR. What are my options?
UPDATE: I am interested in what my options are regarding running long running tasks in Dropwizard, deployed as single jar without external dependencies. Just spawn a new thread? Assuming there are just few such requests it would probably work but there should be better options.
You probably want to use a managed resource:
https://dropwizard.io/en/stable/manual/core.html#managed-objects
to setup a thread pool. Then, your initial request could push a message onto a queue. Your thread pool can pull messages off the queue and process them asynchronously.
You could maybe provide an additional endpoint so that clients can obtain the current state of the asynchronous process.
I would like to ask what is the best approach to run a long process from a java servlet. I have a webapp and when the client do a request it runs a servlet. This servlet should get some parameters from the request and then runs a process. This process may take a long time so I need to run it separately. When this process executed finish, it send an email with the results.
Thanks in advance.
Use a thread pool. Each time you receive a request, create a task and submit it to the thread pool. This will ensure too many requests don't bring the server to its knees, because you'e in control of how many concurrent threads you can have, and how many tasks can wait in the thread pool's queue of waiting tasks.
See the javadoc for Executors and ThreadPoolExecutor.
Though this sounds a bit dangerous that invocation of a servlet spawns a process (without proper throttling capabilities in place), you can spawn a process using Runtime.getRuntime().exec(). Much better would be to use ProcessBuilder to prepare the process arguments and spawn it.
Normally that kind of activities is delegated to another type of application module like a message driven bean and that seems to be the cleanest, and standards compliant solution to me. Although most servers won't complain if you create your own threads (which is forbidden by the standard but rarely enforced) the amount of management needed to set up your own job queue and pooled execution environment isn't really worth it in my opinion.
I see two possibilities to do this:
Create a separate thread for each task (thread pool approach). This is possible, but potentially may create a performance problem.
Create a second application. For instance you can save parameters to DB. Second application will monitor this DB with some interval and do something. Instead DB you can use some message queue manager like WebSphere MQ
Second approach have the advantage: if app not able to process the request now by some reason, the app can return to it later
I'm very confused about GAE's concepts of Tasks, Task Queues (both push and pull), Cron Jobs and how each of these relate to Frontend vs. Backend instances.
I'm trying to achieve a situation where some HTTP requests can be serviced immediately, whereas some get queued. Queued requests might ultimately end up triggering my own code to execute (once they are consumed) or they might hit one of the GAE service APIs (LogQuery, etc.).
I can't seem to wrap my head around how to design these two scenarios and let alone do the code up. To make things worse I've read literature that suggests there's certain task/queue-related coding you want to do differently depending on whether the code is executing on a Frontend or Backend instance. Thanks in advance for any help here! Bonus points for some concrete examples!
You write the code, Tasks and Cron execute it.
Task is a wraper for a set of properties, the main one is Url that should execute. You code (handler, servlet) should reside on that url. Task sit in the TaskQueues, which have certain default properties on how fast, how many in parallel, etc.. they execute the Tasks. So basically a To-Do list, that sequentially executes tasks with no guarantee when a task will start.
Cron is a service that periodically calls Url that you provided. In a sense its a scheduler.
Your Url (= your handler/servlet) can reside on frontend instance (default) or backend instance (must set special property on Task or in Cron settings). The main difference being that front requests must complete in 10min, while backend requests can take indefinitelly.