Google App Engine: Queued Tasks vs Cron Jobs - java

I'm very confused about GAE's concepts of Tasks, Task Queues (both push and pull), Cron Jobs and how each of these relate to Frontend vs. Backend instances.
I'm trying to achieve a situation where some HTTP requests can be serviced immediately, whereas some get queued. Queued requests might ultimately end up triggering my own code to execute (once they are consumed) or they might hit one of the GAE service APIs (LogQuery, etc.).
I can't seem to wrap my head around how to design these two scenarios and let alone do the code up. To make things worse I've read literature that suggests there's certain task/queue-related coding you want to do differently depending on whether the code is executing on a Frontend or Backend instance. Thanks in advance for any help here! Bonus points for some concrete examples!

You write the code, Tasks and Cron execute it.
Task is a wraper for a set of properties, the main one is Url that should execute. You code (handler, servlet) should reside on that url. Task sit in the TaskQueues, which have certain default properties on how fast, how many in parallel, etc.. they execute the Tasks. So basically a To-Do list, that sequentially executes tasks with no guarantee when a task will start.
Cron is a service that periodically calls Url that you provided. In a sense its a scheduler.
Your Url (= your handler/servlet) can reside on frontend instance (default) or backend instance (must set special property on Task or in Cron settings). The main difference being that front requests must complete in 10min, while backend requests can take indefinitelly.

Related

Can I trigger a Dropwizard task execution?

Recently I tried to create a task with Dropwizard that would be triggered within a resource but I can't find a way to do it.
I know that there are a integration with Quartz but that doesn't fit my needs (don't want to schedule tasks).
Is the only option to make a POST to the task endpoint? If so, how can I do a request to /tasks/myTask ?
I don't want to change the architecture to something like producer/consumer, where I create a task in the resource and enqueue it to have then something executing the tasks enqueued.
I posted a sample of how you can use a Managed service to execute tasks.
Running async jobs in dropwizard, and polling their status
Is there a specific reason why you need to invoke the code as a task? I would extract the logic from the Task and put it in it's own class. Then you can use it from multiple places irrespective of the implementation. If it needs to be performed asynchronously, I've had success running Akka workers triggered from inside my Dropwizard services.

Starting long-running process from within Apache ServiceMix

I am looking for suggestions or ideas.
There is an external process (or even a browser) that needs to trigger a long-running process via simple web service call that ideally should run in the same container as that web service. We're using Apache ServiceMix. The web service itself shouldn't stay alive for the duration of the long-running process, besides it may just time-out anyway so we want it to return the response normally pretty much right away.
Originally, I was thinking of using ProcessBuilder() to launch the long-running process as just another app but doing this introduces certain OS dependencies and seems like a less then ideal practice anyway. One of the options we considered is starting another thread from the request and just letting the request complete immediately with a response while the long-running thread would keep on going as long as needed. I fear resource hijacking on the container as well as long-running thread's health when its launcher/parent exits losing any reference to that long-running child.
If anyone has any good ideas for how this can be solved in an elegant way, please let me know.
Thank you very much!
I'm guessing here as you didn't provide the version of your servicemix. Though with Camel which is included with servicemix I'd have two routes the first one providing the web service the second one doing the long running process. The second route should use the seda component. This will give you the async call.

What is the best approach to run a long process from a java servlet?

I would like to ask what is the best approach to run a long process from a java servlet. I have a webapp and when the client do a request it runs a servlet. This servlet should get some parameters from the request and then runs a process. This process may take a long time so I need to run it separately. When this process executed finish, it send an email with the results.
Thanks in advance.
Use a thread pool. Each time you receive a request, create a task and submit it to the thread pool. This will ensure too many requests don't bring the server to its knees, because you'e in control of how many concurrent threads you can have, and how many tasks can wait in the thread pool's queue of waiting tasks.
See the javadoc for Executors and ThreadPoolExecutor.
Though this sounds a bit dangerous that invocation of a servlet spawns a process (without proper throttling capabilities in place), you can spawn a process using Runtime.getRuntime().exec(). Much better would be to use ProcessBuilder to prepare the process arguments and spawn it.
Normally that kind of activities is delegated to another type of application module like a message driven bean and that seems to be the cleanest, and standards compliant solution to me. Although most servers won't complain if you create your own threads (which is forbidden by the standard but rarely enforced) the amount of management needed to set up your own job queue and pooled execution environment isn't really worth it in my opinion.
I see two possibilities to do this:
Create a separate thread for each task (thread pool approach). This is possible, but potentially may create a performance problem.
Create a second application. For instance you can save parameters to DB. Second application will monitor this DB with some interval and do something. Instead DB you can use some message queue manager like WebSphere MQ
Second approach have the advantage: if app not able to process the request now by some reason, the app can return to it later

Java patterns for long running process in a web service

I'm building a web service that executes a database process (SQL code to run several queries , then move data between two really large tables), I'm assuming some processes might take 2 to 10 hours to execute.
What are the best practices for executing a long running database process from within a Java web service (it's actually REST-based using JAX-RS and Spring)? The process would be executed upon 1 web service call. It is expected that this execution would be done once a week.
Thanks in advance!
It's gotta be asynchronous.
Since your web service call is an RPC, best to have the implementation validate the request, put it on a queue for processing, and immediately send back a response that has a token or URL to check on progress.
Set up a JMS queue and register a listener that takes the message off the queue and persists it.
If this is really taking 2-10 hours, I'd recommend looking at your schema and queries to see if you can speed it up. There's an index missing somewhere, I'd bet.
Where I work, I am currently evaluating different strategies for this exact situation, only times are different.
With the times you state, you may be better served by using Publish/Subscribe message queuing (ActiveMQ).

Task Queue Java API

What is the purpose of Task Queue Java API? How does it work, and where should it be used?
The homepage looks pretty unambiguous:
With the Task Queue API, applications
can perform work outside of a user
request but initiated by a user
request. If an app needs to execute
some background work, it may use the
Task Queue API to organize that work
into small, discrete units, called
Tasks. The app then inserts these
Tasks into one or more Queues. App
Engine automatically detects new Tasks
and executes them when system
resources permit.
One thing GAE does is keep your request-response cycle very short lived to increase scalability. That is why lot of things like database access and http requests are handled asynchronously.
However there are requests that just can't be fully handled in realtime. This is either because such requests do some long computation (so they can be done in the background) or because they are periodic tasks like cron jobs that you need to schedule and execute repeatedly.
Tasks Queue let do you that.

Categories