Java patterns for long running process in a web service - java

I'm building a web service that executes a database process (SQL code to run several queries , then move data between two really large tables), I'm assuming some processes might take 2 to 10 hours to execute.
What are the best practices for executing a long running database process from within a Java web service (it's actually REST-based using JAX-RS and Spring)? The process would be executed upon 1 web service call. It is expected that this execution would be done once a week.
Thanks in advance!

It's gotta be asynchronous.
Since your web service call is an RPC, best to have the implementation validate the request, put it on a queue for processing, and immediately send back a response that has a token or URL to check on progress.
Set up a JMS queue and register a listener that takes the message off the queue and persists it.
If this is really taking 2-10 hours, I'd recommend looking at your schema and queries to see if you can speed it up. There's an index missing somewhere, I'd bet.

Where I work, I am currently evaluating different strategies for this exact situation, only times are different.
With the times you state, you may be better served by using Publish/Subscribe message queuing (ActiveMQ).

Related

calling Django API using apache (mod_wsgi)

I have a frontend web tool which interacts with REST API written in danjgo. Each API calls take long time to process the call and is also CPU/GPU intensive. So I am planning to run only 1 call at a time and putting rest of the calls in queue. Now I am not sure if Celery with redis can be helpful here or should I stick with job queue approach at the java side.
So, the tool would be used by multiple users and and so each user would have their jobs. So, I need to put the jobs in queue so that they can be processed one by one asynchronously. Would Celery be helpful here?

How are asynchronous requests handled by servlets?

I apologize in advance if this is a bad question.
I'm new to backend development and I'm trying to build an instant messaging service with GAE using java servlets.
And I assume the process for sending a message will be like this:
1. Client send JSON file to servlet.
2. Servlet parses the JSON file and archives the message to the database.
So my question is:
what's going to happen if the next user attempts to send another message while the servlet is in the middle of the process of saving the previous message to the database?
Because the arrival of user requests are not synchronized with the servlet cycle, will the new request just get lost?
Is there going to be some mechanism that queues the request or it's something that I'll have to implement myself?
I think I'm really confused about how the asynchronous request between different functions in a distributed system works.
And, if there any readings that you would recommend for backend design pattern? or just a general introduction?
Thanks a lot!
Please read the official tutorial on the subject that talks in depth about the java web technologies , web containers and servlets:
http://docs.oracle.com/javaee/6/tutorial/doc/bnafd.html
But to answer your questions :
When another HTTP request comes in , a new thread will be created by
the web container and will run your servlet concurrently.
The new request will be processed concurrently
The answer depends on your specific problem , performance and SLA requirements. The simplest solution would be to parse and write each request to the database. If you are dealing with a very large number of simultaneous requests coming in , i'd suggest starting a whole new discussion on the subject.
You need to know exactly what the 'Thread' is? When another request sent to Servlet. The container like tomcat will assign another thread for this request. Every thread is independent from another.
Server requests will run in parallel and your code might access/edit the same data concurrently. You should use Datastore transactions to prevent data corruption.
No, requests are independent and they run in parallel.
You could use Task Queues in your code to make updates run sequentially, but I'd advise highly against it: first Task Queue will double your requests, second it will force a distributed parallel system to run sequentially, basically negating the whole purpose of AppEngine.
Parallel processing are essential in server programming - they enable servers to process high amount of requests. You should write code that takes this into account - use datastore transactions to prevent possible data corruption in those cases.
in a servlet lifecycle the init() and destroy() methods are called only once - but the service() will be called each time a new request comes and hit the application and a new instance of the servlet will be shared with the request through a different thread . Therefore one of the most basic rules for creating a servlet is not to create global variable in a servlet class.
Your variable is readable/writeable by any other class. You have no control to ensure that they all do sensible things with it. One of them could overwrite it/incorrectly increment it, etc
The is one instance of a servlet, per JVM. So may threads may try to access it concurrently. Because it is global, and you are not providing any synchronization/access control, it will not be thread-safe. Also, if you ever run the servlet in some kind of cluster with different JVMs, then the variable will not be shared between them and you will have multiple loginAttempt variables.

Restful Webservices using Java, Apache Axis2, Hibernate and MySQL and its performance

I read somewhere use of webservcies in apps. After a lot of research I am able to create one Webservice which will accept Json and JsonP both format as request and response accordingly. I developed the webservcies using Java, Apache Axis2, Hibernate and MySQL as database. there are few problems and I dont know how to solve ?
Insert or delete option, sometimes if at a time more than two users call that service that is insert or delete any row the queries go in sleep mode and next time someone tries to fetch that service he couldnt. Accroding to server log it says error SQL Lockout State. If I checks Processlist in MYSQL it is showing that query in Sleep, I have to kill to resume.
The performance of webservice doesnt seems to be upto mark, it takes time some more time as what i experienced it shouldn't. In simple words how to obtain better performance by the services
How to implement security feature such that if a user logins he/she can be provided an id and validation of that id so that unauthorized access can be prevented
Or just guide me what should be the most appropriate and optmized Webservice methodology that can be used using Java
Answer to this question is not specific to Android. Below are my investigations which might be useful for you.
For the point about MySQL connections going to sleep mode, you can do the following.
Debug the datasource used by Hibernate, try to increase the pool size & check for any issues in it.
Define a timeout period for connections. JBoss has several configurations related to this like blocking-timeout-millis, idle-timeout-minutes etc.
Declare a mechanism to validate periodically the connection resources in the pool for activeness. You can explore OracleStaleConnectionChecker for options.
Configure miniumn connections in the pool. This is important because when all the stale connections are discarded, empty pool needs to be pre-filled & ready with active connections.
Coming to performance of Insert/Delete operations & SQL Lockout State, please try to re-order the sequence of the queries which you are firing to DB at every request. This may not be a deadlock situation but sequencing DB queries correctly will definitely lead to less lockout time and better performance.
This answer may be of use for you. Hibernate: Deadlock found when trying to obtain lock
Web-services which you have developed may require some performance optimization to make them upto the mark. Below are first few steps you can take to bring the performance up.
Avoid nested loops. Every extra parameter in the iterated lust increase the order of the lopp exponentially.
Remove early initialization of objects. This may lead to long unwanted GC cycles.
Apart from above optimizations, there are several frameworks & tools at your service to evaluate the code quality & its performance. PMD, FindBugs, JMeter, Java profiler are few of them to name.
Shishir
You are going to have to profile your server and see where the time is spent. I really like YourKit for doing thread profile. visualvm which comes with the JDK can help also.
There are all sorts of reasons your web service can be slow:
Latency from client to server
Handling the HTTP request on the server
Handling the HTTP response on the client
Making the database call (sounds like you already have some kind of locking / blocking going on there)
You are going to have to get markers to tell you how long it took to go from A to B to C to D back to C back to B back to A kind of thing. We would be speculating heavily from here on what is exactly going on in your program, but we can give you the ideas / tools to figure it out.
If you use YourKit, connect it to your server process. Have nothing else connecting to your server (for instance your client is not sending requests). Try it with your client requesting, you should see your accepting threads receive the HTTP request and then delegate to either your processing thread or do the processing itself. You can use YourKit to see how much time is spent in different functions during that call time.
Try it with your client making the call.
Try it using a simple HTTP request tool like wget or maybe your IDE has a webservice test tool (for instance intellij does), or you can download a simple HTTP test tool.
By testing it in a simple tool that just outputs the response, you can eliminate any client processing issues. You can also achieve a similar test in Chrome or Firefox and use the developer tools to see time to fulfill request.
In my experience, the framework for handling the requests and delegating can introduce some performance issues. I ripped Grails out of a production environment because of its performance issues (before any Grails / Groovy flames come my way, we were operating at a much higher rate than typical web applications, and I am sure Grails has made some headway in the last couple years... alas, it was not for my need at that time)
BTW, I doubt you are operating a load where you will be critiquing the web service framework you chose to use. I have been happy with Spring MVC and DropWizard (Jersey JAX-RS), and Grails is easy to use too.
You should make a simple static content response in your webservice and see how quickly that returns vs a request that makes a database call.
Also, what kind of table are you using in MySQL? InnoDB? MyISAM? They have different locking schemes. That could be causing your MySQL issue.
The key to all of it, break the problem up into parts, and measure each and eliminate parts one by one till you go, everytime I do X it is slower (like everytime I make a database call its slower)
In Java the the way you will be able to find more support online via documentation/forums is to develop the web service as a REST web service using Spring MVC.
You can base yourself on this resource and take it from there:
Spring MVC REST Hello World Web Service
Using Spring you can create a RestFul webservice easily and spring does all the ground work you needed. As others had mentioned you can consume the webservice in any type of client - including Android.
A detailed guide available here:
https://spring.io/guides/gs/rest-service/
Here are my suggestions:
Make APIs only read or write database. If an API combines reading and writing, it is possible to cause deadlock;
Use a light-weight HTTP server. Powerful HTTP server is possibly consuming more.
Make use of thread. Have more threads could be helpful when you are facing a ton of users.
Make more things static. You could avoid unnecessary queries.
I think mhoglan's answer is detailed enough.

How to implement asynchronous processing with J2EE application

I have an enterprise application with around 2k concurrent users every day. These users handle customer calls so application speed is of vital importance.
When a user is wrapping up a call they commit all the information they captured. This commit can take anywhere from 10-45 seconds.
I am looking into ways to take the delay away from the user.
We have a web front end running in I.E. the backend is heavy java running on a single EJB.
I wanted to make this commit process asynchronous in that once the user submits the request they don't have to wait for the commit to finish before going on to the next customer. This is what is currently implemented.
Originally I was thinking of just spawning another thread to handle the commit but that's a no no with EJB's.
Other options I can think of would be using JMS or SIB,
What would the best solution be? Is there another alternative I am missing?
There are actually two alternatives for cases like that.
The first one will be to use JMS. It has the advantage that the server provides all required infrastructure and you haven't to implement much yourself.
Another way will be to register the request in a database and have a scheduled event to process all of them.
What you select depends on your requirements. If you need to serve the requests as soon as they arrive, then you need to go with JMS. In both cases you need to persist the outcome of the request in a database and design a web service at the top of it. The front end could use this (through pollling) to present the result to the user.
Would have liked to leave a comment, but don't have the ability.
Another possibility:
Wrap the heavy EJB's in a queue mechanism, and expose a different bean with the same API so your client-facing communications talk to the new bean and are quick. They accept the request, add the job to the queue and return to the client immediately. You don't need to change the heavy EJB's or the client communications, just put a mediator in the way, and add a layer of wrapping.

Asynchronous processing in Java from a servlet

I currently have a tomcat container -- servlet running on it listening for requests. I need the result of an HTTP request to be a submission to a job queue which will then be processed asynchronously. I want each "job" to be persisted in a row in a DB for tracking and for recovery in case of failure. I've been doing a lot of reading. Here are my options (note I have to use open-source stuff for everything).
1) JMS -- use ActiveMQ (but who is the consumer of the job in this case another servlet?)
2) Have my request create a row in the DB. Have a seperate servlet inside my Tomcat container that always runs -- it Uses Quartz Scheduler or utilities provided in java.util.concurrent to continously process the rows as jobs (uses thread pooling).
I am leaning towards the latter because looking at the JMS documentation gives me a headache and while I know its a more robust solution I need to implement this relatively quickly. I'm not anticipating huge amounts of load in the early days of deploying this server in any case.
A lot of people say Spring might be good for either 1 or 2. However I've never used Spring and I wouldn't even know how to start using it to solve this problem. Any pointers on how to dive in without having to re-write my entire project would be useful.
Otherwise if you could weigh in on option 1 or 2 that would also be useful.
Clarification: The asynchronous process would be to screen scrape a third-party web site, and send a message notification to the original requester. The third-party web site is a bit flaky and slow and thats why it will be handled as an asynchronous process (several retry attempts built in). I will also be pulling files from that site and storing them in S3.
Your Quartz Job doesn't need to be a Servlet! You can persist incoming Jobs in the DB and have Quartz started when your main Servlet starts up. The Quartz Job can be a simple POJO and check the DB for any jobs periodically.
However, I would suggest to take a look at Spring. It's not hard to learn and easy to setup within Tomcat. You can find a lot of good information in the Spring reference documentation. It has Quartz integration, which is much easier than doing it manually.
A suitable solution which will not require you to do a lot of design and programming is to create the object you will need later in the servlet, and serialize it to a byte array. Then put that in a BLOB field in the database and be done with it.
Then your processing thread can just read the contents, deserialize it and work with the ressurrected object.
But, you may get better answers by describing what you need your system to actually DO :)

Categories