Intercepting outgoing servlet http requests

Intercepting outgoing servlet http requests - java

I am working on a school project in which I query and receive some fairly large XML documents from a central server. This was fine in the beginning, as I was rarely making these requests (HTTP GET), but as the project progressed I came up with more things to do with this data, and now I have servlets requesting 3 or 4 XML documents, each in it's own separate GET request, which is causing upwards of 25 seconds page generation times.
It's not possible to change the way the data is served, neither the way in which it's requested as I have a fairly large code base, and it's not as decoupled as it perhaps should have been.
Is there a smart way to listen in on when my servlets execute these GET requests, intercept them, and perhaps supply them with a local, cached version instead? The data is not THAT volatile, so 'live' data is not needed.
So far, I have not been able to find information on listening on OUTgoing requests made by Tomcat...

I think a lot will depend upon your cache hit ratio. If the same 3-4 documents (or some small group of documents) are being requested on a regular basis, a local caching proxy server (like Squid) might be a possibility. Java can be configured to use a proxy server for HTTP requests.

You can probably implement this using HttpFilter. It can be used as a cache. If requested document is already in cache just return it; if not forward the HTTP request to your servlet.

I ended up using a ContextListener to load most of the data at start-up in addition to an 'expiry date' into the servlet context attributes. It makes for some slow start-ups (9 GetRequests to the central server!) but drastically reduces our page load times.

Related

Designing "ahead-of-time" caching with JCS

I have been trying to implement an "ahead-of-time" caching scheme. Essentially, I have a web app that users interact with to retrieve resources–from a cache if the resource has been requested before, and from elsewhere on a cache miss.
I would like to populate the cache ahead of time. This would look like the user providing specific parameters in the request that will kick the web app off and get it requesting resources and caching them.
The wrinkle in the problem is that if a user requests a specific resource while my web app is requesting resources and caching them, priority needs to be given to the user's request. Once the user has received the resource, the "ahead-of-time" caching should kick back in.
I'm new to servlets, and I have no idea how to make two requests talk like that. How could I "postpone" the operation of some function based off of a separate request to my web app?
I'd be happy to provide any relevant code that you all would find useful. My caching class is almost identical to the one on the JCS website.
Thanks,
Kevin

Restful Webservices using Java, Apache Axis2, Hibernate and MySQL and its performance

I read somewhere use of webservcies in apps. After a lot of research I am able to create one Webservice which will accept Json and JsonP both format as request and response accordingly. I developed the webservcies using Java, Apache Axis2, Hibernate and MySQL as database. there are few problems and I dont know how to solve ?
Insert or delete option, sometimes if at a time more than two users call that service that is insert or delete any row the queries go in sleep mode and next time someone tries to fetch that service he couldnt. Accroding to server log it says error SQL Lockout State. If I checks Processlist in MYSQL it is showing that query in Sleep, I have to kill to resume.
The performance of webservice doesnt seems to be upto mark, it takes time some more time as what i experienced it shouldn't. In simple words how to obtain better performance by the services
How to implement security feature such that if a user logins he/she can be provided an id and validation of that id so that unauthorized access can be prevented
Or just guide me what should be the most appropriate and optmized Webservice methodology that can be used using Java

Answer to this question is not specific to Android. Below are my investigations which might be useful for you.
For the point about MySQL connections going to sleep mode, you can do the following.
Debug the datasource used by Hibernate, try to increase the pool size & check for any issues in it.
Define a timeout period for connections. JBoss has several configurations related to this like blocking-timeout-millis, idle-timeout-minutes etc.
Declare a mechanism to validate periodically the connection resources in the pool for activeness. You can explore OracleStaleConnectionChecker for options.
Configure miniumn connections in the pool. This is important because when all the stale connections are discarded, empty pool needs to be pre-filled & ready with active connections.
Coming to performance of Insert/Delete operations & SQL Lockout State, please try to re-order the sequence of the queries which you are firing to DB at every request. This may not be a deadlock situation but sequencing DB queries correctly will definitely lead to less lockout time and better performance.
This answer may be of use for you. Hibernate: Deadlock found when trying to obtain lock
Web-services which you have developed may require some performance optimization to make them upto the mark. Below are first few steps you can take to bring the performance up.
Avoid nested loops. Every extra parameter in the iterated lust increase the order of the lopp exponentially.
Remove early initialization of objects. This may lead to long unwanted GC cycles.
Apart from above optimizations, there are several frameworks & tools at your service to evaluate the code quality & its performance. PMD, FindBugs, JMeter, Java profiler are few of them to name.
Shishir

You are going to have to profile your server and see where the time is spent. I really like YourKit for doing thread profile. visualvm which comes with the JDK can help also.
There are all sorts of reasons your web service can be slow:
Latency from client to server
Handling the HTTP request on the server
Handling the HTTP response on the client
Making the database call (sounds like you already have some kind of locking / blocking going on there)
You are going to have to get markers to tell you how long it took to go from A to B to C to D back to C back to B back to A kind of thing. We would be speculating heavily from here on what is exactly going on in your program, but we can give you the ideas / tools to figure it out.
If you use YourKit, connect it to your server process. Have nothing else connecting to your server (for instance your client is not sending requests). Try it with your client requesting, you should see your accepting threads receive the HTTP request and then delegate to either your processing thread or do the processing itself. You can use YourKit to see how much time is spent in different functions during that call time.
Try it with your client making the call.
Try it using a simple HTTP request tool like wget or maybe your IDE has a webservice test tool (for instance intellij does), or you can download a simple HTTP test tool.
By testing it in a simple tool that just outputs the response, you can eliminate any client processing issues. You can also achieve a similar test in Chrome or Firefox and use the developer tools to see time to fulfill request.
In my experience, the framework for handling the requests and delegating can introduce some performance issues. I ripped Grails out of a production environment because of its performance issues (before any Grails / Groovy flames come my way, we were operating at a much higher rate than typical web applications, and I am sure Grails has made some headway in the last couple years... alas, it was not for my need at that time)
BTW, I doubt you are operating a load where you will be critiquing the web service framework you chose to use. I have been happy with Spring MVC and DropWizard (Jersey JAX-RS), and Grails is easy to use too.
You should make a simple static content response in your webservice and see how quickly that returns vs a request that makes a database call.
Also, what kind of table are you using in MySQL? InnoDB? MyISAM? They have different locking schemes. That could be causing your MySQL issue.
The key to all of it, break the problem up into parts, and measure each and eliminate parts one by one till you go, everytime I do X it is slower (like everytime I make a database call its slower)

In Java the the way you will be able to find more support online via documentation/forums is to develop the web service as a REST web service using Spring MVC.
You can base yourself on this resource and take it from there:
Spring MVC REST Hello World Web Service

Using Spring you can create a RestFul webservice easily and spring does all the ground work you needed. As others had mentioned you can consume the webservice in any type of client - including Android.
A detailed guide available here:
https://spring.io/guides/gs/rest-service/

Here are my suggestions:
Make APIs only read or write database. If an API combines reading and writing, it is possible to cause deadlock;
Use a light-weight HTTP server. Powerful HTTP server is possibly consuming more.
Make use of thread. Have more threads could be helpful when you are facing a ton of users.
Make more things static. You could avoid unnecessary queries.
I think mhoglan's answer is detailed enough.

Enable resuming interrupt download in REST web service using Java

I am writing a REST web service for clients to download large data files. As part of this, I would like to implement a feature to enable resuming interrupt downloads in case an exception occurs or the connection is lost on the original request.
I did some research online and found that supporting Range/If-Range properties in the request header might be the solution, as indicated in http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html.
My question are
In the scope of REST web service, is it the most commonly used and best practice to support Range/If-Range properties in the client HTTP request header, or just pass the byte offset as a query parameter in the client GET request, e.g., hostname:port/download?token=?byteoffset=??
If taking the former approach, on the server side, is there a standard way to handle request with Range field in JAX-RS specification (I am using Java)? The straightforward way is to just open an InputStream from the file and bypass the given # of bytes.

In general, don't use parameters that have to do with meta-information on the resource (or the part of it you need), so you should be using the Range, and keep sure the server allows that.
Note that, for example, byteoffset is not a meaningful part, component or semantically interesting bit of the resource itself, but a way of overcoming partial content (also, identical for all the resources, so you have to use the headers allowed for that, and hey! they're there for that).

How to implement rate limiting based on a client token in Spring?

I am developing a simple REST API using Spring 3 + Spring MVC. Authentication will be done through OAuth 2.0 or basic auth with a client token using Spring Security. This is still under debate. All connections will be forced through an SSL connection.
I have been looking for information on how to implement rate limiting, but it does not seem like there is a lot of information out there. The implementation needs to be distributed, in that it works across multiple web servers.
Eg if there are three api servers A, B, C and clients are limited to 5 requests a second, then a client that makes 6 requests like so will find the request to C rejected with an error.
A recieves 3 requests \
B receives 2 requests | Executed in order, all requests from one client.
C receives 1 request /
It needs to work based on a token included in the request, as one client may be making requests on behalf of many users, and each user should be rate limited rather than the server IP address.
The set up will be multiple (2-5) web servers behind an HAProxy load balancer. There is a Cassandra backed, and memcached is used. The web servers will be running on Jetty.
One potential solution might be to write a custom Spring Security filter that extracts the token and checks how many requests have been made with it in the last X seconds. This would allow us to do some things like different rate limits for different clients.
Any suggestions on how it can be done? Is there an existing solution or will I have to write my own solution? I haven't done a lot of web site infrastructure before.

It needs to work based on a token included in the request, as one client may be making requests on behalf of many users, and each user should be rate limited rather than the server IP address.
The set up will be multiple (2-5) web servers behind an HAProxy load balancer. There is a Cassandra backed, and memcached is used. The web servers will be running on Jetty.
I think the project is request/response http(s) protocol. And you use HAProxy as fronted.
Maybe the HAProxy can load balancing with token, you can check from here.
Then the same token requests will reach same webserver, and webserver can just use memory cache to implement rate limiter.

I would avoid modifying application level code to meet this requirement if at all possible.
I had a look through the HAProxy LB documentation nothing too obvious there, but the requirement may warrant a full investigation of ACLs.
Putting HAProxy to one side, a possible architecture is to put an Apache WebServer out front and use an Apache plugin to do the rate limiting. Over-the-limit requests are refused out front and the application servers in the tier behind Apache are then separated from rate limit concerns making them simpler. You could also consider serving static content from the Web Server.
See the answer to this question How can I implement rate limiting with Apache? (requests per second)
I hope this helps.
Rob

You could put rate limits at various points in the flow (generally the higher up the better) and the general approach you have makes a lot of sense. One option for the implementation is to use 3scale to do it (http://www.3scale.net) - it does rate limits, analytics, key managed etc. and works either with a code plugin (the Java plugin is here: https://github.com/3scale/3scale_ws_api_for_java) which pushes or by putting something like Varnish (http://www.varnish-cache.org) in the pipeline and having that apply rate limits.

I was also thinking of the similar solutions a couple of day's ago. Basically, I prefer the "central-controlled" solution to save the state of the client request in the distributed environment.
In my application, I use a "session_id" to identify the request client. Then create a servlet filter or spring HandlerInterceptorAdapter to filter the request, then check the "session_id" with the central-controlled data repository, which could be memcached, redis, cassandra or zookeeper.

We use redis as leaky bucket backend
Add a controller as entrance
google cache that token as key with expired time
then filter every request

It is best if you implement ratelimit using REDIS. For more info please look this Rate limiting js Example.

Most Efficient Way of calling an external webservice in Java?

In one of our applications we need to call the Yahoo Soap Webservice to Get Weather and other related info.
I used the wsdl2java tool from axis1.4 and generated th required stubs and wrote a client. I use jsp's use bean to include the client bean and call methods defined in the client which call the yahoo webservice inturn.
Now the problem: When users make calls to the jsp the response time of the webservice differs greatly, like for one user it took less then 10 seconds and the other in the same network took more than a minute.
I was just wondering if Axis1.4 queues the requests even though the jsps are multithreaded.
And finally is there an efficient way of calling the webservice(Yahoo weather). Typically i get around 200 simultaneous requests from my users.

Why don't you schedule one thread to get the weather every minute or so, and expose that to the JSP, in stead of letting each JSP get its own weather report?
That's a lot more efficient for both you and Yahoo, and JSP's only need to lookup a local object (almost instantaneous) in stead of connecting to a web service.
EDIT
Some new requirements in the comments of this answer suggest a different way of choosing solutions.
It seems that not only weather, which not only doesn't change that often but is also the same for every user, is requested by web service but also other data like flight data.
The requirements for flight data retrieval are very much different than for weather data. So I think you should define a few types of (remote) data and choose a different solution
for each category.
As basis for the requirements I'd use something simple:
Users like their information promptly, they do not like waiting
The amount of data stored on the web server is finite
Remote web services have an EULA of sorts and are probably not happy with 200 concurrent requests of the same data by the same source (you)
Fast data access to users is best achieved by having the data locally, be it transient (kept in a bean) or persistent (a local database). That can be done by periodically requesting data from the remote source, and using the cached data in the JSP. That would also keep you in the clear with the third point.
A finite amount of data stored on the web service means that not everything can be cached. Data which differs per user, or large data sets which can vary over small periods of time, cannot readily be cached. It's not really a good idea to load data on all flights of all airports in the US every minute or so. That kind of requests would be better served by running a specific web service query when necessary.
The trick is now to identify when caching data is feasible. If it is feasible, do that, otherwise run the web service query in the background. That can be done by presenting the JSP now and starting the web service query in the background. The JSP can have an AJAX script which queries your web server whether the data is ready, and insert that data in the page when ready.

I'd use Google tools to monitor how long the call to the web service is taking.
There are several things going on here:
Map Java beans to XML request.
Send XML request to web service.
Unmarshall XML request on web service side.
Web service processes request
Web service marshalles XML response
Web service sends XML response to Java client
Unmarshall XML response and display on client.
You can't see inside the Yahoo web service, but do break out what you can see on the client side to see where the time is spent.
Check memory as well. If Axis is generating .class files, maybe your perm space is being consumed. Visual VM is available to you with the JDK. Attach it to the PID on your client to see what's going on in memory on your app server.
Maybe this would be a good place for an AJAX call. This will be a good solution if you can get the weather in the background while users are doing other things.

I would recommend local caching and data pooling. Instead of sending out 200 separate requests for similar/same locations run a background thread which pulls the weather for only the locations your users are interested in and caches them locally, this cache updates every minute or so. When users request their personal preferences, the requests hit the cache and refetch if the location is new or the data in the cache is stale. This way the user will have a more seamless experience and you will not hit Yahoo throttles and get denied service.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.