Why can the body of the HttpServletRequest not be read multiple times? - java

I was facing this issue, where I was reading the request using javax.servlet.http.HttpServletRequest.getReader() and it was clearing the request after that. So when I was passing it forward in the filter chain, it was passing request as null
I was able to resolve this by following similar solution:
How can I read request body multiple times in Spring 'HandlerMethodArgumentResolver'?
My question is why are we not allowed to read the request multiple times?

Why can the body of the HttpServletRequest not be read multiple times?
Because that would require the servlet stack to buffer the entire request body ... in case the servlet decides to re-read it. That is going to be a performance and/or memory utilization hit for all requests. And it won't work unless there is enough memory to buffer multiple instances (for multiple simultaneous requests) of the largest anticipated request body.
Note that there is no way to get the client to resend the data, short of failing the HTTP request. Even then, the client may not be able to resend it ... because it may not have been able to buffer the data itself.
In short: rereading the request body is not supported by the servlet APIs because it doesn't scale. If a servlet wants to reread the data, it needs to buffer it itself.

Related

Request atomicity within microservices bounded context

Our project consists of multiple microservices. These microservices form a boundary to which the entry point is not strictly defined meaning each of microservices can be requested and can request other services.
The situation we need to handle in this bounded microservice context is following:
client (other application) makes the request to perform some logic and change the data (PATCH),
request times out,
while request is being processed client fires the same request to repeat the operation,
operation successfully completes,
second request is being processed the same way and completes within it's time and client gets response.
Now what happened is that the same was processed two times because of first timeout.
We need to make sure the same request won't get processed and application will respond with former response and status code.
The subsequent request is identified by the same uuid.
Now, I understand it's the client that should do requesting more precisely or we should have a single request entry point in out micorservices bounded context, but in enterprise projects the team doesn't own the whole system therefore we are a bit constrained with the solutions we propose for the problem. with this in mind while trying to not reinvent the wheel this comes to my mind:
The microservices should utilize some kind of session sharing (spring-session?) with the ability to look up the request by it's id before it gets processed and in described case, when first is being processed and second arrives, wait for the completion of the 1st and respond to the second with data of the first that has timed out for a client.
What I am struggling with is imagining handling the asynchronicity of replying to the second one and how to listen for session state of the first request.
If spring-session would be used (for example with hazelcast) I'm lacking some kind of concrete session state handler which would get fired when request ends. Is there something like this to listen for?
No code written yet. It's an architectural thought experiment that I want to discuss.
If unsure of understanding, read second time please, then I'm happy to expand.
EDIT: first idea:
process would be as follows (with numbering on the image):
(1) first request fired
(3) processing started; (2) request timed out meanwhile;
(4) client repeats the same request; program knows it has received the same request before because it knows the req. id.
program checks the cache and the state of that request id 'pending' so it WAITS (async).
computed result of first request is saved into the cache - orange square
(5) program responds to the first request with the data that was meant to be for the first one
idea is that result checking and responding to the repeated request would be done in the filter chain so it won't actually hit the controller when the second request is asynchronously waiting for the operation triggered by the first request to be done (I see hazelcast has some events when rows are added/updated/evicted from the cache - dunno if it's working yet) and when complete just respond (somehow write to the HttpServletResponse). result would be saved into the cache in postHandling filter.
Thanks for insights.
I'd consider this more of a caching paradigm. Stick your request/responses into an external cache provider (REDIS or similar), indexed by uuid. Having a TTL will allow your responses to automatically get cleaned up for requests that are never coming back, and the high-speed implementation (o1) should allow this to scale nicely. It will also out-of-the-box give you an asynchronous model (not a stated goal, but always a nice option).

Scalable Way to Combine Same Requests Within Certain Time Threshold

I have an application, call it Service 1, that potentially makes a lot of the same requests to another application, call it Service 2. As an example, x number of people use Service 1 and that results in x requests (which are the exact same request) to Service 2. Each response is cached in Service 1.
Currently, we have a synchronized method that checks whether or not the same request has been made within a certain time threshold. The problem we are having is that when the server is under a heavy load that synchronized method locks up the threads, kubernetes can't perform liveness checks, so kubernetes restarts the service. The reason we want to prevent duplicate requests is two fold: 1) we don't want to hammer service 2, and 2) if we already are making the request we don't want to make it again, just wait for the result that will already be coming back.
What is the fastest, most scalable solution to not making duplicate requests without locking up and taking down the server?
FWIW, my experience with rx-java specifically is very limited, so I'm not entirely confident how applicable this is for your case. This is a solution I've used several times with Scala and I know Java itself does have analogous constructs that would allow the same approach.
A solution I have used in the past that has worked very well for me involves using Futures. It does not reduce duplication entirely, but it does remove duplication per requesting server. The approach involves using a TTL Cache in which we stored the Future object that does or will contain the result of a request we want to deduplicate on. It is stored under a key that can determine uniqueness of the request such as the different parameters that might be applicable.
So let's say you have a method that you call to fetch the response from Service 2 and returns it as a Future. As an example we'll say getPage which has one parameter, an integer, which is the page you'd like to fetch.
When a request begins and we're about to call getPage with the page number of 2, we check the cache for a key like "getPage:2". This won't contain anything for the first request, so we call getPage(2) which returns a Future[SomeResponseObject]. We set "getPage:2" in the TTL Cache to the Future object. When another request comes in that may spawn a duplicate request, the same cache check happens, however, there's a Future object already in the cache. We get this future and add a response listener to be invoked when the response is available, or in Scala, simply .map() on it.
This has a few advantages. If your request is slow or there's highly duplicative requests even in a small time frame, many requests to Service 1 are serviced by a single response from Service 2.
Secondarily, once the request to Service 2 has come back, assuming you have a window in which the response is still valid, the response is already available immediately and no request is necessary at all.
If your Service 2 request takes 50ms, and your response can be considered valid for 5 seconds, all requests happening to the same server in the first 50ms are serviced at ms 50 when the response is returned, and from that point forward for the remaining 4950 ms already have access to the response.
As I alluded earlier to the effectiveness here is tied to how many instances of Service 1 are running. The number of duplicate requests at any time is linear to the number of Servers running.
This is a mostly lock free way to achieve this. I saw mostly because some synchronization is necessary the TTL Cache itself to make sure the request is only started once, but has never been an issue for performance in my experience.
As an extension of this, you can potentially use something like redis to cache responses from Service 2 if it has long-ish response times, and have your getPage equivalent first check a redis cache for the serialized response (and write an expiring value if one wasn't there). This allows you to further reduce requests to Service 2 by having a more global value cached, but having a second caching layer does add some complexity and potential for issues.

Having a "worker" in Java

I have a REST API created in Java with the Spark framework, but right now a lot of work is being done on the request thread that is significantly slowing down requests.
I'm wanting to solve this by creating some kind of background worker/queue that will do all the needed work off of the request thread. The response from the server contains data that the client will need (it's data that will be displayed). In these examples the client is a web browser.
Here's what the current cycle looks like
API request from client to server
Server does blocking work; Response from server after several seconds/minutes
Client receives response. It has all the data it needs in the response
Here's what I would like
API request from client to server
Server does work off-thread
Client receives response from server almost instantly, but it doesn't have the data it needs. This response will contain some ID (Integer or UUID), which can be used to check the progress of the work being done
Client regularly checks the status of the work being done, the response will contain a status (like a percentage or time estimate). Once the work is done, the response will also contain the data we need
What I dislike about this approach is that it will significantly complicate my API. If I want to get any data, I will have to make two requests. One to initiate the blocking work, and another to check the status (and get the result of the blocking work). Not only will the API become more complicated, but the backend will too.
Is this efficient, or is there a better way to implement what I want to accomplish?
Neither way is more efficient than the other since the same amount and time of work will be done in either case. In the first case it will be done on the request thread, the client will not know of progress and the request will take as long as it takes to run the task. This has the client wait on the reply.
In the second case you need to add complexity, but you get progress status and possibly other advantages depending on the task. This has the client poll on the reply.
You can use async processing to perform work on non-request threads, but that probably won't make any difference if most of your requests are long running ones. So it's up to you to decide what you want, the client will have to wait the same amount anyway.

Is there a way to tell the servlet container to spawn one instance of a resource at a time?

I have a resource, say a #POST method serving the clients. It doesn't run on any external parameters, not even the caller URL (we're leaving that to the firewall) or the user authentication.
However, we don't want to handle user requests simultaneously. When a request1 is being processed and the method hasn't just yet returned, a request2 coming in should receive a response of status 309 (or whatever status code applies) and shouldn't get served.
Is there a way of doing this without getting into anything on the server back-end side like multithreading?
I'm using Tomcat 8. The application will be deployed on JBoss, however this wouldn't effect the outcome(?) I used Jersey 1.19 for coding the resource.
This is a Q relevant to How to ignore multiple clicks from an impatient user?.
TIA.
Depending on what you want to achieve, yes, it is possible to reject additional requests while a service is "in use." I don't know if it's possible at the servlet level; servlets are designed to spin up processes for as many requests as possible so that, say, if one user requests something simple and another requests something difficult, the simple request can get handled while the difficult request is processing.
The primary reason you would probably NOT want to return an HTTP error code simply because a service is in use is that the service didn't error; it was simply in use. Imagine trying to use a restroom that someone else was using and instead of "in use" the restroom said "out of order."
Another reason to think twice about a service that rejects requests while it is processing any other request is that it will not scale. Period. You will have some users have their requests accepted and others have their requests rejected, seemingly at random, and the ratio will tilt toward more rejections the more users the service has. Think of calling into the radio station to try to be the 9th caller, getting a busy tone, and then calling back again and again until you get through. This works for trying to win free tickets to a concert, but would not work well for a business you were a customer of.
That said, here are some ways I might approach handling expensive, possibly duplicate, requests.
If you're trying to avoid multiple identical/simultaneous requests from an impatient user, you most likely have a UX problem (e.g. a web button doesn't seem to respond when clicked because of processing lag). I'd implement a loading mask or something similar to prevent multiple clicks and to communicate that the user's request has been received and is processing. Loading/processing masks have the added benefit of giving users an abstract feeling of ease and confidence that the service is indeed working as expected.
If there is some reason out of your control why multiple identical requests might get triggered coming from the same source, I'd opt for a cache that returns the processed result to all requests, but only processes the first request (and retrieves the response from the cache for all other requests).
If you really really want to return errors, implement a singleton service that remembers a cache of some number of requests, detects duplicates, and handles them appropriately.
Remember that if your use case is indeed multiple clicks from a browser, you likely want to respond to the last request sent, not the first. If a user has clicked twice, the browser will register the error response first (it will come back immediately as a response to the last click). This can further undermine the UX: a single click results in a delay, but two clicks results in an error.
But before implementing a service that returns an error condsider the following: what if two different users request the same resource at the same time? Should one really get an error response? What if the quantity of requests increases during certain times? Do you really want to return errors to what amounts to a random selection of consumers of the service?

Java Servlets: How to repeat an HTTP request?

I'd like to repeat an HTTP request automatically if a database deadlock occurs; however, FilterChain.doFilter() is defined as a unidirectional chain (so I cannot reset its state).
In cases where it's safe to do so, is it possible to repeat an HTTP request without having the client re-submit the request?
UPDATE: I just discovered a problem with this approach. Even if you repeat the request, you will need to buffer the request InputStream. This means that if a user uploads 100MB of data, you'll be forced to buffer that data regardless of whether a deadlock occurs.
I am exploring the idea of getting the client to repeat the request here: Is it appropriate to return HTTP 503 in response to a database deadlock?
Answering my own question:
Don't attempt to repeat an HTTP request. In order to do so you are going to be forced to buffer the InputStream for all requests, even if a deadlock never occurs. This opens you up to denial-of-service attacks if you are forced to accept large uploads.
I recommend this approach instead: Is it appropriate to return HTTP 503 in response to a database deadlock?
You can then break down large uploads into multiple requests stitched together using AJAX. Not pretty but it works and on the whole your design should be easier to implement.
UPDATE: According to Brett Wooldridge:
You want a small pool of a few dozen connections at most, and you want the rest of the application threads blocked on the pool awaiting connections.
Just as Hikari recommends a small number of threads with a long queue of requests, I believe the same holds true for the web server. By limiting the number of active threads, we limit the number of InputStreams we need to buffer (the remaining requests get blocked before sending the HTTP body).
To further reinforce this point, Craig Ringer recommends recovering from failures on the server side where possible.
You can do a 'forward' of the original request like below.
RequestDispatcher rd= request.getRequestDispatcher("/originalUrl");
rd.forward(request, response);
Here request and response represent HttpServletRequest/HttpServletResponse respectively.Refer
http://docs.oracle.com/javaee/5/api/index.html?javax/servlet/RequestDispatcher.html
Alternatively you can do a redirect on the response. This however will send a response to the browser asking it to issue a new request for the provided url. This is shown below
response.sendRedirect("originalUrl?neededParam=abc");

Categories