Does servlet engine read the whole request before calling a servlet? - java

Servlet engine (e.g. Tomcat or Jetty) receives an HTTP request and calls a servlet with an HttpServletRequest object, which contains an InputStream of the request body.
Now I wonder if the engine has already read the whole request from the network and the InputStream is just a buffer in memory or it has read the request partially and when the servlet calls the InputStream.read it actually reads the socket.

Usually that's not the case, because the request body can be really huge. A servlet container MAY do that if the content length is known and is small enough.

It has to, at least in the case of POST, so it can form the requestParameterMap from the name-value pairs in the body of the request.

you can read this article. where does the HttpServeltInputStream read from — Zhihu, this article is written by myself. It's written in Chinese, and if you don't read Chinese, here's a neat conclusion:
picture source: https://tomcat.apache.org/tomcat-8.0-doc/config/http.html
you can also read some tomcat information from this question: What is "Sim blocking" (seen in tomcat doc)?
Actually, in tomcat HttpServletRequest.getInputStream will read data from NIOChannel(NIOChannel source code in github).
Although many buffer objects are used during the read. But these are still small buffers. Eventually it's a little bit of reading from the NIOChannel directly from the socket
We read stream byte from CoyoteInputStream directly.
It (CoyoteInputStream) copy or read byte stream from the InputBuffer ;
InputBuffer read data from the low level request Object: CoyoteRequest (which class is: org.apache.coyote.Request)
CoyoteRequest read data from Http11InputBuffer if we use Connector: HTTP/1.1
Http11InputBuffer read data from NIOSocketWrapper
NIOSocketWrapper read data from NIOChannel

Related

streaming response body to a streaming request body with java 11 HttpClient

I'm trying stream the data from an HTTP (GET) response to another HTTP (POST) request. With old HttpURLConnection I would take the responses OutputStream, read parts into a buffer and write them to the requests InputStream.
I've already managed to do the same with HttpClient in Java 11 by creating my own Publisher that is used in the POST to write the request body. The GET request has a BodyHandler with ofByteArrayConsumer that sends the chunks to the custom Publisher which itself then sends the chunks to the subscribing HTTP POST request.
But I think this is not the correct approach as it looks like there is something in the API that looks like this could be done directly without implementing publishers and subscribers myself.
There is HttpResponse.BodyHandlers.ofPublisher() which returns a Publisher<List<ByteBuffer> which I can use for the HTTP GET request. Unfortunately for my POST request, there is HttpRequest.BodyPublishers.fromPublisher which expects a Publisher<? extends ByteBuffer> so it seems that the fromPublisher only works for a publisher that holds a complete ByteBuffer and not one that sends several ByteBuffers for parts of the data.
Do I miss something here to be able to connect the BodyPublisher from one request to the other?
You're not missing anything. This is simply a use case that is not supported out of the box for now. Though the mapping from ByteBuffer to List<ByteBuffer> is trivial, the inverse mapping is less so. One easy (if not optimal) way to adapt from one to the other could be to collect all the buffers in the list into a single buffer - possibly combining HttpResponse.BodyHandlers.ofPublisher() with HttpResponse.BodyHandlers.buffering() if you want to control the amount of bytes in each published List<ByteBuffer> that you receive from upstream.

HttpServletRequest.getInputStream() does not unwrap chunked HTTP request

I am in the process of sending a HTTP chunked request to an internal system. I've confirmed other factors are not at play by ensuring that I can send small messages without chunk encoding.
My process was basically to change the Transfer-Encoding header to be chunked and I've removed the Content-Length header. Additionally, I am utilising an in-house ChunkedOutputStream which has been around for quite some time.
I am able to connect, obtain an output stream and send the data. The recipient then returns a 200 response so it seems the request was received and successfully handled. The endpoint receives the HTTP Request, and streams the data straight into a table (using HttpServletRequest.getInputStream()).
On inspecting the streamed data I can see that the chunk encoding information in the stream has not been unwrapped/decoded by the Tomcat container automatically. I've been trawling the Tomcat HTTPConnector documentation and can't find anything that alludes to the chunked encoding w.r.t how a chunk encoded message should be handled within a HttpServlet. I can't see other StackOverflow questions querying this so I suspect I am missing something basic.
My question boils down to:
Should Tomcat automatically decode the chunked encoding from my request and give me a "clean" InputStream when I call HttpServletRequest.getInputStream()?
If yes, is there configuration that needs to be updated to enable this functionality? Am I sending something wrong in the headers that is causing it to return the non-decoded stream?
If no, is it common practice to wrap input stream in a ChunkedInputStream or something similar when the Transfer-Encoding header is present ?
This is solved. As expected it was basic in my case.
The legacy system I was using provided handrolled methods to simplify the process of opening a HTTP Connection, sending headers and then using an OutputStream to send the content via a POST. I didn't realise, and it was in a rather obscure location, but the behind-the-scenes helper's we're identifying that I was not specifying a Content-Length thus added the TRANSFER_ENCODING=chunked header and wrapped the OutputStream in a ChunkedOutputStream. This resulted in me double encoding the contents, hence my endpoints (seeming) inability to decode it.
Case closed.

How to keep API Restful when GET request requires sizable JSON payload?

I'm building a java REST API using JAX-RS and to complete a GET request for a zip file I need a rather sizeable chunk of JSON to complete it. I'm not terribly experienced with REST but I do know that GET requests shouldn't have a request body and a POST shouldn't be returning a resource. So I guess my question is, how do I complete a request that contains JSON (currently in the message body) and expects a zip file in the response while keeping the application RESTful? It may be worth noting that the JSON could also contain a password
I have used POST for similar scenarios. This is a common scenarios for SEARCH operations where there is a need to send json data in request. Though using POST for getting an object is not as per REST standards, I found that to be the most suitable given the options available.
You can send body in GET requests, but that is not supported by all frameworks/tools/servers. This link discusses that in detail.
If you use POST for the operation, you can use https to send confidential information in the body.
You can think that your REST API exposes a virtual file system and the zip file you mentioned is just one resource in that VFS and have files in a certain directory to represent queries of that file system. Then you can create a new query object by sending a POST request to the queries directory, specifying all query parameters you need, such as chunk size and the path of the zip file in the VFS.
The virtual file system I am referring to is actually a directory containing other directories and files that can represent real files on the disk or metadata records in a database.
For example, say you start with the following directory layout in the VFS:
/myvfs
/files
/archive.zip
/queries
To download the archive.zip file you can send a simple GET request:
// Request:
GET /myvfs/files/archive.zip
But this will stream the entire file at once. In order to break it in parts, you can create a query in which you want to download chunks of 1MB:
// Request:
POST /myvfs/queries/archive.zip
{
chunk_size: 1048576
}
// Response:
{
query_id: 42,
chunks: 139
}
The new query lives at the address /myvfs/queries/archive.zip/42 and can be deleted by sending a DELETE request to that URL.
Now, you can download the zip file in parts. Note that the creation of the query does not actually create smaller files for each part, it only provides information about the offsets and the size of the chunks, information that can be persisted anywhere, from RAM to databases or plain text files.
To download the first 1MB chunk of the zip file, you can send a GET request:
GET /myvfs/queries/archive.zip/42/0
As a final note, you should also be aware that the query resource can be modeled to accommodate other scenarios, such as dynamic ranges of a certain file.
P.S. I am aware that the answer is not as clear as it should and I apologize for that. I will try to come back and refine it, as time permits.

The best way to send a file over a Network

I want to send a file to the Browser via the REST Interface.
Can you suggest the most efficient way to do it, Keeping in mind the following?
Not much traffic.
I am fetching the file from HBase which means when I fetch it from HBase I get it in Byte Array.
The files are not in any folder in the server. The files can only be fetched from the HBase table.
The Front end is PHP and I do not know PHP.
In the REST api you can just pass the byte array to Response and it takes care of itself.
Using the following code -
#Produces("image/jpg")
public Response getImage() {
<Fetch it from where ever you have it>
Response.ok(<byteArrayOfTheFile>).build();
}
I am giving case study of WebService by which i send file:
It is always good to encode the file content and send it to the destination where they will be decode it and read the content.
Sending as an attachment is always open to the world becasue it is not encrypted.And if the network having high trafic chances of failure is high.

How to configure HTTPServer to use content length and not transfer encoding: chunked?

I'm using java's HTTP Server object with web service implemeted by WebServiceProvider.
I see that no matter of the client request, the answer is chunked and i need it to be with content length.
so i'm assuming the problem is in the server and not the web server provider, right?
and how can i configure the http header to use content length and not chunked?
HttpServer m_server = HttpServer.create();
Endpoint ep= Endpoint.create(new ep());
HttpContext epContext = m_server.createContext("/DownloadFile");
ep.publish(downloadFileContext);
I assume you're talking about the com.sun.net.httpserver HTTPServer. I further assume that you're connecting the server to the service with a call to Endpoint.publish, using some service provider which supports HTTPServer.
The key is in the HttpExchange.sendResponseHeaders method:
If the response length parameter is greater than zero, this specifies an exact number of bytes to send and the application must send that exact amount of data. If the response length parameter is zero, then chunked transfer encoding is used and an arbitrary amount of data may be sent. The application terminates the response body by closing the OutputStream.
So, as long as the handler is passing a positive value for responseLength, Content-Length is used. Of course, to do that, it will have to know how much data it is going to send ahead of time, which it might well not. Whether it does or not depends entirely on the implementation of the binding, i'm afraid. I don't believe this is standardised - indeed, i don't believe that the WebServiceProvider/HTTPServer is standardised at all.
However, even if your provider is uncooperative, you have a recourse: write a Filter which adds buffering, and add it to the HttpContext which you are using to publish the service. I think that to do this, you would have to write an implementation of HttpExchange which buffers the data written to it, pass that down the filter chain for the handler to write its response to, then when it comes back, write the buffered content, setting the responseLength when it does so.

Categories