Receiving large amount of data effectively in Java Servlets - java

I am getting a large amount of data from an HTML form, using POST method.
req.getParameter();
returns the value as String, but I am getting a value that is so large that it needs StringBuilder.
What do I do ?

There seem to be various limits, depending on servlet implementation, of POST parameter sizes. See
Tomcat POST parameter limit (default 2mb)
Setting Jetty POST parameter limits (default <200k)
GAE servlet container is based on Jetty 6, but concrete implementation limitations are AFAIK not known. Google guys, anyone got some concrete numbers on max POST parameter size?

Related

Difference between APPLICATION_STREAM_JSON_VALUE and APPLICATION_NDJSON_VALUE

While working with Spring 5 reactive APIs , i came across the deprecated MediaType APPLICATION_STREAM_JSON_VALUE , which when used display values from a GET REST endpoint in a stream kind of fashion , that is showing up values as they appear on the browser . But as of today the documentation states that it has been replaced by APPLICATION_NDJSON_VALUE as per below text from the documentation :
APPLICATION_STREAM_JSON_VALUE Deprecated. as of 5.3 since it
originates from the W3C Activity Streams specification which has a
more specific purpose and has been since replaced with a different
mime type. Use APPLICATION_NDJSON as a replacement or any other
line-delimited JSON format (e.g. JSON Lines, JSON Text Sequences).
When i checked the behaviour of the MediaType APPLICATION_NDJSON_VALUE , i observe that when a GET API is consumed on the browser , instead of streaming it on browser real time , results get downloaded as file , which you can later view . But does that in any way impact the streaming behaviour or is it exactly the same ? Does APPLICATION_NDJSON_VALUE brings in some other significance as well or its just a pure replacement for APPLICATION_STREAM_JSON_VALUE . And if it is just a replacement , why the browser streaming behaviour changes to being results of a Flux getting downloaded ? Or let me know if i am doing any mistake while trying to replicate the exact behaviour ?
But does that in any way impact the streaming behaviour or is it exactly the same?
It's exactly the same. The content type header is only telling the client what type of content it's serving, nothing more. The browser will do its best to look at that header and work out whether to display something inline or download it, but it's just a "best guess", especially in the case of reasonably new standards like newline delimited JSON. In practice you're never going to be opening this in a browser anyway (instead consuming it as an API), so it's not really that big a deal.
If you really need it not to download in the browser, you can try adding a Content-Disposition: inline header - but personally I'd just ignore the browser's behaviour and consume it with a tool more suited to the job (like curl for instance) instead.

Tomcat ignoring post parameters request

I have an Apache Tomcat server to read request from my webapp.
In my webapp I have a form that is submitted and posts a large number of POST parameters, around 8k~
However when I try to debug the entrypoint, where the HttpServletRequest, I always receive exactly 6841. The inputs from the form are created iterating over a number of elements, meaning that the last ones are exactly the same form as the other that are succeding
I can't show code for NDA reasons.
I discarded the frontend as an issue because with a sniffer I was able to see that the complete post param list is sent.
I believe I'm on the right track, I think Tomcat is dropping the other post params. the post size limit is well beyond the size of the request, and we don't have a post parameter count configured on server.xml (defaults to 10,000 and I don't hit that amount).
All answers I have found are about not sending parameters at all or errors being thrown, in this case they are simply ignored by Tomcat.
Increasing the number of POST parameters (not size of post) to 20,000 fixed the issue in my case. This was done in the tomcat server.xml configuration using maxParameterCount:
The maxParameterCount attribute controls the maximum number of
parameter and value pairs (GET plus POST) that can be parsed and
stored in the request. Excessive parameters are ignored. If you want
to reject such requests, configure a FailedRequestFilter.

Java Jersey 2.4.1: content-length header requirement when using fixed length sreaming

Jersey 2.4.1 gives us the ability to enable fixed length streaming. This is very useful when uploading large files. The new client property for enabling this is: HTTP_URL_CONNECTOR_FIX_LENGTH_STREAMING.
By default, when doing uploads, the whole entity content is buffered by the connector before the bytes are sent to their destination. This means that the client will likely run out of memory when uploading large files. Enabling fixed length streaming solves this problem.
Unfortunately this property is not honored when the content-length header is not specified (or is set to 0) in the request. My question is why? What problem are the Jersey runtimes trying to prevent by putting this restriction? Is the content length information necessary to stream the data?
Thanks,
Habib
Whether fixed length streaming is actived or not, the client should set the header anyway. With fixed length you know the size without the need of buffering the content but that only makes sense if you actually set the header. The server doesn't care if the client buffered the content to determine the length or not.
In HTTP, [the Content-Length field] SHOULD be sent whenever the message's length can be determined prior to being transferred, unless this is prohibited by the rules in section 4.4.
RFC 2616, section 14.13 Content-Length
Without setting the length header, the client could start streaming indefinitely, without a buffer. I guess this it what Jersey tries to prevent, because then the server wouldn't know when the content ends (exept some cases listed in
RFC 2616, section 4.4 Message Length).
I forward upload requests I receive from clients to an another endpoint. I do not control the presence of the content length header in the requests I receive, and therefore may not always have a content length header to send to the end point.
That said, I can see that we need to protect against the malicious case you mention above, although I initially thought this would be the backend's responsibility.
Thanks for the clarification.

Access request body when isMaxSizeExceeded

I'm using play 2.1.0 and want to implement file upload with several parameters, i.e. multipart/form-data form has some small fields and file itself.
If I upload the file without using annotation
#BodyParser.Of(value = BodyParser.MultipartFormData.class, maxLength = MAX_FILE_SIZE_B)
and checking file size like uploadedFile.length > MAX_SIZE I can access request body and it's not null all the time.
If I'm using the annotation, when maxSizeExceeded ctx.request().body().asMultipartFormData() is null even my small parameters go first in the request sent by browser. Is it correct behaviour, is any way to get small parameters even file is too large?
Is it true that the first way is bad, because large files actually will be uploaded on the server?
The behavior is expected because, the header will contain the file size, and if the payload/file size has exceeded the max_size limit the server will not receive the file and the connection will be closed. So, you can't access any form fields. Instead try to add those fields as a part of request headers, if that helps.
There is no documentation that explains this, but that is how it is handled in http layer. The following code might explain a bit, when the payload exceeds the limit it wraps the object with body = null.
To answer your question, yes the second approach is good and helps your server from accepting large files unnecessarily.

HTTP Request (POST) field size limit & Request.BinaryRead in JSPs

First off my Java is beyond rusty and I've never done JSPs or servlets, but I'm trying to help someone else solve a problem.
A form rendered by JavaScript is posting back to a JSP.
Some of the fields in this form are over 100KB in size.
However when the form field is being retrieved on the JSP side the value of the field is being truncated to 100KB.
Now I know that there is a similar problem in ASP Request.Form which can be gotten around by using Request.BinaryRead.
Is there an equivalent in Java?
Or alternatively is there a setting in Websphere/Apache/IBM HTTP Server that gets around the same problem?
Since the posted request must be kept in-memory by the servlet container to provide the functionality required by the ServletRequest API, most servlet containers have a configurable size limit to prevent DoS attacks, since otherwise a small number of bogus clients could provoke the server to run out of memory.
It's a little bit strange if WebSphere is silently truncating the request instead of failing properly, but if this is the cause of your problem, you may find the configuration options here in the WebSphere documentation.
We have resolved the issue.
Nothing to do with web server settings as it turned out and nothing was being truncated in the post.
The form field prior to posting was being split into 102399 bytes sized chunks by JavaScript and each chunk was added to the form field as a value so it was ending up with an array of values.
Request.Form() appears to automatically concatenate these values to reproduce the single giant string but Java getParameter() does not.
Using getParameterValues() and rebuilding the string from the returned values however did the trick.
You can use getInputStream (raw bytes) or getReader (decoded character data) to read data from the request. Note how this interacts with reading the parameters. If you don't want to use a servlet, have a look at using a Filter to wrap the request.
I would expect WebSphere to reject the request rather than arbitrarily truncate data. I suspect a bug elsewhere.

Categories