HTTP Request (POST) field size limit & Request.BinaryRead in JSPs - java

First off my Java is beyond rusty and I've never done JSPs or servlets, but I'm trying to help someone else solve a problem.
A form rendered by JavaScript is posting back to a JSP.
Some of the fields in this form are over 100KB in size.
However when the form field is being retrieved on the JSP side the value of the field is being truncated to 100KB.
Now I know that there is a similar problem in ASP Request.Form which can be gotten around by using Request.BinaryRead.
Is there an equivalent in Java?
Or alternatively is there a setting in Websphere/Apache/IBM HTTP Server that gets around the same problem?

Since the posted request must be kept in-memory by the servlet container to provide the functionality required by the ServletRequest API, most servlet containers have a configurable size limit to prevent DoS attacks, since otherwise a small number of bogus clients could provoke the server to run out of memory.
It's a little bit strange if WebSphere is silently truncating the request instead of failing properly, but if this is the cause of your problem, you may find the configuration options here in the WebSphere documentation.

We have resolved the issue.
Nothing to do with web server settings as it turned out and nothing was being truncated in the post.
The form field prior to posting was being split into 102399 bytes sized chunks by JavaScript and each chunk was added to the form field as a value so it was ending up with an array of values.
Request.Form() appears to automatically concatenate these values to reproduce the single giant string but Java getParameter() does not.
Using getParameterValues() and rebuilding the string from the returned values however did the trick.

You can use getInputStream (raw bytes) or getReader (decoded character data) to read data from the request. Note how this interacts with reading the parameters. If you don't want to use a servlet, have a look at using a Filter to wrap the request.
I would expect WebSphere to reject the request rather than arbitrarily truncate data. I suspect a bug elsewhere.

Related

Encoding goes wrong in the transport of a SOAP message

Context
I have a SOAP web service which is served by a JBOSS EAP instance and is called via a SOAP UI client.
In the result returned by this web service there may be an XML string returned like this by the web service:
The same string will be rendered as follows in the SOAP UI client:
As you can observe, during the transport of this message some characters (specifically <) have been encoded to <: this is normal, as the encoder wants to avoid that the string gets interpreted as markup when it's just an output to be returned as is.
Problem
What we have observed is that when the string is too long, the encoding goes just wrong. I've tried to analyze and understand and this is all I can get:
Towards the end of the string, some < characters are left as such and are not converted into <
Very weirdly, an XML tag that is normally formed on server side:
<calculationPeriod>
...some stuff
</calculationPeriod>
... has its second c converted into < and that clearly breaks completely the XML:
<cal<ulationPeriod>
...some stuff
</calculationPeriod>
My question
Honestly, I have no idea how to debug this issue furtherly. All I can notice is that:
When inside the web service (stack that I control), the response is normally formed and encoded in XML using the open tag <.
Once out in the SOAP UI client (all across the stack there are generic JBOSS calls and RMI invocations), the message gets corrupted like this.
It is important to remark that this only happens when the string is particularly long. I have one output with length 8192 characters (before encoding) that goes fine, while the other output having length 9567 characters (before encoding) goes wrong and is the subject of this question.
Apologises :)
I'm sorry not to be able to provide a reproductible use case, as well as to use a title which means nothing and everything in the question.
I'm open to provide any additional information for those who may help and to rephrase the question once I get a clearer picture of what the problem is.
I've of course looked a lot on the web but I can't find anything similar, probably I don't search with the right keywords.

Access request body when isMaxSizeExceeded

I'm using play 2.1.0 and want to implement file upload with several parameters, i.e. multipart/form-data form has some small fields and file itself.
If I upload the file without using annotation
#BodyParser.Of(value = BodyParser.MultipartFormData.class, maxLength = MAX_FILE_SIZE_B)
and checking file size like uploadedFile.length > MAX_SIZE I can access request body and it's not null all the time.
If I'm using the annotation, when maxSizeExceeded ctx.request().body().asMultipartFormData() is null even my small parameters go first in the request sent by browser. Is it correct behaviour, is any way to get small parameters even file is too large?
Is it true that the first way is bad, because large files actually will be uploaded on the server?
The behavior is expected because, the header will contain the file size, and if the payload/file size has exceeded the max_size limit the server will not receive the file and the connection will be closed. So, you can't access any form fields. Instead try to add those fields as a part of request headers, if that helps.
There is no documentation that explains this, but that is how it is handled in http layer. The following code might explain a bit, when the payload exceeds the limit it wraps the object with body = null.
To answer your question, yes the second approach is good and helps your server from accepting large files unnecessarily.

How do i start reading byte through input stream from a specific Location in the stream?

I am using URL class in java and I want to read bytes through Input Stream from a specific byte location in the stream instead of using skip() function which takes a lot of time to get to that specific location.
I suppose it is not possible and here is why: when you send GET request, remote server does not know that you are interested in bytes from 100 till 200 - he sends you full document/file. So you need to read them, but don't need to handle them - that is why skip is slow.
But: I am sure that you can tell server (some of them support it, some - don't) that you want 100+ bytes of file.
Also: see this to get in-depth knowledge about skip mechanics: How does the skip() method in InputStream work?
The nature of streams mean you will need to read through all the data to get to the specific place you want to start from. You will not get faster than skip() unfortunately.
The simple answer is that you can't.
If you perform a GET that requests the entire file, you will have to use skip() to get to the part that you want. (And in fact, the slowness is most likely because the server has to send all of the data that is being skipped to the client. That is how TCP/IP works ...)
However, there is a possible alternative. The HTTP 1.1 specification supports partial fetching documents using the Range header. If your server supports this, then you can request the server to send you just the range of the document that you are interested in. However, you may need to deal with the case where the server ignores the Range header and sends the entire document anyway.

Receiving large amount of data effectively in Java Servlets

I am getting a large amount of data from an HTML form, using POST method.
req.getParameter();
returns the value as String, but I am getting a value that is so large that it needs StringBuilder.
What do I do ?
There seem to be various limits, depending on servlet implementation, of POST parameter sizes. See
Tomcat POST parameter limit (default 2mb)
Setting Jetty POST parameter limits (default <200k)
GAE servlet container is based on Jetty 6, but concrete implementation limitations are AFAIK not known. Google guys, anyone got some concrete numbers on max POST parameter size?

Multiple questions in Java (Validating URLs, adding applications to startup, etc.)

I have an application to build in Java, and I've got some questions to put.
Is there some way to know if one URL of a webpage is real? The user enters the URL and I have to test if it's real, or not.
How can I konw if one webpage has changes since one date, or what is the date of the last update?
In java how can I put an application running on pc boot, the application must run since the user turns on the computer.
I'm not sure what kind of application you want to build. I'll assume it's a desktop application. In order to check if a URL exists, you should make a HTTP HEAD Request, and parse the results. HEAD can be used to check if the page has been modified. In order for an application to start when the PC boots, you have to add a registry entry under Windows; this process is explained here
To check whether a url is valid you could try using a regular expression (regex for urls).
To know if a webpage has changed you can take a look at the http headers (reading http headers in java).
You can't make the program startup automatically on boot, the user must do that. However, you can write code to help the user set the program as startup app; this however depends on the operating system.
I'm not sure what you mean by "real". If you mean "valid", then you can just construct a java.net.URL from a String and catch the resulting MalformedURLException if it's not valid. If you mean that there's actually something there, you could issue an HTTP HEAD request like Geo says, or you could just retrieve the content. HTTPUnit is particularly handy for retrieving web content.
HTTP headers may indicate when the content has changed, as nan suggested above. If you don't want to count on that you can just retrieve the page and store it, or even better, store a hash of the page content. See DigestOutputStream for generating a hash. On a subsequent check for changes you would simply compare the new hash with the the one you stored last time.
Nan is right about start on boot. What OS are you targeting?

Categories