I've had no luck finding this issue in any search. I've also posted this question on the Apache forum, but haven't had any luck there so far.
I have an application on a JBoss application server (EAP 6.2) with an Apache 2.2.26 server in front of it acting as a reverse proxy for HTTPS. The application has a large multi-part form which recently has been experiencing an intermittent drop of a single form field from the post data (not always the same field). We've verified that all of the data is being sent, but not being received by the application. The issue does not occur if we HTTP directly to the JBoss server (test server). We've repeatedly sent the same form data and sometimes a single form field is dropped (not always the same field) and sometimes it is not. But, it happens often enough to be easily repeated. The issue happens on Internet Explorer and Firefox, so does not appear to be browser related. The amount of data being sent varies, but is usually in the 10-30 KB range. There aren't any errors appearing in the Apache server log even when logging at the debug level.
One additional bit of information is that since this is a multi-part form, the browser includes a boundary value between each form field and file when submitted. We've noted that the size of this boundary value varies and the form data that is being lost changes depending on the size of the boundary value when the same form data is being sent.
Any ideas as to what might be causing this data loss would be most appreciated.
Update: This issue is similar to an old bug reported for Apache 2.0.55 and 2.2.2, link
Update: Found out how to monitor the number of bytes received at the JBoss ajp port. The same number of bytes being sent from the browser are arriving at the ajp port. Could this be a chunk encoding issue?
It turned out that this issue had something to do with the SLL protocol. We were using TLSv1 and recently changed to TLSv1.2. The problem went away with the protocol upgrade. So, we changed back to TLSv1 in a test environment. The problem reappeared. We then changed to TLSv1.2 and the problem went away.
No idea why the protocol change fixed the issue, but I do know that considerable changes were made TLSv1.2 . At this point just glad the protocol upgrade resolved the issue.
Related
We have implemented a Java servlet running on JBoss container that uses CometD long-polling. This has been implemented in a few organizations without any issue, but in a recent implementation there are functional issues which appear to be related to the network setup of this organization.
Specifically, around 5% of the time, the connect requests are getting back 402 errors:
{"id":"39","error":"402::Unknown client","successful":false,"advice":{"interval":0,"reconnect":"handshake"},"channel":"/meta/connect"}
Getting this organization to address network performance is a significant challenge, so we are looking at a way to tune the implementation to reduce these issues.
Which cometd configuration parameters can be updated to improve this?
maxinterval, timeout, multiSessionInverval, etc?
Thank you!
The "402 unknown client" error is due to the fact that the server does not see /meta/connect heartbeat messages from the client and expires the correspondent session on the server. This is typically due to network issues.
Once the client network is restored, the client sends a /meta/connect heartbeat message but the server doesn't have the correspondent session, hence the 402.
The parameter that controls the server side expiration of sessions is maxInterval, documented here: https://docs.cometd.org/current/reference/#_java_server.
By default is 10 seconds. If you increase it, it means you are retaining in the server memory sessions for a longer time, so you need to take that into account.
I need some assistance on a project I am working on. It's a library itself using Jersey 1.x (1.19.1) aiming at HTTP posting a JSON document and getting the corresponding JSON response from a server.
I am facing a problem when the response from the server is "big". The JSON document that is posted by my client application contains several jobs that must be executed by the server, and
the JSON document sent back by the server is made of the outputs of these jobs. The jobs can be considered independent from each other. The server works in streaming mode, which means it
starts to process the jobs before it receives the entire JSON document posted by the client. And it starts to send the outputs of the jobs as soon as they are finished. So the server
starts to reply to my client application while it is still posting the request. Here is my problem. When the request gets big so gets the response (more jobs to do), my application freezes
and at some point terminates.
I spent a lot of time trying to figure out what's happening and here is what is found and what I infered.
Jersey, for handling HTTP communication is using a class from the JDK (in rt.jar) I forgot the exact name and don't have access to my work right now but let's call it HttpConnection.
In this class there is a method checkError() that is invoked and throws a IOException with only a message saying it was impossible to write to server.
Debugging I was able to understand that an attribute of this class named trouble was set to true because a write() method caught an IOException before. checkError() throws a
IOException based on that trouble boolean flag. It's not possible to easily see the cause IOException because the classes of the JRE are compiled without the debugging symbols but
I managed to see that this IOExeption was a "connection reset by peer" problem.
Then I tried to understand why the server resets the connection. I used a HTTP proxy that captures the HTTP traffic between my client application and the server but this gave me no more clues,
it even seems that the proxy is unable to handle properly the connection with the server as well!
So I tried to use Wireshark to capture the traffic and see what's wrong. Here is what I found.
On client side, packets corresponding to the post of the request JSON document are sent and the server starts to reply shortly after, as explained above. The server side sends
more and more packets and I noticed that the buffer of the TCP layer (called TCP window in Wireshark) on client side has a size that decreases more and more as the server sends packets.
Until it beomes full (size: 0 byte). So the TCP layer on server side cannot send data to the TCP layer on client side anymore and thus becomes full too. The conversation, in the end is
only about retrying to send data, on both sides, failing again and again. Ultimately the server decides to send a reset packet. This corresponds to the cause IOExcpetion I mentioned
above I believe.
My understanding is: as long as the server does not start to stream the response everything is fine. When the server starts to send the response, the TCP buffer on client side starts to
get filled. But as the client application does not read the response yet, the content of this buffer is not consumed. When the server has sent enough data to fill this buffer it cannot
send anymore data and the buffer of its TCP layer gets full too because the server continues to push data. As a result, the client application cannot finish to send the request JSON
document. The communication is blocked on both sides and the server decides to reset the connection.
My conclusion is: the code, as currently written, does not support such full duplex communication, because the response from the server is not consumed as it is received. Indeed, walking
through the Jersey code that is executed by my library, by debugging, it is clear that the pattern is:
first: connection.getOutputStream().write()
and then: response.getInputStream().read()
In my opinion, the root cause of the problem is that the library I am working on uses Jersey in this synchronous manner which does not fit well the way the server works (streaming the
response while the request is still being sent to it).
I searched a lot on the Internet a solution keeping Jersey 1.19.1 for me to improve the library with as few impacts as possible but I failed. This is the reason why I am asking help
here now ;)
So basicaly my question is: is it possible to do what I need to do keeping Jersey client library 1.19.1 and if yes how? If not, what HTTP client library should I use for my library (to
write a post request and read the corresponding response at the same time) and if you could give me a basic example so I can be on track quickly it would be much appreciated.
One last thing: curl works just fine, I can fully post the exact same JSON document and get the response using it, so there is no problem on server side as I suspected at the very
beginning of my investigation. And it scales fine (I tried to send huge JSON documents). Of course I made sure the HTTP header of the post is the same in the case of my library and in the
curl case.
Thanks a lot for reading me and for your answers.
Best regards,
Loïc
I'm creating a small utility which receives a lot of HTTP requests. It is written in java and uses embedded-jetty to handle requests via https.
I have a load-testing tool for it, but when it is being run for some time it starts to throw exceptions:
java.net.BindException: Address already in use: connect
(note, this is on sender's side, not in my project)
As I understand this means no more free sockets were found in system when another connect was called. Throughput is about 1000 requests per second, and failures start to appear somewhere after 20000 to 50000 requests.
However when I use the same load testing tool with another program (a kind of simple consumer, written in scala using netty by some colleague - it simply receives all requests and returns empty ok response) - there is no so problem with sockets (though typical speed is 1.5-2 times slower).
I wonder if this could be fixed by telling Jetty somehow to close connections immediately after response was sent. Anyway each new request is sent via new connection. I tried to play with Connector#setIdleTimeout - it seems to be 30000 by default but have not succeeded.
What can I do to fix this - or at least to research the matter deeper to find its cause (if I am wrong in my suggestions)?
UPD Thanks for suggestions, I think I am not allowed to post the source, but I get the idea that I should study client's code (this will make me busy for some time since it is written in scala).
I found that really there was a problem with client - it sends requests with Connection: Keep-Alive in header, though creates new HttpURLConnection for each request and calls disconnect() method after it.
To solve this trouble on the server-side it was sufficient to send Connection: close in response header, since I have no allowance to change testing utility.
My server version is as follows on my dev machine:
Apache/2.2.21 (Win32) mod_fcgid/2.3.6
I have been testing HttpURLConnection as my project requires easy streaming capabilties. I have read a great synopsis from #BalusC on how to use the class.
Using java.net.URLConnection to fire and handle HTTP requests
The trouble I am currently having is when setting setChunkedStreamingMode. Regardless of what I set it to my stream doesn't seem to make it to the server the data stream is empty when my server api method/connection is called/made. However, if I remove it, it works fine.
I have seen another person with a similar issue:
Java/Android HttpURLConnection setChunkedStreamingMode not working with all PHP servers
But with no real resolution. I am unable to set it to setFixedLengthStreamingMode simply because the content (json) is variable in length.
This is NOT OK. I potentially will be transfering very large quantities of data and hence cannot have the data stored in memory.
My question is, how can I get setChunkedStreamingMode to play nice? Is it a server setup issue or can it be fixed in code?
EDIT
I have now tested my code on my production server and it works no problem. I would however still like to know why my Apache server on my local machine fails. Any help is still much appreciated.
Try adding this HTTP header:
urlConnection.setRequestHeader("Transfer-Encoding","chunked");
I haved a problem like this: although I haved set the chunked HTTP streaming mode (urlConnection.setChunkedStreamingMode(0) ), it not worked, but putting the HTTP header above it works fine.
I had a similar issue. In my case it was the client system that had a virus scanner installed. Those scanners sometimes have identity theft modules that intercept POSTs, scan the body and then pass it on.
In my case BitDefender cached about 5MB before passing it on.
If the whole payload was less then the POST was delivered as non chunked fixed length request.
I had a similar problem using HttpURLConnection. Just add:
conn.setRequestProperty("connection", "close"); // disables Keep Alive
to your connection or disable it for all connections:
System.setProperty("http.keepAlive", "false");
From the API about disconnect():
Releases this connection so that its resources may be either reused or closed.
Unlike other Java implementations, this will not necessarily close socket connections that can be reused. You can disable all connection reuse by setting the http.keepAlive system property to false before issuing any HTTP requests.
I have a setup in which some applications communicate with each other via Tibco rendezvous. The applications communicate using certified messaging. My problem is that two of my receivers have recently started exhibiting the behavior that they will get an Error 27, Not Permitted when they want to confirm a message (the first message in a certified message exchange isn't certified, we've accounted for that).
I've been looking around the internet to find people with the same error, and I have found many, but they all get the error when trying to create the tibco transport. I can create the transport just fine, but I can't confirm any messages received over it.
Our environment uses both tibco 7.X and 8.X, some times intermingled. This problem appears both when the peers use the same tibco version and when they use different versions. It doesn't show up for all applications, but when it does show up for an application, it remains "broken". Discarding the ledger files for both sender and receiver does nothing. We still get the error. Both sender and receiver have proper permissions to write to (and create the) ledger files. We are connecting to permanently running rvds. The sender and receiver are on different machines. Communication has worked flawlessly in the past, but at some point, it stopped doing so. The application is in java, and we're using the tibrvj.jar auto-native libraries.
The error is
...
Caused by: TibrvException[error=27,message=Not permitted]
at com.tibco.tibrv.TibrvImplCmTPortC.natConfirmMsg(Native Method)
at com.tibco.tibrv.TibrvImplCmTPortC.confirmMsg(TibrvImplCmTPortC.java:304)
at com.tibco.tibrv.TibrvCmListener.confirmMsg(TibrvCmListener.java:88)
....
I know you're going to ask me "what did you do to make it start happening", and my response is "I don't know".
Any input would be appreciated.
Thanks.
It may be possible that TCP connections between the two RVD servers is not possible. Can you check if you can connect from one to the other (connect from the subscriber host back to the publisher)? In my experience, CM acknowledgments are handled over TCP (please take this with a grain of salt as I'm more an end user than a Middleware support guy).
As it turns out, it was a screw-up on the application level.
Due to some old code lying around, after having updated a dependency (our messaging layer), we had moved from an application level confirmation to a container level confirmation, but we had forgotten to remove an explicit message confirmation in the application code.
To summarize: We tried to confirm the message twice, and the second time it threw this exception.
I recently encountered the same exception - application had been working for months, suddenly was throwing exception. In my case some maintenance had been done on the Windows server the application ran on and directories had been marked read-only. Once that was cleared the exception went away.
Discovered this after trouble-shooting hours worth of other potential causes.
Just my two cents: This exception also occurs when you try to explicitly confirm message on non-CM transport.