SSL overweight in Java - java

I'm using org.apache.commons.ssl to make an SSL server in Java.
I'm facing a strange problem : I send 500KB of data over the SSL stream, I receive 500KB of data on client side, but the transferred data over the TCP connection is 20 times bigger.
What could be the cause ? A bad configuration of SSL parameters ?
I'm using a real trusted SSL certificate for my tests.
I tried to sniff and decode the SSL stream with Wireshark but it didn't work, I wasn't able to see the decoded data. Or maybe the stream was encoded in more than one pass ?
The TCP packets were 1525 bytes each. Nothing abnormal as I could see.
If somebody has an idea ...
Thanks !
Olivier

Sounds like you are only sending one byte at a time over the wire. The overhead is then the TCP/IP-packet encapsulation.

Renegotiations won't account for your 20x explosion. Are you using BufferedOutputStreams around the SSL socket's output streams in both directions? i.e. at the server and the client? If you don't use buffered output and your code writes one byte at a time you can see a 40x explosition due to the SSL record protocol, and, geometrically, another 40x explosition due to TCP segment overhead; the latter is usually mitigated by the Nagle algorithm, but some people turn that off, a little too keenly IMHO.

#EJP: you were right, I made a mistake in my code: I was wrapping a BufferedOuputStream around a SomeStuffOutputStream, instead of wrapping a SomeStuffOutputStream around BufferedOuputStream.
The BufferedOuputStream must be at the lowest level, just above the raw socket's OutputStream.
Now it's working perfectly!
It was a misconception, and I'm just beginning to understand why I saw "normal" packet sizes, because SSL protocol stuff. I'll be more careful next time :)
Thanks to all.

Related

Accessing external TCP stream

Theory:
Let's say I have an application A, written in Java, that uses a TCP stream for client/server communication (it's on the client end in the relationship). Now, purely as an experiment, I am trying to create an application B, written in VB.NET, that would serve as a proxy for application A's network stream, allowing app B to read and write to the stream.
Is it, at all, possible to access such a network stream from another application, also taking the language boundary into account?
Your question is pretty vague, but if you're asking about the possibility of making a proxy server, then yes, it's possible. The language doesn't matter, but the interface does (the way that the content in the stream is encoded). For instance, Java typically serializes things into a stream using big endian (most significant byte of each byte sequence sent first), whereas .NET uses little endian (least significant byte of each byte sequence sent first). Again though, as long as you're aware of how the data is actually encoded into those streams, you can write a decent proxy server. If all that your proxy server will be doing is passing along data without caring what the data is, then you can just read a byte from one stream and write it to the other. But if you're actually reading values (integers, strings, pictures, etc.), then you will be dealing with the endianness issues, because Java and VB.NET's default stream readers will read and write integers differently, etc.
There will be some complications if you want to actually edit the data instead of simply passing it along. You'll have to deal with the client's and server's reactions to strange network behavior. For instance, if Client A is a video game, and Proxy B injects a message to the server to "join the game", then you'll have to deal with the fact that the server is going to send "ok, you've joined the game". When the client receives that message, it will most likely ignore it, because it had no knowledge that the proxy tried to join the game on its behalf, and will just assume the server made a mistake.

Netty dynamic pipeline configuration

This may be a "newb" question but here it goes anyway. We have a netty server up and running and we want it to support multiple different protocols like straight tcp, http, udp etc.. I am trying to write a class to be more dynamic what handlers/decoders/encoders we add to the pipeline on every request so we only add the layers we need depending on what type of traffic it is. I've got straight tcp figured out because we are encoding special bytes but I'm having a hard time coming up with a clever way to tell if its HTTP traffic vs straight tcp based off a ChannelBuffer or byte array.
My thoughts have been along the line of reading in some bytes and looking for a string like 'GET' or 'POST', I assume a HTTPRequest would have these items somewhere.. Is what I'm trying to do worth it? Or anyone have any helpful ideas?
I think you want to have a look at the portunification example where we do something like what you want to do. In short it's possible to do what you want. For more infos and more details please check the example at [1].
[1.a (master_deprecated)] https://github.com/netty/netty/blob/master_deprecated/example/src/main/java/io/netty/example/portunification/PortUnificationServerHandler.java
[1.b (4.1)] https://github.com/netty/netty/blob/4.1/example/src/main/java/io/netty/example/portunification/PortUnificationServerHandler.java

How do i start reading byte through input stream from a specific Location in the stream?

I am using URL class in java and I want to read bytes through Input Stream from a specific byte location in the stream instead of using skip() function which takes a lot of time to get to that specific location.
I suppose it is not possible and here is why: when you send GET request, remote server does not know that you are interested in bytes from 100 till 200 - he sends you full document/file. So you need to read them, but don't need to handle them - that is why skip is slow.
But: I am sure that you can tell server (some of them support it, some - don't) that you want 100+ bytes of file.
Also: see this to get in-depth knowledge about skip mechanics: How does the skip() method in InputStream work?
The nature of streams mean you will need to read through all the data to get to the specific place you want to start from. You will not get faster than skip() unfortunately.
The simple answer is that you can't.
If you perform a GET that requests the entire file, you will have to use skip() to get to the part that you want. (And in fact, the slowness is most likely because the server has to send all of the data that is being skipped to the client. That is how TCP/IP works ...)
However, there is a possible alternative. The HTTP 1.1 specification supports partial fetching documents using the Range header. If your server supports this, then you can request the server to send you just the range of the document that you are interested in. However, you may need to deal with the case where the server ignores the Range header and sends the entire document anyway.

Exceptions when reading protobuf messages in Java

I am using protobuf now for some weeks, but I still keep getting exceptions when parsing protobuf messages in Java.
I use C++ to create my protobuf messages and send them with boost sockets to a server socket where the Java client ist listening. The C++ code for transmitting the message is this:
boost::asio::streambuf b;
std::ostream os(&b);
ZeroCopyOutputStream *raw_output = new OstreamOutputStream(&os);
CodedOutputStream *coded_output = new CodedOutputStream(raw_output);
coded_output->WriteVarint32(agentMessage.ByteSize());
agentMessage.SerializeToCodedStream(coded_output);
delete coded_output;
delete raw_output;
boost::system::error_code ignored_error;
boost::asio::async_write(socket, b.data(), boost::bind(
&MessageService::handle_write, this,
boost::asio::placeholders::error));
As you can see I write with WriteVarint32 the length of the message, thus the Java side should know by using parseDelimitedFrom how far it should read:
AgentMessage agentMessage = AgentMessageProtos.AgentMessage
.parseDelimitedFrom(socket.getInputStream());
But it's no help, I keep getting these kind of Exceptions:
Protocol message contained an invalid tag (zero).
Message missing required fields: ...
Protocol message tag had invalid wire type.
Protocol message end-group tag did not match expected tag.
While parsing a protocol message, the input ended unexpectedly in the middle of a field. This could mean either than the input has been truncated or that an embedded message misreported its own length.
It is important to know, that these exceptions are not thrown on every message. This is only a fraction of the messages I receive the most work out just fine - still I would like to fix this since I do not want to omit the messages.
I would be really gratful if someone could help me out or spent his ideas.
Another interesting fact is the number of messages I receive. A total messages of 1.000 in 2 seconds is normally for my program. In 20 seconds about 100.000 and so on. I reduced the messages sent artificially and when only 6-8 messages are transmitted, there are no errors at all. So might this be a buffering problem on the Java client socket side?
On, let's say 60.000 messages, 5 of them are corrupted on average.
[I'm not really a TCP expert, this may be way off]
Problem is, [Java] TCP Socket's read(byte[] buffer) will return after reading to the end of the TCP frame. If that happens to be mid-message (I mean, protobuf message), parser will choke and throw an InvalidProtocolBufferException.
Any protobuf parsing call uses CodedInputStream internally (src here), which, in case the source is an InputStream, relies on read() -- and, consequently, is subject to the TCP socket issue.
So, when you stuff big amounts of data through your socket, some messages are bound to be split in two frames -- and that's where they get corrupted.
I'm guessing, when you lower message transfer rate (as you said to 6-8 messages per second), each frame gets sent before the next data piece is put into the stream, so each message always gets its very own TCP frame, i.e. none get split and don't get errors. (Or maybe it's just that the errors are rare and low rate just means you need more time to see them)
As for the solution, your best bet would be to handle the buffer yourself, i.e. read a byte[] from the socket (probably using readFully() instead of read() because the former will block until either there's enough data to fill the buffer [or a EOF is encountered], so it's kind of resistant to the mid-message frame end thing), ensure it's got enough data to be parsed into a whole message, and then feed the buffer to the parser.
Also, there's some good read on the subject in this Google Groups topic -- that's where I got the readFully() part.
I am not familiar with the Java API, but I wonder how Java deals with an uint32 value denoting the message length, because Java only has signed 32-bit integers. A quick look at the Java API reference told me an unsigned 32-bit value is stored within a signed 32-bit variable. So how is the case handled where an unsigned 32-bit value denotes the message length? Also, there seems to be support for varint signed integers in the Java implementation. They are called ZigZag32/64. AFAIK, the C++ version doesn't know about such encodings. So maybe the cause for your problem might be related with these things?

What is the fastest way to output a large amount of data?

I have an JAX-RS web service that calls a db2 z/os database and returns about 240mb of data in a resultset. I am then creating an OutputStream to send this data to the client by looping through the resultset and adding a few XML tags for my output.
I am confused about what to use PrintWriter, BufferedWriter or OutputStreamWriter. I am looking for the fastest way to deliver the data. I also don't want the JVM to hold onto this data any longer than it needs to, so I don't use up it's memory.
Any help is appreciated.
You should use
BufferedWriter
Call .flush() frequently
Enable gzip for best compression
Start thinking about a different way of doing this. Can your data be paginated? Do you need all the data in one request.
If you are sending a large binary data, you probably don't want to use xml. When xml is used, binary data is usually represented using base64 which becomes larger than the original binary and uses quite a lot of CPU for the conversion into base64.
If I were you, I'd send the binary separate from the xml. If you are using WebService, MTOM attachment could help. Otherwise you could send the reference to the binary data in the xml, and let the app. download the binary data separately.
As for the fastest way to send binary, if you are using weblogic, just writing on the response's outputstram would be ok. That output stream is most probably buffered and whatever you do probably won't change the performance anyways.
Turning on gzip could also help depending on what you are sending (e.g. if you are sending jpeg (stuff that is already compressed) or something, it won't help a lot but if you are sending raw text then it can help a lot, etc.).
One solution (which might not work for you) is to spawn a job / thread that creates a file and then notifies the user when the file is ready to download, in this way you're not tied to the bandwidth of the client connection (and you can even compress the file properly, before the client downloads it)
Some Business Intelligence and data crunching applications do this, specially if the process takes some time to generate the data.
The output max speed will me limited by network bandwith and i am shure any Java OutputStream will be much more faster than you will notice the difference.
The choice depends on the data to send: is that text (lines) PrintWriter is easy, is that a byte array take OutputStream.
To hold not too much data in the buffers you should call flush() any x kb maybe.
You should never use PrintWriter to output data over a network. First of all, it creates platform-dependent line breaks. Second, it silently catches all I/O exceptions, which makes it hard for you to deal with those exceptions.
And if you're sending 240 MB as XML, then you're definitely doing something wrong. Before you start worrying about which stream class to use, try to reduce the amount of data.
EDIT:
The advice about PrintWriter (and PrintStream) came from a book by Elliotte Rusty Harold. I can't remember which one, but it was a few years ago. I think that ServletResponse.getWriter() was added to the API after that book was written - so it looks like Sun didn't follow Rusty's advice. I still think it was good advice - for the reasons stated above, and because it can tempt implementation authors to violate the API contract
in order to get predictable behavior.

Categories