server send several files through a single outputstream, when the client receive the byte[], how does it extract the files? Is there any marks in the bytes ? for example ,EOF? '\r\n', or else ? it seems that java basic IO,even the NIO both can not do that.
I think sending several files through a single IO channel, we should insert some special characters to divide the bytes. FYI.
You are looking a low level function if you wrap your stream in a multi-part or zip (no compression is an option) stream you won't have to re-implement markers.
Related
I want to store a bunch of protobuf messages in a file, and read them later.
In java, I can just use 'writeDelimitedTo' and 'parseDelimitedFrom' to read and write to a file. However, I want to read it in Python, which only seems to have a 'ParseFromString' method.
Some SO questions are very similar, such as, Parsing Protocol Buffers, written in Java and read in Python, but that is only for a single message: not for multiple.
From the proto guide it is written that you need to deal yourself with the size of your message:
Streaming Multiple Messages
If you want to write multiple messages to a single file or stream, it
is up to you to keep track of where one message ends and the next
begins. The Protocol Buffer wire format is not self-delimiting, so
protocol buffer parsers cannot determine where a message ends on their
own. The easiest way to solve this problem is to write the size of
each message before you write the message itself. When you read the
messages back in, you read the size, then read the bytes into a
separate buffer, then parse from that buffer. (If you want to avoid
copying bytes to a separate buffer, check out the CodedInputStream
class (in both C++ and Java) which can be told to limit reads to a
certain number of bytes.)
https://developers.google.com/protocol-buffers/docs/techniques
A simple solution could be for you to serialize each proto in base64, on a new line in your file.
Doing so, it would be pretty easy on python to parse and use them.
I have implemented the program that will transfer any txt file using the udp socket in java. I am using printwriter to write and read. But using that I am not able to transfer any file other than txt (say i want to transfer pdf). In this case what should be done. I am using the below function for file write.
Output_File_Write = new PrintWriter("dummy.txt");
Output_File_Write.print(new String(p.getData()));
Writers / PrintWriters are for writing text files. They take (Unicode-based) character data and encode it using the default character encoding (or a specified one), and write that to the file.
A PDF document (as you get it from the network) is in a binary format, so you need to use a FileOutputStream to write the file.
It is also a little bit concerning that you are attempting to transfer documents using UDP. UDP provides no guarantees that the datagrams sent will all arrive, or that they will arrive in the same order as they were sent. Unless you can always fit the entire document into a single datagram, you will have to do a significant amount of work to detect that datagrams have been dropped or have arrived in the wrong order ... and take remedial action.
Using TCP would be far simpler.
AFAIK PrintWriter is meant to be used with Text. Quote from doc
Prints formatted representations of objects to a text-output stream. This class implements all of the print methods found in PrintStream. It does not contain methods for writing raw bytes, for which a program should use unencoded byte streams.
To be able to send binary data you would need to use apt API for it, for example PrintStream
I have a servlet which reads BINARY file and sends it to a client.
byte[] binaryData = FileUtils.readFileToByteArray(path);
response.getWriter().print(new String(binaryData));
It works for NON BINARY files. When I have a BINARY file, I get receive file length bigger than origin or received file not the same. How I can read and send binary data?
Thanks.
Not via the Writer. Writers are for text data, not binary data. Your current code is trying to interpret arbitrary binary data as text, using the system default encoding. That's a really bad idea.
You want an output stream - so use response.getOutputStream(), and write the binary data to that:
response.getOutputStream().write(FileUtils.readFileToByteArray(path));
Do not use Writer, it will add encoding of your characters and there will not always be a 1:1 mapping (as you have experienced). Instead use the OutputStream directly.
And avoid reading the full content if you don't need it available at once. Serving many parallel requests will quickly consume memory. FileUtils have methods for this.
FileUtils.copyFile(path, response.getOutputStream());
I'm trying to uncompress data that was compressed using the ZLIB library written by Jean-loup Gailly back in the 1990s. I think it is a popular library (I see a lot of programs that ship the zlib32.dll file it uses) so I hope someone will be familiar enough with it to help me. I am using the compress() function directly which from what I read uses rfc-1951 DEFLATE format.
Here is a segment of the code I am using to read some compressed data from a stream and uncompress it:
InputStream is = new ByteArrayInputStream(buf);
//GZIPInputStream gzis = new GZIPInputStream(is);
InflaterInputStream iis = new InflaterInputStream(is);
byte[] buf2 = new byte[uncompressedDataLength];
iis.read(buf2);
The iis.read(buf2) function throws an internal exception of "Data Format Error". I tried using GZIPInputStream also, but that also throws the same exception.
The "buf" variable is type byte[] and I have confirmed by debugging that it is the same as what my C program gets back from the ZLIB compress() function (the actual data comes from a server over TCP). "uncompressedDataLength" is the known size of the uncompressed data that was also provided by the C program (server).
Has anyone tried reading/writing data using this library and then reading/writing the same data on the Android using Java?
I did find a "pure Java port of ZLIB" referenced in a few places, and if I need to I can try that, but I would rather use the built-in/OS functions if possible.
The data formats deflate, zlib and gzip in play here are all related.
The base is the deflate compressed data format, defined in RFC 1951.
As it is often quite useless in its pure form, we usually use a wrapping format around it.
The gzip compressed data format (RFC 1952) is intended for compression of files. It consists of a header which has space for a file name and some attributes, a deflate data stream, and a CRC-32 check sum (4 bytes) at the end. (There is also support of multiple such files in one stream in the specification, but I think this isn't used as often.)
The zlib compressed data format, defined in RFC 1950: It consists of a smaller header (2 or 6 bytes), a deflate data stream, and an Adler-32 check sum (4 bytes) at the end. (The Adler-32 check sum is intended to be faster to calculate than the CRC-32 check sum used in gzip.) It is intended for compressed transmission of data inside some other protocols, or compressed storage inside other file formats. For example, it is used inside the PNG file format.
The zlib library supports all these formats. Java's java.util.zip is build on zlib (as part of the VM's implementation/native calls), and exposes access to these with several classes:
The Deflater and Inflater classes implement - depending on the nowrap argument to the constructor - either the zlib or the deflate data formats.
DeflaterOutputStream/DeflaterInputStream/InflaterInputStream/InflaterOutputStream build on a Deflater/Inflater. The documentation doesn't say clearly whether the default Inflater/Deflater implements zlib or deflate, but the source shows that it uses the default Deflater or Inflater constructor, which implements zlib.
GZipOutputStream/GZipInputStream implement, as the name says, the gzip format.
I had a look at the source code of zlib's compress function, and it seems to use the zlib format. So your code should do the right thing. Make sure there is no missing data, or additional data which is not part of the compressed data block before or after it.
Disclaimer: This is the state for Java SE, I suppose it is similar for Android, but I can't guarantee this.
The jzlib library you found (I suppose), which is a Java reimplementation of zlib, also implements all these data formats (gzip was added in the latest update). For interactive use (on the compressing side) it is preferable, since it allows some flushing actions which are not possible with java.util's classes (other than using some workaround like changing the compression level), and it also might be faster since it avoids native calls (which always have some overhead).
PS: The zip (or pkzip) file format is also related: It uses deflate internally for each file inside the archive.
I am trying to send some very large files (>200MB) through an Http output stream from a Java client to a servlet running in Tomcat.
My protocol currently packages the file contents in a byte[] and that is placed a a Map<String, Object> along with some metadata (filename, etc.), each part under a "standard" key ("FILENAME" -> "Foo", "CONTENTS" -> byte[], "USERID" -> 1234, etc.). The Map is written to the URL connection output stream (urlConnection.getOutputStream()). This works well when the file contents are small (<25MB), but I am running into Tomcat memory issues (OutOfMemoryError) when the file size is very large.
I thought of sending the metadata Map first, followed by the file contents, and finally by a checksum on the file data. The receiver servlet can then read the metadata from its input stream, then read bytes until the entire file is finished, finally followed by reading the checksum.
Would it be better to send the metadata in connection headers? If so, how? If I send the metadata down the socket first, followed by the file contents, is there some kind of standard protocol for doing this?
You will almost certainly want to use a multipart POST to send the data to the server. Then on the server you can use something like commons-fileupload to process the upload.
The good thing about commons-fileupload is that it understands that the server may not have enough memory to buffer large files and will automatically stream the uploaded data to disk once it exceeds a certain size, which is quite helpful in avoiding OutOfMemoryError type problems.
Otherwise you are going to have to implement something comparable yourself. It doesn't really make much difference how you package and send your data, so long as the server can 1) parse the upload and 2) redirect data to a file so that it doesn't ever have to buffer the entire request in memory at once. As mentioned both of these come free if you use commons-fileupload, so that's definitely what I'd recommend.
I don't have a direct answer for you but you might consider using FTP instead. Apache Mina provides FTPLets, essentially servlets that respond to FTP events (see http://mina.apache.org/ftpserver/ftplet.html for details).
This would allow you to push your data in any format without requiring the receiving end to accommodate the entire data in memory.
Regards.