How do I write to a pre-specified binary file format in Java that I can share with another computer that will parse it (and other computer is not using Java)? The file format has longs, floats, and some bitfields. This is a file that the Java program will write data to and then share with another computer. Is there a better way to do this than with a binary file format?
If it doesn't have to be binary you can use XML or JSON. If it has to be binary, use Protocol Buffers.
If it has to be binary, I would use ByteBuffer which support reading/write primitive in any combination and with any byte order (endianess) You can create wrappers so when you read/write it can appear as events, messages or event types (so you hide the fact you are dealing with a binary file)
Related
I have implemented the program that will transfer any txt file using the udp socket in java. I am using printwriter to write and read. But using that I am not able to transfer any file other than txt (say i want to transfer pdf). In this case what should be done. I am using the below function for file write.
Output_File_Write = new PrintWriter("dummy.txt");
Output_File_Write.print(new String(p.getData()));
Writers / PrintWriters are for writing text files. They take (Unicode-based) character data and encode it using the default character encoding (or a specified one), and write that to the file.
A PDF document (as you get it from the network) is in a binary format, so you need to use a FileOutputStream to write the file.
It is also a little bit concerning that you are attempting to transfer documents using UDP. UDP provides no guarantees that the datagrams sent will all arrive, or that they will arrive in the same order as they were sent. Unless you can always fit the entire document into a single datagram, you will have to do a significant amount of work to detect that datagrams have been dropped or have arrived in the wrong order ... and take remedial action.
Using TCP would be far simpler.
AFAIK PrintWriter is meant to be used with Text. Quote from doc
Prints formatted representations of objects to a text-output stream. This class implements all of the print methods found in PrintStream. It does not contain methods for writing raw bytes, for which a program should use unencoded byte streams.
To be able to send binary data you would need to use apt API for it, for example PrintStream
I am storing large amounts of information inside of text files that are written via java. I have two questions relating to this:
Is there any efficiency boost to writing in binary or bytecode over Strings?
What would I use to write the data type into a file.
I already have a setup based around Strings, but I want to compare and at least know how to write the file in bytecode or binary.
When I read in the file, it will be translated into Strings again, but according to my reasearch if I write the file straight into bytecode it removes the added process on both ends of translating between Strings and code both for writing the file and for reading it.
cHao has a good point about just using Strings anyway, but I am still interested in the how if how to write varied data types in the file.
In other words, can I still use the FileReader and BufferedReader to read and translate back to Strings, or is there another thing to use. Also using a BinaryWriter, is it still just the FileWriter class that I use???
If you want to write it in "binary", and you want to save space, why not just zip it using the jdk? Meets all your requirements.
I want to write two simple utilities:
Receives a Binary file, and converts it to a text file (ASCII format).
Receives a text file in the format of the above file and restores the original binary file.
The reason I need this is that very stupid, but still a reason. I have two computers - one with internet access and one without. I write software on the one without internet. I get emails on the 2nd one. I need to transfer binary files from one to another (e.g. jars) but the only communication between them is a clipboard (text only).
Might be a very localized problem - but I assume it has some solution in the worlds of data encryption/compression/network transfer.
The only thing I could come up is go over the binary file and convert each byte into it's HEX representation - so for every byte I'll get two ASCII characters (i.e. two bytes). Is there anything better? (This solution doubles the amount of info and might not be possible to transfer via clipboard)
One limitation - I need it as a java based solution (I want to write it myself)
Google for Base64, and use Apache commons codec to have a ready to use implementation.
I've to make a code to upload/download a file on remote machine. But when i upload the file new line is not saved as well as it automatically inserts some binary characters. Also I'm not able to save the file in its actual format, I've to save it as "filename.ser". I'm using serialization-deserialization concept of java.
Thanks in advance.
How exactly are you transmitting the files? If you're using implementations of InputStream and OutputStream, they work on a byte-by-byte level so you should end up with a binary-equal output.
If you're using implementations of Reader and Writer, they convert the bytes to characters according to some character mapping, and then perform the reverse process when saving. Depending on the platform encodings of the various machines (and possibly other effects if you're not specifying the charset explicitly), you could well end up with differences in the binary file.
The fact that you mention newlines makes me think that you're using Readers to send strings (and possibly that you're stitching the strings back together yourself by manually adding newlines). If you want the files to be binary equal, then send them as a stream of bytes and store that stream verbatim. If you want them to be equal as strings in a given character set, then use Readers and Writers but specify the character set explicitly. If you want them to be transmitted as strings in the platform default set (not very useful), then accept that they're not going to be binary equal as files.
(Also, your question really doesn't provide much information to solve it. To me, it basically reads "I wrote some code to do X, and it doesn't work. Where did I go wrong?" You seem to assume that your code is correct by not listing it, but at the same time recognise that it's not...)
When reading zipfiles (using Java ZipInputStream or any other library) from an unknown source is there any way of detecting which entries are "character data" (and if so the encoding) or "binary data". And, if binary, any way of determining any more information (MIME types, etc.)
EDIT does the ByteOrderMark (BOM) occur in zipentries and if so do we have to make special operations for it.
It basically boils down to heuristics for determining the contents of files. For instance, for text files (ASCII) it should be possible to make a fairly good guess by checking the range of byte values used in the file -- although this will never be completely fool-proof.
You should try to limit the classes of file types you want to identify, e.g. is it enough to discern between "text data" and "binary data" ? If so you should be able to get a fairly high success rate for detection.
For UNIX systems, there is always the file command which tries to identify file types based on (mostly) content.
Maybe implement a Java component that is capable of applying the rules defined in /usr/share/file/magic. I would love to have something like that. (You would basically have to be able to look at the first x couple of bytes.)