pdf file transfer - java

I have implemented the program that will transfer any txt file using the udp socket in java. I am using printwriter to write and read. But using that I am not able to transfer any file other than txt (say i want to transfer pdf). In this case what should be done. I am using the below function for file write.
Output_File_Write = new PrintWriter("dummy.txt");
Output_File_Write.print(new String(p.getData()));

Writers / PrintWriters are for writing text files. They take (Unicode-based) character data and encode it using the default character encoding (or a specified one), and write that to the file.
A PDF document (as you get it from the network) is in a binary format, so you need to use a FileOutputStream to write the file.
It is also a little bit concerning that you are attempting to transfer documents using UDP. UDP provides no guarantees that the datagrams sent will all arrive, or that they will arrive in the same order as they were sent. Unless you can always fit the entire document into a single datagram, you will have to do a significant amount of work to detect that datagrams have been dropped or have arrived in the wrong order ... and take remedial action.
Using TCP would be far simpler.

AFAIK PrintWriter is meant to be used with Text. Quote from doc
Prints formatted representations of objects to a text-output stream. This class implements all of the print methods found in PrintStream. It does not contain methods for writing raw bytes, for which a program should use unencoded byte streams.
To be able to send binary data you would need to use apt API for it, for example PrintStream

Related

Serialize multiple protobuf messages in java and desesrialize them in Python

I want to store a bunch of protobuf messages in a file, and read them later.
In java, I can just use 'writeDelimitedTo' and 'parseDelimitedFrom' to read and write to a file. However, I want to read it in Python, which only seems to have a 'ParseFromString' method.
Some SO questions are very similar, such as, Parsing Protocol Buffers, written in Java and read in Python, but that is only for a single message: not for multiple.
From the proto guide it is written that you need to deal yourself with the size of your message:
Streaming Multiple Messages
If you want to write multiple messages to a single file or stream, it
is up to you to keep track of where one message ends and the next
begins. The Protocol Buffer wire format is not self-delimiting, so
protocol buffer parsers cannot determine where a message ends on their
own. The easiest way to solve this problem is to write the size of
each message before you write the message itself. When you read the
messages back in, you read the size, then read the bytes into a
separate buffer, then parse from that buffer. (If you want to avoid
copying bytes to a separate buffer, check out the CodedInputStream
class (in both C++ and Java) which can be told to limit reads to a
certain number of bytes.)
https://developers.google.com/protocol-buffers/docs/techniques
A simple solution could be for you to serialize each proto in base64, on a new line in your file.
Doing so, it would be pretty easy on python to parse and use them.

Server/Client errors when too many messages are sent

I have a problem in my server/client TCP multiplayer game whenever I try to send too many messages in a short time (usually over 20 messages in the arc of 20 millis). After a while the messages start to arrive corrupted for some reason (like with integers in place of strings that usually get me a NumberFormatException).
I send informations as Strings using a DataOutputStream and read them with a Scanner.
inputStream = socket.getInputStream();
outputStream = socket.getOutputStream();
in = new Scanner(inputStream);
out = new DataOutputStream (outputStream);
My questions are: should I use something different from the DataOutputStream/Scanner combination? Is there a faster combination? Should I turn the strings into bytes before sending them?
The strings I send are usually composed by both integers and strings, like "m 2 215 123" or "ep 2".
Expanding on #EJP's answer, the corruption you are experiencing is a result of an application programming error of some kind.
If you are using DataOutputStream to write the data, you should use DataInputStream to read it. And make sure that the sequence of write calls exactly matches the sequence of read calls.
If you want to read using a Scanner, then you need to format the data as text, and use a Writer to write it. (Make sure that you use the same character encoding scheme at both ends, and avoid doing nasty things like mapping binary data to text via String(byte[]) ... 'cos they tend to break.)
If you are either reading or writing the data using multiple threads that read from / write to a single stream, then you need to use some kind of locking to ensure that the messages interleave correctly / cleanly. Streams are typically NOT thread-safe.
As to whether JSON will give you better performance, you probably need to do some experiments to be sure. Among other things, it will depend on the complexity of the data and the way you chose to encode it in the non-JSON case. (But I'd expect DataOutputStream / DataInputStream to be fastest if you chose an appropriate encoding.)
the message start to arrive corrupted
No they don't. The messages don't arrive corrupted. You're just getting out of sync, because you're using a poorly defined application protocol. Use DataInputStream and DataOutputStream symmetrically, as suggested by #JimGarrison.

Read binary file as byte[] and send from servlet as char[]

I have a servlet which reads BINARY file and sends it to a client.
byte[] binaryData = FileUtils.readFileToByteArray(path);
response.getWriter().print(new String(binaryData));
It works for NON BINARY files. When I have a BINARY file, I get receive file length bigger than origin or received file not the same. How I can read and send binary data?
Thanks.
Not via the Writer. Writers are for text data, not binary data. Your current code is trying to interpret arbitrary binary data as text, using the system default encoding. That's a really bad idea.
You want an output stream - so use response.getOutputStream(), and write the binary data to that:
response.getOutputStream().write(FileUtils.readFileToByteArray(path));
Do not use Writer, it will add encoding of your characters and there will not always be a 1:1 mapping (as you have experienced). Instead use the OutputStream directly.
And avoid reading the full content if you don't need it available at once. Serving many parallel requests will quickly consume memory. FileUtils have methods for this.
FileUtils.copyFile(path, response.getOutputStream());

Processing received data from socket

I am developing a socket application and my application needs to receive xml file over socket. The size of xml files received vary from 1k to 100k. I am now thinking of storing data that I received into a temporary file first, then pass it to the xml parser. I am not sure if it is a proper way to do it.
Another question is if I wanna do as mentioned above, should I pass file object or file path to xml parser?
Thanks in advance,
Regards
Just send it straight to the parser. That's what browsers do. Adding a temp file costs you time and space with no actual benefit.
Do you think it would work to put a BufferedReader around whatever input stream you have? It wouldn't put it into a temporary file, but it would let you hang onto that data. You can set whatever size BufferedReader you need.
Did you write your XML parser? If you didn't, what will it accept as a parameter? If you did write it, are you asking about efficiency. That is to say which object, the path or file, should your parser ask for to be most efficient?
You do not have to store the data from socket to any file. Just read whole the DataInputStream into a byte array and you can then do whatever you need. E.g. if needed create a String with the xml input to feed the parser. (I am assuming tcp sockets).
If there are preceding data you skip them so as to feed the actual xml data to the parser.

File upload-download in its actual format

I've to make a code to upload/download a file on remote machine. But when i upload the file new line is not saved as well as it automatically inserts some binary characters. Also I'm not able to save the file in its actual format, I've to save it as "filename.ser". I'm using serialization-deserialization concept of java.
Thanks in advance.
How exactly are you transmitting the files? If you're using implementations of InputStream and OutputStream, they work on a byte-by-byte level so you should end up with a binary-equal output.
If you're using implementations of Reader and Writer, they convert the bytes to characters according to some character mapping, and then perform the reverse process when saving. Depending on the platform encodings of the various machines (and possibly other effects if you're not specifying the charset explicitly), you could well end up with differences in the binary file.
The fact that you mention newlines makes me think that you're using Readers to send strings (and possibly that you're stitching the strings back together yourself by manually adding newlines). If you want the files to be binary equal, then send them as a stream of bytes and store that stream verbatim. If you want them to be equal as strings in a given character set, then use Readers and Writers but specify the character set explicitly. If you want them to be transmitted as strings in the platform default set (not very useful), then accept that they're not going to be binary equal as files.
(Also, your question really doesn't provide much information to solve it. To me, it basically reads "I wrote some code to do X, and it doesn't work. Where did I go wrong?" You seem to assume that your code is correct by not listing it, but at the same time recognise that it's not...)

Categories