Sending large data over TCP/IP socket - java

I have a small project running a server in C# and a client in Java. The server sends images to the client.
Some images are quite big (up to 10MiB sometimes), so I split the image bytes and send it in chunks of 32768 bytes each.
My C# Server code is as follows:
using (var stream = new MemoryStream(ImageData))
{
for (int j = 1; j <= dataSplitParameters.NumberOfChunks; j++)
{
byte[] chunk;
if (j == dataSplitParameters.NumberOfChunks)
chunk = new byte[dataSplitParameters.FinalChunkSize];
else
chunk = new byte[dataSplitParameters.ChunkSize];
int result = stream.Read(chunk, 0, chunk.Length);
string line = DateTime.Now + ", Status OK, " + ImageName+ ", ImageChunk, " + j + ", " + dataSplitParameters.NumberOfChunks + ", " + chunk.Length;
//write read params
streamWriter.WriteLine(line);
streamWriter.Flush();
//write the data
binaryWriter.Write(chunk);
binaryWriter.Flush();
Console.WriteLine(line);
string deliveryReport = streamReader.ReadLine();
Console.WriteLine(deliveryReport);
}
}
And my Java Client code is as follows:
long dataRead = 0;
for (int j = 1; j <= numberOfChunks; j++) {
String line = bufferedReader.readLine();
tokens = line.split(", ");
System.out.println(line);
int toRead = Integer.parseInt(tokens[tokens.length - 1]);
byte[] chunk = new byte[toRead];
int read = inputStream.read(chunk, 0, toRead);
//do something with the data
dataRead += read;
String progressReport = pageLabel + ", progress: " + dataRead + "/" + dataLength + " bytes.";
bufferedOutputStream.write((progressReport + "\n").getBytes());
bufferedOutputStream.flush();
System.out.println(progressReport);
}
The problem is when I run the code, either the client crashes with an error saying it is reading bogus data, or both the client and the server hang. This is the error:
Document Page 1, progress: 49153/226604 bytes.
�9��%>�YI!��F�����h�
Exception in thread "main" java.lang.NumberFormatException: For input string: .....
What am I doing wrong?

The basic problem.
Once you wrap an inputstream into a bufferedreader you must stop accessing the inputstream. That bufferedreader is buffered, it will read as much data as it wants to, it is NOT limited to reading exactly up to the next newline symbol(s) and stopping there.
The BufferedReader on the java side has read a lot more than that, so it's consumed a whole bunch of image data already, and there's no way out from here. By making that BufferedReader, you've made the job impossible, so you can't do that.
The underlying problem.
You have a single TCP/IP connection. On this, you send some irrelevant text (the page, the progress, etc), and then you send an unknown amount of image data, and then you send another irrelevant progress update.
That's fundamentally broken. How can an image parser possibly know that halfway through sending an image, you get a status update line? Text is just binary data too, there is no magic identifier that lets a client know: This byte is part of the image data, but this byte is some text sent in-between with progress info.
The simple fix.
You'd think the simple fix is.. well, stop doing that then! Why are you sending this progress? The client is perfectly capable of knowing how many bytes it read, there is no point sending that. Just.. take your binary data. open the outputstream. send all that data. And on the client side, open the inputstream, read all that data. Don't involve strings. Don't use anything that smacks of 'works with characters' (so, BufferedReader? No. BufferedInputStream is fine).
... but now the client doesn't know the title, nor the total size!
So make a wire protocol. It can be near trivial.
This is your wire protocol:
4 bytes, big endian: SizeOfName
SizeOfName number of bytes. UTF-8 encoded document title.
4 bytes, big endian: SizeOfData
SizeOfData number of bytes. The image data.
And that's if you actually want the client to be able to render a progress bar and to know the title. If that's not needed, don't do any of that, just straight up send the bytes, and signal that the file has been completely sent by.. closing the connection.
Here's some sample java code:
try (InputStream in = ....) {
int nameSize = readInt(in);
byte[] nameBytes = in.readNBytes(nameSize);
String name = new String(nameBytes, StandardCharsets.UTF_8);
int dataSize = readInt(in);
try (OutputStream out =
Files.newOutputStream(Paths.get("/Users/TriSky/image.png")) {
byte[] buffer = new byte[65536];
while (dataSize > 0) {
int r = in.read(buffer);
if (r == -1) throw new IOException("Early end-of-stream");
out.write(buffer, 0, r);
dataSize -= r;
}
}
}
public int readInt(InputStream in) throws IOException {
byte[] b = in.readNBytes(4);
return ByteBuffer.wrap(b).getInt();
}
Closing notes
Another bug in your app is that you're using the wrong method. Java's 'read(bytes)' method will NOT (neccessarily) fully fill that byte array. All read(byte[]) will do is read at least 1 byte (unless the stream is closed, then it reads none, and returns -1. The idea is: read will read the optimal number of bytes: Exactly as many as are ready to give you right now. How many is that? Who knows - if you ignore the returned value of in.read(bytes), your code is neccessarily broken, and you're doing just that. What you really want is for example readNBytes which guarantees that it fully fills that byte array (or until stream ends, whichever happens first).
Note that in the transfer code above, I also use the basic read, but here I don't ignore the return value.

Your Java code seems to be using a BufferedReader. It reads data into a buffer of its own, meaning it is no longer available in the underlying socket input stream - that's your first problem. You have a second problem with how inputStream.read is used - it's not guaranteed to read all the bytes you ask for, you would have to put a loop around it.
This is not a particularly easy problem to solve. When you mix binary and text data in the same stream, it is difficult to read it back. In Java, there is a class called DataInputStream that can help a little - it has a readLine method to read a line of text, and also methods to read binary data:
DataInputStream dataInput = new DataInputStream(inputStream);
for (int j = 1; j <= numberOfChunks; j++) {
String line = dataInput.readLine();
...
byte[] chunk = new byte[toRead];
int read = dataInput.readFully(chunk);
...
}
DataInputStream has limitations: the readLine method is deprecated because it assumes the text is encoded in latin-1, and does not let you use a different text encoding. If you want to go further down this road you'll want to create a class of your own to read your stream format.
Some images are quite big (up to 10MiB sometimes), so I split the image bytes and send it in chunks of 32768 bytes each.
You know this is totally unnecessary right? There is absolutely no problem sending multiple megabytes of data into a TCP socket, and streaming all of the data in on the receiving side.

When you try to send image, you have to open the image as a normal file then substring the image into some chunks and every chunk change it into "base64encode" when you send and the client decode it because the image data is not normal data, so base64encode change this symbols to normal chars like AfHM65Hkgf7MM

Related

Jakarta mail IMAP Could not download photo

Jakarta mail could not read email photo
I don't know how to handle it.
InputStream x.available is 0 ,
Could not download source data
this is my writePart code
public static void writePart(Part p) throws MessagingException, IOException {
if (p.isMimeType("multipart/*")) {
Multipart content = (Multipart) p.getContent();
for (int i = 0; i < content.getCount(); i++) {
writePart(content.getBodyPart(i));
}
}
else if (p.isMimeType("text/*")) {
System.out.println(p.getContent());
}
else if (p.isMimeType("image/jpeg")) {
MimeBodyPart iPart = (MimeBodyPart) p;
String imgName = iPart.getFileName();
InputStream x = (InputStream) iPart.getContent();
//here!!! x.available is zore,why
System.out.println("x.length = " + x.available());
int i = 0;
byte[] bArray = new byte[x.available()];
while ((i = x.available()) > 0) {
int result = x.read(bArray);
if (result == -1)
break;
}
//todo temp directory
File file = new File("tmp/" + imgName);
boolean b = file.getParentFile().mkdirs();
FileOutputStream f2 = new FileOutputStream(file);
f2.write(bArray);
}
}
This is not the correct way to read an InputStream. The documentation for InputStream.available states:
Returns an estimate of the number of bytes that can be read (or skipped over) from this input stream without blocking by the next invocation of a method for this input stream. The next invocation might be the same thread or another thread. A single read or skip of this many bytes will not block, but may read or skip fewer bytes.
Note that while some implementations of InputStream will return the total number of bytes in the stream, many will not. It is never correct to use the return value of this method to allocate a buffer intended to hold all data in this stream.
The key phrase there is "without blocking". This means there may be more data available on the remote side, so it's incorrect to use this to detect the end of the stream. Instead, you should check result (the return value from InputStream.read(bArray).
It's also incorrect to use the result from available to allocate bArray. It is not possible in general to determine the size of all data that will be produced by an InputStream. In fact, regardless of the size of the buffer, it is not guaranteed that all of the data will be returned from the InputStream in a single call to read, so there needs to be a loop that either appends each read to a dynamically-sized buffer, or processes the read data in a streaming fashion. The latter is preferable, as it doesn't require retaining all of the data in memory at once, which could cause heap space exhaustion for large attachments in this case.
Implementing low-level I/O operations correctly can be tricky, so it's better to use a high-level method when available. In the case of MimeBodyPart, you can use the writeTo method to automatically write to the FileOutputStream:
iPart.writeTo(f2);
This eliminates the need to explicitly get and read from the InputStream.

FileInputStream and DataOutputStream - handling byte[] buffer

I've been working on an app to move files between two hosts and while I got the transfer process to work (code is still really messy so sorry for that, I'm still fixing it) I'm kinda left wondering how exactly it handles the buffer. I'm fairly new to networking in java so I just don't want to end up with "meh i got it to work so let's move on" attitude.
File sending code.
public void sendFile(String filepath, DataOutputStream dos) throws Exception{
if (new File(filepath).isFile()&&dos!=null){
long size = new File(filepath).length();
String strsize = Long.toString(size) +"\n";
//System.out.println("File size in bytes: " + strsize);
outToClient.writeBytes(strsize);
FileInputStream fis = new FileInputStream(filepath);
byte[] filebuffer = new byte[8192];
while(fis.read(filebuffer) > 0){
dos.write(filebuffer);
dos.flush();
}
File recieving code
public void saveFile() throws Exception{
String size = inFromServer.readLine();
long longsize = Long.parseLong(size);
//System.out.println(longsize);
String tmppath = currentpath + "\\" + tmpdownloadname;
DataInputStream dis = new DataInputStream(clientSocket.getInputStream());
FileOutputStream fos = new FileOutputStream(tmppath);
byte[] filebuffer = new byte[8192];
int read = 0;
int remaining = (int)longsize;
while((read = dis.read(filebuffer, 0, Math.min(filebuffer.length, remaining))) > 0){
//System.out.println(Math.min(filebuffer.length, remaining));
//System.out.println(read);
//System.out.println(remaining);
remaining -= read;
fos.write(filebuffer,0, read);
}
}
I'd like to know how exactly buffers on both sides are handled to avoid writing wrong bytes. (ik how receiving code avoids that but i'd still like to know how byte array is handled)
Does fis/dis always wait for buffers to fill up fully? In receiving code it always writes full array or remaining length if it's less than filebuffer.length but what about fis from sending code.
In fact, your code could have a subtle bug, exactly because of the way you handle buffers.
When you read a buffer from the original file, the read(byte[]) method returns the number of bytes actually read. There is no guarantee that, in fact, all 8192 bytes have been read.
Suppose you have a file with 10000 bytes. Your first read operation reads 8192 bytes. Your second read operation, however, will only read 1808 bytes. The third operation will return -1.
In the first read, you write exactly the bytes that you have read, because you read a full buffer. But in the second read, your buffer actually contains 1808 correct bytes, and the remaining 6384 bytes are wrong - they are still there from the previous read.
In this case you are lucky, because this only happens in the last buffer that you write. Thus, the fact that you stop reading on your client side when you reach the pre-sent length causes you to skip those 6384 wrong bytes which you shouldn't have sent anyway.
But in fact, there is no actual guarantee that reading from the file will return 8192 bytes even if the end was not reached yet. The method's contract does not guarantee that, and it's up to the OS and underlying file system. It could, for example, send you 5000 bytes in your first read, and 5000 in your second read. In this case, you would be sending 3192 wrong bytes in the middle of the file.
Therefore, your code should actually look like:
byte[] filebuffer = new byte[8192];
int read = 0;
while(( read = fis.read(filebuffer)) > 0){
dos.write(filebuffer,0,read);
dos.flush();
}
much like the code you have on the receiving side. This guarantees that only the actual bytes read will be written.
So there is nothing actually magical about the way buffers are handled. You give the stream a buffer, you tell it how much of the buffer it's allowed to fill, but there is no guarantee it will fill all of it. It may fill less and you have to take care and use only the portion it tells you it fills.
Another grave mistake you are making, though, is to just convert the long that you received into an int in this line:
int remaining = (int)longsize;
Files may be longer than an integer contains. Especially things like long videos etc. This is why you get that number as a long in the first place. Don't truncate it like that. Keep the remaining as long and change it to int only after you have taken the minimum (because you know the minimum will always be in the range of an int).
long remaining = longsize;
long fileBufferLen = filebuffer.length;
while((read = dis.read(filebuffer, 0, (int)Math.min(fileBufferLen, remaining))) > 0){
...
}
By the way, there is no real reason to use a DataOutputStream and DataInputStream for this. The read(byte[]), read(byte[],int,int), write(byte[]), and write(byte[],int,int) are inherited from the underlying InputStream and there is no reason not to use the socket's OutputStream/InputStream directly, or use a BufferedOutputStream/BufferedOutputStream to wrap it. There is also no need to use flush until you have finished writing/reading.
Also, do not forget to close at least your file input/output streams when you are done with them. You may want to keep the socket input/output streams open for continued communication, but there is no need to keep the files themselves open, it may cause problems. Use a try-with-resources to guarantee that they are closed.

How to tell how far along an InputStream a loop is in Java

I'm trying to make a downloader so I can automatically update my program. The following is my code so far:
public void applyUpdate(final CharSequence ver)
{
java.io.InputStream is;
java.io.BufferedWriter bw;
try
{
String s, ver;
alertOf(s);
updateDialogProgressBar.setIndeterminate(true);//This is a javax.swing.JProgressBar which is configured beforehand, and is displayed to the user in an update dialog
is = latestUpdURL.openStream();//This is a java.net.URL which is configured beforehand, and contains the path to a replacement JAR file
bw = new java.io.BufferedWriter(new java.io.FileWriter(new java.io.File(System.getProperty("user.dir") + java.io.File.separatorChar + TITLE + ver + ".jar")));//Creates a new buffered writer which writes to a file adjacent to the JAR being run, whose name is the title of the application, then a space, then the version number of the update, then ".jar"
updateDialogProgressBar.setValue(0);
//updateDialogProgressBar.setMaximum(totalSize);//This is where I would input the total number of bytes in the target file
updateDialogProgressBar.setIndeterminate(false);
{
for (int i, prog=0; (i = is.read()) != -1; prog++)
{
bw.write(i);
updateDialogProgressBar.setValue(prog);
}
bw.close();
is.close();
}
}
catch (Throwable t)
{
//Alert the user of a problem
}
}
As you can see, I'm just trying to make a downloader with a progress bar, but I don't know how to tell the total size of the target file. How can I tell how many bytes are going to be downloaded before the file is done downloading?
A Stream is a flow of bytes, you can't ask it how many bytes are remaining, you just read from it until it says 'i'm done'. Now, depending on how the connection that provides the stream is stablished, perhaps the underlying protocol (HTTP, for example) can know in advance the total length to be sent... perhaps not. For this, see URLConnection.getContentLength(). But it might well return -1 (= 'I don't know').
BTW, your code is not the proper way to read a stream of bytes and write it to a file. For one thing, you are using a Writer, when you should use a OutputStream (you are converting from bytes to characters, and then back to bytes - this hinders performance and might corrupt everything if the received content is binary, or if the encodings don't match). Secondly, its inefficient to read and write one byte at a time.
To get the length of the file, you do this:
new File("System.getProperty("user.dir") + java.io.File.separatorChar + TITLE + ver + ".jar").length()

Load text file to memory in Java

I have wiki.txt file and its size is 50 MB.
I need to do several things on the file and so I thought that the best way in terms of performance is to load the file to memory, is that correct?
This is the code that I written:
File file = new File("wiki.txt");
FileInputStream fileInputStream = new FileInputStream(file);
FileChannel fileChannel = fileInputStream.getChannel();
MappedByteBuffer mapByteBuffer = fileChannel.map(FileChannel.MapMode.READ_ONLY, 0, file.length());
System.out.println((char)mapByteBuffer.get());
I get error on this code: mapByteBuffer.get().
I tried the get() function a few options but all of them I get error and didn't even get an error on e.getMessage() I just got null.
Another important thing to note, my text file contains English words and actions I need to do is search, if expressed is exist in this text file.
Thank you.
I would suggest using a MemoryMappedFile, to read the file directly from the disk instead of loading it in memory.
RandomAccessFile file = new RandomAccessFile("wiki.txt", "r");
FileChannel channel = file.getChannel();
MappedByteBuffer buf = channel.map(FileChannel.MapMode.READ_WRITE, 0, 1024*50);
And then you can read the buffer as usual.
My answers for point (1):
It depends on what you want to do with the file. If your processing doesn't involve rewind operation (looking what was read behind/before), it's best to just read as a stream and process it in one go (instead of loading all into memory).
Even if you need random access across the file, you may also be interested in doing block file operation, because your solution may not scale well when the file size change to bigger size.
RandomAccessFile if you are on Java 1.4 or above.
For random access, the operating system usually handles the file buffer caching quite well you don't have to handle yourself.
It is important to read the whole error, not just the message. Often the real information is in the exception's name not the text associated with it.
You will get an error if the file is empty as there is no first byte.
Note: the approach you are using assumes ASCII 7-bit characters. If you want to assume ISO-8859-1 characters you can use (char) (byteBuffer.get() & 0xFF)
However, if you have plan text you may find that using strings is simpler to use and not much slower. e.g. you can read a 50 MB file as text in less than a second. I would only use a memory mapped file if this is far too long.
I would suggest to use BufferedReader. It is much faster and requires relatively less resources.
First read number of lines:
InputStream is = new BufferedInputStream(new FileInputStream(filename));
byte[] chars = new byte[1024];
int numberOfChars = 0;
while ((numberOfChars = is.read(chars)) != -1)
{
for (int i = 0; i < numberOfChars; ++i)
{
if (chars[i] == '\n' && numberOfChars - i != 1)
{
++count;
}
}
}
count++
return count; // number of lines
Then read the lines:
BufferedReader in = new BufferedReader(new FileReader(fileName));
for (int i = 0; i < endLine; i++)
{
String oneLine = in.readLine();
}
In this strings you can even do search for what you need.

java: error checking php output

Hi i have a problem i'm not able to solve.
In my Android\java application i call a script download.php. Basically it gives a file in output that i download and save on my device. I had to add a control on all my php scripts that basically consist in sending a token to the script and check if it's valid or not. If it's a valid token i will get the output (in this case a file in the other scripts a json file) if it's not i get back a string "false".
To check this condition in my other java files i used IOUtils method to turn the input stream to a String, check it, and than
InputStream newInputStream = new ByteArrayInputStream(mystring.getBytes("UTF-8"));
to get a valid input stream again and read it......it works with my JSon files, but not in this case......i get this error:
11-04 16:50:31.074: ERROR/AndroidRuntime(32363):
java.lang.OutOfMemoryError
when i try IOUtils.toString(inputStream, "UTF-8");
I think it's because in this case i'm trying to download really long file.
fileOutput = new BufferedOutputStream(new FileOutputStream(file,false));
inputStream = new BufferedInputStream(conn.getInputStream());
String result = IOUtils.toString(inputStream, "UTF-8");
if(result.equals("false"))
{
return false;
}
else
{
Reader r = new InputStreamReader(MyMethods.stringToInputStream(result));
int totalSize = conn.getContentLength();
int downloadedSize = 0;
byte[] buffer = new byte[1024];
int bufferLength = 0;
while ( (bufferLength = inputStream.read(buffer)) > 0 )
{
fileOutput.write(buffer, 0, bufferLength);
downloadedSize += bufferLength;
}
fileOutput.flush();
fileOutput.close();
Don't read the stream as a string to start with. Keep it as binary data, and start off by just reading the first 5 bytes. You can then check whether those 5 bytes are the 5 bytes used to encode "false" in UTF-8, and act accordingly if so. Otherwise, write those 5 bytes to the output file and then do the same looping/reading/writing as before. Note that to read those 5 bytes you may need to loop (however unlikely that seems). Perhaps your IOUtils class has something to say "read at least 5 bytes"? Will the real content ever be smaller than 5 bytes?
To be honest, it would be better if you could use a header in the response to indicate the different result, instead of just a body with "false" - are you in control of the PHP script?

Categories