Hi i have a problem i'm not able to solve.
In my Android\java application i call a script download.php. Basically it gives a file in output that i download and save on my device. I had to add a control on all my php scripts that basically consist in sending a token to the script and check if it's valid or not. If it's a valid token i will get the output (in this case a file in the other scripts a json file) if it's not i get back a string "false".
To check this condition in my other java files i used IOUtils method to turn the input stream to a String, check it, and than
InputStream newInputStream = new ByteArrayInputStream(mystring.getBytes("UTF-8"));
to get a valid input stream again and read it......it works with my JSon files, but not in this case......i get this error:
11-04 16:50:31.074: ERROR/AndroidRuntime(32363):
java.lang.OutOfMemoryError
when i try IOUtils.toString(inputStream, "UTF-8");
I think it's because in this case i'm trying to download really long file.
fileOutput = new BufferedOutputStream(new FileOutputStream(file,false));
inputStream = new BufferedInputStream(conn.getInputStream());
String result = IOUtils.toString(inputStream, "UTF-8");
if(result.equals("false"))
{
return false;
}
else
{
Reader r = new InputStreamReader(MyMethods.stringToInputStream(result));
int totalSize = conn.getContentLength();
int downloadedSize = 0;
byte[] buffer = new byte[1024];
int bufferLength = 0;
while ( (bufferLength = inputStream.read(buffer)) > 0 )
{
fileOutput.write(buffer, 0, bufferLength);
downloadedSize += bufferLength;
}
fileOutput.flush();
fileOutput.close();
Don't read the stream as a string to start with. Keep it as binary data, and start off by just reading the first 5 bytes. You can then check whether those 5 bytes are the 5 bytes used to encode "false" in UTF-8, and act accordingly if so. Otherwise, write those 5 bytes to the output file and then do the same looping/reading/writing as before. Note that to read those 5 bytes you may need to loop (however unlikely that seems). Perhaps your IOUtils class has something to say "read at least 5 bytes"? Will the real content ever be smaller than 5 bytes?
To be honest, it would be better if you could use a header in the response to indicate the different result, instead of just a body with "false" - are you in control of the PHP script?
Related
I have a small project running a server in C# and a client in Java. The server sends images to the client.
Some images are quite big (up to 10MiB sometimes), so I split the image bytes and send it in chunks of 32768 bytes each.
My C# Server code is as follows:
using (var stream = new MemoryStream(ImageData))
{
for (int j = 1; j <= dataSplitParameters.NumberOfChunks; j++)
{
byte[] chunk;
if (j == dataSplitParameters.NumberOfChunks)
chunk = new byte[dataSplitParameters.FinalChunkSize];
else
chunk = new byte[dataSplitParameters.ChunkSize];
int result = stream.Read(chunk, 0, chunk.Length);
string line = DateTime.Now + ", Status OK, " + ImageName+ ", ImageChunk, " + j + ", " + dataSplitParameters.NumberOfChunks + ", " + chunk.Length;
//write read params
streamWriter.WriteLine(line);
streamWriter.Flush();
//write the data
binaryWriter.Write(chunk);
binaryWriter.Flush();
Console.WriteLine(line);
string deliveryReport = streamReader.ReadLine();
Console.WriteLine(deliveryReport);
}
}
And my Java Client code is as follows:
long dataRead = 0;
for (int j = 1; j <= numberOfChunks; j++) {
String line = bufferedReader.readLine();
tokens = line.split(", ");
System.out.println(line);
int toRead = Integer.parseInt(tokens[tokens.length - 1]);
byte[] chunk = new byte[toRead];
int read = inputStream.read(chunk, 0, toRead);
//do something with the data
dataRead += read;
String progressReport = pageLabel + ", progress: " + dataRead + "/" + dataLength + " bytes.";
bufferedOutputStream.write((progressReport + "\n").getBytes());
bufferedOutputStream.flush();
System.out.println(progressReport);
}
The problem is when I run the code, either the client crashes with an error saying it is reading bogus data, or both the client and the server hang. This is the error:
Document Page 1, progress: 49153/226604 bytes.
�9��%>�YI!��F�����h�
Exception in thread "main" java.lang.NumberFormatException: For input string: .....
What am I doing wrong?
The basic problem.
Once you wrap an inputstream into a bufferedreader you must stop accessing the inputstream. That bufferedreader is buffered, it will read as much data as it wants to, it is NOT limited to reading exactly up to the next newline symbol(s) and stopping there.
The BufferedReader on the java side has read a lot more than that, so it's consumed a whole bunch of image data already, and there's no way out from here. By making that BufferedReader, you've made the job impossible, so you can't do that.
The underlying problem.
You have a single TCP/IP connection. On this, you send some irrelevant text (the page, the progress, etc), and then you send an unknown amount of image data, and then you send another irrelevant progress update.
That's fundamentally broken. How can an image parser possibly know that halfway through sending an image, you get a status update line? Text is just binary data too, there is no magic identifier that lets a client know: This byte is part of the image data, but this byte is some text sent in-between with progress info.
The simple fix.
You'd think the simple fix is.. well, stop doing that then! Why are you sending this progress? The client is perfectly capable of knowing how many bytes it read, there is no point sending that. Just.. take your binary data. open the outputstream. send all that data. And on the client side, open the inputstream, read all that data. Don't involve strings. Don't use anything that smacks of 'works with characters' (so, BufferedReader? No. BufferedInputStream is fine).
... but now the client doesn't know the title, nor the total size!
So make a wire protocol. It can be near trivial.
This is your wire protocol:
4 bytes, big endian: SizeOfName
SizeOfName number of bytes. UTF-8 encoded document title.
4 bytes, big endian: SizeOfData
SizeOfData number of bytes. The image data.
And that's if you actually want the client to be able to render a progress bar and to know the title. If that's not needed, don't do any of that, just straight up send the bytes, and signal that the file has been completely sent by.. closing the connection.
Here's some sample java code:
try (InputStream in = ....) {
int nameSize = readInt(in);
byte[] nameBytes = in.readNBytes(nameSize);
String name = new String(nameBytes, StandardCharsets.UTF_8);
int dataSize = readInt(in);
try (OutputStream out =
Files.newOutputStream(Paths.get("/Users/TriSky/image.png")) {
byte[] buffer = new byte[65536];
while (dataSize > 0) {
int r = in.read(buffer);
if (r == -1) throw new IOException("Early end-of-stream");
out.write(buffer, 0, r);
dataSize -= r;
}
}
}
public int readInt(InputStream in) throws IOException {
byte[] b = in.readNBytes(4);
return ByteBuffer.wrap(b).getInt();
}
Closing notes
Another bug in your app is that you're using the wrong method. Java's 'read(bytes)' method will NOT (neccessarily) fully fill that byte array. All read(byte[]) will do is read at least 1 byte (unless the stream is closed, then it reads none, and returns -1. The idea is: read will read the optimal number of bytes: Exactly as many as are ready to give you right now. How many is that? Who knows - if you ignore the returned value of in.read(bytes), your code is neccessarily broken, and you're doing just that. What you really want is for example readNBytes which guarantees that it fully fills that byte array (or until stream ends, whichever happens first).
Note that in the transfer code above, I also use the basic read, but here I don't ignore the return value.
Your Java code seems to be using a BufferedReader. It reads data into a buffer of its own, meaning it is no longer available in the underlying socket input stream - that's your first problem. You have a second problem with how inputStream.read is used - it's not guaranteed to read all the bytes you ask for, you would have to put a loop around it.
This is not a particularly easy problem to solve. When you mix binary and text data in the same stream, it is difficult to read it back. In Java, there is a class called DataInputStream that can help a little - it has a readLine method to read a line of text, and also methods to read binary data:
DataInputStream dataInput = new DataInputStream(inputStream);
for (int j = 1; j <= numberOfChunks; j++) {
String line = dataInput.readLine();
...
byte[] chunk = new byte[toRead];
int read = dataInput.readFully(chunk);
...
}
DataInputStream has limitations: the readLine method is deprecated because it assumes the text is encoded in latin-1, and does not let you use a different text encoding. If you want to go further down this road you'll want to create a class of your own to read your stream format.
Some images are quite big (up to 10MiB sometimes), so I split the image bytes and send it in chunks of 32768 bytes each.
You know this is totally unnecessary right? There is absolutely no problem sending multiple megabytes of data into a TCP socket, and streaming all of the data in on the receiving side.
When you try to send image, you have to open the image as a normal file then substring the image into some chunks and every chunk change it into "base64encode" when you send and the client decode it because the image data is not normal data, so base64encode change this symbols to normal chars like AfHM65Hkgf7MM
I'm trying to write a function which downloads a file at a specific URL. The function produces a corrupt file unless I make the buffer an array of size 1 (as it is in the code below).
The ternary statement above the buffer initialization (which I plan to use) along with hard-coded integer values other than 1 will manufacture a corrupted file.
Note: MAX_BUFFER_SIZE is a constant, defined as 8192 (2^13) in my code.
public static void downloadFile(String webPath, String localDir, String fileName) {
try {
File localFile;
FileOutputStream writableLocalFile;
InputStream stream;
url = new URL(webPath);
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
int size = connection.getContentLength(); //File size in bytes
int read = 0; //Bytes read
localFile = new File(localDir);
//Ensure that directory exists, otherwise create it.
if (!localFile.exists())
localFile.mkdirs();
//Ensure that file exists, otherwise create it.
//Note that if we define the file path as we do below initially and call mkdirs() it will create a folder with the file name (I.e. test.exe). There may be a better alternative, revisit later.
localFile = new File(localDir + fileName);
if (!localFile.exists())
localFile.createNewFile();
writableLocalFile = new FileOutputStream(localFile);
stream = connection.getInputStream();
byte[] buffer;
int remaining;
while (read != size) {
remaining = size - read; //Bytes still to be read
//remaining > MAX_BUFFER_SIZE ? MAX_BUFFER_SIZE : remaining
buffer = new byte[1]; //Adjust buffer size according to remaining data (to be read).
read += stream.read(buffer); //Read buffer-size amount of bytes from the stream.
writableLocalFile.write(buffer, 0, buffer.length); //Args: Bytes to read, offset, number of bytes
}
System.out.println("Read " + read + " bytes.");
writableLocalFile.close();
stream.close();
} catch (Throwable t) {
t.printStackTrace();
}
}
The reason I've written it this way is so I may provide a real time progress bar to the user as they are downloading. I've removed it from the code to reduce clutter.
len = stream.read(buffer);
read += len;
writableLocalFile.write(buffer, 0, len);
You must not use buffer.length as the bytes read, you need to use the return value of the read call. Because it might return a short read and then your buffer contains junk (0 bytes or data from previous reads) after the read bytes.
And besides calculating the remaining and using dynamic buffers just go for 16k or something like that. The last read will be short, which is fine.
InputStream.read() may read number of bytes fewer than you requested. But you always append whole buffer to the file. You need to capture actual number of read bytes and append only those bytes to the file.
Additionally:
Watch for InputStream.read() to return -1 (EOF)
Server may return incorrect size. As such, the check read != size is dangerous. I would advise not to rely on the Content-Length HTTP field altogether. Instead, just keep reading from the input stream until you hit EOF.
I have some large base64 encoded data (stored in snappy files in the hadoop filesystem).
This data was originally gzipped text data.
I need to be able to read chunks of this encoded data, decode it, and then flush it to a GZIPOutputStream.
Any ideas on how I could do this instead of loading the whole base64 data into an array and calling Base64.decodeBase64(byte[]) ?
Am I right if I read the characters till the '\r\n' delimiter and decode it line by line?
e.g. :
for (int i = 0; i < byteData.length; i++) {
if (byteData[i] == CARRIAGE_RETURN || byteData[i] == NEWLINE) {
if (i < byteData.length - 1 && byteData[i + 1] == NEWLINE)
i += 2;
else
i += 1;
byteBuffer.put(Base64.decodeBase64(record));
byteCounter = 0;
record = new byte[8192];
} else {
record[byteCounter++] = byteData[i];
}
}
Sadly, this approach doesn't give any human readable output.
Ideally, I would like to stream read, decode, and stream out the data.
Right now, I'm trying to put in an inputstream and then copy to a gzipout
byteBuffer.get(bufferBytes);
InputStream inputStream = new ByteArrayInputStream(bufferBytes);
inputStream = new GZIPInputStream(inputStream);
IOUtils.copy(inputStream , gzipOutputStream);
And it gives me a
java.io.IOException: Corrupt GZIP trailer
Let's go step by step:
You need a GZIPInputStream to read zipped data (that and not a GZIPOutputStream; the output stream is used to compress data). Having this stream you will be able to read the uncompressed, original binary data. This requires an InputStream in the constructor.
You need an input stream capable of reading the Base64 encoded data. I suggest the handy Base64InputStream from apache-commons-codec. With the constructor you can set the line length, the line separator and set doEncode=false to decode data. This in turn requires another input stream - the raw, Base64 encoded data.
This stream depends on how you get your data; ideally the data should be available as InputStream - problem solved. If not, you may have to use the ByteArrayInputStream (if binary), StringBufferInputStream (if string) etc.
Roughly this logic is:
InputStream fromHadoop = ...; // 3rd paragraph
Base64InputStream b64is = // 2nd paragraph
new Base64InputStream(fromHadoop, false, 80, "\n".getBytes("UTF-8"));
GZIPInputStream zis = new GZIPInputStream(b64is); // 1st paragraph
Please pay attention to the arguments of Base64InputStream (line length and end-of-line byte array), you may need to tweak them.
Thanks to Nikos for pointing me in the right direction.
Specifically this is what I did:
private static final byte NEWLINE = (byte) '\n';
private static final byte CARRIAGE_RETURN = (byte) '\r';
byte[] lineSeparators = new byte[] {CARRIAGE_RETURN, NEWLINE};
Base64InputStream b64is = new Base64InputStream(inputStream, false, 76, lineSeparators);
GZIPInputStream zis = new GZIPInputStream(b64is);
Isn't 76 the length of the Base64 line? I didn't try with 80, though.
I try to send multiple Files from my Server (NanoHttpd) to my Client (Apache DefaultHttpClient).
My approach is to send multiple files via one Response of NanoHttpd.
For this purpose i wanted to use SequenceInputStream.
I am trying to concatenate multiple Files, send them via the Response (InputStream) and write every File again in a seperate File with my Client.
On the Serverside i call this:
List<InputStream> data = new ArrayList<InputStream>(o_file_path.size());
for (String file_name : files)
{
File file = new File(file_name);
data.add(new FileInputStream(file));
}
InputStream is = new SequenceInputStream(Collections.enumeration(data));
return new NanoHTTPD.Response(HTTP_OK, "application/octet-stream", is);
Now my Question is how to receive and split the Files correctly.
I have tried it this way on my client, but it does not work:
int read = 0;
int remaining = 0;
byte[] bytes = new byte[buffer];
// Read till the end of the Stream
while ( (read != -1) && (counter < files.size()))
{
// Create a .o file for the current file
read = 0;
remaining = is.available();
// Should open each Stream
while (remaining > 0)
{
read = is.read(bytes);
remaining = remaining - read;
os.write(bytes, 0, read);
}
os.flush();
os.close();
}
This way I want to go over all Stream (untill read == 1, or i know there is no file anymore), and read any stream into a file.
I clearly seem to understand something groundbreaking wrong, since is.available() always is 0.
Could anyone please tell me how to read properly from this SequencedInputStream, or how to solve my Problem.
Thanks in advance.
It won't work this way. SequenceInputStream will merge all input streams in one solid byte stream. There will be no separators or EOFs. I suggest to abandon the idea and look for a different approach.
I have some working code in python that I need to convert to Java.
I have read quite a few threads on this forum but could not find an answer. I am reading in a JPG image and converting it into a byte array. I then write this buffer it to a different file. When I compare the written files from both Java and python code, the bytes at the end do not match. Please let me know if you have a suggestion. I need to use the byte array to pack the image into a message that needs to be sent over to a remote server.
Java code (Running on Android)
Reading the file:
File queryImg = new File(ImagePath);
int imageLen = (int)queryImg.length();
byte [] imgData = new byte[imageLen];
FileInputStream fis = new FileInputStream(queryImg);
fis.read(imgData);
Writing the file:
FileOutputStream f = new FileOutputStream(new File("/sdcard/output.raw"));
f.write(imgData);
f.flush();
f.close();
Thanks!
InputStream.read is not guaranteed to read any particular number of bytes and may read less than you asked it to. It returns the actual number read so you can have a loop that keeps track of progress:
public void pump(InputStream in, OutputStream out, int size) {
byte[] buffer = new byte[4096]; // Or whatever constant you feel like using
int done = 0;
while (done < size) {
int read = in.read(buffer);
if (read == -1) {
throw new IOException("Something went horribly wrong");
}
out.write(buffer, 0, read);
done += read;
}
// Maybe put cleanup code in here if you like, e.g. in.close, out.flush, out.close
}
I believe Apache Commons IO has classes for doing this kind of stuff so you don't need to write it yourself.
Your file length might be more than int can hold and than you end up having wrong array length, hence not reading entire file into the buffer.