Copying files from a remote address seems to loose information - java

I have a program that reads a file from a webpage and writes it to a file. Most of the time this works good, but on occasions the file gets corrupted. I guess this has something to do with network issues. What could I do to make my code more stable?
String filename = "myfile.txt";
File file = new File(PROFilePath+"/"+filename);
//Open the connection
URL myCon = new URL("url to a page");
URLConnection uc = myCon.openConnection();
FileOutputStream outputStream = new FileOutputStream(file);
int read = 0;
byte[] bytes = new byte[1024];
while ((read = uc.getInputStream().read(bytes)) != -1) {
outputStream.write(bytes, 0, read);
}
uc.getInputStream().close();
outputStream.close();

You are not using an explicit encoding for your copies, you are merely copying all bytes and write these bytes to a file which might later be read with a different decoding. An easy way to find this out is to compare the bytes of the document at the remote address and the copied file after you discover a "broken" file. However, with the information you provide is not detailed enough to provide you more specific help. Is there an example document are you having struggles with? Check out this related question and answer as well as this thread for a deeper discussion of this issue.
As to your suspicion: The connection should not simply lose bytes while you are reading from the remote address. This would be a very serious bug in the implementation as you connect via TCP (I guess the URL's protocol is HTTP) where lost packages are automatically compensated. And if the connection breaks, the connection should issue an exception instead of failing silently. I do not think that this is the source of your error.

Related

Java NIO read large file from inputstream

I want to read a large InputStream and return it as a file. So I need to split InputStream(or I should read InputStream in multiple threads). How can I do this? I'm trying to do something like this:
URL url = new URL("path");
URLConnection connection = url.openConnection();
int fileSize = connection.getContentLength();
InputStream is = connection.getInputStream();
ReadableByteChannel rbc1 = Channels.newChannel(is);
ReadableByteChannel rbc2 = Channels.newChannel(is);
FileOutputStream fos = new FileOutputStream("file.ext");
FileChannel fileChannel1 = fos.getChannel();
FileChannel fileChannel2 = fos.getChannel();
fileChannel1.transferFrom(rbc1, 0, fileSize/2);
fileChannel2.transferFrom(rbc2, fileSize/2, fileSize/2);
fos.close();
But it does not affect on performance.
You can open multiple (HTTP) Connections to the same resource (URL) but use the Range: Header of HTTP to make each stream begin to read at another point. This can actually speed up the data transfer, especially when high latency is an issue. You should not overdo the parallelism, be aware that it puts additional load on the server.
connection1.setRequestProperty("Range", "bytes=0-" + half);
connection2.setRequestProperty("Range", "bytes=" + half+1 +"-");
This can also be used to resume downloads. It needs to be supported by the server. It can announce this with Accept-Ranges: bytesbut does not have to . Be prepared that the first connection might return the whole requested entity (status 200 vs. 206) instead.
You need to read the input streams from the URLConnections in separate threads as this is blocking IO (not sure if the NIO wrapping helps here).
You can use position(long) method for each channel to start reading for.
Check this.
http://tutorials.jenkov.com/java-nio/file-channel.html#filechannel-position
Besides, if you want download a file partially,
Parallel Downloading
To download multiple parts of a file parallelly, we need to create
multiple threads. Each thread is implemented similarly to the simple
thread above, except that it needs to download only a part of the
downloaded file. To do that, the HttpURLConnection or its super class
URLConnection provides us method setRequestProperty to set the range
of the bytes we want to download.
// open Http connection to URL
HttpURLConnection conn = (HttpURLConnection)mURL.openConnection();
// set the range of byte to download
String byteRange = mStartByte + "-" + mEndByte;
conn.setRequestProperty("Range", "bytes=" + byteRange);
// connect to server
conn.connect();
This would be helpful for you.
I found this answer here, you can check complete tutorial.
http://luugiathuy.com/2011/03/download-manager-java/

Who is tampering with my data stream?

The piece of code below downloads a file from some URL and saves it to a local file. Piece of cake. What could possible be wrong here?
protected long download(ProgressMonitor montitor) throws Exception{
long size = 0;
DataInputStream dis = new DataInputStream(is);
int read = 0;
byte[] chunk = new byte[chunkSize];
while( (read = dis.read(chunk)) != -1){
os.write(chunk, 0, read);
size += read;
if(montitor != null)
montitor.worked(read);
}
chunk = null;
dis.close();
os.flush();
os.close();
return size;
}
The reason I am posting a question here is because it works in 99.999% of the time and doesn't work as expected whenever there is an antivirus or some other protection software installed on a computer running this code. I am blindly pointing a finger that way because whenever I stop (or disable) it, the code works perfect again. The end result of such interference is that the MD5 of downloaded file don't match the expected, and a whole new saga begins.
So, the question is - is it really possible that some smart "protection" software would alter the actual stream coming from the URL without me knowing about it? And if yes - how do you deal with this? (verified with Kasperksy and Norton products).
EDIT-1:
Apparently I've got a hold on the problem and it's got nothing to do with antiviruses. The download takes place from the FTP server (FileZilla in particular) and we use apache commons ftp on client side . What I did is went to the FTP server and terminated the connection (kicked it out) in a middle of the download. I expected that is.read(..) would throw an IOException on client side, but this never happened. Instead, the is.read(..) returns -1 meaning that there is no more data coming from the stream. This is definitely unexpected and explains why sometimes I get partial files. This doesn't explain however why sometimes the data gets altered as well.
Yeah this happens to me all the time. In my case it's caused by transparent HTTP proxying by Websense on my corporate network. The worst problem are caused by the block page being returned with 200 OK.
Do you get the same or similar corruption every time? E.g., do you get some HTML explaining why the request was blocked? The best you can probably do is compare the first few bytes of the downloaded data to some text in the block page, and throw an exception in this case.
Edit: based on your update, have you got the FTP client set to image/binary mode?

File transfer through Socket in java

I'm making a Network File Transfer System for transfering any kind of file over a network in java. The size also could be of any kind. Therefore I've used UTF-8 protocol for the task.
I'm providing the codes which I've made but the problem is some times the file gets transfered as it is, with no problem at all. But sometimes few kb's of data is just skipped at the receiving end, which actually restricts the mp3/video/image file to be opened correctly. I think the problem is with BUFFER. I'm not creating any buffer which, right now, I think may be of some use to me.
I would really appreciate if anyone could provide any help regarding the problem, so that the file gets transferred fully.
Client side : --->> File Sender
Socket clientsocket = new Socket(host,6789); // host contains the ip address of the remote server
DataOutputStream outtoserver = new DataOutputStream(clientsocket.getOutputStream());
try
{
int r=0;
FileInputStream fromFile1 = new FileInputStream(path); // "path" is the of the file being sent.
while(r!=-1)
{
r = fromFile1.read();
outtoserver.writeUTF(r+"");
}
}
catch(Exception e)
{
System.out.println(e.toString());
}
clientsocket.close();
Server side: --->> File Receiver
ServerSocket welcome = new ServerSocket(6789);
Socket conn = welcome.accept();
try
{
String r1 = new String();
int r=0;
FileOutputStream toFile1 = new FileOutputStream(path); // "path" is the of the file being received.
BufferedOutputStream toFile= new BufferedOutputStream(toFile1);
DataInputStream recv = new DataInputStream(conn.getInputStream());
while(r!=-1)
{
r1 = recv.readUTF();
r = Integer.parseInt(r1);
toFile.write(r);
}
}
catch(Exception e)
{
System.out.println(e.toString());
}
I don't understand why you are encoding binary data as text.
Plain sockets can send and receive streams of bytes without any problems. So, just read the file as bytes using a FileInputStream and write the bytes to the socket as-is.
(For the record, what you are doing is probably sending 3 to 5 bytes for each byte of the input file. And you are reading the input file one byte at a type without any buffering. These mistakes and others you have made are likely to have a significant impact on file transfer speed. The way to get performance is to simply read and write arrays of bytes using a buffer size of at least 1K bytes.)
I'm not sure of this, but I suspect that the reason that you are losing some data is that you are not flushing or closing outtoserver before you close the socket on the sending end.
FOLLOW UP
I also noticed that you are not flushing / closing toFile on the receiver end, and that could result in you losing data at the end of the file.
The first problem I see is that you're using DataInputStream and DataOutputStream. These are for reading/writing primitive Java types (int, long etc), you don't need them for just binary data.
Another problem is that you're not flushing your file output stream - this could be causing the lost bytes.
An explicit flush might help the situation.

Problem with Sending and Receiving Files with SPP over Bluetooth

I am attempting to transfer files (MP3s about six megabytes in size) between two PCs using SPP over Bluetooth (in Java, with the BlueCove API). I can get the file transfer working fine in one direction (for instance, one file from the client to the server), but when I attempt to send any data in the opposite direction during the same session (i.e., send a file from the server to the client), the program freezes and will not advance.
For example, if I simply:
StreamConnection conn;
OutputStream outputStream;
outputStream = conn.openOutputStream();
....
outputStream.write(data); //Data here is an MP3 file converted to byte array
outputStream.flush();
The transfer works fine. But if I try:
StreamConnection conn;
OutputStream outputStream;
InputStream inputStream;
ByteArrayOutputStream out = new ByteArrayOutputStream();
outputStream = conn.openOutputStream();
inputStream = conn.openInputStream();
....
outputStream.write(data);
outputStream.flush();
int receiveData;
while ((receiveData = inputStream.read()) != -1) {
out.write(receiveData);
}
Both the client and the server freeze, and will not advance. I can see that the file transfer is actually happening at some point, because if I kill the client, the server will still write the file to the hard drive, with no issues. I can try to respond with another file, or with just an integer, and it still will not work.
Anyone have any ideas what the problem is? I know OBEX is commonly used for file transfers over Bluetooth, but it seemed overkill for what I needed to do. Am I going to have to use OBEX for this functionality?
It could be as simple as both programs stuck in blocking receive calls, waiting for the other end to say something... try adding a ton of log statements so you can see what "state" each program is in (ie, so it gives you a running commentary such as "trying to recieve", "got xxx data", "trying to reply", etc), or set up debugging, wait until it gets stuck and then stop one of them and single step it.
you can certainly use SPP to transfer file between your applications (assuming you are sending and receiving at both ends using your application). From the code snippet it is difficult to tell what is wrong with your program.
I am guessing that you will have to close the stream as an indication to the other side that you are done with sending the data .. Note even though you write the whole file in one chunk, SPP / Bluetooth protocol layers might fragment it and the other end could receive in fragments, so you need to have some protocol to indicate transfer completion.
It is hard to say without looking at the client side code, but my guess, if the two are running the same code (i.e. both writing first, and then reading), is that the outputStream needs to be closed before the reading occurs (otherwise, both will be waiting for the other to close their side in order to get out of the read loop, since read() only returns -1 when the other side closes).
If the stream should not be closed, then the condition to stop reading cannot be to wait for -1. (so, either change it to transmit the file size first, or some other mechanism).
Why did you decide to use ByteArrayOutputStream? Try following code:
try {
try {
byte[] buf = new byte[1024];
outputstream = conn.openOutputStream();
inputStream = conn.openInputStream();
while ((n = inputstream.read(buf, 0, 1024)) > -1)
outputstream.write(buf, 0, n);
} finally {
outputstream.close();
inputstream.close();
log.debug("Closed input streams!");
}
} catch (Exception e) {
log.error(e);
e.printStackTrace();
}
And to convert the outputStream you could do something like this:
byte currentMP3Bytes[] = outputStream.toString().getBytes();
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream(currentMP3Bytes);

Java: IOException on write in HttpServlet

I've written a proxy of sorts in Java (and Jetty). Anyway, it works great, but sometimes
...
final OutputStream realOs = res.getOutputStream();
...
InputStream is = url.openStream();
int i;
while ((i = is.read(buffer)) != -1) {
realOs.write(buffer, 0, i);
}
fails with IOException. I've noticed that it mostly happens with large binary files, i.e. flash and Safari browser...
I'm puzzled...
This can happen if the browser is closed (or the user cancels the download) while you're still writing to the socket. The browser closes the socket, so your OutputStream no longer has anything to write to.
Unfortunately it's hard to tell for sure whether this is really the case - in which case it's not an issue - or whether there's something more insidious going on.

Categories