Hello My Respected Seniors :)
My Goal: Download a URL Resource, given a URL, by using Multi-Threading in Java, i.e. download a single file into multiple pieces (much like how IDM does) & at the end of download, combine all of them to 1 final file.
Technology Using: Java, RandomAccessFile, MultiThreading, InputStreams
Problem:
The file is downloaded fine with exact KB size, I've checked many times, but the final file is corrupted. For example, If I download an Image, it will be somewhat blurry, If I download an .exe, it downloads fine but when I run the .exe file, it says "media is damaged, retry download".
This is my Main code from which I call to thread class with parameters such as fileName, starting Range and ending Range for a connection as well as a JProgressBar for every thread which will update its own respectively.
public void InitiateDownload()
{
HttpURLConnection uc = (HttpURLConnection) url.openConnection();
uc.connect();
long fileSize = uc.getContentLengthLong();
System.out.println("File Size = "+ fileSize );
uc.disconnect();
chunkSize = (long) Math.ceil(fileSize/6);
startFrom = 0;
endRange = (startFrom + chunkSize) - 1;
Thread t1 = new MyThread(url, fileName, startFrom, endRange, progressBar_1);
t1.start();
//-----------------------------------------
startFrom += chunkSize;
endRange = endRange + chunkSize;
System.out.println("Part 2 :: Start = " + startFrom + "\tEnd To = " + endRange );
Thread t2 = new MyThread(url, fileName, startFrom, endRange, progressBar_2);
t2.start();
//-----------------------------------------
//..
//..
//..
//-----------------------------------------
startFrom += chunkSize;
long temp = endRange + chunkSize;
endRange = temp + (fileSize - temp); //add any remaining bits, that were rounded off in division
Thread t6 = new MyThread(url, fileName, startFrom, endRange, progressBar_6);
t6.start();
//-----------------------------------------
}
Here is run() function of MyThread class:
public void run() {
Thread.currentThread().setPriority(MAX_PRIORITY);
System.setProperty("http.proxyHost", "192.168.10.50");
System.setProperty("http.proxyPort", "8080");
HttpURLConnection uc = null;
try {
uc = (HttpURLConnection) url.openConnection();
uc.setRequestProperty("Range", "bytes="+startFrom+"-"+range);
uc.connect();
fileSize = uc.getContentLengthLong();
inStream = uc.getInputStream();
int[] buffer = new int[ (int) totalDownloadSize ];
file.seek(startFrom); //adjusted start of file
THIS IS WHERE I THINK THE PROBLEM IS,
run() continued...
for(int i = 0 ; i < totalDownloadSize; i++)
{
buffer[i] = inStream.read();
file.write(buffer[i]);
//Updating Progress bars
totalDownloaded = totalDownloaded + 1;
int downloaded = (int) (100 * ( totalDownloaded/ (float) totalDownloadSize)) ;
progressbar.setValue( downloaded );
}
System.err.println( Thread.currentThread().getName() + "'s download is Finished!");
uc.disconnect();
}
catch(IOException e) {
System.err.println("Exception in " + Thread.currentThread().getName() + "\t Exception = " + e );
}
finally {
try {
file.close();
if(inStream!=null)
inStream.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
Now the file is downloaded with complete size, but as I said, a little part of it is corrupt.
Now,
If I replace the for loop with following while loop, the problem is completely solved.
int bytesRead = 0;
byte[] buffer = new byte[ (int) totalDownloadSize ];
file.seek(startFrom); //adjusted start of file
while( (bytesRead = inStream.read(buffer) ) != -1 ) {
file.write(buffer, 0, bytesRead);
}
BUT I NEED for LOOP TO MEASURE HOW MUCH FILE EACH THREAD HAS DOWNLOADED & I WANT TO UPGRADE RESPECTIVE JPROGRESSBARs of THREADS.
Kindly help me out with the for loop logic.
OR
If you can advise on how can I upgrade Jprogressbars within while loop. I can't seem to find a way to quantify how much file a thread has downloaded...
I've spent alot of hours & I'm extremely tired now...
You can use the while loop that works, and then keep track of the total amount of bytes read like this:
int totalRead = 0;
while ((bytesRead = inStream.read(buffer)) != -1) {
totalRead += bytesRead;
file.write(buffer, 0, bytesRead);
progressBar.setValue((int)(totalRead / (double) totalDownloadSize));
}
just remember that for (a; b; c) { ... } is equal to a; while (b) { c; ... }.
Related
So I've been trying to read the content of a text file and write the content chunk by chunk alternately into e.g. 2 new files.
I already tried multiple ways to do that but it won't work (OutputStream and FileOutputStream seems to be the most suitable).
Before i tried to part the file in e.g. 3 Parts and wrote the first part in one file, the second part in another and so on. Which worked perfectly fine with OutputStream and FileOutputStream.
But it won't work when i want to do it alternately.
To do it alternately i use the round robin algorithm, which on its own works fine.
I would be really thankful if you could show me some examples to do it!
public void splitFile(String filePath, int numberOfParts, long sizeOfParts[]) throws FileNotFoundException, IOException, SQLException {
long bytes = 8;
OutputStream partsPath[] = new OutputStream[numberOfParts];
long bytePositition[] = new long[numberOfParts];
long copy_size[] = new long[numberOfParts];
for (int i = 0; i < numberOfParts; i++) {
copy_size[i] = sizeOfParts[i];
partsPath[i] = new FileOutputStream(path); //Gets Path from my Database (works)
//System.out.println(cloudsTable.getCloudsPathsFromDatabase(i) + '\\' + name + (i + 1) + fileType);
}
InputStream file = new FileInputStream(filePath);
while (true) {
boolean done = true;
for (int i = 0; i < numberOfParts; i++) {
if (copy_size[i] > 0) {
done = false;
if (copy_size[i] > bytes) {
copy_size[i] -= bytes;
bytePositition[i] += bytes;
System.out.println("file " + i + " " + bytePositition[i]);
readWrite(file, bytePositition[i], partsPath[i]);
} else {
bytePositition[i] += copy_size[i];
System.out.println("rest file " + i + " " + bytePositition[i]);
readWrite(file, bytePositition[i], partsPath[i]);
copy_size[i] = 0;
}
}
}
if (done == true) {
break;
}
}
file.close();
for (int i = 0; i < partsPath.length; i++) {
partsPath[i].close();
}
}
private void readWrite(InputStream file, long bytes, OutputStream path) throws IOException {
byte[] buf = new byte[(int) bytes];
while (file.read(buf) != -1) {
path.write(buf);
path.flush();
}
}
What the code does is, it only write the content of the Originalfile in the first-copied file and the following files are empty
EDIT:
To clarify what the code should do is write the first 8 bytes to go to file 1, second 8 bytes to go to file 2, third 8 bytes to go to file 3, fourth 8 bytes to go to file 1, and so on, round robin, until file 1 is sizeOfParts[0] long, file 2 is sizeOfParts[1] long, and file 3 is sizeOfParts[2] long.
The main problem is that the readWrite() method is only supposed to copy one 8-byte block of bytes, but has a loop that makes it copy all the remaining bytes in the input file.
In addition, the code should be enhanced to use try-finally to close the files, and to correctly handle end-of-file, in case the input file is shorter than the sum of parts.
I would eliminate the readWrite() method, and consolidate the logic to prevent duplicate code, like this:
public void splitFile(String inPath, long[] sizeOfParts) throws IOException, SQLException {
final int numberOfParts = sizeOfParts.length;
String[] outPath = new String[numberOfParts];
// Gets Paths from Database here
InputStream in = null;
OutputStream[] out = new OutputStream[numberOfParts];
try {
in = new BufferedInputStream(new FileInputStream(inPath));
for (int part = 0; part < numberOfParts; part++)
out[part] = new BufferedOutputStream(new FileOutputStream(outPath[part]));
byte[] buf = new byte[8];
long[] remain = sizeOfParts.clone();
for (boolean done = false; ! done; ) {
done = true;
for (int part = 0; part < numberOfParts; part++) {
if (remain[part] > 0) {
int len = in.read(buf, 0, (int) Math.min(remain[part], buf.length));
if (len == -1) {
done = true;
break;
}
remain[part] -= len;
System.out.println("file " + part + " " + (sizeOfParts[part] - remain[part]));
out[part].write(buf, 0, len);
done = false;
}
}
}
} finally {
if (in != null)
in.close();
for (int part = 0; part < out.length; part++)
if (out[part] != null)
out[part].close();
}
}
In this example files reader the solution focuses on just reading any file any file and loading it into the memory.
I've been working on it to improve it so it processes a csv file with keeping the header in each thread, so each thread can output a separate and a correctly-formatted csv file.
Unfortunately I'm not able to do so since it reads from random locations (lines), this means it might read from the middle of the line and I'll get lines mixed up.
Is there a way to utilize this code and make is csv specific?
Here is the code I changed:
public static void main(String[] args) throws IOException {
long start = System.currentTimeMillis();
CSVReader reader = new CSVReader(new FileReader("file.csv"));
String[] columnsNames = reader.readNext();
reader.close();
FileInputStream fileInputStream = new FileInputStream("file.csv");
FileChannel channel = fileInputStream.getChannel();
long remaining_size = channel.size(); //get the total number of bytes in the file
long chunk_size = remaining_size / Integer.parseInt("4"); //file_size/threads
//Max allocation size allowed is ~2GB
if (chunk_size > (Integer.MAX_VALUE - 5))
{
chunk_size = (Integer.MAX_VALUE - 5);
}
//thread pool
ExecutorService executor = Executors.newFixedThreadPool(Integer.parseInt("4"));
long start_loc = 0;//file pointer
int i = 0; //loop counter
while (remaining_size >= chunk_size)
{
//launches a new thread
executor.execute(new FileRead(start_loc, toIntExact(chunk_size), channel, i, String.join(",", columnsNames)));
remaining_size = remaining_size - chunk_size;
start_loc = start_loc + chunk_size;
i++;
}
//load the last remaining piece
executor.execute(new FileRead(start_loc, toIntExact(remaining_size), channel, i, String.join(",", columnsNames)));
//Tear Down
executor.shutdown();
//Wait for all threads to finish
while (!executor.isTerminated())
{
//wait for infinity time
}
System.out.println("Finished all threads");
fileInputStream.close();
long finish = System.currentTimeMillis();
System.out.println( "Time elapsed: " + (finish - start) );
}
class FileRead implements Runnable {
private FileChannel _channel;
private long _startLocation;
private int _size;
int _sequence_number;
String _columns;
public FileRead(long loc, int size, FileChannel chnl, int sequence, String header) {
_startLocation = loc;
_size = size;
_channel = chnl;
_sequence_number = sequence;
_columns = header;
}
#Override
public void run() {
try {
System.out.println( "Reading the channel: " + _startLocation + ":" + _size );
//allocate memory
ByteBuffer buff = ByteBuffer.allocate( _size );
//Read file chunk to RAM
_channel.read( buff, _startLocation );
//chunk to String
String string_chunk = new String( buff.array(), Charset.forName( "UTF-8" ) );
string_chunk = _columns + System.getProperty( "line.separator" ) + string_chunk;
if (string_chunk.length() > 0) {
BufferedWriter out = new BufferedWriter( new FileWriter( "output_" + System.currentTimeMillis() + ".csv" ) );
try {
out.write( string_chunk ); //Replace with the string
//you are trying to write
} catch (IOException e) {
System.out.println( "Exception " );
} finally {
out.close();
}
}
System.out.println( "Done Reading the channel: " + _startLocation + ":" + _size );
} catch (Exception e) {
e.printStackTrace();
}
}
}
I'm trying to perform an AsyncTask class in my Android application that analyzes the network connection speed in for downloading and uploading. I'm working on the download portion now, but I'm not getting results I expect. I'm testing on a Wifi network that gets 15Mbps down/up speeds consistently, however, the results I'm getting from my application are more around barely 1Mbps. When I run the speed test apk on the device I'm testing on that gets around 3.5Mbps. The function works, just seems to be half the speed it should be. Should the following code produce accurate results?
try {
String DownloadUrl = "http://ipv4.download.thinkbroadband.com:8080/5MB.zip";
String fileName = "testfile.bin";
File dir = new File (context.getFilesDir() + "/temp/");
if(dir.exists()==false) {
dir.mkdirs();
}
URL url = new URL(DownloadUrl); //you can write here any link
File file = new File(context.getFilesDir() + "/temp/" + fileName);
long startTime = System.currentTimeMillis();
Log.d("DownloadManager", "download begining: " + startTime);
Log.d("DownloadManager", "download url:" + url);
Log.d("DownloadManager", "downloaded file name:" + fileName);
/* Open a connection to that URL. */
URLConnection ucon = url.openConnection();
//Define InputStreams to read from the URLConnection.
InputStream is = ucon.getInputStream();
BufferedInputStream bis = new BufferedInputStream(is);
//Read bytes to the Buffer until there is nothing more to read(-1).
ByteArrayBuffer baf = new ByteArrayBuffer(1024);
int current = 0;
while ((current = bis.read()) != -1) {
baf.append((byte) current);
}
long endTime = System.currentTimeMillis(); //maybe
/* Convert the Bytes read to a String. */
FileOutputStream fos = new FileOutputStream(file);
fos.write(baf.toByteArray());
fos.flush();
fos.close();
File done = new File(context.getFilesDir() + "/temp/" + fileName);
Log.d("DownloadManager", "Location being searched: "+ context.getFilesDir() + "/temp/" + fileName);
double size = done.length();
if(done.exists()) {
done.delete();
}
Log.d("DownloadManager", "download ended: " + ((endTime - startTime) / 1000) + " secs");
double rate = (((size / 1024) / ((endTime - startTime) / 1000)) * 8);
rate = Math.round( rate * 100.0 ) / 100.0;
String ratevalue;
if(rate > 1000)
ratevalue = String.valueOf(rate / 1024).concat(" Mbps");
else
ratevalue = String.valueOf(rate).concat(" Kbps");
Log.d("DownloadManager", "download speed: "+ratevalue);
} catch (IOException e) {
Log.d("DownloadManager", "Error: " + e);
}
Example output
10-08 15:09:52.658: D/DownloadManager(13714): download ended: 70 secs
10-08 15:09:52.662: D/DownloadManager(13714): download speed: 585.14 Kbps
Thanks in advance for the help. If there is a better method, please let me know.
Following on my comments, here is an example of how to read several bytes from the stream
//Define InputStreams to read from the URLConnection.
InputStream is = ucon.getInputStream();
BufferedInputStream bis = new BufferedInputStream(is);
//I usually use a ByteArrayOutputStream, as it is more common.
ByteArrayOutputStream baos = new ByteArrayOutputStream();
int red = 0;
// This size can be changed
byte[] buf = new byte[1024];
while ((red = bis.read(buf)) != -1) {
baos.write(buf, 0, red);
}
What this does is it reads into a byte[] buffer, and return the amount of read bytes. This is in turn written to the OutputStream, specifying the amount of bytes to write.
ByteArrayOutputStream also have a toByteArray that behaves similarly.
Alternatively, you can also write directly to the file, if you consider that the write to file operation is significantly faster than the read function :
// Simply start by defining the fileoutputstream
FileOutputStream fos = new FileOutputStream(file);
int red = 0;
// This size can be changed
byte[] buf = new byte[1024];
while ((red = bis.read(buf)) != -1) {
// And directly write to it.
fos.write(buf, 0, red);
}
long endTime = System.currentTimeMillis(); //maybe
// Flush after, as this may trigger a commit to disk.
fos.flush();
fos.close();
Moreover, if you really only care about the download speed, it is not mandatory to write to the file, or to anywhere, this would be sufficient :
long size = 0;
byte[] buf = new byte[1024];
while ((red = bis.read(buf)) != -1) {
size += red;
}
I have a use case where I want to upload big gzipped text data files (~ 60 GB) on HDFS.
My code below is taking about 2 hours to upload these files in chunks of 500 MB. Following is the pseudo code. I was chekcing if somebody could help me reduce this time:
i) int fileFetchBuffer = 500000000;
System.out.println("file fetch buffer is: " + fileFetchBuffer);
int offset = 0;
int bytesRead = -1;
try {
fileStream = new FileInputStream (file);
if (fileName.endsWith(".gz")) {
stream = new GZIPInputStream(fileStream);
BufferedReader reader = new BufferedReader(new InputStreamReader(stream));
String[] fileN = fileName.split("\\.");
System.out.println("fil 0 : " + fileN[0]);
System.out.println("fil 1 : " + fileN[1]);
//logger.info("First line is: " + streamBuff.readLine());
byte[] buffer = new byte[fileFetchBuffer];
FileSystem fs = FileSystem.get(conf);
int charsLeft = fileFetchBuffer;
while (true) {
charsLeft = fileFetchBuffer;
logger.info("charsLeft outside while: " + charsLeft);
FSDataOutputStream dos = null;
while (charsLeft != 0) {
bytesRead = stream.read(buffer, 0, charsLeft);
if (bytesRead < 0) {
dos.flush();
dos.close();
break;
}
offset = offset + bytesRead;
charsLeft = charsLeft - bytesRead;
logger.info("offset in record: " + offset);
logger.info("charsLeft: " + charsLeft);
logger.info("bytesRead in record: " + bytesRead);
//prettyPrintHex(buffer);
String outFileStr = Utils.getOutputFileName(
stagingDir,
fileN[0],
outFileNum);
if (dos == null) {
Path outFile = new Path(outFileStr);
if (fs.exists(outFile)) {
fs.delete(outFile, false);
}
dos = fs.create(outFile);
}
dos.write(buffer, 0, bytesRead);
}
logger.info("done writing: " + outFileNum);
dos.flush();
dos.close();
if (bytesRead < 0) {
dos.flush();
dos.close();
break;
}
outFileNum++;
} // end of if
} else {
// Assume uncompressed file
stream = fileStream;
}
} catch(FileNotFoundException e) {
logger.error("File not found" + e);
}
You should consider using the super package IO from Apache.
It has a method
IOUtils.copy( InputStream, OutputStream )
that would tremendously reduce time needed to copy your files.
I tried with buffered input stream and saw no real difference.
I suppose a file channel implementation could be even more efficient. Tell me if it's not fast enough.
package toto;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
public class Slicer {
private static final int BUFFER_SIZE = 50000;
public static void main(String[] args) {
try
{
slice( args[ 0 ], args[ 1 ], Long.parseLong( args[2]) );
}//try
catch (IOException e)
{
e.printStackTrace();
}//catch
catch( Exception ex )
{
ex.printStackTrace();
System.out.println( "Usage : toto.Slicer <big file> <chunk name radix > <chunks size>" );
}//catch
}//met
/**
* Slices a huge files in chunks.
* #param inputFileName the big file to slice.
* #param outputFileRadix the base name of slices generated by the slicer. All slices will then be numbered outputFileRadix0,outputFileRadix1,outputFileRadix2...
* #param chunkSize the size of chunks in bytes
* #return the number of slices.
*/
public static int slice( String inputFileName, String outputFileRadix, long chunkSize ) throws IOException
{
//I would had some code to pretty print the output file names
//I mean adding a couple of 0 before chunkNumber in output file name
//so that they all have same number of chars
//use java.io.File for that, estimate number of chunks, take power of 10, got number of leading 0s
//just to get some stats
long timeStart = System.currentTimeMillis();
long timeStartSlice = timeStart;
long timeEnd = 0;
//io streams and chunk counter
int chunkNumber = 0;
FileInputStream fis = null;
FileOutputStream fos = null;
try
{
//open files
fis = new FileInputStream( inputFileName );
fos = new FileOutputStream( outputFileRadix + chunkNumber );
//declare state variables
boolean finished = false;
byte[] buffer = new byte[ BUFFER_SIZE ];
int bytesRead = 0;
long bytesInChunk = 0;
while( !finished )
{
//System.out.println( "bytes to read " +(int)Math.min( BUFFER_SIZE, chunkSize - bytesInChunk ) );
bytesRead = fis.read( buffer,0, (int)Math.min( BUFFER_SIZE, chunkSize - bytesInChunk ) );
if( bytesRead == -1 )
finished = true;
else
{
fos.write( buffer, 0, bytesRead );
bytesInChunk += bytesRead;
if( bytesInChunk == chunkSize )
{
if( fos != null )
{
fos.close();
timeEnd = System.currentTimeMillis();
System.out.println( "Chunk "+chunkNumber + " has been generated in "+ (timeEnd - timeStartSlice) +" ms");
chunkNumber ++;
bytesInChunk = 0;
timeStartSlice = timeEnd;
System.out.println( "Creating slice number " + chunkNumber );
fos = new FileOutputStream( outputFileRadix + chunkNumber );
}//if
}//if
}//else
}//while
}
catch (Exception e)
{
System.out.println( "A problem occured during slicing : " );
e.printStackTrace();
}//catch
finally
{
//whatever happens close all files
System.out.println( "Closing all files.");
if( fis != null )
fis.close();
if( fos != null )
fos.close();
}//fin
timeEnd = System.currentTimeMillis();
System.out.println( "Total slicing time : " + (timeEnd - timeStart) +" ms" );
System.out.println( "Total number of slices "+ (chunkNumber +1) );
return chunkNumber+1;
}//met
}//class
Greetings,
Stéphane
I'm working on downloading a file on a software, this is what i got, it sucesfully download, and also i can get progress, but still 1 thing left that I dont know how to do. Measure download speed. I would appreciate your help. Thanks.
This is the current download method code
public void run()
{
OutputStream out = null;
URLConnection conn = null;
InputStream in = null;
try
{
URL url1 = new URL(url);
out = new BufferedOutputStream(
new FileOutputStream(sysDir+"\\"+where));
conn = url1.openConnection();
in = conn.getInputStream();
byte[] buffer = new byte[1024];
int numRead;
long numWritten = 0;
double progress1;
while ((numRead = in.read(buffer)) != -1)
{
out.write(buffer, 0, numRead);
numWritten += numRead;
this.speed= (int) (((double)
buffer.length)/8);
progress1 = (double) numWritten;
this.progress=(int) progress1;
}
}
catch (Exception ex)
{
echo("Unknown Error: " + ex);
}
finally
{
try
{
if (in != null)
{
in.close();
}
if (out != null)
{
out.close();
}
}
catch (IOException ex)
{
echo("Unknown Error: " + ex);
}
}
}
The same way you would measure anything.
System.nanoTime() returns a Long you can use to measure how long something takes:
Long start = System.nanoTime();
// do your read
Long end = System.nanoTime();
Now you have the number of nanoseconds it took to read X bytes. Do the math and you have your download rate.
More than likely you're looking for bytes per second. Keep track of the total number of bytes you've read, checking to see if one second has elapsed. Once one second has gone by figure out the rate based on how many bytes you've read in that amount of time. Reset the total, repeat.
here is my implementation
while (mStatus == DownloadStatus.DOWNLOADING) {
/*
* Size buffer according to how much of the file is left to
* download.
*/
byte buffer[];
// handled resume case.
if ((mSize < mDownloaded ? mSize : mSize - mDownloaded <= 0 ? mSize : mSize - mDownloaded) > MAX_BUFFER_SIZE) {
buffer = new byte[MAX_BUFFER_SIZE];
} else {
buffer = new byte[(int) (mSize - mDownloaded)];
}
// Read from server into buffer.
int read = stream.read(buffer);
if (read == -1)
break;// EOF, break while loop
// Write buffer to file.
file.write(buffer, 0, read);
mDownloaded += read;
double speedInKBps = 0.0D;
try {
long timeInSecs = (System.currentTimeMillis() - startTime) / 1000; //converting millis to seconds as 1000m in 1 second
speedInKBps = (mDownloaded / timeInSecs) / 1024D;
} catch (ArithmeticException ae) {
}
this.mListener.publishProgress(this.getProgress(), this.getTotalSize(), speedInKBps);
}
I can give you a general idea. Start a timer at the beginning of the download. Now, multiply the (percentage downloaded) by the download size, and divide it by the time elapsed. That gives you average download time. Hope I get you on the right track!
You can use System.nanoTime(); as suggested by Brian.
Put long startTime = System.nanoTime(); outside your while loop. and
long estimatedTime = System.nanoTime() - startTime; will give you the elapsed time within your loop.