Reading a file twice is extremely fast on the second read

Reading a file twice is extremely fast on the second read - java

I'm currently writing a small program to frequently test my internet speed.
To test the computational overhead I changed the read source to a file on my disk. Here I noticed that the bytewise reading limits the speed at about 31 MB/s so I changed it to reading 512 KB blocks.
Now I have a really strange behavior: After reading a 1GB file for the first time every following read operation is finished in less than one second. But there is no way that my normal HDD reads at over 1 GB/s and I also can't imaging that the whole file is cached in the RAM.
Here's my code:
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.text.SimpleDateFormat;
import java.util.Date;
public class Main {
public static void main(String[] args) {
SimpleDateFormat sdf = new SimpleDateFormat("dd.MM.yyyy HH:mm");
try {
System.out.println("Starting test...");
InputStream in = (new FileInputStream(new File("path/to/testfile")));
long startTime = System.currentTimeMillis();
long initTime = startTime + 8 * 1000; // start measuring after 8 seconds
long stopTime = initTime + 15 * 1000; // stop after 15 seconds testing
boolean initiated = false;
boolean stopped = false;
long bytesAfterInit = 0;
long bytes = 0;
byte[] b = new byte[524288];
int bytesRead = 0;
while((bytesRead = in.read(b)) > 0) {
bytes += bytesRead;
if(!initiated && System.currentTimeMillis() > initTime) {
initiated = true;
System.out.println("initiated");
bytesAfterInit = bytes;
}
if(System.currentTimeMillis() > stopTime) {
stopped = true;
System.out.println("stopped");
break;
}
}
long endTime = System.currentTimeMillis();
in.close();
long duration = 0;
long testBytes = 0;
if(initiated && stopped) { //if initiated and stopped calculate for the test time
duration = endTime - initTime;
testBytes = bytes - bytesAfterInit;
} else { //otherwise calculate the whole process
duration = endTime - startTime;
testBytes = bytes;
}
if(duration == 0) //prevent dividing by zero
duration = 1;
String result = sdf.format(new Date()) + "\t" + (testBytes / 1024 / 1024) / (duration / 1000d) + " MB/s";
System.out.println(duration + " ms");
System.out.println(testBytes + " bytes");
System.out.println(result);
} catch (IOException e) {
e.printStackTrace();
}
}
}
Output:
Starting test...
302 ms
1010827264 bytes
09.02.2015 10:20 3192.0529801324506 MB/s
I don't have that behavior if I change the source of the file to some file in the internet or a way bigger file on my SSD.
How is it possible that all the bytes are read in such a short period?

Related

Java FileChannel Vs BufferedReader - Spring Batch - Reader

We process huge files (sometimes 50 GB each file). The application reads this one file and based on the business logic, it will write multiple output files (4-6).
The records in the file are of variable length and each field in a record is a delimiter separated.
Going by the understanding that reading a file using FileChannel with a ByteBuffer was always better than using a BufferedReader.readLine and then using a split by the delimiter.
BufferSizes tried 10240(10KB) and even more
Commit interval - 5000, 10000 etc
Below is how we used file channel to read:
Read byte by byte. Check if the read byte is a new line char(10) -
which means end of line.
check for delimiter bytes. capture the bytes read in a byte array(we initialized this byte array with a maximum field size of 350 bytes) until delimiter bytes are encountered.
convert these bytes read until this time, to String using UTF-8 encoding - new String(byteArr, 0, index,"UTF-8") to be specific - index is the number of bytes read until delimiter.
Using this method of reading the file using FileChannel took 57 minutes to process the file.
We want to decrease this time and tried using BufferredReader.readLine() and then use a split by delimiter, to see how it fares.
And shockingly the same file completed processing only in 7 minutes.
What's the catch here? Why FileChannel is taking more time than a buffered reader and then using a string split.
I was always under the assumption that ReadLine and Split combination will have a big performance impact?
Can any one throw light on if I was using FileChannel in a wrong way? One
Thanks in advance. Hope I have summarized the issue properly.
The below is sample code :
while (inputByteBuffer.hasRemaining() && (b = inputByteBuffer.get()) != 0){
boolean endOfField = false;
if (b == 10){
break;
}
else{
if (b == 94){//^
if (!inputByteBuffer.hasRemaining()){
inputByteBuffer.clear();
noOfBytes = inputFileChannel.read(inputByteBuffer);
inputByteBuffer.flip();
}
if (inputByteBuffer.hasRemaining()){
byte b2 = inputByteBuffer.get();
if (b2 == 124){//|
if (!inputByteBuffer.hasRemaining()){
inputByteBuffer.clear();
noOfBytes = inputFileChannel.read(inputByteBuffer);
inputByteBuffer.flip();
}
if (inputByteBuffer.hasRemaining()){
byte b3 = inputByteBuffer.get();
if (b3 == 94){//^
String field = new String(fieldBytes, 0, index, encoding);
if(fieldIndex == -1){
fields = new String[sizeFromAConfiguration];
}else{
fields[fieldIndex] = field;
}
fieldBytes = new byte[maxFieldSize];
endOfField = true;
fieldIndex++;
}
else{
fieldBytes = addFieldBytes(fieldBytes, b, index);
index++;
fieldBytes = addFieldBytes(fieldBytes, b2, index);
index++;
fieldBytes = addFieldBytes(fieldBytes, b3, index);
}
}
else{
endOfFile = true;
//fields.add(new String(fieldBytes, 0, index, encoding));
fields[fieldIndex] = new String(fieldBytes, 0, index, encoding);
fieldBytes = new byte[maxFieldSize];
endOfField = true;
}
}else{
fieldBytes = addFieldBytes(fieldBytes, b, index);
index++;
fieldBytes = addFieldBytes(fieldBytes, b2, index);
}
}else{
endOfFile = true;
fieldBytes = addFieldBytes(fieldBytes, b, index);
}
}
else{
fieldBytes = addFieldBytes(fieldBytes, b, index);
}
}
if (!inputByteBuffer.hasRemaining()){
inputByteBuffer.clear();
noOfBytes = inputFileChannel.read(inputByteBuffer);
inputByteBuffer.flip();
}
if (endOfField){
index = 0;
}
else{
index++;
}
}

You're causing a lot of overhead with the constant hasRemaining()/read() checks as well as the constant get() calls. It would probably be better to get() the entire buffer into an array and process that directly, only calling read() when you get to the end.
And to answer a question in comments, you should not allocate a new ByteBuffer per read. This is expensive. Keep using the same one. And NB do not use a DirectByteBuffer for this application. It is not appropriate: it's only appropriate when you want the data to stay south of the JVM/JNI boundary, e.g. when merely copying between channels.
But I think I would throw this away, or rather rewrite it, using BufferedReader.read(), rather than readLine() followed by string splits, and using much the same logic as you have here, except of course that you don't need to keep calling hasRemaining() and filling the buffer, which BufferedReader will do automatically for you.
You have to take care to store the result of read() into an int, and to check it for -1 after every read().
It isn't clear to me that you should be using a Reader at all actually, unless you know you have multibyte text. Possibly a simple BufferedInputStream would be more appropriate.

While one cannot tell with certainty how a particular code will behave I would imagine the best way is to profile it just like you did.The FileChannel while percieved to be faster is actually not helping in your case.But this may not be because of reading from the file but actual processing that you do with the content you read.
One article I would like to point out while dealing with files is
https://www.redgreencode.com/why-is-java-io-slow/
Also the corresponding Github codebase
Java IO benchmark
I would like to point out this code to use a combination of both worlds
fos = new FileOutputStream(outputFile);
outFileChannel = fos.getChannel();
bufferedWriter = new BufferedWriter(Channels.newWriter(outFileChannel, "UTF-8"));
Since it is read in your case I will consider
File inputFile = new File("C:\\input.txt");
FileInputStream fis = new FileInputStream(inputFile);
FileChannel inputChannel = fis.getChannel();
BufferedReader bufferedReader = new BufferedReader(Channels.newReader(inputChannel,"UTF-8"));
Also I will tweak the chunksize and with Spring batch it is always trial and error to find sweet spot.
On a completely unrelated note the reason for your problem of not able to use BufferedReader is because of doubling of charecters and I am assuming this happens more commonly with ebcdic charecters.I will simply run a loop like this to identfy the troublemakers and eliminate at the source.
import java.io.UnsupportedEncodingException;
public class EbcdicConvertor {
public static void main(String[] args) throws UnsupportedEncodingException {
int index = 0;
for (int i = -127; i < 128; i++) {
byte[] b = new byte[1];
b[0] = (byte) i;
String cp037 = new String(b, "CP037");
if (cp037.getBytes().length == 2) {
index++;
System.out.println(i + "::" + cp037);
}
}
System.out.println(index);
}
}
The above answer is without testing my actual hypothesis.Here is an actual program to measure time.The results speak for themselves on a 200 MB file
import java.io.File;
import java.io.FileInputStream;
import java.io.FileReader;
import java.io.RandomAccessFile;
import java.nio.ByteBuffer;
import java.nio.channels.Channels;
import java.nio.channels.FileChannel;
import java.util.ArrayList;
import java.util.List;
import java.util.Scanner;
import java.util.regex.Pattern;
public class ReadComplexDelimitedFile {
private static long total = 0;
private static final Pattern DELIMITER_PATTERN = Pattern.compile("\\^\\|\\^");
private void readFileUsingScanner() {
String s;
try (Scanner stdin = new Scanner(new File(this.getClass().getResource("input.txt").getPath()))) {
while (stdin.hasNextLine()) {
s = stdin.nextLine();
String[] fields = DELIMITER_PATTERN.split(s, 0);
total = total + fields.length;
}
} catch (Exception e) {
System.err.println("Error");
}
}
private void readFileUsingCustomBufferedReader() {
try (BufferedReader stdin = new BufferedReader(new FileReader(new File(this.getClass().getResource("input.txt").getPath())))) {
String s;
while ((s = stdin.readLine()) != null) {
String[] fields = DELIMITER_PATTERN.split(s, 0);
total += fields.length;
}
} catch (Exception e) {
System.err.println("Error");
}
}
private void readFileUsingBufferedReader() {
try (java.io.BufferedReader stdin = new java.io.BufferedReader(new FileReader(new File(this.getClass().getResource("input.txt").getPath())))) {
String s;
while ((s = stdin.readLine()) != null) {
String[] fields = DELIMITER_PATTERN.split(s, 0);
total += fields.length;
}
} catch (Exception e) {
System.err.println("Error");
}
}
private void readFileUsingBufferedReaderFileChannel() {
try (FileInputStream fis = new FileInputStream(this.getClass().getResource("input.txt").getPath())) {
try (FileChannel inputChannel = fis.getChannel()) {
try (BufferedReader stdin = new BufferedReader(Channels.newReader(inputChannel, "UTF-8"))) {
String s;
while ((s = stdin.readLine()) != null) {
String[] fields = DELIMITER_PATTERN.split(s, 0);
total = total + fields.length;
}
}
} catch (Exception e) {
System.err.println("Error");
}
} catch (Exception e) {
System.err.println("Error");
}
}
private void readFileUsingBufferedReaderByteFileChannel() {
try (FileInputStream fis = new FileInputStream(this.getClass().getResource("input.txt").getPath())) {
try (FileChannel inputChannel = fis.getChannel()) {
try (BufferedReader stdin = new BufferedReader(Channels.newReader(inputChannel, "UTF-8"))) {
int b;
StringBuilder sb = new StringBuilder();
while ((b = stdin.read()) != -1) {
if (b == 10) {
total = total + DELIMITER_PATTERN.split(sb, 0).length;
sb = new StringBuilder();
} else {
sb.append((char) b);
}
}
}
} catch (Exception e) {
e.printStackTrace();
}
} catch (Exception e) {
System.err.println("Error");
}
}
private void readFileUsingFileChannelStream() {
try (RandomAccessFile fis = new RandomAccessFile(new File(this.getClass().getResource("input.txt").getPath()), "r")) {
try (FileChannel inputChannel = fis.getChannel()) {
ByteBuffer byteBuffer = ByteBuffer.allocate(8192);
ByteBuffer recordBuffer = ByteBuffer.allocate(250);
int recordLength = 0;
while ((inputChannel.read(byteBuffer)) != -1) {
byte b;
byteBuffer.flip();
while (byteBuffer.hasRemaining() && (b = byteBuffer.get()) != -1) {
if (b == 10) {
recordBuffer.flip();
total = total + splitIntoFields(recordBuffer, recordLength);
recordBuffer.clear();
recordLength = 0;
} else {
++recordLength;
recordBuffer.put(b);
}
}
byteBuffer.clear();
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
private int splitIntoFields(ByteBuffer recordBuffer, int recordLength) {
byte b;
String[] fields = new String[17];
int fieldCount = -1;
StringBuilder sb = new StringBuilder();
for (int i = 0; i < recordLength - 1; i++) {
b = recordBuffer.get(i);
if (b == 94 && recordBuffer.get(++i) == 124 && recordBuffer.get(++i) == 94) {
fields[++fieldCount] = sb.toString();
sb = new StringBuilder();
} else {
sb.append((char) b);
}
}
fields[++fieldCount] = sb.toString();
return fields.length;
}
public static void main(String args[]) {
//JVM wamrup
for (int i = 0; i < 100000; i++) {
total += i;
}
// We know scanner is slow-Still warming up
ReadComplexDelimitedFile readComplexDelimitedFile = new ReadComplexDelimitedFile();
List<Long> longList = new ArrayList<>(50);
for (int i = 0; i < 50; i++) {
total = 0;
long startTime = System.nanoTime();
readComplexDelimitedFile.readFileUsingScanner();
long stopTime = System.nanoTime();
long timeDifference = stopTime - startTime;
longList.add(timeDifference);
}
System.out.println("Time taken for readFileUsingScanner");
longList.forEach(System.out::println);
// Actual performance test starts here
longList = new ArrayList<>(10);
for (int i = 0; i < 10; i++) {
total = 0;
long startTime = System.nanoTime();
readComplexDelimitedFile.readFileUsingBufferedReaderFileChannel();
long stopTime = System.nanoTime();
long timeDifference = stopTime - startTime;
longList.add(timeDifference);
}
System.out.println("Time taken for readFileUsingBufferedReaderFileChannel");
longList.forEach(System.out::println);
longList.clear();
for (int i = 0; i < 10; i++) {
total = 0;
long startTime = System.nanoTime();
readComplexDelimitedFile.readFileUsingBufferedReader();
long stopTime = System.nanoTime();
long timeDifference = stopTime - startTime;
longList.add(timeDifference);
}
System.out.println("Time taken for readFileUsingBufferedReader");
longList.forEach(System.out::println);
longList.clear();
for (int i = 0; i < 10; i++) {
total = 0;
long startTime = System.nanoTime();
readComplexDelimitedFile.readFileUsingCustomBufferedReader();
long stopTime = System.nanoTime();
long timeDifference = stopTime - startTime;
longList.add(timeDifference);
}
System.out.println("Time taken for readFileUsingCustomBufferedReader");
longList.forEach(System.out::println);
longList.clear();
for (int i = 0; i < 10; i++) {
total = 0;
long startTime = System.nanoTime();
readComplexDelimitedFile.readFileUsingBufferedReaderByteFileChannel();
long stopTime = System.nanoTime();
long timeDifference = stopTime - startTime;
longList.add(timeDifference);
}
System.out.println("Time taken for readFileUsingBufferedReaderByteFileChannel");
longList.forEach(System.out::println);
longList.clear();
for (int i = 0; i < 10; i++) {
total = 0;
long startTime = System.nanoTime();
readComplexDelimitedFile.readFileUsingFileChannelStream();
long stopTime = System.nanoTime();
long timeDifference = stopTime - startTime;
longList.add(timeDifference);
}
System.out.println("Time taken for readFileUsingFileChannelStream");
longList.forEach(System.out::println);
}
}
BufferedReader was written very long back and hence we can rewrite some parts relevant to this example.For instance we don't care about \r and skipLF or skipCR or those kinds of stuff
We are going to read the file( no need for syncrhonized)
By extension no need for StringBuffer even otherwise StringBuilder can be used.Performance improvement immediately seen.
dangerous hack,remove synchronized and replace StringBuffer with StringBuilder don't use it without proper testing and not knowing what you are doing
public String readLine() throws IOException {
StringBuilder s = null;
int startChar;
bufferLoop:
for (; ; ) {
if (nextChar >= nChars)
fill();
if (nextChar >= nChars) { /* EOF */
if (s != null && s.length() > 0)
return s.toString();
else
return null;
}
boolean eol = false;
char c = 0;
int i;
/* Skip a leftover '\n', if necessary */
charLoop:
for (i = nextChar; i < nChars; i++) {
c = cb[i];
if (c == '\n') {
eol = true;
break charLoop;
}
}
startChar = nextChar;
nextChar = i;
if (eol) {
String str;
if (s == null) {
str = new String(cb, startChar, i - startChar);
} else {
s.append(cb, startChar, i - startChar);
str = s.toString();
}
nextChar++;
return str;
}
if (s == null)
s = new StringBuilder(defaultExpectedLineLength);
s.append(cb, startChar, i - startChar);
}
}
Java 8 Intel i5 12 GB RAM Windows 10
Result:
Time taken for readFileUsingBufferedReaderFileChannel::
2581635057 1849820885 1763992972 1770510738 1746444157 1733491399
1740530125 1723907177 1724280512 1732445638
Time taken for readFileUsingBufferedReader
1851027073 1775304769 1803507033 1789979554 1786974538 1802675458
1789672780 1798036307 1789847714 1785302003
Time taken for readFileUsingCustomBufferedReader
1745220476 1721039975 1715383650 1728548462 1724746005 1718177466
1738026017 1748077438 1724608192 1736294175
Time taken for readFileUsingBufferedReaderByteFileChannel
2872857919 2480237636 2917488143 2913491126 2880117231 2904614745
2911756298 2878777496 2892169722 2888091211
Time taken for readFileUsingFileChannelStream
3039447073 2896156498 2538389366 2906287280 2887612064 2929288046
2895626578 2955326255 2897535059 2884476915
Process finished with exit code 0

I did try NIO with all possible options(provided in this post and to the best of my knowledge and research) and found that it no where came close to BufferedReader in terms of reading a text file.
Changing BufferedReader to use StringBuilder in place of StringBuffer, I don't see any significant improvement in performance (only very few seconds for some files and some of them were better using StringBuffer itself).
Removing synchronized block also didn't give much/any improvement. And it's not worth to tweak something by which we didn't receive any benefit.
The below is the time taken(reading, processing, writing - time taken for processing and writing is not significant - not even 20% of time) for file which is around 50 GB
NIO : 71.67 (Minutes)
IO (BufferedReader) : 10.84 (Minutes)
Thank you all for your time to reading and responding to this post and providing suggestions.

The main issue here is creating a new byte[] very rapidly(fieldBytes = new byte[maxFieldSize];).
Since for every iteration a new array is being created, garbage collection is being kicked off very often which triggers "stop the world" to reclaim the memory.
And also, the object creation could be expensive.
We could rather initialize the byte array once and then track the indexes to just convert the field to string with an end index.
And anyway, BufferedReader is faster than FileChannel, atleast to read the ASCII files, and to keep the code simple, we continued using Bufferred Reader itself.
Using Bufferred reader, the development and testing effort can be reduced by not having tedious logic to find delimiters and populating the object.

about linux IO performance

I wrote a program to test IO performance in java useing FileChannel. Write data and call force(false) immediately. My Linux server has 12 ssd hard drives, sda~sdl, and I test writing data to different hard drive, the performance varies widely, and I don't know why?
code:
public static void main(String[] args) throws IOException, InterruptedException {
RandomAccessFile aFile = new RandomAccessFile(args[0], "rw");
int count = Integer.parseInt(args[1]);
int idx = count;
FileChannel channel = aFile.getChannel();
long time = 0;
long bytes = 0;
while (--idx > 0) {
String newData = "New String to write to file..." + System.currentTimeMillis();
String buff = "";
for (int i = 0 ; i<100; i++) {
buff += newData;
}
bytes += buff.length();
ByteBuffer buf = ByteBuffer.allocate(buff.length());
buf.clear();
buf.put(buff.getBytes());
buf.flip();
while(buf.hasRemaining()) {
channel.write(buf);
}
long st = System.nanoTime();
channel.force(false);
long et = System.nanoTime();
System.out.println("force time : " + (et - st));
time += (et -st);
}
System.out.println("wirte " + count + " record, " + bytes + " bytes, force avg time : " + time/count);
}
Result like this:
sda: wirte 1000000 record, 4299995700 bytes, force avg time : 273480 ns
sdb: wirte 100000 record, 429995700 bytes, force avg time : 5868387 ns
The average time vary significantly.
Here is some IO monitor data.
sda:
iostat data image
sdb:
iostat data image

You need to start by measure your SSD disks performance using some standard tool like fio.
Then you can test your utility again using numbers from fio output.
Looks like you are writing into the Linux write cache so that can explain your results :)

JAVA HttpURLConnection I/O Not working

Hello My Respected Seniors :)
My Goal: Download a URL Resource, given a URL, by using Multi-Threading in Java, i.e. download a single file into multiple pieces (much like how IDM does) & at the end of download, combine all of them to 1 final file.
Technology Using: Java, RandomAccessFile, MultiThreading, InputStreams
Problem:
The file is downloaded fine with exact KB size, I've checked many times, but the final file is corrupted. For example, If I download an Image, it will be somewhat blurry, If I download an .exe, it downloads fine but when I run the .exe file, it says "media is damaged, retry download".
This is my Main code from which I call to thread class with parameters such as fileName, starting Range and ending Range for a connection as well as a JProgressBar for every thread which will update its own respectively.
public void InitiateDownload()
{
HttpURLConnection uc = (HttpURLConnection) url.openConnection();
uc.connect();
long fileSize = uc.getContentLengthLong();
System.out.println("File Size = "+ fileSize );
uc.disconnect();
chunkSize = (long) Math.ceil(fileSize/6);
startFrom = 0;
endRange = (startFrom + chunkSize) - 1;
Thread t1 = new MyThread(url, fileName, startFrom, endRange, progressBar_1);
t1.start();
//-----------------------------------------
startFrom += chunkSize;
endRange = endRange + chunkSize;
System.out.println("Part 2 :: Start = " + startFrom + "\tEnd To = " + endRange );
Thread t2 = new MyThread(url, fileName, startFrom, endRange, progressBar_2);
t2.start();
//-----------------------------------------
//..
//..
//..
//-----------------------------------------
startFrom += chunkSize;
long temp = endRange + chunkSize;
endRange = temp + (fileSize - temp); //add any remaining bits, that were rounded off in division
Thread t6 = new MyThread(url, fileName, startFrom, endRange, progressBar_6);
t6.start();
//-----------------------------------------
}
Here is run() function of MyThread class:
public void run() {
Thread.currentThread().setPriority(MAX_PRIORITY);
System.setProperty("http.proxyHost", "192.168.10.50");
System.setProperty("http.proxyPort", "8080");
HttpURLConnection uc = null;
try {
uc = (HttpURLConnection) url.openConnection();
uc.setRequestProperty("Range", "bytes="+startFrom+"-"+range);
uc.connect();
fileSize = uc.getContentLengthLong();
inStream = uc.getInputStream();
int[] buffer = new int[ (int) totalDownloadSize ];
file.seek(startFrom); //adjusted start of file
THIS IS WHERE I THINK THE PROBLEM IS,
run() continued...
for(int i = 0 ; i < totalDownloadSize; i++)
{
buffer[i] = inStream.read();
file.write(buffer[i]);
//Updating Progress bars
totalDownloaded = totalDownloaded + 1;
int downloaded = (int) (100 * ( totalDownloaded/ (float) totalDownloadSize)) ;
progressbar.setValue( downloaded );
}
System.err.println( Thread.currentThread().getName() + "'s download is Finished!");
uc.disconnect();
}
catch(IOException e) {
System.err.println("Exception in " + Thread.currentThread().getName() + "\t Exception = " + e );
}
finally {
try {
file.close();
if(inStream!=null)
inStream.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
Now the file is downloaded with complete size, but as I said, a little part of it is corrupt.
Now,
If I replace the for loop with following while loop, the problem is completely solved.
int bytesRead = 0;
byte[] buffer = new byte[ (int) totalDownloadSize ];
file.seek(startFrom); //adjusted start of file
while( (bytesRead = inStream.read(buffer) ) != -1 ) {
file.write(buffer, 0, bytesRead);
}
BUT I NEED for LOOP TO MEASURE HOW MUCH FILE EACH THREAD HAS DOWNLOADED & I WANT TO UPGRADE RESPECTIVE JPROGRESSBARs of THREADS.
Kindly help me out with the for loop logic.
OR
If you can advise on how can I upgrade Jprogressbars within while loop. I can't seem to find a way to quantify how much file a thread has downloaded...
I've spent alot of hours & I'm extremely tired now...

You can use the while loop that works, and then keep track of the total amount of bytes read like this:
int totalRead = 0;
while ((bytesRead = inStream.read(buffer)) != -1) {
totalRead += bytesRead;
file.write(buffer, 0, bytesRead);
progressBar.setValue((int)(totalRead / (double) totalDownloadSize));
}
just remember that for (a; b; c) { ... } is equal to a; while (b) { c; ... }.

Calculating network download speed

I have written the follwing code to calculate download speed using java.
But it is not giving correct results.What is the problem?.Is there a problem with my logic , or is it a problem with java networking classes usage?I think it is a problem with the usage of java networking classes.Can anybody tell me what exactly the problem is?
/*Author:Jinu Joseph Daniel*/
import java.io.*;
import java.net.*;
class bwCalc {
static class CalculateBw {
public void calculateUploadBw() {}
public float calculateDownloadRate(int waitTime) throws Exception {
int bufferSize = 1;
byte[] data = new byte[bufferSize]; // buffer
BufferedInputStream in = new BufferedInputStream(new URL("https://www.google.co.in/").openStream());
int count = 0;
long startedAt = System.currentTimeMillis();
long stoppedAt;
float rate;
while (((stoppedAt = System.currentTimeMillis()) - startedAt) < waitTime) {
if ( in .read(data, 0, bufferSize) != -1) {
count++;
} else {
System.out.println("Finished");
break;
}
}
in .close();
rate = 1000 * (((float) count*bufferSize*8 / (stoppedAt - startedAt)) )/(1024*1024);//rate in Mbps
return rate;
}
public float calculateAverageDownloadRate() throws Exception{
int times[] = {100,200,300,400,500};
float bw = 0,curBw;
int i = 0, len = times.length;
while (i < len) {
curBw = calculateDownloadRate(times[i++]);
bw += curBw;
System.out.println("Current rate : "+Float.toString(curBw));
}
bw /= len;
return bw;
}
}
public static void main(String argc[]) throws Exception {
CalculateBw c = new CalculateBw();
System.out.println(Float.toString(c.calculateAverageDownloadRate()));
}
}

There are many problems with your code...
you're not checking how many bytes you are reading
testing with Google's home page is useless, since the content size is very small and most of the download time is related to network latency; you should try downloading a large file (10+ MB) - UNLESS you actually want to measure latency rather than bandwidth, in which case you can simply run ping
you also need to give it more than 500ms if you want to get any relevant result - I'd say at least 5 sec
plenty of code style issues, but those are less important

Here is the code which will calculate the average download rate for you in KBs and MBs per second you can scale them by 8 to get the rate in bits per second.
public static void main(String argc[]) throws Exception {
long totalDownload = 0; // total bytes downloaded
final int BUFFER_SIZE = 1024; // size of the buffer
byte[] data = new byte[BUFFER_SIZE]; // buffer
BufferedInputStream in = new BufferedInputStream(
new URL(
"http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.15/linux-headers-2.6.15-020615_2.6.15-020615_all.deb")
.openStream());
int dataRead = 0; // data read in each try
long startTime = System.nanoTime(); // starting time of download
while ((dataRead = in.read(data, 0, 1024)) > 0) {
totalDownload += dataRead; // adding data downloaded to total data
}
/* download rate in bytes per second */
float bytesPerSec = totalDownload
/ ((System.nanoTime() - startTime) / 1000000000);
System.out.println(bytesPerSec + " Bps");
/* download rate in kilobytes per second */
float kbPerSec = bytesPerSec / (1024);
System.out.println(kbPerSec + " KBps ");
/* download rate in megabytes per second */
float mbPerSec = kbPerSec / (1024);
System.out.println(mbPerSec + " MBps ");
}

Measuring Download Speed Java

I'm working on downloading a file on a software, this is what i got, it sucesfully download, and also i can get progress, but still 1 thing left that I dont know how to do. Measure download speed. I would appreciate your help. Thanks.
This is the current download method code
public void run()
{
OutputStream out = null;
URLConnection conn = null;
InputStream in = null;
try
{
URL url1 = new URL(url);
out = new BufferedOutputStream(
new FileOutputStream(sysDir+"\\"+where));
conn = url1.openConnection();
in = conn.getInputStream();
byte[] buffer = new byte[1024];
int numRead;
long numWritten = 0;
double progress1;
while ((numRead = in.read(buffer)) != -1)
{
out.write(buffer, 0, numRead);
numWritten += numRead;
this.speed= (int) (((double)
buffer.length)/8);
progress1 = (double) numWritten;
this.progress=(int) progress1;
}
}
catch (Exception ex)
{
echo("Unknown Error: " + ex);
}
finally
{
try
{
if (in != null)
{
in.close();
}
if (out != null)
{
out.close();
}
}
catch (IOException ex)
{
echo("Unknown Error: " + ex);
}
}
}

The same way you would measure anything.
System.nanoTime() returns a Long you can use to measure how long something takes:
Long start = System.nanoTime();
// do your read
Long end = System.nanoTime();
Now you have the number of nanoseconds it took to read X bytes. Do the math and you have your download rate.
More than likely you're looking for bytes per second. Keep track of the total number of bytes you've read, checking to see if one second has elapsed. Once one second has gone by figure out the rate based on how many bytes you've read in that amount of time. Reset the total, repeat.

here is my implementation
while (mStatus == DownloadStatus.DOWNLOADING) {
/*
* Size buffer according to how much of the file is left to
* download.
*/
byte buffer[];
// handled resume case.
if ((mSize < mDownloaded ? mSize : mSize - mDownloaded <= 0 ? mSize : mSize - mDownloaded) > MAX_BUFFER_SIZE) {
buffer = new byte[MAX_BUFFER_SIZE];
} else {
buffer = new byte[(int) (mSize - mDownloaded)];
}
// Read from server into buffer.
int read = stream.read(buffer);
if (read == -1)
break;// EOF, break while loop
// Write buffer to file.
file.write(buffer, 0, read);
mDownloaded += read;
double speedInKBps = 0.0D;
try {
long timeInSecs = (System.currentTimeMillis() - startTime) / 1000; //converting millis to seconds as 1000m in 1 second
speedInKBps = (mDownloaded / timeInSecs) / 1024D;
} catch (ArithmeticException ae) {
}
this.mListener.publishProgress(this.getProgress(), this.getTotalSize(), speedInKBps);
}

I can give you a general idea. Start a timer at the beginning of the download. Now, multiply the (percentage downloaded) by the download size, and divide it by the time elapsed. That gives you average download time. Hope I get you on the right track!
You can use System.nanoTime(); as suggested by Brian.
Put long startTime = System.nanoTime(); outside your while loop. and
long estimatedTime = System.nanoTime() - startTime; will give you the elapsed time within your loop.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Reading a file twice is extremely fast on the second read - java

Related

Java FileChannel Vs BufferedReader - Spring Batch - Reader

about linux IO performance

JAVA HttpURLConnection I/O Not working

Calculating network download speed

Measuring Download Speed Java

Categories

Resources