Throughput of arrays of different sizes vary - java

I have arrays like
byte[] b = new byte[10];
byte[] b1 = new byte[1024*1024];
I populate them with some values. Say,
for(i=0;i<10;i++){
b[i]=1;
}
for(i=0;i<1024*1024;i++){
b1[i]=1;
}
Then I write it to a RandomAccessFile and read again from that file into the same array using,
randomAccessFile.write(arrayName);
and
randomAccessFile.read(arrayName);
When I try to calculate the throughput of both these arrays(using the time calculated for file read and write) of varying sizes(10 bytes and 1Mb), throughput appears to be more for 1MB array.
Sample Output:
Throughput of 10kb array: 0.1 Mb/sec.
Throughput of 1Mb array: 1000.0 Mb/sec.
Why does this happen? I have Intel i7 with quad core processor. Will my hardware configuration be responsible for this? If not what could be the possible reason?

The reason for the big difference is the overheads involved in I/O that occur no matter what the size of data being transferred - it's like the flag fall of a taxi ride. Overheads, which are not restricted to java and includes many O/S operations, include:
Finding the file on disk
Checking O/S permissions on the file
Opening the file for I/O
Closing the file
Updating file info in the file system
Many other tasks
Also, disk I/O is performed in pages (size depends on O/S, but usually 2K), so I/O of 1 byte probably costs the same as I/O of 2048 bytes: A slightly fairer comparison would be a 2048 byte array with a 1Mb array.
If you are using buffered I/O, that can further speed up larger I/O tasks.
Finally, what you report as "10Kb" is in fact just 10 bytes, so your calculation is possibly incorrect.

Related

Java MergeSort Binary files

I have several sorted binary files which store information in some variable length format (meaning one of the segments contains the length of the variable length segment).
I need to merge them into one sorted file. I can do so with BufferedInputStream successfully. Nevertheless, it takes very long time on a mechanical disk. On a machine with SSD its much faster, as expected.
What bothers me is the fact that even on SSD, the CPU utilization is very low, and makes me suspect there's a way to improve the speed. I assume this happens because most of the time the CPU waits on the disk. I tried to increase the buffers to hundreds of MBs to no avail.
I have tried to use MemoryMapped buffer and file channel but it didn't improve the runtime.
Any ideas?
Edit: Using MemoryMappedByteBuffer failed because the merged file size is over 2 GB, which is the size limitation of MemoryMappedByteBuffer. But even before having merged the smaller files into GB files, I didn't notice an improvement in speed or CPU utilization.
Thanks
Perhaps you can compress the files better or is that not an option? If the bottleneck is I/O then reducing the amount is a good attack angle.
http://www.oracle.com/technetwork/articles/java/compress-1565076.html

How to keep read a large file with dynamic buffer size - depending on data read from file.

I have a file containing data that is meaningful only in chunks of certain size which is appended at the start of each chunk, for e.g.
{chunk_1_size}
{chunk_1}
{chunk_2_size}
{chunk_2}
{chunk_3_size}
{chunk_3}
{chunk_4_size}
{chunk_4}
{chunk_5_size}
{chunk_5}
.
.
{chunk_n_size}
{chunk_n}
The file is really really big ~ 2GB and the chunk size is ~20MB (which is the buffer that I want to have)
I would like to Buffer read this file to reduce the number to calls to actual hard disk.
But I am not sure how much buffer to have because the chunk size may vary.
pseudo code of what I have in mind:
while(!EOF) {
/*chunk is an integer i.e. 4 bytes*/
readChunkSize();
/*according to chunk size read the number of bytes from file*/
readChunk(chunkSize);
}
If lets say I have random buffer size then I might crawl into situations like:
First Buffer contains chunkSize_1 + chunk_1 + partialChunk_2 --- I have to keep track of leftover and then from the next buffer get the remaning chunk and concatenate to leftover to complete the chunk
First Buffer contains chunkSize_1 + chunk_1 + partialChunkSize_2 (chunk size is an integer i.e. 4 bytes so lets say I get only two of those from first buffer) --- I have to keep track of partialChunkSize_2 and then get remaning bytes from the next buffer to form an integer that actually gives me the next chunkSize
Buffer might not even be able to get one whole chunk at a time -- I have to keep hitting read until the first chunk is completely read into memory
You don't have much control over the number of calls to the hard disk. There are several layers between you and the hard disk (OS, driver, hardware buffering) that you cannot control.
Set a reasonable buffer size in your Java code (1M) and forget about it unless and until you can prove there is a performance issue that is directly related to buffer sizes. In other words, do not fall into the trap of premature optimization.
See also https://stackoverflow.com/a/385529/18157
you might need to do some analysis and have an idea of average buffer size, to read data.
you are saying to keep buffer-size and read data till the chunk is done ,to have some meaning full data
R u copying the file to some place else, or you sending this data to another place?
for some activities Java NIO packages have better implementations to deal with ,rather than reading data into jvm buffers.
the buffer size should be decent enough to read maximum chunks of data ,
If planning to hold data in memmory reading the data using buffers and holding them in memory will be still memory-cost operation ,buffers can be freed in many ways using basic flush operaitons.
please also check apache file-utils to read/write data

Performance characteristics of memory mapped file

Background:
I have a Java application which does intensive IO on quite large
memory mapped files (> 500 MB). The program reads data, writes data,
and sometimes does both.
All read/write functions have similar computation complexity.
I benchmarked the IO layer of the program and noticed strange
performance characteristics of memory mapped files:
It performs 90k reads per second (read 1KB every iteration at random position)
It performs 38k writes per second (write 1KB every iteration sequentially)
It performs 43k writes per second (write 4 bytes every iteration at random position)
It performs only 9k read/write combined operation per second (read 12 bytes then write 1KB every iteration, at random position)
The programs on 64-bit JDK 1.7, Linux 3.4.
The machine is an ordinary Intel PC with 8 threads CPU and 4GB physical memory. Only 1 GB was assigned to JVM heap when conducting the benchmark.
If more details are needed, here is the benchmark code: https://github.com/HouzuoGuo/Aurinko2/blob/master/src/test/scala/storage/Benchmark.scala
And here is the implementation of the above read, write, read/write functions: https://github.com/HouzuoGuo/Aurinko2/blob/master/src/main/scala/aurinko2/storage/Collection.scala
So my questions are:
Given fixed file size and memory size, what factors affect memory mapped file random read performance?
Given fixed file size and memory size, what factors affect memory mapped file random write performance?
How do I explain the benchmark result of read/write combined operation? (I was expecting it to perform over 20K iterations per second).
Thank you.
The memory mapped file performance depends on disk performance, file system type, free memory available for file system cache and read/write block size. The page size on the linux is 4K. So you should expect most performance with 4k read/writes. An access at random position causes page fault if page is not mapped and will pull a new page read. Usually, you want memory mapped file if you want to see the files as a one memory array ( or ByteBuffer in Java ).

I want to use java to calcuate the speed of network,but i don't known how to do

I thought of using download files to calculator the speed, but it turns out to be unsuccessful. Operation is as follows:
I download a file and read every second file size, and use a small tool observation network speed at the same time. Finally found that the size of the file every second increase less (300 KB/S), but the tools it show JVM download speeds up to 4M/S.
Now I do not have a few thoughts, and I need your help.
When you are looking at the amount of actual data you are usually measuring in bytes (8 bits) and is without TCP/IP headers (can be 54 bytes). When you are looking at the raw connection you are measuring in bits and including headers. If the packets are fairly small (ie with a significant header overhead), you can have a 4 Mb/s (b for bit) connection and only 300 kB (B for byte or octet) of actual data.

Efficiency of Code: Java Transfer File over TCP

I would like to know the difference in terms of performance between these two blocks that try to send a big file over a TCP socket.
I couldn't find much resources explaining their efficiency.
A-
byte[] buffer = new byte[1024];
int number;
while ((number = fileInputStream.read(buffer)) != -1) {
socketOutputStream.write(buffer, 0, number);
}
B-
byte mybytearray = new byte[filesize];
os.write(mybytearray);
Which one is better in terms of transfer delay?
Also What is the difference if i set the size to 1024 or 65536? How would that affect the performance.
The latency until the last byte of the file arrives is basically identical. However the first one is preferable, although with a much larger buffer, for the following reasons:
The data starts arriving sooner.
There is no assumption that the file size fits into an int.
There is no assumption that the entire file fits into memory, so
It scales to very large files without code changes.
Your MTU (Maximum Transmission Unit) size is likely to be around 1500 bytes. This means your data will be broken up (or combined into) this size no matter what you do. Any reasonable buffer size from 512 byte up is likely to give you the same transfer speed.
How you send and receieve data impact the amount of CPU you use. Unles syou have a fast network e.g. 10 GB, your CPU will mroe than keep up with your network.
Writing the code in an efficient manner will ensure you don't waste CPU (which is a good thing) but shouldn't make much difference to your transfer speed which is limited by your bandwidth (and latency of your network)

Categories