What is the fastest java file copy method with progress monitoring? - java

I'm working on a file copying application which is used to copy files from client machine to a network folder (UNC path). Client and network folder are connected using a 10Gbps connection. Traditional Stream/Buffer mechanism could only use up to 250Mbps. That is why I started using NIO methods. Both Files.copy() and transferFrom() methods could use upto 6Gbps bandwidth which is sufficient for now. But the problem is both these methods doesn't provide progress. I must need to display the file copying progress in my application.
Then I found ReadableByteChannel interface to track the upload progress. But after implementing this, upload speed dropped to 100Mbps. Not sure if I didn't implement it correctly.
OS level copying (Ctrl+C and Ctrl+V) works with 6Gbps bandwidth utilization. How to achieve the same with Java method with progress monitoring?
public class AppTest {
/**
* #param args the command line arguments
*/
public static void main(String[] args) {
File source = new File(args[0]);
File dest = new File(args[1] + File.separator + source.getName());
long startTime = System.currentTimeMillis();
try {
if (args[2].equalsIgnoreCase("s")) {
copyUsingStream(source, dest, args.length > 3 ? Integer.parseInt(args[3]) : 32 * 1024);
} else if (args[2].equalsIgnoreCase("fp")) {
copyUsingFileChannelWithProgress(source, dest);
} else if (args[2].equalsIgnoreCase("f")){
copyUsingFileChannels(source, dest);
} else if (args[2].equalsIgnoreCase("j")) {
copyUsingFilescopy(source, dest);
} else {
System.out.println("Unknown copy option.");
}
} catch (Exception e) {
e.printStackTrace();
}
System.out.println("Completed in " + (System.currentTimeMillis() - startTime));
}
private static void copyUsingStream(File source, File dest, int buf_size) throws IOException {
System.out.println("Copying using feeder code...");
System.out.println("Buffer Size : " + buf_size);
FileInputStream sourceFileIS = new FileInputStream(source);
FileOutputStream srvrFileOutStrm = new FileOutputStream(dest);
byte[] buf = new byte[buf_size];
int dataReadLen;
while ((dataReadLen = sourceFileIS.read(buf)) > 0) {
srvrFileOutStrm.write(buf, 0, dataReadLen);
}
srvrFileOutStrm.close();
sourceFileIS.close();
}
private static void copyUsingFileChannels(File source, File dest)
throws IOException {
System.out.println("Copying using filechannel...");
FileChannel inputChannel = null;
FileChannel outputChannel = null;
try {
inputChannel = new FileInputStream(source).getChannel();
outputChannel = new FileOutputStream(dest).getChannel();
outputChannel.transferFrom(inputChannel, 0, inputChannel.size());
} finally {
inputChannel.close();
outputChannel.close();
}
}
private static void copyUsingFilescopy(File source, File dest) throws IOException{
Files.copy(source.toPath(), dest.toPath());
}
interface ProgressCallBack {
public void callback(CallbackByteChannel rbc, double progress);
}
static class CallbackByteChannel implements ReadableByteChannel {
ProgressCallBack delegate;
long size;
ReadableByteChannel rbc;
long sizeRead;
CallbackByteChannel(ReadableByteChannel rbc, long sizeRead, long expectedSize, ProgressCallBack delegate) {
this.delegate = delegate;
this.sizeRead = sizeRead;
this.size = expectedSize;
this.rbc = rbc;
}
#Override
public void close() throws IOException {
rbc.close();
}
public long getReadSoFar() {
return sizeRead;
}
#Override
public boolean isOpen() {
return rbc.isOpen();
}
#Override
public int read(ByteBuffer bb) throws IOException {
int n;
double progress;
if ((n = rbc.read(bb)) > 0) {
sizeRead += n;
progress = size > 0 ? (double) sizeRead / (double) size * 100.0 : -1.0;
delegate.callback(this, progress);
}
return n;
}
}
private static void copyUsingFileChannelWithProgress(File sourceFile, File destFile) throws IOException {
ProgressCallBack progressCallBack = new ProgressCallBack() {
#Override
public void callback(CallbackByteChannel rbc, double progress) {
// publish((int)progress);
}
};
FileOutputStream fos = null;
FileChannel sourceChannel = null;
sourceChannel = new FileInputStream(sourceFile).getChannel();
ReadableByteChannel rbc = new CallbackByteChannel(sourceChannel, 0, sourceFile.length(), progressCallBack);
fos = new FileOutputStream(destFile);
fos.getChannel().transferFrom(rbc, 0, sourceFile.length());
if (sourceChannel.isOpen()) {
sourceChannel.close();
}
fos.close();
}
}

Use transferFrom() in a loop with a large chunk size that is still smaller than the file size. You will have to trade off speed for progress indication here. You will probably want to make the chunks at least 1Mb to retain speed.

Related

Pipe Broken with PipeInputStream with kubernetes-client exec()

I'm using the kubernetes-client to try copy a directory from a pod, but I'm doing something wrong with the input stream from stdout. I get a java.io.IOException: Pipe broken exception when it tries to read(). I'm pretty sure that no data flows at all. I'm half wondering if I need to read the InputStream on a separate thread or something?
The stream is created like this:
public InputStream copyFiles(String containerId,
String folderName) {
ExecWatch exec = client.pods().withName(containerId).redirectingOutput().exec("tar -C " + folderName + " -c");
// We need to wrap the InputStream so that when the stdout is closed, then the underlying ExecWatch is closed
// also. This will cleanup any Websockets connections.
ChainedCloseInputStreamWrapper inputStreamWrapper = new ChainedCloseInputStreamWrapper(exec.getOutput(), exec);
return inputStreamWrapper;
}
And the InputStream is processed in this function
void copyVideos(final String containerId) {
TarArchiveInputStream tarStream = new TarArchiveInputStream(containerClient.copyFiles(containerId, "/videos/"));
TarArchiveEntry entry;
boolean videoWasCopied = false;
try {
while ((entry = tarStream.getNextTarEntry()) != null) {
if (entry.isDirectory()) {
continue;
}
String fileExtension = entry.getName().substring(entry.getName().lastIndexOf('.'));
testInformation.setFileExtension(fileExtension);
File videoFile = new File(testInformation.getVideoFolderPath(), testInformation.getFileName());
File parent = videoFile.getParentFile();
if (!parent.exists()) {
parent.mkdirs();
}
OutputStream outputStream = new FileOutputStream(videoFile);
IOUtils.copy(tarStream, outputStream);
outputStream.close();
videoWasCopied = true;
LOGGER.log(Level.INFO, "{0} Video file copied to: {1}/{2}", new Object[]{getId(),
testInformation.getVideoFolderPath(), testInformation.getFileName()});
}
} catch (IOException e) {
LOGGER.log(Level.WARNING, getId() + " Error while copying the video", e);
ga.trackException(e);
} finally {
if (!videoWasCopied) {
testInformation.setVideoRecorded(false);
}
}
}
The InputStream Wrapper class is just there to close the ExecWatch at the end once the InputStream is closed, it looks like this:
private static class ChainedCloseInputStreamWrapper extends InputStream {
private InputStream delegate;
private Closeable resourceToClose;
public ChainedCloseInputStreamWrapper(InputStream delegate, Closeable resourceToClose) {
this.delegate = delegate;
this.resourceToClose = resourceToClose;
}
#Override
public int read() throws IOException {
return delegate.read();
}
public int available() throws IOException {
return delegate.available();
}
public void close() throws IOException {
logger.info("Shutdown called!");
delegate.close();
// Close our dependent resource
resourceToClose.close();
}
public boolean equals(Object o) {
return delegate.equals(o);
}
public int hashCode() {
return delegate.hashCode();
}
public int read(byte[] array) throws IOException {
return delegate.read(array);
}
public int read(byte[] array,
int n,
int n2) throws IOException {
return delegate.read(array, n, n2);
}
public long skip(long n) throws IOException {
return delegate.skip(n);
}
public void mark(int n) {
delegate.mark(n);
}
public void reset() throws IOException {
delegate.reset();
}
public boolean markSupported() {
return delegate.markSupported();
}
public String toString() {
return delegate.toString();
}
}
Turns out I had the tar command wrong, so it was causing a failure and the stdout PipeInputStream was dead locking. I managed to find a workaround for the deadlock. But the main reason for the failure was that I forgot to tell tar to actually do something! I at least needed a "." to include the current directory.

How to filter a particular directory and copy the rest of the folders in java

I am trying to copy folders and files which is working fine but I need help on how to filter a single folder and copy the rest of the folders. For example, I have directories like carsfolder and truckfolder in(C:\vehicle\carsfolder and C:\vehicle\truckfolder). When I use the below code it copies both carsfolder and truckfolder but I wanted to copy only carsfolder. How can I do that. Your help is highly appreciated.(Using Swing and Java 1.6)
class CopyTask extends SwingWorker<Void, Integer>
{
private File source;
private File target;
private long totalBytes = 0;
private long copiedBytes = 0;
public CopyTask(File src, File dest)
{
this.source = src;
this.target = dest;
progressAll.setValue(0);
}
#Override
public Void doInBackground() throws Exception
{
ta.append("Retrieving info ... "); //append to TextArea
retrieveTotalBytes(source);
ta.append("Done!\n");
copyFiles(source, target);
return null;
}
#Override
public void process(List<Integer> chunks)
{
for(int i : chunks)
{
}
}
#Override
public void done()
{
setProgress(100);
}
private void retrieveTotalBytes(File sourceFile)
{
try
{
File[] files = sourceFile.listFiles();
for(File file : files)
{
if(file.isDirectory()) retrieveTotalBytes(file);
else totalBytes += file.length();
}
}
catch(Exception ee)
{
}
}
private void copyFiles(File sourceFile, File targetFile) throws IOException
{
if(sourceFile.isDirectory())
{
try{
if(!targetFile.exists()) targetFile.mkdirs();
String[] filePaths = sourceFile.list();
for(String filePath : filePaths)
{
File srcFile = new File(sourceFile, filePath);
File destFile = new File(targetFile, filePath);
copyFiles(srcFile, destFile);
}
}
catch(Exception ie)
{
}
}
else
{
try
{
ta.append("Copying " + sourceFile.getAbsolutePath() + " to " + targetFile.getAbsolutePath() );
bis = new BufferedInputStream(new FileInputStream(sourceFile));
bos = new BufferedOutputStream(new FileOutputStream(targetFile));
long fileBytes = sourceFile.length();
long soFar = 0;
int theByte;
while((theByte = bis.read()) != -1)
{
bos.write(theByte);
setProgress((int) (copiedBytes++ * 100 / totalBytes));
publish((int) (soFar++ * 100 / fileBytes));
}
bis.close();
bos.close();
publish(100);
ta.append(" Done!\n");
}
catch(Exception excep)
{
setProgress(0);
bos.flush();
bis.close();
bos.close();
}
finally{
try {
bos.flush();
}
catch (Exception e) {
}
try {
bis.close();
}
catch (Exception e) {
}
try {
bos.close();
}
catch (Exception e) {
}
}
}
}
}
Maybe you can introduce a regex or list of regexes that specify which files and dirs to exclude?
For example, to exclude truckfolder, use a "exclusion" regex like "C:\\vehicle\\truckfolder.*".
Then, in your code, before you copy anything, check to make sure the absolute path of the sourcefile doesn't match the exclusion regex(s).

Where to provide try catch block

I am trying to copy files from windows server1 to another windows server2 and not sure where to put the try catch block. I want to inform the user whenver windows server1 or windows server2 shuts down while copying process is ongoing either throught a popup or displaying in a textArea and here is my swingworker code. Thanks in advance
class CopyTask extends SwingWorker<Void, Integer>
{
private File source;
private File target;
private long totalBytes = 0;
private long copiedBytes = 0;
public CopyTask(File src, File dest)
{
this.source = src;
this.target = dest;
progressAll.setValue(0);
progressCurrent.setValue(0);
}
#Override
public Void doInBackground() throws Exception
{
ta.append("Retrieving info ... ");
retrieveTotalBytes(source);
ta.append("Done!\n");
copyFiles(source, target);
return null;
}
#Override
public void process(List<Integer> chunks)
{
for(int i : chunks)
{
progressCurrent.setValue(i);
}
}
#Override
public void done()
{
setProgress(100);
}
private void retrieveTotalBytes(File sourceFile)
{
File[] files = sourceFile.listFiles();
for(File file : files)
{
if(file.isDirectory()) retrieveTotalBytes(file);
else totalBytes += file.length();
}
}
private void copyFiles(File sourceFile, File targetFile) throws IOException
{
if(sourceFile.isDirectory())
{
if(!targetFile.exists()) targetFile.mkdirs();
String[] filePaths = sourceFile.list();
for(String filePath : filePaths)
{
File srcFile = new File(sourceFile, filePath);
File destFile = new File(targetFile, filePath);
copyFiles(srcFile, destFile);
}
}
else
{
ta.append("Copying " + sourceFile.getAbsolutePath() + " to " + targetFile.getAbsolutePath() ); //appends to textarea
bis = new BufferedInputStream(new FileInputStream(sourceFile));
bos = new BufferedOutputStream(new FileOutputStream(targetFile));
long fileBytes = sourceFile.length();
long soFar = 0;
int theByte;
while((theByte = bis.read()) != -1)
{
bos.write(theByte);
setProgress((int) (copiedBytes++ * 100 / totalBytes));
publish((int) (soFar++ * 100 / fileBytes));
}
bis.close();
bos.close();
publish(100);
}
}
Where is the line where the exception can happen? That's the first place I locate any exception.
Generally, if your modules are small, you can wrap the try around all the real code in the module and catch the exceptions at the end, especially if the exception is fatal. Then you can log the exception and return an error message/status to the user.
However, the strategy is different if the exception is not fatal. In this case you'll have to handle it right where the connection exception is thrown so you can seamlessly resume when the connection returns. Of course, this is a little more work.
EDIT - you probably want bis.close() and bos.close() inside a finally block to ensure they get closed. It may be pedantic but it seems prudent.

Java: How to get upload and download speed

I write a program to upload and download files to FTP server but I can not monitor the speed and the transfer rate.
I used FTPClient class and its two methods retrievFile() and storeFile()
Give this a try:
public class ReportingOutputStream extends OutputStream {
public static final String BYTES_PROP = "Bytes";
private FileOutputStream fileStream;
private long byteCount = 0L;
private long lastByteCount = 0L;
private long updateInterval = 1L << 10;
private long nextReport = updateInterval;
private PropertyChangeSupport changer = new PropertyChangeSupport(this);
public ReportingOutputStream(File f) throws IOException {
fileStream = new FileOutputStream(f);
}
public void setUpdateInterval(long bytes) {
updateInterval = bytes;
nextReport = updateInterval;
}
#Override
public void write(int b) throws IOException {
byte[] bytes = { (byte) (b & 0xFF) };
write(bytes, 0, 1);
}
#Override
public void write(byte[] b, int off, int len) throws IOException {
fileStream.write(b, off, len);
byteCount += len;
if (byteCount > nextReport) {
changer.firePropertyChange( BYTES_PROP, lastByteCount, byteCount);
lastByteCount = byteCount;
nextReport += updateInterval;
}
}
#Override
public void close() throws IOException {
if (fileStream != null) {
fileStream.close();
fileStream = null;
}
}
public void removePropertyChangeListener(String propertyName, PropertyChangeListener listener) {
changer.removePropertyChangeListener(propertyName, listener);
}
public void addPropertyChangeListener(String propertyName, PropertyChangeListener listener) {
changer.addPropertyChangeListener(propertyName, listener);
}
}
After creating the stream, add a property change listener for BYTES_PROP. By default it fires the handler for every 1 KB received. Call setUpdateInterval to change.
Since retrieveFile and storeFile deal with input and output streams, is it possible for you to write your own subclasses that can monitor the number of bytes transferred in or out over a certain time?

What is the fastest way to read a large number of small files into memory?

I need to read ~50 files on every server start and place each text file's representation into memory. Each text file will have its own string (which is the best type to use for the string holder?).
What is the fastest way to read the files into memory, and what is the best data structure/type to hold the text in so that I can manipulate it in memory (search and replace mainly)?
Thanks
A memory mapped file will be fastest... something like this:
final File file;
final FileChannel channel;
final MappedByteBuffer buffer;
file = new File(fileName);
fin = new FileInputStream(file);
channel = fin.getChannel();
buffer = channel.map(MapMode.READ_ONLY, 0, file.length());
and then proceed to read from the byte buffer.
This will be significantly faster than FileInputStream or FileReader.
EDIT:
After a bit of investigation with this it turns out that, depending on your OS, you might be better off using a new BufferedInputStream(new FileInputStream(file)) instead. However reading the whole thing all at once into a char[] the size of the file sounds like the worst way.
So BufferedInputStream should give roughly consistent performance on all platforms, while the memory mapped file may be slow or fast depending on the underlying OS. As with everything that is performance critical you should test your code and see what works best.
EDIT:
Ok here are some tests (the first one is done twice to get the files into the disk cache).
I ran it on the rt.jar class files, extracted to the hard drive, this is under Windows 7 beta x64. That is 16784 files with a total of 94,706,637 bytes.
First the results...
(remember the first is repeated to get the disk cache setup)
ArrayTest
time = 83016
bytes = 118641472
ArrayTest
time = 46570
bytes = 118641472
DataInputByteAtATime
time = 74735
bytes = 118641472
DataInputReadFully
time = 8953
bytes = 118641472
MemoryMapped
time = 2320
bytes = 118641472
Here is the code...
import java.io.BufferedInputStream;
import java.io.DataInputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;
import java.nio.channels.FileChannel.MapMode;
import java.util.HashSet;
import java.util.Set;
public class Main
{
public static void main(final String[] argv)
{
ArrayTest.main(argv);
ArrayTest.main(argv);
DataInputByteAtATime.main(argv);
DataInputReadFully.main(argv);
MemoryMapped.main(argv);
}
}
abstract class Test
{
public final void run(final File root)
{
final Set<File> files;
final long size;
final long start;
final long end;
final long total;
files = new HashSet<File>();
getFiles(root, files);
start = System.currentTimeMillis();
size = readFiles(files);
end = System.currentTimeMillis();
total = end - start;
System.out.println(getClass().getName());
System.out.println("time = " + total);
System.out.println("bytes = " + size);
}
private void getFiles(final File dir,
final Set<File> files)
{
final File[] childeren;
childeren = dir.listFiles();
for(final File child : childeren)
{
if(child.isFile())
{
files.add(child);
}
else
{
getFiles(child, files);
}
}
}
private long readFiles(final Set<File> files)
{
long size;
size = 0;
for(final File file : files)
{
size += readFile(file);
}
return (size);
}
protected abstract long readFile(File file);
}
class ArrayTest
extends Test
{
public static void main(final String[] argv)
{
final Test test;
test = new ArrayTest();
test.run(new File(argv[0]));
}
protected long readFile(final File file)
{
InputStream stream;
stream = null;
try
{
final byte[] data;
int soFar;
int sum;
stream = new BufferedInputStream(new FileInputStream(file));
data = new byte[(int)file.length()];
soFar = 0;
do
{
soFar += stream.read(data, soFar, data.length - soFar);
}
while(soFar != data.length);
sum = 0;
for(final byte b : data)
{
sum += b;
}
return (sum);
}
catch(final IOException ex)
{
ex.printStackTrace();
}
finally
{
if(stream != null)
{
try
{
stream.close();
}
catch(final IOException ex)
{
ex.printStackTrace();
}
}
}
return (0);
}
}
class DataInputByteAtATime
extends Test
{
public static void main(final String[] argv)
{
final Test test;
test = new DataInputByteAtATime();
test.run(new File(argv[0]));
}
protected long readFile(final File file)
{
DataInputStream stream;
stream = null;
try
{
final int fileSize;
int sum;
stream = new DataInputStream(new BufferedInputStream(new FileInputStream(file)));
fileSize = (int)file.length();
sum = 0;
for(int i = 0; i < fileSize; i++)
{
sum += stream.readByte();
}
return (sum);
}
catch(final IOException ex)
{
ex.printStackTrace();
}
finally
{
if(stream != null)
{
try
{
stream.close();
}
catch(final IOException ex)
{
ex.printStackTrace();
}
}
}
return (0);
}
}
class DataInputReadFully
extends Test
{
public static void main(final String[] argv)
{
final Test test;
test = new DataInputReadFully();
test.run(new File(argv[0]));
}
protected long readFile(final File file)
{
DataInputStream stream;
stream = null;
try
{
final byte[] data;
int sum;
stream = new DataInputStream(new BufferedInputStream(new FileInputStream(file)));
data = new byte[(int)file.length()];
stream.readFully(data);
sum = 0;
for(final byte b : data)
{
sum += b;
}
return (sum);
}
catch(final IOException ex)
{
ex.printStackTrace();
}
finally
{
if(stream != null)
{
try
{
stream.close();
}
catch(final IOException ex)
{
ex.printStackTrace();
}
}
}
return (0);
}
}
class DataInputReadInChunks
extends Test
{
public static void main(final String[] argv)
{
final Test test;
test = new DataInputReadInChunks();
test.run(new File(argv[0]));
}
protected long readFile(final File file)
{
DataInputStream stream;
stream = null;
try
{
final byte[] data;
int size;
final int fileSize;
int sum;
stream = new DataInputStream(new BufferedInputStream(new FileInputStream(file)));
fileSize = (int)file.length();
data = new byte[512];
size = 0;
sum = 0;
do
{
size += stream.read(data);
sum = 0;
for(int i = 0; i < size; i++)
{
sum += data[i];
}
}
while(size != fileSize);
return (sum);
}
catch(final IOException ex)
{
ex.printStackTrace();
}
finally
{
if(stream != null)
{
try
{
stream.close();
}
catch(final IOException ex)
{
ex.printStackTrace();
}
}
}
return (0);
}
}
class MemoryMapped
extends Test
{
public static void main(final String[] argv)
{
final Test test;
test = new MemoryMapped();
test.run(new File(argv[0]));
}
protected long readFile(final File file)
{
FileInputStream stream;
stream = null;
try
{
final FileChannel channel;
final MappedByteBuffer buffer;
final int fileSize;
int sum;
stream = new FileInputStream(file);
channel = stream.getChannel();
buffer = channel.map(MapMode.READ_ONLY, 0, file.length());
fileSize = (int)file.length();
sum = 0;
for(int i = 0; i < fileSize; i++)
{
sum += buffer.get();
}
return (sum);
}
catch(final IOException ex)
{
ex.printStackTrace();
}
finally
{
if(stream != null)
{
try
{
stream.close();
}
catch(final IOException ex)
{
ex.printStackTrace();
}
}
}
return (0);
}
}
The most efficient way is:
Determine the length of the file (File.length())
Create a char buffer with the same size (or slightly larger)
Determine the encoding of the file
Use new InputStreamReader (new FileInputStream(file), encoding) to read
Read the while file into the buffer with a single call to read(). Note that read() might return early (not having read the whole file). In that case, call it again with an offset to read the next batch.
Create the string: new String(buffer)
If you need to search&replace once at startup, use String.replaceAll().
If you need to do it repeatedly, you may consider using StringBuilder. It has no replaceAll() but you can use it to manipulate the character array in place (-> no allocation of memory).
That said:
Make your code as short and simple as possible.
Measure the performance
It it's too slow, fix it.
There is no reason to waste a lot of time into making this code run fast if it takes just 0.1s to execute.
If you still have a performance problem, consider to put all the text files into a JAR, add it into the classpath and use Class.getResourceAsStream() to read the files. Loading things from the Java classpath is highly optimized.
It depends a lot on the internal structure of your text files and what you intend to do with them.
Are the files key-value dictionaries (i.e. "properties" files)? XML? JSON? You have standard structures for those.
If they have a formal structure you may also use JavaCC to build an object representation of the files.
Otherwise, if they are just blobs of data, well, read the files and put them in a String.
Edit: about search&replace- juste use String's replaceAll function.
After searching across google for for existing tests on IO speed in Java, I must say TofuBear's test case completely opened my eyes. You have to run his test on your own platform to see what is fastest for you.
After running his test, and adding a few of my own (Credit to TofuBear for posting his original code), it appears you may get even more speed by using your own custom buffer vs. using the BufferedInputStream.
To my dismay the NIO ByteBuffer did not perform well.
NOTE: The static byte[] buffer shaved off a few ms, but the static ByteBuffers actualy increased the time to process! Is there anything wrong with the code??
I added a few tests:
ArrayTest_CustomBuffering (Read data directly into my own buffer)
ArrayTest_CustomBuffering_StaticBuffer (Read Data into a static buffer that is created only once in the beginning)
FileChannelArrayByteBuffer (use NIO ByteBuffer and wrapping your own byte[] array)
FileChannelAllocateByteBuffer (use NIO ByteBuffer with .allocate)
FileChannelAllocateByteBuffer_StaticBuffer (same as 4 but with a static buffer)
FileChannelAllocateDirectByteBuffer (use NIO ByteBuffer with .allocateDirect)
FileChannelAllocateDirectByteBuffer_StaticBuffer (same as 6 but with a static buffer)
Here are my results:, using Windows Vista and jdk1.6.0_13 on the extracted rt.jar:
ArrayTest
time = 2075
bytes = 2120336424
ArrayTest
time = 2044
bytes = 2120336424
ArrayTest_CustomBuffering
time = 1903
bytes = 2120336424
ArrayTest_CustomBuffering_StaticBuffer
time = 1872
bytes = 2120336424
DataInputByteAtATime
time = 2668
bytes = 2120336424
DataInputReadFully
time = 2028
bytes = 2120336424
MemoryMapped
time = 2901
bytes = 2120336424
FileChannelArrayByteBuffer
time = 2371
bytes = 2120336424
FileChannelAllocateByteBuffer
time = 2356
bytes = 2120336424
FileChannelAllocateByteBuffer_StaticBuffer
time = 2668
bytes = 2120336424
FileChannelAllocateDirectByteBuffer
time = 2512
bytes = 2120336424
FileChannelAllocateDirectByteBuffer_StaticBuffer
time = 2590
bytes = 2120336424
My hacked version of TofuBear's code:
import java.io.BufferedInputStream;
import java.io.DataInputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.nio.MappedByteBuffer;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;
import java.nio.channels.FileChannel.MapMode;
import java.util.HashSet;
import java.util.Set;
public class Main {
public static void main(final String[] argv) {
ArrayTest.mainx(argv);
ArrayTest.mainx(argv);
ArrayTest_CustomBuffering.mainx(argv);
ArrayTest_CustomBuffering_StaticBuffer.mainx(argv);
DataInputByteAtATime.mainx(argv);
DataInputReadFully.mainx(argv);
MemoryMapped.mainx(argv);
FileChannelArrayByteBuffer.mainx(argv);
FileChannelAllocateByteBuffer.mainx(argv);
FileChannelAllocateByteBuffer_StaticBuffer.mainx(argv);
FileChannelAllocateDirectByteBuffer.mainx(argv);
FileChannelAllocateDirectByteBuffer_StaticBuffer.mainx(argv);
}
}
abstract class Test {
static final int BUFF_SIZE = 20971520;
static final byte[] StaticData = new byte[BUFF_SIZE];
static final ByteBuffer StaticBuffer =ByteBuffer.allocate(BUFF_SIZE);
static final ByteBuffer StaticDirectBuffer = ByteBuffer.allocateDirect(BUFF_SIZE);
public final void run(final File root) {
final Set<File> files;
final long size;
final long start;
final long end;
final long total;
files = new HashSet<File>();
getFiles(root, files);
start = System.currentTimeMillis();
size = readFiles(files);
end = System.currentTimeMillis();
total = end - start;
System.out.println(getClass().getName());
System.out.println("time = " + total);
System.out.println("bytes = " + size);
}
private void getFiles(final File dir,final Set<File> files) {
final File[] childeren;
childeren = dir.listFiles();
for(final File child : childeren) {
if(child.isFile()) {
files.add(child);
}
else {
getFiles(child, files);
}
}
}
private long readFiles(final Set<File> files) {
long size;
size = 0;
for(final File file : files) {
size += readFile(file);
}
return (size);
}
protected abstract long readFile(File file);
}
class ArrayTest extends Test {
public static void mainx(final String[] argv) {
final Test test;
test = new ArrayTest();
test.run(new File(argv[0]));
}
protected long readFile(final File file) {
InputStream stream;
stream = null;
try {
final byte[] data;
int soFar;
int sum;
stream = new BufferedInputStream(new FileInputStream(file));
data = new byte[(int)file.length()];
soFar = 0;
do {
soFar += stream.read(data, soFar, data.length - soFar);
}
while(soFar != data.length);
sum = 0;
for(final byte b : data) {
sum += b;
}
return (sum);
}
catch(final IOException ex) {
ex.printStackTrace();
}
finally {
if(stream != null) {
try {
stream.close();
}
catch(final IOException ex) {
ex.printStackTrace();
}
}
}
return (0);
}
}
class ArrayTest_CustomBuffering extends Test {
public static void mainx(final String[] argv) {
final Test test;
test = new ArrayTest_CustomBuffering();
test.run(new File(argv[0]));
}
protected long readFile(final File file) {
InputStream stream;
stream = null;
try {
final byte[] data;
int soFar;
int sum;
stream = new FileInputStream(file);
data = new byte[(int)file.length()];
soFar = 0;
do {
soFar += stream.read(data, soFar, data.length - soFar);
}
while(soFar != data.length);
sum = 0;
for(final byte b : data) {
sum += b;
}
return (sum);
}
catch(final IOException ex) {
ex.printStackTrace();
}
finally {
if(stream != null) {
try {
stream.close();
}
catch(final IOException ex) {
ex.printStackTrace();
}
}
}
return (0);
}
}
class ArrayTest_CustomBuffering_StaticBuffer extends Test {
public static void mainx(final String[] argv) {
final Test test;
test = new ArrayTest_CustomBuffering_StaticBuffer();
test.run(new File(argv[0]));
}
protected long readFile(final File file) {
InputStream stream;
stream = null;
try {
int soFar;
int sum;
final int fileSize;
stream = new FileInputStream(file);
fileSize = (int)file.length();
soFar = 0;
do {
soFar += stream.read(StaticData, soFar, fileSize - soFar);
}
while(soFar != fileSize);
sum = 0;
for(int i=0;i<fileSize;i++) {
sum += StaticData[i];
}
return (sum);
}
catch(final IOException ex) {
ex.printStackTrace();
}
finally {
if(stream != null) {
try {
stream.close();
}
catch(final IOException ex) {
ex.printStackTrace();
}
}
}
return (0);
}
}
class DataInputByteAtATime extends Test {
public static void mainx(final String[] argv) {
final Test test;
test = new DataInputByteAtATime();
test.run(new File(argv[0]));
}
protected long readFile(final File file) {
DataInputStream stream;
stream = null;
try {
final int fileSize;
int sum;
stream = new DataInputStream(new BufferedInputStream(new FileInputStream(file)));
fileSize = (int)file.length();
sum = 0;
for(int i = 0; i < fileSize; i++) {
sum += stream.readByte();
}
return (sum);
}
catch(final IOException ex) {
ex.printStackTrace();
}
finally {
if(stream != null) {
try {
stream.close();
}
catch(final IOException ex) {
ex.printStackTrace();
}
}
}
return (0);
}
}
class DataInputReadFully extends Test {
public static void mainx(final String[] argv) {
final Test test;
test = new DataInputReadFully();
test.run(new File(argv[0]));
}
protected long readFile(final File file) {
DataInputStream stream;
stream = null;
try {
final byte[] data;
int sum;
stream = new DataInputStream(new BufferedInputStream(new FileInputStream(file)));
data = new byte[(int)file.length()];
stream.readFully(data);
sum = 0;
for(final byte b : data) {
sum += b;
}
return (sum);
}
catch(final IOException ex) {
ex.printStackTrace();
}
finally {
if(stream != null) {
try {
stream.close();
}
catch(final IOException ex) {
ex.printStackTrace();
}
}
}
return (0);
}
}
class DataInputReadInChunks extends Test {
public static void mainx(final String[] argv) {
final Test test;
test = new DataInputReadInChunks();
test.run(new File(argv[0]));
}
protected long readFile(final File file) {
DataInputStream stream;
stream = null;
try {
final byte[] data;
int size;
final int fileSize;
int sum;
stream = new DataInputStream(new BufferedInputStream(new FileInputStream(file)));
fileSize = (int)file.length();
data = new byte[512];
size = 0;
sum = 0;
do {
size += stream.read(data);
sum = 0;
for(int i = 0;
i < size;
i++) {
sum += data[i];
}
}
while(size != fileSize);
return (sum);
}
catch(final IOException ex) {
ex.printStackTrace();
}
finally {
if(stream != null) {
try {
stream.close();
}
catch(final IOException ex) {
ex.printStackTrace();
}
}
}
return (0);
}
}
class MemoryMapped extends Test {
public static void mainx(final String[] argv) {
final Test test;
test = new MemoryMapped();
test.run(new File(argv[0]));
}
protected long readFile(final File file) {
FileInputStream stream;
stream = null;
try {
final FileChannel channel;
final MappedByteBuffer buffer;
final int fileSize;
int sum;
stream = new FileInputStream(file);
channel = stream.getChannel();
buffer = channel.map(MapMode.READ_ONLY, 0, file.length());
fileSize = (int)file.length();
sum = 0;
for(int i = 0; i < fileSize; i++) {
sum += buffer.get();
}
return (sum);
}
catch(final IOException ex) {
ex.printStackTrace();
}
finally {
if(stream != null) {
try {
stream.close();
}
catch(final IOException ex) {
ex.printStackTrace();
}
}
}
return (0);
}
}
class FileChannelArrayByteBuffer extends Test {
public static void mainx(final String[] argv) {
final Test test;
test = new FileChannelArrayByteBuffer();
test.run(new File(argv[0]));
}
protected long readFile(final File file) {
FileInputStream stream;
stream = null;
try {
final byte[] data;
final FileChannel channel;
final ByteBuffer buffer;
int nRead=0;
final int fileSize;
int sum;
stream = new FileInputStream(file);
data = new byte[(int)file.length()];
buffer = ByteBuffer.wrap(data);
channel = stream.getChannel();
fileSize = (int)file.length();
nRead += channel.read(buffer);
buffer.rewind();
sum = 0;
for(int i = 0; i < fileSize; i++) {
sum += buffer.get();
}
return (sum);
}
catch(final IOException ex) {
ex.printStackTrace();
}
finally {
if(stream != null) {
try {
stream.close();
}
catch(final IOException ex) {
ex.printStackTrace();
}
}
}
return (0);
}
}
class FileChannelAllocateByteBuffer extends Test {
public static void mainx(final String[] argv) {
final Test test;
test = new FileChannelAllocateByteBuffer();
test.run(new File(argv[0]));
}
protected long readFile(final File file) {
FileInputStream stream;
stream = null;
try {
final byte[] data;
final FileChannel channel;
final ByteBuffer buffer;
int nRead=0;
final int fileSize;
int sum;
stream = new FileInputStream(file);
//data = new byte[(int)file.length()];
buffer = ByteBuffer.allocate((int)file.length());
channel = stream.getChannel();
fileSize = (int)file.length();
nRead += channel.read(buffer);
buffer.rewind();
sum = 0;
for(int i = 0; i < fileSize; i++) {
sum += buffer.get();
}
return (sum);
}
catch(final IOException ex) {
ex.printStackTrace();
}
finally {
if(stream != null) {
try {
stream.close();
}
catch(final IOException ex) {
ex.printStackTrace();
}
}
}
return (0);
}
}
class FileChannelAllocateDirectByteBuffer extends Test {
public static void mainx(final String[] argv) {
final Test test;
test = new FileChannelAllocateDirectByteBuffer();
test.run(new File(argv[0]));
}
protected long readFile(final File file) {
FileInputStream stream;
stream = null;
try {
final byte[] data;
final FileChannel channel;
final ByteBuffer buffer;
int nRead=0;
final int fileSize;
int sum;
stream = new FileInputStream(file);
//data = new byte[(int)file.length()];
buffer = ByteBuffer.allocateDirect((int)file.length());
channel = stream.getChannel();
fileSize = (int)file.length();
nRead += channel.read(buffer);
buffer.rewind();
sum = 0;
for(int i = 0; i < fileSize; i++) {
sum += buffer.get();
}
return (sum);
}
catch(final IOException ex) {
ex.printStackTrace();
}
finally {
if(stream != null) {
try {
stream.close();
}
catch(final IOException ex) {
ex.printStackTrace();
}
}
}
return (0);
}
}
class FileChannelAllocateByteBuffer_StaticBuffer extends Test {
public static void mainx(final String[] argv) {
final Test test;
test = new FileChannelAllocateByteBuffer_StaticBuffer();
test.run(new File(argv[0]));
}
protected long readFile(final File file) {
FileInputStream stream;
stream = null;
try {
final byte[] data;
final FileChannel channel;
int nRead=0;
final int fileSize;
int sum;
stream = new FileInputStream(file);
//data = new byte[(int)file.length()];
StaticBuffer.clear();
StaticBuffer.limit((int)file.length());
channel = stream.getChannel();
fileSize = (int)file.length();
nRead += channel.read(StaticBuffer);
StaticBuffer.rewind();
sum = 0;
for(int i = 0; i < fileSize; i++) {
sum += StaticBuffer.get();
}
return (sum);
}
catch(final IOException ex) {
ex.printStackTrace();
}
finally {
if(stream != null) {
try {
stream.close();
}
catch(final IOException ex) {
ex.printStackTrace();
}
}
}
return (0);
}
}
class FileChannelAllocateDirectByteBuffer_StaticBuffer extends Test {
public static void mainx(final String[] argv) {
final Test test;
test = new FileChannelAllocateDirectByteBuffer_StaticBuffer();
test.run(new File(argv[0]));
}
protected long readFile(final File file) {
FileInputStream stream;
stream = null;
try {
final byte[] data;
final FileChannel channel;
int nRead=0;
final int fileSize;
int sum;
stream = new FileInputStream(file);
//data = new byte[(int)file.length()];
StaticDirectBuffer.clear();
StaticDirectBuffer.limit((int)file.length());
channel = stream.getChannel();
fileSize = (int)file.length();
nRead += channel.read(StaticDirectBuffer);
StaticDirectBuffer.rewind();
sum = 0;
for(int i = 0; i < fileSize; i++) {
sum += StaticDirectBuffer.get();
}
return (sum);
}
catch(final IOException ex) {
ex.printStackTrace();
}
finally {
if(stream != null) {
try {
stream.close();
}
catch(final IOException ex) {
ex.printStackTrace();
}
}
}
return (0);
}
}
Any conventional approach is going to be limited in speed. I'm not sure you'll see much of a difference from one approach to the next.
I would concentrate on business tricks that could make the entire operation faster.
For instance, if you read all the files and stored them in a single file with the timestamps from each of your original file, then you could check to see if any of the files have changed without actually opening them. (a simple cache, in other words).
If your problem was getting a GUI up quickly, you might find a way to open the files in a background thread after your first screen was displayed.
The OS can be pretty good with files, if this is part of a batch process (no user I/O), you could start with a batch file that appends all the files into one big one before launching java, using something like this:
echo "file1" > file.all
type "file1" >> file.all
echo "file2" >> file.all
type "file2" >> file.all
Then just open file.all (I'm not sure how much faster this will be, but it's probably the fastest approach for the conditions I just stated)
I guess I'm just saying that more often than not, a solution to a speed issue often requires expanding your viewpoint a little and completely rethinking the solution using new parameters. Modifications of an existing algorithm usually only give minor speed enhancements at the cost of readability.
You should be able to read all the files in under a second using standard tools like Commons IO FileUtils.readFileToString(File)
You can use writeStringToFile(File, String) to save the modified file as well.
http://commons.apache.org/io/api-release/index.html?org/apache/commons/io/FileUtils.html
BTW: 50 is not a large number of files. A typical PC can have 100K files or more.

Categories