I need to write(append) huge string to flat file using java nio. The encoding is ISO-8859-1.
Currently we are writing as shown below. Is there any better way to do the same ?
public void writeToFile(Long limit) throws IOException{
String fileName = "/xyz/test.txt";
File file = new File(fileName);
FileOutputStream fileOutputStream = new FileOutputStream(file, true);
FileChannel fileChannel = fileOutputStream.getChannel();
ByteBuffer byteBuffer = null;
String messageToWrite = null;
for(int i=1; i<limit; i++){
//messageToWrite = get String Data From database
byteBuffer = ByteBuffer.wrap(messageToWrite.getBytes(Charset.forName("ISO-8859-1")));
fileChannel.write(byteBuffer);
}
fileChannel.close();
}
EDIT: Tried both options. Following are the results.
#Test
public void testWritingStringToFile() {
DiagnosticLogControlManagerImpl diagnosticLogControlManagerImpl = new DiagnosticLogControlManagerImpl();
try {
File file = diagnosticLogControlManagerImpl.createFile();
long startTime = System.currentTimeMillis();
writeToFileNIOWay(file);
//writeToFileIOWay(file);
long endTime = System.currentTimeMillis();
System.out.println("Total Time is " + (endTime - startTime));
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
/**
*
* #param limit
* Long
* #throws IOException
* IOException
*/
public void writeToFileNIOWay(File file) throws IOException {
FileOutputStream fileOutputStream = new FileOutputStream(file, true);
FileChannel fileChannel = fileOutputStream.getChannel();
ByteBuffer byteBuffer = null;
String messageToWrite = null;
for (int i = 1; i < 1000000; i++) {
messageToWrite = "This is a test üüüüüüööööö";
byteBuffer = ByteBuffer.wrap(messageToWrite.getBytes(Charset
.forName("ISO-8859-1")));
fileChannel.write(byteBuffer);
}
}
/**
*
* #param limit
* Long
* #throws IOException
* IOException
*/
public void writeToFileIOWay(File file) throws IOException {
FileOutputStream fileOutputStream = new FileOutputStream(file, true);
BufferedOutputStream bufferedOutputStream = new BufferedOutputStream(
fileOutputStream, 128 * 100);
String messageToWrite = null;
for (int i = 1; i < 1000000; i++) {
messageToWrite = "This is a test üüüüüüööööö";
bufferedOutputStream.write(messageToWrite.getBytes(Charset
.forName("ISO-8859-1")));
}
bufferedOutputStream.flush();
fileOutputStream.close();
}
private File createFile() throws IOException {
File file = new File(FILE_PATH + "test_sixth_one.txt");
file.createNewFile();
return file;
}
Using ByteBuffer and Channel: took 4402 ms
Using buffered Writer : Took 563 ms
UPDATED:
Since Java11 there is a specific method to write strings using java.nio.file.Files:
Files.writeString(Paths.get(file.toURI()), "My string to save");
We can also customize the writing with:
Files.writeString(Paths.get(file.toURI()),
"My string to save",
StandardCharsets.UTF_8,
StandardOpenOption.CREATE,
StandardOpenOption.TRUNCATE_EXISTING);
ORIGINAL ANSWER:
There is a one-line solution, using Java nio:
java.nio.file.Files.write(Paths.get(file.toURI()),
"My string to save".getBytes(StandardCharsets.UTF_8),
StandardOpenOption.CREATE,
StandardOpenOption.TRUNCATE_EXISTING);
I have not benchmarked this solution with the others, but using the built-in implementation for open-write-close file should be fast and the code is quite small.
I don't think you will be able to get a strict answer without benchmarking your software. NIO may speed up the application significantly under the right conditions, but it may also make things slower.
Here are some points:
Do you really need strings? If you store and receive bytes from you database you can avoid string allocation and encoding costs all together.
Do you really need rewind and flip? Seems like you are creating a new buffer for every string and just writing it to the channel. (If you go the NIO way, benchmark strategies that reuse the buffers instead of wrapping / discarding, I think they will do better).
Keep in mind that wrap and allocateDirect may produce quite different buffers. Benchmark both to grasp the trade-offs. With direct allocation, be sure to reuse the same buffer in order to achieve the best performance.
And the most important thing is: Be sure to compare NIO with BufferedOutputStream and/or BufferedWritter approaches (use a intermediate byte[] or char[] buffer with a reasonable size as well). I've seen many, many, many people discovering that NIO is no silver bullet.
If you fancy some bleeding edge... Back to IO Trails for some NIO2 :D.
And here is a interesting benchmark about file copying using different strategies. I know it is a different problem, but I think most of the facts and author conclusions also apply to your problem.
Cheers,
UPDATE 1:
Since #EJP tiped me that direct buffers wouldn't be efficient for this problem, I benchmark it myself and ended up with a nice NIO solution using nemory-mapped files. In my Macbook running OS X Lion this beats BufferedOutputStream by a solid margin. but keep in mind that this might be OS / Hardware / VM specific:
public void writeToFileNIOWay2(File file) throws IOException {
final int numberOfIterations = 1000000;
final String messageToWrite = "This is a test üüüüüüööööö";
final byte[] messageBytes = messageToWrite.
getBytes(Charset.forName("ISO-8859-1"));
final long appendSize = numberOfIterations * messageBytes.length;
final RandomAccessFile raf = new RandomAccessFile(file, "rw");
raf.seek(raf.length());
final FileChannel fc = raf.getChannel();
final MappedByteBuffer mbf = fc.map(FileChannel.MapMode.READ_WRITE, fc.
position(), appendSize);
fc.close();
for (int i = 1; i < numberOfIterations; i++) {
mbf.put(messageBytes);
}
}
I admit that I cheated a little by calculating the total size to append (around 26 MB) beforehand. This may not be possible for several real world scenarios. Still, you can always use a "big enough appending size for the operations and later truncate the file.
UPDATE 2 (2019):
To anyone looking for a modern (as in, Java 11+) solution to the problem, I would follow #DodgyCodeException's advice and use java.nio.file.Files.writeString:
String fileName = "/xyz/test.txt";
String messageToWrite = "My long string";
Files.writeString(Paths.get(fileName), messageToWrite, StandardCharsets.ISO_8859_1);
A BufferedWriter around a FileWriter will almost certainly be faster than any NIO scheme you can come up with. Your code certainly isn't optimal, with a new ByteBuffer per write, and then doing pointless operations on it when it is about to go out of scope, but in any case your question is founded on a misconception. NIO doesn't 'offload the memory footprint to the OS' at all, unless you're using FileChannel.transferTo/From(), which you can't in this instance.
NB don't use a PrintWriter as suggested in comments, as this swallows exceptions. PW is really only for consoles and log files where you don't care.
Here is a short and easy way. It creates a file and writes the data relative to your code project:
private void writeToFile(String filename, String data) {
Path p = Paths.get(".", filename);
try (OutputStream os = new BufferedOutputStream(
Files.newOutputStream(p, StandardOpenOption.CREATE, StandardOpenOption.APPEND))) {
os.write(data.getBytes(), 0, data.length());
} catch (IOException e) {
e.printStackTrace();
}
}
This works for me:
//Creating newBufferedWritter for writing to file
BufferedWritter napiš = Files.newBufferedWriter(Paths.get(filePath));
napiš.write(what);
//Don't forget for this (flush all what you write to String write):
napiš.flush();
Related
I am trying to decompress a lot of 40 MB+ files as I download them in parallel using ByteBuffers and Channels. I am getting better throughput by using Channels than I do by using Streams and we need this to be a very high throughput system as we need to process 40 TB of files every day and this part of the process is currently the bottleneck. The files are compressed with zstd-jni. Zstd-jni has api's for decompressing byte buffers but I get an error when I use them. How do I decompress a byte buffer at a time using zstd-jni?
I found these examples in their tests, but unless I am missing something the examples using ByteBuffers seem to assume the entire input file fits in one ByteBuffer:
https://github.com/luben/zstd-jni/blob/master/src/test/scala/Zstd.scala
Below is my code for compressing and decompressing files. The compression code works great, but the decompression code then fails with an error of -70.
public static long compressFile(String inFile, String outFolder, ByteBuffer inBuffer, ByteBuffer compressedBuffer, int compressionLevel) throws IOException {
File file = new File(inFile);
File outFile = new File(outFolder, file.getName() + ".zs");
long numBytes = 0l;
try (RandomAccessFile inRaFile = new RandomAccessFile(file, "r");
RandomAccessFile outRaFile = new RandomAccessFile(outFile, "rw");
FileChannel inChannel = inRaFile.getChannel();
FileChannel outChannel = outRaFile.getChannel()) {
inBuffer.clear();
while(inChannel.read(inBuffer) > 0) {
inBuffer.flip();
compressedBuffer.clear();
long compressedSize = Zstd.compressDirectByteBuffer(compressedBuffer, 0, compressedBuffer.capacity(), inBuffer, 0, inBuffer.limit(), compressionLevel);
numBytes+=compressedSize;
compressedBuffer.position((int)compressedSize);
compressedBuffer.flip();
outChannel.write(compressedBuffer);
inBuffer.clear();
}
}
return numBytes;
}
public static long decompressFile(String originalFilePath, String inFolder, ByteBuffer inBuffer, ByteBuffer decompressedBuffer) throws IOException {
File outFile = new File(originalFilePath);
File inFile = new File(inFolder, outFile.getName() + ".zs");
outFile = new File(inFolder, outFile.getName());
long numBytes = 0l;
try (RandomAccessFile inRaFile = new RandomAccessFile(inFile, "r");
RandomAccessFile outRaFile = new RandomAccessFile(outFile, "rw");
FileChannel inChannel = inRaFile.getChannel();
FileChannel outChannel = outRaFile.getChannel()) {
inBuffer.clear();
while(inChannel.read(inBuffer) > 0) {
inBuffer.flip();
decompressedBuffer.clear();
long compressedSize = Zstd.decompressDirectByteBuffer(decompressedBuffer, 0, decompressedBuffer.capacity(), inBuffer, 0, inBuffer.limit());
System.out.println(Zstd.isError(compressedSize) + " " + compressedSize);
numBytes+=compressedSize;
decompressedBuffer.position((int)compressedSize);
decompressedBuffer.flip();
outChannel.write(decompressedBuffer);
inBuffer.clear();
}
}
return numBytes;
}
Yes, the static methods you use in your example assume the whole compressed file fits in one ByteBuffer. As far as I understand your requirements, you need streaming decompression using ByteBuffers. ZstdDirectBufferDecompressingStream already provides this:
https://static.javadoc.io/com.github.luben/zstd-jni/1.3.7-1/com/github/luben/zstd/ZstdDirectBufferDecompressingStream.html
and here is an example how to use it (from the tests):
https://github.com/luben/zstd-jni/blob/master/src/test/scala/Zstd.scala#L261-L302
but you have also to subclass it and override the "refill" method.
EDIT: here is a new test I just added that has exactly the same structure as your question - moving data beteen channels:
https://github.com/luben/zstd-jni/blob/master/src/test/scala/Zstd.scala#L540-L586
I am trying to send chunks of files from server to more than one clients. When I am trying to send file of size 700mb, it showed "OutOfMemory java heap space" error. I am using Netbeans 7.1.2 version.
I also tried VMoption in the properties. But still the same error happens. I think there is some problem with reading the entire file. Below code is working for up to 300mb. Please give me some suggestions.
Thanks in advance
public class SplitFile {
static int fileid = 0 ;
public static DataUnit[] getUpdatableDataCode(File fileName) throws FileNotFoundException, IOException{
int i = 0;
DataUnit[] chunks = new DataUnit[UAProtocolServer.singletonServer.cloudhosts.length];
FileInputStream fis;
long Chunk_Size = (fileName.length())/chunks.length;
int cursor = 0;
long fileSize = (long) fileName.length();
int nChunks = 0, read = 0;long readLength = Chunk_Size;
byte[] byteChunk;
try {
fis = new FileInputStream(fileName);
//StupidTest.size = (int)fileName.length();
while (fileSize > 0) {
System.out.println("loop"+ i);
if (fileSize <= Chunk_Size) {
readLength = (int) fileSize;
}
byteChunk = new byte[(int)readLength];
read = fis.read(byteChunk, 0, (int)readLength);
fileSize -= read;
// cursor += read;
assert(read==byteChunk.length);
long aid = fileid;
aid = aid<<32 | nChunks;
chunks[i] = new DataUnit(byteChunk,aid);
// Lister.add(chunks[i]);
nChunks++;
++i;
}
fis.close();
fis = null;
}catch(Exception e){
System.out.println("File splitting exception");
e.printStackTrace();
}
return chunks;
}
Reading in the whole file would definitely trigger OutOfMemoryError as file size grow. Tuning the -Xmx1024M may be good for temporary fix, but it's definitely not the right/scalable solution. Also, doesn't matter how you move your variables around (like creating buffer outside of the loop instead of inside the loop) you will get OutOfMemoryError sooner or later. The only way to not get OutOfMemoryError for you is to not to read the complete file in memory.
If you have to use just memory, then an approach is to send off chunks to the client so you don't have to keep all the chunks in memory:
instead of:
chunks[i] = new DataUnit(byteChunk,aid);
do:
sendChunkToClient(new DataUnit(byteChunk, aid));
But the above solution has the drawback that if some error happened in-between chunk sending, you may have hard time trying to resume/recover from the error point.
Saving the chunks to temporary files like Ross Drew suggested is probably better and more reliable.
How about creating the
byteChunk = new byte[(int)readLength];
outside of the loop and just reuse it instead of creating an array of bytes over and over if it's always the same.
Alternatively
You could write incoming data to a temporary file as it comes in instead of maintaining that huge array then process it once it's all arrived.
Also
If you are using it multiple times as an int, you should probably just case readLength to an int outside the loop as well
int len = (int)readLength;
And Chunk_Size is a variable right? It should begin with a lower case letter.
I'm quite new to using Java I/O as I haven't ever before and have written this to download a .mp4 file from www.kissanime.com.
The download is very, very slow at the moment (approximately 70-100kb/s) and was wondering how I could speed it up. I don't really understand the byte buffering so any help with that would be appreciated. That may be my problem, I'm not sure.
Here's my code:
protected static boolean downloadFile(URL source, File dest) {
try {
URLConnection urlConn = source.openConnection();
urlConn.setConnectTimeout(1000);
urlConn.setReadTimeout(5000);
InputStream in = urlConn.getInputStream();
FileOutputStream out = new FileOutputStream(dest);
BufferedOutputStream bout = new BufferedOutputStream(out);
int fileSize = urlConn.getContentLength();
byte[] b = new byte[65536];
int bytesDownloaded = 0, len;
while ((len = in.read(b)) != -1 && bytesDownloaded < fileSize) {
bout.write(b, 0, len);
bytesDownloaded += len;
// System.out.println((double) bytesDownloaded / 1000000.0 + "mb/" + (double) fileSize / 1000000.0 + "mb");
}
bout.close();
} catch (IOException e) {
e.printStackTrace();
}
return true;
}
Thanks. Any further information will be provided upon request.
I can't find any questions on here related to downloading media files, and I'm sorry if this is deemed to be a duplicate.
Try using IOUtils.toByteArray, It takes an inputstream and returns an array with all bytes, in my opinion it's generally a good idea to check the common utility packages like apache-commons and guava and see if what you're trying to do hasn't already been done
If you want to save the file from InputStream then use this bellow method of apache-commons
FileUtils.copyInputStreamToFile ()
public static void copyInputStreamToFile(InputStream source,
File destination)
throws IOException
Copies bytes from an InputStream source to a file destination. The directories up to destination will be created if they don't already exist. destination will be overwritten if it already exists. The source stream is closed.
Always use file and IO related stuff by using library if available.There are also some other utility methods available & you can explore .
IOUtils
FileUtils
Turns out that it was the vast number of redirects from the link that caused the download speed to be throttled. Thanks everyone who answered.
I want to read a binary file that its size is 5.5 megabyte(a mp3 file). I tried it with fileinputstream but it took many attempts. If possible, I want to read file with a minimal waste of time.
You should try to use a BufferedInputStream around your FileInputStream. It will improve the performance significantly.
new BufferedInputStream(fileInputStream, 8192 /* default buffer size */);
Furthermore, I'd recommend to use the read-method that takes a byte array and fills it instead of the plain read.
There are useful utilities in FileUtils for reading a file at once. This is simpler and efficient for modest files up to 100 MB.
byte[] bytes = FileUtils.readFileToByteArray(file); // handles IOException/close() etc.
Try this:
public static void main(String[] args) throws IOException
{
InputStream i = new FileInputStream("a.mp3");
byte[] contents = new byte[i.available()];
i.read(contents);
i.close();
}
A more reliable version based on helpful comment from #Paul Cager & Liv related to available's and read's unreliability.
public static void main(String[] args) throws IOException
{
File f = new File("c:\\msdia80.dll");
InputStream i = new FileInputStream(f);
byte[] contents = new byte[(int) f.length()];
int read;
int pos = 0;
while ((read = i.read(contents, pos, contents.length - pos)) >= 1)
{
pos += read;
}
i.close();
}
I want to find out what method is better of two that I have come up with for concatenating my text files in Java. If someone has some insight they can share about what goes on at the kernel level that explains the difference between these methods of writing to a FileChannel, I would greatly appreciate it.
From what I understand from documentation and other Stack Overflow conversations, the allocateDirect allocates space right on the drive, and mostly avoids using RAM. I have a concern that the ByteBuffer created with allocateDirect might have a potential to overflow or not be allocated if the File infile is large, say 1GB. I am guaranteed at this point in the development of our software that the File will be no larger than 2 GB; but there is potential in the future that it might be as big as 10 or 20GB.
I have observed that the transferFrom loop never goes through the loop more than once... so it seems to succeed in writing the entire infile at once; but I haven't tested it with files bigger than 60MB. I looped though, because the documentation specifies that there is no guarantee of how much will be written at once. With transferFrom only able to accept, on my system, an int32 as its count parameter, I won't be able to specify more than 2GB at a time be transferred... Again, kernel expertise would help me understand.
Thanks in advance for your help!!
Using a ByteBuffer:
boolean concatFiles(StringBuffer sb, File infile, File outfile) {
FileChannel inChan = null, outChan = null;
try {
ByteBuffer buff = ByteBuffer.allocateDirect((int)(infile.length() + sb.length()));
//write the stringBuffer so it goes in the output file first:
buff.put(sb.toString().getBytes());
//create the FileChannels:
inChan = new RandomAccessFile(infile, "r" ).getChannel();
outChan = new RandomAccessFile(outfile, "rw").getChannel();
//read the infile in to the buffer:
inChan.read(buff);
// prep the buffer:
buff.flip();
// write the buffer out to the file via the FileChannel:
outChan.write(buff);
inChan.close();
outChan.close();
} catch...etc
}
Using trasferTo (or transferFrom):
boolean concatFiles(StringBuffer sb, File infile, File outfile) {
FileChannel inChan = null, outChan = null;
try {
//write the stringBuffer so it goes in the output file first:
PrintWriter fw = new PrintWriter(outfile);
fw.write(sb.toString());
fw.flush();
fw.close();
// create the channels appropriate for appending:
outChan = new FileOutputStream(outfile, true).getChannel();
inChan = new RandomAccessFile(infile, "r").getChannel();
long startSize = outfile.length();
long inFileSize = infile.length();
long bytesWritten = 0;
//set the position where we should start appending the data:
outChan.position(startSize);
Byte startByte = outChan.position();
while(bytesWritten < length){
bytesWritten += outChan.transferFrom(inChan, startByte, (int) inFileSize);
startByte = bytesWritten + 1;
}
inChan.close();
outChan.close();
} catch ... etc
transferTo() can be far more efficient as there is less data copying, or none if it can all be done in the kernel. And if it isn't on your platform it will still use highly tuned code.
You do need the loop, one day it will iterate and your code will keep working.