I need to read a huge file (15+GB) and perform some minor modifications (add some newlines so a different parser can actually work with it). You might think that there are already answers for doing this normally:
Reading a very huge file in java
How to read a large text file line by line using Java?
but my entire file is on one line.
My general approach so far is very basic:
char[] buffer = new char[X];
BufferedReader reader = new BufferedReader(new ReaderUTF8(new FileInputStream(new File("myFileName"))), X);
char[] bufferOut = new char[X+a little];
int bytesRead = -1;
int i = 0;
int offset = 0;
long totalBytesRead = 0;
int countToPrint = 0;
while((bytesRead = reader.read(buffer)) >= 0){
for(i = 0; i < bytesRead; i++){
if(buffer[i] == '}'){
bufferOut[i+offset] = '}';
offset++;
bufferOut[i+offset] = '\n';
}
else{
bufferOut[i+offset] = buffer[i];
}
}
writer.write(bufferOut, 0, bytesRead+offset);
offset = 0;
totalBytesRead += bytesRead;
countToPrint += 1;
if(countToPrint == 10){
countToPrint = 0;
System.out.println("Read "+((double)totalBytesRead / originalFileSize * 100)+" percent.");
}
}
writer.flush();
After some experimentation, I've found that a value of X larger than a million gives optimal speed - it looks like I'm getting about 2% every 10 minutes, while a value of X of ~60,000 only got 60% in 15 hours. Profiling reveals that I'm spending 96+% of my time in the read() method, so that's definitely my bottleneck. As of writing this, my 8 million X version has finished 32% of the file after 2 hours and 40 minutes, in case you want to know how it performs long-term.
Is there a better approach for dealing with such a large, one-line file? As in, is there a faster way of reading this type of file that gives me a relatively easy way of inserting the newline characters?
I am aware that different languages or programs could probably handle this gracefully, but I'm limiting this to a Java perspective.
You are making this far more complicated than it should be. By just making use of the buffering already provided by the standard classes you should get a thorughput of at least several MB per second without any hassles.
This simple test program processes 1GB in less than 2 minutes on my PC (including creating the test file):
import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.OutputStream;
import java.io.OutputStreamWriter;
import java.io.Reader;
import java.io.Writer;
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;
import java.util.Random;
public class TestFileProcessing {
public static void main(String[] argv) {
try {
long time = System.currentTimeMillis();
File from = new File("C:\\Test\\Input.txt");
createTestFile(from, StandardCharsets.UTF_8, 1_000_000_000);
System.out.println("Created file in: " + (System.currentTimeMillis() - time) + "ms");
time = System.currentTimeMillis();
File to = new File("C:\\Test\\Output.txt");
doIt(from, to, StandardCharsets.UTF_8);
System.out.println("Converted file in: " + (System.currentTimeMillis() - time) + "ms");
} catch (IOException e) {
throw new RuntimeException(e.getMessage(), e);
}
}
public static void createTestFile(File file, Charset encoding, long size) throws IOException {
Random r = new Random(12345);
try (OutputStream fout = new FileOutputStream(file);
BufferedOutputStream bout = new BufferedOutputStream(fout);
Writer writer = new OutputStreamWriter(bout, encoding)) {
for (long i=0; i<size; ++i) {
int c = r.nextInt(26);
if (c == 0)
writer.write('}');
else
writer.write('a' + c);
}
}
}
public static void doIt(File from, File to, Charset encoding) throws IOException {
try (InputStream fin = new FileInputStream(from);
BufferedInputStream bin = new BufferedInputStream(fin);
Reader reader = new InputStreamReader(bin, encoding);
OutputStream fout = new FileOutputStream(to);
BufferedOutputStream bout = new BufferedOutputStream(fout);
Writer writer = new OutputStreamWriter(bout, encoding)) {
int c;
while ((c = reader.read()) >= 0) {
if (c == '}')
writer.write('\n');
writer.write(c);
}
}
}
}
As you see no elaborate logic or excessive buffer sizes are used. What is used is simply buffering the streams closest to the hardware, the FileInput/OutputStream.
How do I convert a java.io.File to a byte[]?
From JDK 7 you can use Files.readAllBytes(Path).
Example:
import java.io.File;
import java.nio.file.Files;
File file;
// ...(file is initialised)...
byte[] fileContent = Files.readAllBytes(file.toPath());
It depends on what best means for you. Productivity wise, don't reinvent the wheel and use Apache Commons. Which is here FileUtils.readFileToByteArray(File input).
Since JDK 7 - one liner:
byte[] array = Files.readAllBytes(Paths.get("/path/to/file"));
No external dependencies needed.
import java.io.RandomAccessFile;
RandomAccessFile f = new RandomAccessFile(fileName, "r");
byte[] b = new byte[(int)f.length()];
f.readFully(b);
Documentation for Java 8: http://docs.oracle.com/javase/8/docs/api/java/io/RandomAccessFile.html
Basically you have to read it in memory. Open the file, allocate the array, and read the contents from the file into the array.
The simplest way is something similar to this:
public byte[] read(File file) throws IOException, FileTooBigException {
if (file.length() > MAX_FILE_SIZE) {
throw new FileTooBigException(file);
}
ByteArrayOutputStream ous = null;
InputStream ios = null;
try {
byte[] buffer = new byte[4096];
ous = new ByteArrayOutputStream();
ios = new FileInputStream(file);
int read = 0;
while ((read = ios.read(buffer)) != -1) {
ous.write(buffer, 0, read);
}
}finally {
try {
if (ous != null)
ous.close();
} catch (IOException e) {
}
try {
if (ios != null)
ios.close();
} catch (IOException e) {
}
}
return ous.toByteArray();
}
This has some unnecessary copying of the file content (actually the data is copied three times: from file to buffer, from buffer to ByteArrayOutputStream, from ByteArrayOutputStream to the actual resulting array).
You also need to make sure you read in memory only files up to a certain size (this is usually application dependent) :-).
You also need to treat the IOException outside the function.
Another way is this:
public byte[] read(File file) throws IOException, FileTooBigException {
if (file.length() > MAX_FILE_SIZE) {
throw new FileTooBigException(file);
}
byte[] buffer = new byte[(int) file.length()];
InputStream ios = null;
try {
ios = new FileInputStream(file);
if (ios.read(buffer) == -1) {
throw new IOException(
"EOF reached while trying to read the whole file");
}
} finally {
try {
if (ios != null)
ios.close();
} catch (IOException e) {
}
}
return buffer;
}
This has no unnecessary copying.
FileTooBigException is a custom application exception.
The MAX_FILE_SIZE constant is an application parameters.
For big files you should probably think a stream processing algorithm or use memory mapping (see java.nio).
As someone said, Apache Commons File Utils might have what you are looking for
public static byte[] readFileToByteArray(File file) throws IOException
Example use (Program.java):
import org.apache.commons.io.FileUtils;
public class Program {
public static void main(String[] args) throws IOException {
File file = new File(args[0]); // assume args[0] is the path to file
byte[] data = FileUtils.readFileToByteArray(file);
...
}
}
If you don't have Java 8, and agree with me that including a massive library to avoid writing a few lines of code is a bad idea:
public static byte[] readBytes(InputStream inputStream) throws IOException {
byte[] b = new byte[1024];
ByteArrayOutputStream os = new ByteArrayOutputStream();
int c;
while ((c = inputStream.read(b)) != -1) {
os.write(b, 0, c);
}
return os.toByteArray();
}
Caller is responsible for closing the stream.
// Returns the contents of the file in a byte array.
public static byte[] getBytesFromFile(File file) throws IOException {
// Get the size of the file
long length = file.length();
// You cannot create an array using a long type.
// It needs to be an int type.
// Before converting to an int type, check
// to ensure that file is not larger than Integer.MAX_VALUE.
if (length > Integer.MAX_VALUE) {
// File is too large
throw new IOException("File is too large!");
}
// Create the byte array to hold the data
byte[] bytes = new byte[(int)length];
// Read in the bytes
int offset = 0;
int numRead = 0;
InputStream is = new FileInputStream(file);
try {
while (offset < bytes.length
&& (numRead=is.read(bytes, offset, bytes.length-offset)) >= 0) {
offset += numRead;
}
} finally {
is.close();
}
// Ensure all the bytes have been read in
if (offset < bytes.length) {
throw new IOException("Could not completely read file "+file.getName());
}
return bytes;
}
You can use the NIO api as well to do it. I could do this with this code as long as the total file size (in bytes) would fit in an int.
File f = new File("c:\\wscp.script");
FileInputStream fin = null;
FileChannel ch = null;
try {
fin = new FileInputStream(f);
ch = fin.getChannel();
int size = (int) ch.size();
MappedByteBuffer buf = ch.map(MapMode.READ_ONLY, 0, size);
byte[] bytes = new byte[size];
buf.get(bytes);
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} finally {
try {
if (fin != null) {
fin.close();
}
if (ch != null) {
ch.close();
}
} catch (IOException e) {
e.printStackTrace();
}
}
I think its very fast since its using MappedByteBuffer.
Simple way to do it:
File fff = new File("/path/to/file");
FileInputStream fileInputStream = new FileInputStream(fff);
// int byteLength = fff.length();
// In android the result of file.length() is long
long byteLength = fff.length(); // byte count of the file-content
byte[] filecontent = new byte[(int) byteLength];
fileInputStream.read(filecontent, 0, (int) byteLength);
Simplest Way for reading bytes from file
import java.io.*;
class ReadBytesFromFile {
public static void main(String args[]) throws Exception {
// getBytes from anyWhere
// I'm getting byte array from File
File file = null;
FileInputStream fileStream = new FileInputStream(file = new File("ByteArrayInputStreamClass.java"));
// Instantiate array
byte[] arr = new byte[(int) file.length()];
// read All bytes of File stream
fileStream.read(arr, 0, arr.length);
for (int X : arr) {
System.out.print((char) X);
}
}
}
Guava has Files.toByteArray() to offer you. It has several advantages:
It covers the corner case where files report a length of 0 but still have content
It's highly optimized, you get a OutOfMemoryException if trying to read in a big file before even trying to load the file. (Through clever use of file.length())
You don't have to reinvent the wheel.
import java.io.File;
import java.nio.file.Files;
import java.nio.file.Path;
File file = getYourFile();
Path path = file.toPath();
byte[] data = Files.readAllBytes(path);
Using the same approach as the community wiki answer, but cleaner and compiling out of the box (preferred approach if you don't want to import Apache Commons libs, e.g. on Android):
public static byte[] getFileBytes(File file) throws IOException {
ByteArrayOutputStream ous = null;
InputStream ios = null;
try {
byte[] buffer = new byte[4096];
ous = new ByteArrayOutputStream();
ios = new FileInputStream(file);
int read = 0;
while ((read = ios.read(buffer)) != -1)
ous.write(buffer, 0, read);
} finally {
try {
if (ous != null)
ous.close();
} catch (IOException e) {
// swallow, since not that important
}
try {
if (ios != null)
ios.close();
} catch (IOException e) {
// swallow, since not that important
}
}
return ous.toByteArray();
}
This is one of the simplest way
String pathFile = "/path/to/file";
byte[] bytes = Files.readAllBytes(Paths.get(pathFile ));
I belive this is the easiest way:
org.apache.commons.io.FileUtils.readFileToByteArray(file);
ReadFully Reads b.length bytes from this file into the byte array, starting at the current file pointer. This method reads repeatedly from the file until the requested number of bytes are read. This method blocks until the requested number of bytes are read, the end of the stream is detected, or an exception is thrown.
RandomAccessFile f = new RandomAccessFile(fileName, "r");
byte[] b = new byte[(int)f.length()];
f.readFully(b);
If you want to read bytes into a pre-allocated byte buffer, this answer may help.
Your first guess would probably be to use InputStream read(byte[]). However, this method has a flaw that makes it unreasonably hard to use: there is no guarantee that the array will actually be completely filled, even if no EOF is encountered.
Instead, take a look at DataInputStream readFully(byte[]). This is a wrapper for input streams, and does not have the above mentioned issue. Additionally, this method throws when EOF is encountered. Much nicer.
Not only does the following way convert a java.io.File to a byte[], I also found it to be the fastest way to read in a file, when testing many different Java file reading methods against each other:
java.nio.file.Files.readAllBytes()
import java.io.File;
import java.io.IOException;
import java.nio.file.Files;
public class ReadFile_Files_ReadAllBytes {
public static void main(String [] pArgs) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt";
File file = new File(fileName);
byte [] fileBytes = Files.readAllBytes(file.toPath());
char singleChar;
for(byte b : fileBytes) {
singleChar = (char) b;
System.out.print(singleChar);
}
}
}
//The file that you wanna convert into byte[]
File file=new File("/storage/0CE2-EA3D/DCIM/Camera/VID_20190822_205931.mp4");
FileInputStream fileInputStream=new FileInputStream(file);
byte[] data=new byte[(int) file.length()];
BufferedInputStream bufferedInputStream=new BufferedInputStream(fileInputStream);
bufferedInputStream.read(data,0,data.length);
//Now the bytes of the file are contain in the "byte[] data"
Let me add another solution without using third-party libraries. It re-uses an exception handling pattern that was proposed by Scott (link). And I moved the ugly part into a separate message (I would hide in some FileUtils class ;) )
public void someMethod() {
final byte[] buffer = read(new File("test.txt"));
}
private byte[] read(final File file) {
if (file.isDirectory())
throw new RuntimeException("Unsupported operation, file "
+ file.getAbsolutePath() + " is a directory");
if (file.length() > Integer.MAX_VALUE)
throw new RuntimeException("Unsupported operation, file "
+ file.getAbsolutePath() + " is too big");
Throwable pending = null;
FileInputStream in = null;
final byte buffer[] = new byte[(int) file.length()];
try {
in = new FileInputStream(file);
in.read(buffer);
} catch (Exception e) {
pending = new RuntimeException("Exception occured on reading file "
+ file.getAbsolutePath(), e);
} finally {
if (in != null) {
try {
in.close();
} catch (Exception e) {
if (pending == null) {
pending = new RuntimeException(
"Exception occured on closing file"
+ file.getAbsolutePath(), e);
}
}
}
if (pending != null) {
throw new RuntimeException(pending);
}
}
return buffer;
}
public static byte[] readBytes(InputStream inputStream) throws IOException {
byte[] buffer = new byte[32 * 1024];
int bufferSize = 0;
for (;;) {
int read = inputStream.read(buffer, bufferSize, buffer.length - bufferSize);
if (read == -1) {
return Arrays.copyOf(buffer, bufferSize);
}
bufferSize += read;
if (bufferSize == buffer.length) {
buffer = Arrays.copyOf(buffer, bufferSize * 2);
}
}
}
Another Way for reading bytes from file
Reader reader = null;
try {
reader = new FileReader(file);
char buf[] = new char[8192];
int len;
StringBuilder s = new StringBuilder();
while ((len = reader.read(buf)) >= 0) {
s.append(buf, 0, len);
byte[] byteArray = s.toString().getBytes();
}
} catch(FileNotFoundException ex) {
} catch(IOException e) {
}
finally {
if (reader != null) {
reader.close();
}
}
Try this :
import sun.misc.IOUtils;
import java.io.IOException;
try {
String path="";
InputStream inputStream=new FileInputStream(path);
byte[] data=IOUtils.readFully(inputStream,-1,false);
}
catch (IOException e) {
System.out.println(e);
}
Can be done as simple as this (Kotlin version)
val byteArray = File(path).inputStream().readBytes()
EDIT:
I've read docs of readBytes method. It says:
Reads this stream completely into a byte array.
Note: It is the caller's responsibility to close this stream.
So to be able to close the stream, while keeping everything clean, use the following code:
val byteArray = File(path).inputStream().use { it.readBytes() }
Thanks to #user2768856 for pointing this out.
try this if you have target version less than 26 API
private static byte[] readFileToBytes(String filePath) {
File file = new File(filePath);
byte[] bytes = new byte[(int) file.length()];
// funny, if can use Java 7, please uses Files.readAllBytes(path)
try(FileInputStream fis = new FileInputStream(file)){
fis.read(bytes);
return bytes;
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
return null;
}
In JDK8
Stream<String> lines = Files.lines(path);
String data = lines.collect(Collectors.joining("\n"));
lines.close();
So, I made a program that splits a .mp3 file in Java. Basically, it works fine on some files but on some, the first split file encounters an error after playing some part. The other files work completely fine though.
I think it has something to do with how a file cannot be a multiple of the size of my array and there should be some mod value left. Can anybody please identify the error in this code and correct it?
(here, splitval = no. of splits to be made, filename1= the selected file)
int splitsize=filesize/splitval;
String filecalled;
try
{
byte []b=new byte[splitsize];
FileInputStream fis = new FileInputStream(filename1);
name1=filename2.replaceAll(".mp3", "");
for(int j=1;j<=splitval;j++)
{
filecalled=name1+"_split_"+j+".mp3";
FileOutputStream fos = new FileOutputStream(filecalled);
int i=fis.read(b);
fos.write(b, 0, i);
//System.out.println("no catch");
}
JOptionPane.showMessageDialog(this, "split process successful");
}
catch(IOException e)
{
System.out.println(e.getMessage());
}
Thanks in advance!
EDIT:
I edited the code as suggested, ran it. Here:
C:\Users\dell5050\Desktop\Julien.mp3 5383930 bytes
C:\Users\dell5050\Desktop\ Julien_split_1.mp3 1345984 bytes
C:\Users\dell5050\Desktop\ Julien_split_2.mp3 1345984 bytes
C:\Users\dell5050\Desktop\ Julien_split_3.mp3 1345984 bytes
C:\Users\dell5050\Desktop\ Julien_split_4.mp3 1345978 bytes
There is change in the last few bytes which means that the filesize%splitval is solved.. but still the first file in this.. containing '_split_1' has error while playing some of the last part.
The second file containing '_split_2' starts exactly where the first ended. So the split process is correct. Then, what exactly is the extra empty in the end of the first file?
Also, I noticed that the artwork and info of the original file carries over into the first file ONLY. No other files. Does it have something to do with that? Same thing doesnt happen in some other mp3 files.
CODE:
FileInputStream fis;
FileOutputStream fos;
int splitsize = (int)(filesize / splitval) + (int)(filesize % splitval);
byte[] b = new byte[splitsize];
System.out.println(filename1 + " " + filesize + " bytes");
try
{
fis = new FileInputStream(file);
name1 = filename2.replaceAll(".mp3", "");
for (int j = 1; j <= splitval; j++)
{
String filecalled = name1 + "_split_" + j + ".mp3";
fos = new FileOutputStream(filecalled);
int i = fis.read(b);
fos.write(b, 0, i);
fos.close();
System.out.println(filecalled + " " + i + " bytes");
}
}
catch(IOException ie)
{
System.out.println(ie.getMessage());
}
I doubt you could split a mp3 file just by copying n-bytes to a file and go to the next. Mp3 has a specific format and you'll probably need a library to handle this format.
EDIT regarding the size of the part files being all equal:
You are not writing all the bytes of the file to the split files. If you sum the sizes of all split files and compare it to the size of the original file you'll find out that your missing some bytes. This is because your loop runs from 1 to splitval and always writes the exact number of bytes to each part file i.e. splitsize. So the number of bytes your are missing is filesize % splitval.
To resolve this problem simply add filesize % splitval to splitsize. This way you'll not be missing any bytes. The files from 1 to splitval - 1 will have the same size, the last file will be smaller.
Here is a corrected version of your code with some additions to merge the split files in order to perform an assertion using SHA1-checksum.
Disclaimer - The output files are not expected to be proper mp3 files
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import junit.framework.Assert;
import org.junit.Test;
public class SplitFile {
#Test
public void splitFile() throws IOException, NoSuchAlgorithmException {
String filename1 = "mp3/Innocence_-_Nero.mp3";
File file = new File(filename1);
FileInputStream fis = null;
FileOutputStream fos = null;
long filesize = file.length();
long filesizeActual = 0L;
int splitval = 5;
int splitsize = (int)(filesize / splitval) + (int)(filesize % splitval);
byte[] b = new byte[splitsize];
System.out.println(filename1 + " " + filesize + " bytes");
try {
fis = new FileInputStream(file);
String name1 = filename1.replaceAll(".mp3", "");
String mergeFile = name1 + "_merge.mp3";
for (int j = 1; j <= splitval; j++) {
String filecalled = name1 + "_split_" + j + ".mp3";
fos = new FileOutputStream(filecalled);
int i = fis.read(b);
fos.write(b, 0, i);
fos.close();
fos = null;
System.out.println(filecalled + " " + i + " bytes");
filesizeActual += i;
}
Assert.assertEquals(filesize, filesizeActual);
mergeFileParts(filename1, splitval);
check(filename1, mergeFile);
} finally {
if(fis != null) {
fis.close();
}
if(fos != null) {
fos.close();
}
}
}
private void mergeFileParts(String filename1, int splitval) throws IOException {
FileInputStream fis = null;
FileOutputStream fos = null;
try {
String name1 = filename1.replaceAll(".mp3", "");
String mergeFile = name1 + "_merge.mp3";
fos = new FileOutputStream(mergeFile);
for (int j = 1; j <= splitval; j++) {
String filecalled = name1 + "_split_" + j + ".mp3";
File partFile = new File(filecalled);
fis = new FileInputStream(partFile);
int partFilesize = (int) partFile.length();
byte[] b = new byte[partFilesize];
int i = fis.read(b, 0, partFilesize);
fos.write(b, 0, i);
fis.close();
fis = null;
}
} finally {
if(fis != null) {
fis.close();
}
if(fos != null) {
fos.close();
}
}
}
private void check(String expectedPath, String actualPath) throws IOException, NoSuchAlgorithmException {
System.out.println("check...");
FileInputStream fis = null;
try {
File expectedFile = new File(expectedPath);
long expectedSize = expectedFile.length();
File actualFile = new File(actualPath);
long actualSize = actualFile.length();
System.out.println("exp=" + expectedSize);
System.out.println("act=" + actualSize);
Assert.assertEquals(expectedSize, actualSize);
fis = new FileInputStream(expectedFile);
String expected = makeMessageDigest(fis);
fis.close();
fis = null;
fis = new FileInputStream(actualFile);
String actual = makeMessageDigest(fis);
fis.close();
fis = null;
System.out.println("exp=" + expected);
System.out.println("act=" + actual);
Assert.assertEquals(expected, actual);
} finally {
if(fis != null) {
fis.close();
}
}
}
public String makeMessageDigest(InputStream is) throws NoSuchAlgorithmException, IOException {
byte[] data = new byte[1024];
MessageDigest md = MessageDigest.getInstance("SHA1");
int bytesRead = 0;
while(-1 != (bytesRead = is.read(data, 0, 1024))) {
md.update(data, 0, bytesRead);
}
return toHexString(md.digest());
}
private String toHexString(byte[] digest) {
StringBuffer sha1HexString = new StringBuffer();
for(int i = 0; i < digest.length; i++) {
sha1HexString.append(String.format("%1$02x", Byte.valueOf(digest[i])));
}
return sha1HexString.toString();
}
}
Output (for my test file)
mp3/Innocence_-_Nero.mp3 5048528 bytes
mp3/Innocence_-_Nero_split_1.mp3 1009708 bytes
mp3/Innocence_-_Nero_split_2.mp3 1009708 bytes
mp3/Innocence_-_Nero_split_3.mp3 1009708 bytes
mp3/Innocence_-_Nero_split_4.mp3 1009708 bytes
mp3/Innocence_-_Nero_split_5.mp3 1009696 bytes
check...
exp=5048528
act=5048528
exp=e81cf2dc65ab84e3df328e52d63a55301232b917
act=e81cf2dc65ab84e3df328e52d63a55301232b917
I need to reassemble a 100-part zip file and extract the content. I tried simply concatenating the zip volumes together in an input stream but that does not work. Any suggestions would be appreciated.
Thanks.
Here is the code you can start from. It extracts a single file entry from the multivolume zip archive:
package org.test.zip;
import java.io.BufferedOutputStream;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.io.SequenceInputStream;
import java.util.Arrays;
import java.util.Collections;
import java.util.zip.ZipEntry;
import java.util.zip.ZipInputStream;
public class Main {
public static void main(String[] args) throws IOException {
ZipInputStream is = new ZipInputStream(new SequenceInputStream(Collections.enumeration(
Arrays.asList(new FileInputStream("test.zip.001"), new FileInputStream("test.zip.002"), new FileInputStream("test.zip.003")))));
try {
for(ZipEntry entry = null; (entry = is.getNextEntry()) != null; ) {
OutputStream os = new BufferedOutputStream(new FileOutputStream(entry.getName()));
try {
final int bufferSize = 1024;
byte[] buffer = new byte[bufferSize];
for(int readBytes = -1; (readBytes = is.read(buffer, 0, bufferSize)) > -1; ) {
os.write(buffer, 0, readBytes);
}
os.flush();
} finally {
os.close();
}
}
} finally {
is.close();
}
}
}
Just a note to make it more dynamic -- 100% based on mijer code below.
private void CombineFiles (String[] files) throws FileNotFoundException, IOException {
Vector<FileInputStream> v = new Vector<FileInputStream>(files.length);
for (int x = 0; x < files.length; x++)
v.add(new FileInputStream(inputDirectory + files[x]));
Enumeration<FileInputStream> e = v.elements();
SequenceInputStream sequenceInputStream = new SequenceInputStream(e);
ZipInputStream is = new ZipInputStream(sequenceInputStream);
try {
for (ZipEntry entry = null; (entry = is.getNextEntry()) != null;) {
OutputStream os = new BufferedOutputStream(new FileOutputStream(entry.getName()));
try {
final int bufferSize = 1024;
byte[] buffer = new byte[bufferSize];
for (int readBytes = -1; (readBytes = is.read(buffer, 0, bufferSize)) > -1;) {
os.write(buffer, 0, readBytes);
}
os.flush();
} finally {
os.close();
}
}
} finally {
is.close();
}
}
To just concatenate the segment data did not work for me. In this case the segments had been created with Linux command-line zip (InfoZip version 3.0):
> zip -s 5m data.zip -r data/
Segment files named data.z01, data.z02, ..., data.zip was created.
The first segment data.z01 contained the spanning signature 0x08074b50, as described in the Zip File Format Specification by PKWARE. The presence of these 4 bytes made Java ZipInputStream ignore all entries in the archive. The central registry in the last segment also contained extra segment information compared to a non-split archive but that did not cause ZipInputStream any problems.
All I had to do was to skip the spanning signature. The following code will extract entries both from an archive that have been segmented with zip -s and from a zip file that have been split by the Linux split commad, like this: split -d -b 5M data.zip data.zip.. The code is based on szhem's.
public class ZipCat {
private final static byte[] SPANNING_SIGNATURE = {0x50, 0x4b, 0x07, 0x08};
public static void main(String[] args) throws IOException {
List<InputStream> asList = new ArrayList<>();
byte[] buf4 = new byte[4];
PushbackInputStream pis = new PushbackInputStream(new FileInputStream(args[0]), buf4.length);
asList.add(pis);
if (pis.read(buf4) != buf4.length) {
throw new IOException(args[0] + " is too small for a zip file/segment");
}
if (!Arrays.equals(buf4, SPANNING_SIGNATURE)) {
pis.unread(buf4, 0, buf4.length);
}
for (int i = 1; i < args.length; i++) {
asList.add(new FileInputStream(args[i]));
}
try (ZipInputStream is = new ZipInputStream(new SequenceInputStream(Collections.enumeration(asList)))) {
for (ZipEntry entry = null; (entry = is.getNextEntry()) != null;) {
if (entry.isDirectory()) {
new File(entry.getName()).mkdirs();
} else {
try (OutputStream os = new BufferedOutputStream(new FileOutputStream(entry.getName()))) {
byte[] buffer = new byte[1024];
int count = -1;
while ((count = is.read(buffer)) != -1) {
os.write(buffer, 0, count);
}
}
}
}
}
}
}
After compile and run it shows "no pdf printer available", How to solve this?
I have created a file in c:\print.pdf (using PHP TCPDF). And i am
trying to read that file in byte array, so that i can silently print
it without showing any popup of print etc.
I cant make it working, can anyone please show guide how to read a
file in Byte array? To do the following:
import java.io.ByteArrayInputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.ObjectInputStream;
import java.util.logging.Level;
import java.util.logging.Logger;
import javax.print.Doc;
import javax.print.DocFlavor;
import javax.print.DocPrintJob;
import javax.print.PrintException;
import javax.print.PrintService;
import javax.print.PrintServiceLookup;
import javax.print.SimpleDoc;
public class print
{
private static Object pdfBytes;
// Byte array reader
public static byte[] getBytesFromFile(File file) throws IOException {
InputStream is = new FileInputStream(file);
long length = file.length();
if (length > Integer.MAX_VALUE) {}
byte[] bytes = new byte[(int)length];
int offset = 0;
int numRead = 0;
while (offset < bytes.length
&& (numRead=is.read(bytes, offset, bytes.length-offset)) >= 0) {
offset += numRead;
}
if (offset < bytes.length) {
throw new IOException("Could not completely read file "+file.getName());
}
is.close();
return bytes;
}
// Convert Byte array to Object
public static Object toObject(byte[] bytes)
{
Object obj = null;
try {
ByteArrayInputStream bis = new ByteArrayInputStream(bytes);
ObjectInputStream ois = new ObjectInputStream (bis);
obj = ois.readObject();
} catch (IOException ex) {
} catch (ClassNotFoundException ex) {
}
return obj;
}
private static File fl = new File("c:\\print.pdf");
public static void main(String argc[])
{
DocFlavor flavor = DocFlavor.BYTE_ARRAY.PDF;
PrintService[] services =
PrintServiceLookup.lookupPrintServices(flavor,
null);
//Object pdfBytes = null;
try {
byte[] abc = getBytesFromFile(fl);
pdfBytes =toObject(abc);
} catch (IOException ex) {
Logger.getLogger(print.class.getName()).log(Level.SEVERE, null, ex);
}
if (services.length>0)
{
DocPrintJob printJob = services[0].createPrintJob();
Doc document = new SimpleDoc(pdfBytes,flavor,null);
try {
printJob.print(document, null);
} catch (PrintException ex) {
Logger.getLogger(print.class.getName()).log(Level.SEVERE, null, ex);
}
} else {
System.out.println("no pdf printer available");
}
}
}
I tried this and it solves my silent printing: https://gist.github.com/1094612
Here's an example on how to read a file into a byte[]:
// Returns the contents of the file in a byte array.
public static byte[] getBytesFromFile(File file) throws IOException {
InputStream is = new FileInputStream(file);
// Get the size of the file
long length = file.length();
// You cannot create an array using a long type.
// It needs to be an int type.
// Before converting to an int type, check
// to ensure that file is not larger than Integer.MAX_VALUE.
if (length > Integer.MAX_VALUE) {
// File is too large
}
// Create the byte array to hold the data
byte[] bytes = new byte[(int)length];
// Read in the bytes
int offset = 0;
int numRead = 0;
while (offset < bytes.length
&& (numRead=is.read(bytes, offset, bytes.length-offset)) >= 0) {
offset += numRead;
}
// Ensure all the bytes have been read in
if (offset < bytes.length) {
throw new IOException("Could not completely read file "+file.getName());
}
// Close the input stream and return bytes
is.close();
return bytes;
}
import java.io.*;
import java.util.*;
import com.lowagie.text.*;
import com.lowagie.text.pdf.*;
public class ReadPDFFile {
public static void main(String[] args) throws IOException {
try {
Document document = new Document();
document.open();
PdfReader reader = new PdfReader("file.pdf");
PdfDictionary dictionary = reader.getPageN(1);
PRIndirectReference reference = (PRIndirectReference) dictionary
.get(PdfName.CONTENTS);
PRStream stream = (PRStream) PdfReader.getPdfObject(reference);
byte[] bytes = PdfReader.getStreamBytes(stream);
PRTokeniser tokenizer = new PRTokeniser(bytes);
StringBuffer buffer = new StringBuffer();
while (tokenizer.nextToken()) {
if (tokenizer.getTokenType() == PRTokeniser.TK_STRING) {
buffer.append(tokenizer.getStringValue());
}
}
String test = buffer.toString();
System.out.println(test);
} catch (Exception e) {
}
}
}
After compile and run it shows "no pdf printer available"
From my reading of the documentation here and here, the problem is you haven't configured print service provider that understands how to print documents with that DocFlavour.
One solution is to find a JAR file that implements the PrinterService SPI for PDF documents and add it to your classpath. A Google search will show examples. (I can't recommend any particular SP because I've never had to use one. You'll need to do some investigation / testing to find one that works for you.)