Combining methods to TAR + LZ4 in a single run in Java - java

I have two methods that separately TAR a set of files and then another that will compress it with LZ4. They both work fine, but I'm wondering if combining them together would be more efficient or save some time? Also I'm not really sure how I would combine them. Any suggestions would be useful. As you can see in my code below, I'm trying to have fine access to the data so I can give a good % complete to the user.
public static boolean createTarFile(List<Path> paths, Path output, int machineId)
{
boolean success = false;
boolean failure = false;
try (OutputStream fOut = Files.newOutputStream(output, StandardOpenOption.APPEND);
BufferedOutputStream buffOut = new BufferedOutputStream(fOut);
TarArchiveOutputStream tOut = new TarArchiveOutputStream(buffOut))
{
float index = 1;
for (Path path : paths)
{
TarArchiveEntry tarEntry = new TarArchiveEntry(
path.toFile(),
path.getFileName().toString());
tarEntry.setSize(path.toFile().length());
tOut.putArchiveEntry(tarEntry);
// copy file to TarArchiveOutputStream
Files.copy(path, tOut);
tOut.closeArchiveEntry();
tarPercentComplete = (index / (float) paths.size()) * 100;
index++;
if (abort)
{
break;
}
}
tOut.finish();
}
catch (Exception e)
{
LOG.error("Tarring file failed ", e);
failure = true;
}
}
return success;
}
/**
* Zip a tar file using LZ4
*
* #param fileToZip
* #param outputFileName
* #return
*/
public boolean zipFile(File fileToZip, File outputFileName)
{
boolean success = false;
boolean failure = false;
try (FileOutputStream fos = new FileOutputStream(outputFileName);
LZ4FrameOutputStream lz4fos = new LZ4FrameOutputStream(fos);)
{
try (FileInputStream fis = new FileInputStream(fileToZip))
{
byte[] buf = new byte[bufferSizeZip];
int length;
long count = 0;
while ((length = fis.read(buf)) > 0)
{
lz4fos.write(buf, 0, length);
if (count % 50 == 0)
{
zipPercentComplete = ((bufferSizeZip * count) / (float) fileToZip.length()) * 100;
}
count++;
if (abort)
{
break;
}
}
}
}
catch (Exception e)
{
LOG.error("Zipping file failed ", e);
failure = true;
}
}
return success;
}

Just chain them.
try (OutputStream fOut = Files.newOutputStream(output, StandardOpenOption.APPEND);
BufferedOutputStream buffOut = new BufferedOutputStream(fOut);
LZ4FrameOutputStream lz4fos = new LZ4FrameOutputStream(buffOut);)
TarArchiveOutputStream tOut = new TarArchiveOutputStream(lz4fos)) {
No need to name them all.
try (TarArchiveOutputStream tOut = new TarArchiveOutputStream(
new LZ4FrameOutputStream(
new BufferedOutputStream(
Files.newOutputStream(output, StandardOpenOption.APPEND))))) {

Combining tar + lz4 in a single passs will definitely be more efficient,
because it will avoid moving data to/from storage
for a content which is inherently temporary.
On a shell (command line), one would do something like that :
tar cvf - DIR | lz4 > DIR.tar.lz4
which uses stdout / stdin as the intermediate interface (instead of storage file I/O).
However, as you are not using shell in your code,
prefer following suggestions from #Andreas for a Java example.
The main idea is the same, but the implementation is definitely different.

Related

How do I concatenate sequential files in order with Java?

I have a directory that contains sequentially numbered log files and some Excel spreadsheets used for analysis. The log file are ALWAYS sequentially numbered beginning at zero, but the number of them can vary. I am trying to concatenate the log files, in the order they were created into a single text file which will be a concatenation of all the log files.
For instance, with log files foo0.log, foo1.log, foo2.log would be output to concatenatedfoo.log by appending foo1 after foo0, and foo2 after foo1.
I need to count all the files in the given directory with the extension of *.log, using the count to drive a for-loop that also generates the file name for concatenation. I'm having a hard time finding a way to count the files using a filter...none of the Java Turtorials on file operations seem to fit the situation, but I'm sure I'm missing something. Does this approach make sense? or is there an easier way?
int numDocs = [number of *.log docs in directory];
//
for (int i = 0; i <= numberOfFiles; i++) {
fileNumber = Integer.toString(i);
try
{
FileInputStream inputStream = new FileInputStream("\\\\Path\\to\\file\\foo" + fileNumber + ".log");
BufferedReader br = new BufferedReader(new InputStreamReader(inputStream));
try
{
BufferedWriter metadataOutputData = new BufferedWriter(new FileWriter("\\\\Path\\to\\file\\fooconcat.log").append());
metadataOutputData.close();
}
//
catch (IOException e) // catch IO exception writing final output
{
System.err.println("Exception: ");
System.out.println("Exception: "+ e.getMessage().getClass().getName());
e.printStackTrace();
}
catch (Exception e) // catch IO exception reading input file
{
System.err.println("Exception: ");
System.out.println("Exception: "+ e.getMessage().getClass().getName());
e.printStackTrace();
}
}
how about
public static void main(String[] args){
final int BUFFERSIZE = 1024 << 8;
File baseDir = new File("C:\\path\\logs\\");
// Get the simple names of the files ("foo.log" not "/path/logs/foo.log")
String[] fileNames = baseDir.list(new FilenameFilter() {
#Override
public boolean accept(File dir, String name) {
return name.endsWith(".log");
}
});
// Sort the names
Arrays.sort(fileNames);
// Create the output file
File output = new File(baseDir.getAbsolutePath() + File.separatorChar + "MERGED.log");
try{
BufferedOutputStream out = new BufferedOutputStream(new FileOutputStream(output), BUFFERSIZE);
byte[] bytes = new byte[BUFFERSIZE];
int bytesRead;
final byte[] newLine = "\n".getBytes(); // use to separate contents
for(String s : fileNames){
// get the full path to read from
String fullName = baseDir.getAbsolutePath() + File.separatorChar + s;
BufferedInputStream in = new BufferedInputStream(new FileInputStream(fullName),BUFFERSIZE);
while((bytesRead = in.read(bytes,0,bytes.length)) != -1){
out.write(bytes, 0, bytesRead);
}
// close input file and ignore any issue with closing it
try{in.close();}catch(IOException e){}
out.write(newLine); // seperation
}
out.close();
}catch(Exception e){
throw new RuntimeException(e);
}
}
This code DOES assume that the "sequential naming" would be zero padded such that they will lexigraphically (?? sp) sort correctly. i.e. The files would be
0001.log (or blah0001.log, or 0001blah.log etc)
0002.log
....
0010.log
and not
1.log
2.log
...
10.log
The latter pattern will not sort correctly with the code I have given.
Here's some code for you.
File dir = new File("C:/My Documents/logs");
File outputFile = new File("C:/My Documents/concatenated.log");
Find the ".log" files:
File[] files = dir.listFiles(new FilenameFilter() {
#Override
public boolean accept(File file, String name) {
return name.endsWith(".log") && file.isFile();
}
});
Sort them into the appropriate order:
Arrays.sort(files, new Comparator<File>() {
#Override
public int compare(File file1, File file2) {
return numberOf(file1).compareTo(numberOf(file2));
}
private Integer numberOf(File file) {
return Integer.parseInt(file.getName().replaceAll("[^0-9]", ""));
}
});
Concatenate them:
byte[] buffer = new byte[8192];
OutputStream out = new BufferedOutputStream(new FileOutputStream(outputFile));
try {
for (File file : files) {
InputStream in = new FileInputStream(file);
try {
int charCount;
while ((charCount = in.read(buffer)) >= 0) {
out.write(buffer, 0, charCount);
}
} finally {
in.close();
}
}
} finally {
out.flush();
out.close();
}
By having the log folder as a File object, you can code like this
for (File logFile : logFolder.listFiles()){
if (logFile.getAbsolutePath().endsWith(".log")){
numDocs++;
}
}
to find the number of log files.
I would;
open the output file once. Just use a PrintWriter.
in a loop ...
create a File for each possible file
if it doesn't exist break the loop.
Using a BufferedReader
to read the lines of the file with readLine()
write each line to the output file.
You should be able to do this with about 12 lines of code. I would pass the IOExceptions to the caller.
You can use SequenceInputStream for concatenation of FileInputStreams.
To see all log files File.listFiles(FileFilter) can be used.
It will give you unsorted array with files. To sort files in right order, use Arrays.sort.
Code example:
static File[] logs(String dir) {
File root = new File(dir);
return root.listFiles(new FileFilter() {
#Override
public boolean accept(File pathname) {
return pathname.isFile() && pathname.getName().endsWith(".log");
}
});
}
static String cat(final File[] files) throws IOException {
Enumeration<InputStream> e = new Enumeration<InputStream>() {
int index;
#Override
public boolean hasMoreElements() {
return index < files.length;
}
#Override
public InputStream nextElement() {
index++;
try {
return new FileInputStream(files[index - 1]);
} catch (FileNotFoundException ex) {
throw new RuntimeException("File not available!", ex);
}
}
};
SequenceInputStream input = new SequenceInputStream(e);
StringBuilder sb = new StringBuilder();
int c;
while ((c = input.read()) != -1) {
sb.append((char) c);
}
return sb.toString();
}
public static void main(String[] args) throws IOException {
String dir = "<path-to-dir-with-logs>";
File[] logs = logs(dir);
for (File f : logs) {
System.out.println(f.getAbsolutePath());
}
System.out.println();
System.out.println(cat(logs));
}

Corrupt file from File.renameTo()

This is a followup to this question: here, involving iText. I create a new Pdf with a different rotation angle, then delete the old one and rename the new one to the name of the old one. I've determined that my problem actually happens (only when rotation == null, wtf) with the call to
outFile.renameTo(inFile)
Oddly enough renameTo() returns true, but the file is not equal to the original, outFile will no longer open in Adobe Reader on Windows. I tried analyzing the corrupted Pdf file in a desktop Pdf repair program, and the results I get are:
The end-of-file marker was not found.
The ‘startxref’ keyword or the xref position was not found.
The end-of-file marker was not found.
If I leave out the calls to delete() and renameTo() I am left with two files, neither of which are corrupt. I have also tried copying the file contents with a byte[] with the same results. I have tried outFile.renameTo(new File(inFile.toString()) since inFile is actually a subclass of File with the same results. I have tried new FileDescriptor().sync() with the same results. I have tried adding this broadcast in between every file operation with the same results:
PdfRotateService.appContext.sendBroadcast(new Intent(Intent.ACTION_MEDIA_MOUNTED, Uri
.parse("file://")));
I have tried sleeping the Thread with the same results. I have verified the paths are correct. No exceptions are thrown and delele() and renameTo() return true. I have also tried keeping a reference to the FileOutputStream and manually closing it in the finally block.
I am beginning to think there is a bug in the Android OS or something (but maybe I am overlooking something simple), please help! I want a rotated Pdf with the same filename as the original.
static boolean rotatePdf(LocalFile inFile, int angle)
{
PdfReader reader = null;
PdfStamper stamper = null;
LocalFile outFile = getGoodFile(inFile, ROTATE_SUFFIX);
boolean worked = true;
try
{
reader = new PdfReader(inFile.toString());
stamper = new PdfStamper(reader, new FileOutputStream(outFile));
int i = FIRST_PAGE;
int l = reader.getNumberOfPages();
for (; i <= l; ++i)
{
int desiredRot = angle;
PdfDictionary pageDict = reader.getPageN(i);
PdfNumber rotation = pageDict.getAsNumber(PdfName.ROTATE);
if (rotation != null)
{
desiredRot += rotation.intValue();
desiredRot %= 360;
}
// else
// worked = false;
pageDict.put(PdfName.ROTATE, new PdfNumber(desiredRot));
}
} catch (IOException e)
{
worked = false;
Log.w("Rotate", "Caught IOException in rotate");
e.printStackTrace();
} catch (DocumentException e)
{
worked = false;
Log.w("Rotate", "Caught DocumentException in rotate");
e.printStackTrace();
} finally
{
boolean z = closeQuietly(stamper);
boolean y = closeQuietly(reader);
if (!(y && z))
worked = false;
}
if (worked)
{
if (!inFile.delete())
worked = false;
if (!outFile.renameTo(inFile))
worked = false;
}
else
{
outFile.delete();
}
return worked;
}
static boolean closeQuietly(Object resource)
{
try
{
if (resource != null)
{
if (resource instanceof PdfReader)
((PdfReader) resource).close();
else if (resource instanceof PdfStamper)
((PdfStamper) resource).close();
else
((Closeable) resource).close();
return true;
}
} catch (Exception ex)
{
Log.w("Exception during Resource.close()", ex);
}
return false;
}
public static LocalFile getGoodFile(LocalFile inFile, String suffix)
{
#SuppressWarnings("unused")
String outString = inFile.getParent() + DIRECTORY_SEPARATOR +
removeExtension(inFile.getName()) + suffix + getExtension(inFile.getName());
LocalFile outFile = new LocalFile(inFile.getParent() + DIRECTORY_SEPARATOR +
removeExtension(inFile.getName()) + suffix + getExtension(inFile.getName()));
int n = 1;
while (outFile.isFile())
{
outFile = new LocalFile(inFile.getParent() + DIRECTORY_SEPARATOR +
removeExtension(inFile.getName()) + suffix + n + getExtension(inFile.getName()));
++n;
}
return outFile;
}

A better way to convert a directory of files into bytes

I have been messing with this for some time and it's getting better and better, but it's still a little slow for me. Can anyone help speed this up / make the design better, please?
Also, the files must only be numbers and the file must end with the file extension ".dat"
I never added the checks because I didn't feel is was necessary.
public void preloadModels() {
try {
File directory = new File(signlink.findcachedir() + "raw", File.separator);
File[] modelFiles = directory.listFiles();
for (int modelIndex = modelFiles.length - 1;; modelIndex--) {
String modelFileName = modelFiles[modelIndex].getName();
byte[] buffer = getBytesFromInputStream(new FileInputStream(new File(directory, modelFileName)));
Model.method460(buffer, Integer.parseInt(modelFileName.replace(".dat", "")));
}
} catch (Throwable e) {
return;
}
}
public static final byte[] getBytesFromInputStream(InputStream inputStream) throws IOException {
byte[] buffer = new byte[32 * 1024];
int bufferSize = 0;
for (;;) {
int read = inputStream.read(buffer, bufferSize, buffer.length - bufferSize);
if (read == -1) {
return Arrays.copyOf(buffer, bufferSize);
}
bufferSize += read;
if (bufferSize == buffer.length) {
buffer = Arrays.copyOf(buffer, bufferSize * 2);
}
}
}
I would do the following.
public void preloadModels() throws IOException {
File directory = new File(signlink.findcachedir() + "raw");
for (File file : directory.listFiles()) {
if (!file.getName().endsWith(".dat")) continue;
byte[] buffer = getBytesFromFile(file);
Model.method460(buffer, Integer.parseInt(file.getName().replace(".dat", "")));
}
}
public static byte[] getBytesFromFile(File file) throws IOException {
byte[] buffer = new byte[(int) file.length()];
try (DataInputStream dis = new DataInputStream(new FileInputStream(file))) {
dis.readFully(buffer);
return buffer;
}
}
If this is still too slow, most likely the limitation is the speed of hard drive.
How about using Apache Commons IOUtils class.
IOUtils.toByteArray(InputStream input)
I think the easiest way is to add all directories content to archive. Have a look at java.util.zip. It has some bugs with file names before 7th version. There is also Apache Commons implementation.

Android : How to read file in bytes?

I am trying to get file content in bytes in Android application. I have get the file in SD card now want to get the selected file in bytes. I googled but no such success. Please help
Below is the code to get files with extension. Through this i get files and show in spinner. On file selection I want to get file in bytes.
private List<String> getListOfFiles(String path) {
File files = new File(path);
FileFilter filter = new FileFilter() {
private final List<String> exts = Arrays.asList("jpeg", "jpg", "png", "bmp", "gif","mp3");
public boolean accept(File pathname) {
String ext;
String path = pathname.getPath();
ext = path.substring(path.lastIndexOf(".") + 1);
return exts.contains(ext);
}
};
final File [] filesFound = files.listFiles(filter);
List<String> list = new ArrayList<String>();
if (filesFound != null && filesFound.length > 0) {
for (File file : filesFound) {
list.add(file.getName());
}
}
return list;
}
here it's a simple:
File file = new File(path);
int size = (int) file.length();
byte[] bytes = new byte[size];
try {
BufferedInputStream buf = new BufferedInputStream(new FileInputStream(file));
buf.read(bytes, 0, bytes.length);
buf.close();
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
Add permission in manifest.xml:
<uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE" />
The easiest solution today is to used Apache common io :
http://commons.apache.org/proper/commons-io/javadocs/api-release/org/apache/commons/io/FileUtils.html#readFileToByteArray(java.io.File)
byte bytes[] = FileUtils.readFileToByteArray(photoFile)
The only drawback is to add this dependency in your build.gradle app :
implementation 'commons-io:commons-io:2.5'
+ 1562 Methods count
Since the accepted BufferedInputStream#read isn't guaranteed to read everything, rather than keeping track of the buffer sizes myself, I used this approach:
byte bytes[] = new byte[(int) file.length()];
BufferedInputStream bis = new BufferedInputStream(new FileInputStream(file));
DataInputStream dis = new DataInputStream(bis);
dis.readFully(bytes);
Blocks until a full read is complete, and doesn't require extra imports.
Here is a solution that guarantees entire file will be read, that requires no libraries and is efficient:
byte[] fullyReadFileToBytes(File f) throws IOException {
int size = (int) f.length();
byte bytes[] = new byte[size];
byte tmpBuff[] = new byte[size];
FileInputStream fis= new FileInputStream(f);;
try {
int read = fis.read(bytes, 0, size);
if (read < size) {
int remain = size - read;
while (remain > 0) {
read = fis.read(tmpBuff, 0, remain);
System.arraycopy(tmpBuff, 0, bytes, size - remain, read);
remain -= read;
}
}
} catch (IOException e){
throw e;
} finally {
fis.close();
}
return bytes;
}
NOTE: it assumes file size is less than MAX_INT bytes, you can add handling for that if you want.
If you want to use a the openFileInput method from a Context for this, you can use the following code.
This will create a BufferArrayOutputStream and append each byte as it's read from the file to it.
/**
* <p>
* Creates a InputStream for a file using the specified Context
* and returns the Bytes read from the file.
* </p>
*
* #param context The context to use.
* #param file The file to read from.
* #return The array of bytes read from the file, or null if no file was found.
*/
public static byte[] read(Context context, String file) throws IOException {
byte[] ret = null;
if (context != null) {
try {
InputStream inputStream = context.openFileInput(file);
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
int nextByte = inputStream.read();
while (nextByte != -1) {
outputStream.write(nextByte);
nextByte = inputStream.read();
}
ret = outputStream.toByteArray();
} catch (FileNotFoundException ignored) { }
}
return ret;
}
In Kotlin you can simply use:
File(path).readBytes()
You can also do it this way:
byte[] getBytes (File file)
{
FileInputStream input = null;
if (file.exists()) try
{
input = new FileInputStream (file);
int len = (int) file.length();
byte[] data = new byte[len];
int count, total = 0;
while ((count = input.read (data, total, len - total)) > 0) total += count;
return data;
}
catch (Exception ex)
{
ex.printStackTrace();
}
finally
{
if (input != null) try
{
input.close();
}
catch (Exception ex)
{
ex.printStackTrace();
}
}
return null;
}
A simple InputStream will do
byte[] fileToBytes(File file){
byte[] bytes = new byte[0];
try(FileInputStream inputStream = new FileInputStream(file)) {
bytes = new byte[inputStream.available()];
//noinspection ResultOfMethodCallIgnored
inputStream.read(bytes);
} catch (IOException e) {
e.printStackTrace();
}
return bytes;
}
Following is the working solution to read the entire file in chunks and its efficient solution to read the large files using a scanner class.
try {
FileInputStream fiStream = new FileInputStream(inputFile_name);
Scanner sc = null;
try {
sc = new Scanner(fiStream);
while (sc.hasNextLine()) {
String line = sc.nextLine();
byte[] buf = line.getBytes();
}
} finally {
if (fiStream != null) {
fiStream.close();
}
if (sc != null) {
sc.close();
}
}
}catch (Exception e){
Log.e(TAG, "Exception: " + e.toString());
}
To read a file in bytes, often used to read binary files, such as pictures, sounds, images, etc.
Use the method below.
public static byte[] readFileByBytes(File file) {
byte[] tempBuf = new byte[100];
int byteRead;
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
try {
BufferedInputStream bufferedInputStream = new BufferedInputStream(new FileInputStream(file));
while ((byteRead = bufferedInputStream.read(tempBuf)) != -1) {
byteArrayOutputStream.write(tempBuf, 0, byteRead);
}
bufferedInputStream.close();
return byteArrayOutputStream.toByteArray();
} catch (Exception e) {
e.printStackTrace();
return null;
}
}

Trouble with java snippets

public class GenericWorldLoader implements WorldLoader {
#Override
public LoginResult checkLogin(PlayerDetails pd) {
Player player = null;
int code = 2;
File f = new File("data/savedGames/" + NameUtils.formatNameForProtocol(pd.getName()) + ".dat.gz");
if(f.exists()) {
try {
InputStream is = new GZIPInputStream(new FileInputStream(f));
String name = Streams.readRS2String(is);
String pass = Streams.readRS2String(is);
if(!name.equals(NameUtils.formatName(pd.getName()))) {
code = 3;
}
if(!pass.equals(pd.getPassword())) {
code = 3;
}
} catch(IOException ex) {
code = 11;
}
}
if(code == 2) {
player = new Player(pd);
}
return new LoginResult(code, player);
}
#Override
public boolean savePlayer(Player player) {
try {
OutputStream os = new GZIPOutputStream(new FileOutputStream("data/savedGames/" + NameUtils.formatNameForProtocol(player.getName()) + ".dat.gz"));
IoBuffer buf = IoBuffer.allocate(1024);
buf.setAutoExpand(true);
player.serialize(buf);
buf.flip();
byte[] data = new byte[buf.limit()];
buf.get(data);
os.write(data);
os.flush();
os.close();
return true;
} catch(IOException ex) {
return false;
}
}
#Override
public boolean loadPlayer(Player player) {
try {
File f = new File("data/savedGames/" + NameUtils.formatNameForProtocol(player.getName()) + ".dat.gz");
InputStream is = new GZIPInputStream(new FileInputStream(f));
IoBuffer buf = IoBuffer.allocate(1024);
buf.setAutoExpand(true);
while(true) {
byte[] temp = new byte[1024];
int read = is.read(temp, 0, temp.length);
if(read == -1) {
break;
} else {
buf.put(temp, 0, read);
}
}
buf.flip();
player.deserialize(buf);
return true;
} catch(IOException ex) {
return false;
}
}
}
Yeah so... My problem is that this seems to save 'something' in really complex and hard to read way(binary) and I'd rather have it as an .txt, in easily readable format. how to convert?
EDIT: I'm not using Apache Mina, so what should I replace
IoBuffer buf = IoBuffer.allocate(1024);
buf.setAutoExpand(true);"
with?
checkLogin() obviously checks whether the specified login has matching data present and whether the password is correct.
savePlayer() method saves the player.
loadPlayer() loads it again.
The data format used is gzip (wiki) and it is written as a stream of serialized data. If you want to make it more readable, you might want to overload (or just use it, if it is good) toString() method of Player class and to write player.toString() into a new text file using e.g. BufferedWriter wrapped around a File Writer:
String playerName = NameUtils.formatNameForProtocol(player.getName());
BufferedWriter writer = new BufferedWriter(new FileWriter(playerName + ".txt"));
writer.write(player.toString());
writer.close();

Categories