Modifying a text file in a ZIP archive in Java - java

My use case requires me to open a txt file, say abc.txt which is inside a zip archive which contains key-value pairs in the form
key1=value1
key2=value2
.. and so on where each key-value pair is in a new line.
I have to change one value corresponding to a certain key and put the text file back in a new copy of the archive. How do I do this in java?
My attempt so far:
ZipFile zipFile = new ZipFile("test.zip");
final ZipOutputStream zos = new ZipOutputStream(new FileOutputStream("out.zip"));
for(Enumeration e = zipFile.entries(); e.hasMoreElements(); ) {
ZipEntry entryIn = (ZipEntry) e.nextElement();
if(!entryIn.getName().equalsIgnoreCase("abc.txt")){
zos.putNextEntry(entryIn);
InputStream is = zipFile.getInputStream(entryIn);
byte [] buf = new byte[1024];
int len;
while((len = (is.read(buf))) > 0) {
zos.write(buf, 0, len);
}
}
else{
// I'm not sure what to do here
// Tried a few things and the file gets corrupt
}
zos.closeEntry();
}
zos.close();

Java 7 introduced a much simpler way for doing zip archive manipulations - FileSystems API, which allows to access contents of a file as a file system.
Besides much more straightforward API, it is doing the modification in-place and doesn't require to rewrite other (irrelevant) files in a zip archive (as done in the accepted answer).
Here's sample code that solves OP's use case:
import java.io.*;
import java.nio.file.*;
public static void main(String[] args) throws IOException {
modifyTextFileInZip("test.zip");
}
static void modifyTextFileInZip(String zipPath) throws IOException {
Path zipFilePath = Paths.get(zipPath);
try (FileSystem fs = FileSystems.newFileSystem(zipFilePath, null)) {
Path source = fs.getPath("/abc.txt");
Path temp = fs.getPath("/___abc___.txt");
if (Files.exists(temp)) {
throw new IOException("temp file exists, generate another name");
}
Files.move(source, temp);
streamCopy(temp, source);
Files.delete(temp);
}
}
static void streamCopy(Path src, Path dst) throws IOException {
try (BufferedReader br = new BufferedReader(
new InputStreamReader(Files.newInputStream(src)));
BufferedWriter bw = new BufferedWriter(
new OutputStreamWriter(Files.newOutputStream(dst)))) {
String line;
while ((line = br.readLine()) != null) {
line = line.replace("key1=value1", "key1=value2");
bw.write(line);
bw.newLine();
}
}
}
For more zip archive manipulation examples, see demo/nio/zipfs/Demo.java sample which you can download here (look for JDK 8 Demos and Samples).

You had almost got it right. One possible reason, the file was shown as corrupted is that you might have used
zos.putNextEntry(entryIn)
in the else part as well. This creates a new entry in the zip file containing information from the existing zip file. Existing information contains entry name(file name) and its CRC among other things.
And then, when u try to update the text file and close the zip file, it will throw an error as the CRC defined in the entry and the CRC of the object you are trying to write differ.
Also u might get an error if the length of the text that you are trying to replace is different than the one existing i.e. you are trying to replace
key1=value1
with
key1=val1
This boils down to the problem that the buffer you are trying to write to has length different than the one specified.
ZipFile zipFile = new ZipFile("test.zip");
final ZipOutputStream zos = new ZipOutputStream(new FileOutputStream("out.zip"));
for(Enumeration e = zipFile.entries(); e.hasMoreElements(); ) {
ZipEntry entryIn = (ZipEntry) e.nextElement();
if (!entryIn.getName().equalsIgnoreCase("abc.txt")) {
zos.putNextEntry(entryIn);
InputStream is = zipFile.getInputStream(entryIn);
byte[] buf = new byte[1024];
int len;
while((len = is.read(buf)) > 0) {
zos.write(buf, 0, len);
}
}
else{
zos.putNextEntry(new ZipEntry("abc.txt"));
InputStream is = zipFile.getInputStream(entryIn);
byte[] buf = new byte[1024];
int len;
while ((len = (is.read(buf))) > 0) {
String s = new String(buf);
if (s.contains("key1=value1")) {
buf = s.replaceAll("key1=value1", "key1=val2").getBytes();
}
zos.write(buf, 0, (len < buf.length) ? len : buf.length);
}
}
zos.closeEntry();
}
zos.close();
The following code ensures that even if data that is replaced is of less length than the original length, no IndexOutOfBoundsExceptions occur.
(len < buf.length) ? len : buf.length

Only a little improvement to:
else{
zos.putNextEntry(new ZipEntry("abc.txt"));
InputStream is = zipFile.getInputStream(entryIn);
byte[] buf = new byte[1024];
int len;
while ((len = (is.read(buf))) > 0) {
String s = new String(buf);
if (s.contains("key1=value1")) {
buf = s.replaceAll("key1=value1", "key1=val2").getBytes();
}
zos.write(buf, 0, (len < buf.length) ? len : buf.length);
}
}
That should be:
else{
zos.putNextEntry(new ZipEntry("abc.txt"));
InputStream is = zipFile.getInputStream(entryIn);
long size = entry.getSize();
if (size > Integer.MAX_VALUE) {
throw new IllegalStateException("...");
}
byte[] bytes = new byte[(int)size];
is.read(bytes);
zos.write(new String(bytes).replaceAll("key1=value1", "key1=val2").getBytes());
}
In order to capture all the occurrences
The reason is that, with the first, you could have "key1" in one read and "=value1" in the next, not being able to capture the occurrence you want to change

Related

Extracting SFX 7-Zip

I want to extract two specific files from a .zip file. I tried the following library:
ZipFile zipFile = new ZipFile("myZip.zip");
Result:
Exception in thread "main" java.util.zip.ZipException: error in opening zip file
I also tried:
public void extract(String targetFileName) throws IOException
{
OutputStream outputStream = new FileOutputStream("targetFile.foo");
FileInputStream fileInputStream = new FileInputStream("myZip.zip");
ZipInputStream zipInputStream = new ZipInputStream(new BufferedInputStream(fileInputStream));
ZipEntry zipEntry;
while ((zipEntry = zipInputStream.getNextEntry()) != null)
{
if (zipEntry.getName().equals("targetFile.foo"))
{
byte[] buffer = new byte[8192];
int length;
while ((length = zipInputStream.read(buffer)) != -1)
{
outputStream.write(buffer, 0, length);
}
outputStream.close();
break;
}
}
}
Result:
No exception, but an empty targetFile.foo file.
Note that the .zip file is of type SFX 7-zip and initially had the .exe extensions so that may be the reason for the failure.
As in Comments, Extracting SFX 7-Zip file is basically not supported with your library. But you can do with commons compress and xz Libary together with a quick "hack":
import org.apache.commons.compress.archivers.sevenz.SevenZArchiveEntry;
import org.apache.commons.compress.archivers.sevenz.SevenZFile;
import org.apache.commons.io.FileUtils;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
...
protected File un7zSFXFile(File file, String password)
{
SevenZFile sevenZFile = null;
File tempFile = new File("/tmp/" + file.getName() + ".temp");
try
{
FileInputStream in = new FileInputStream(file);
/**
* Yes this is Voodoo Code:
* first 205824 Bytes get skipped as these is are basically the 7z-sfx-runnable.dll
* common-compress does fail if this information is not cut away
* ATTENTION: the amount of bytes may vary depending of the 7z Version used!
*/
in.skip(205824);
// EndOfVoodoCode
tempFile.getParentFile().mkdirs();
tempFile.createNewFile();
FileOutputStream temp = new FileOutputStream(tempFile);
byte[] buffer = new byte[1024];
int length;
while((length = in.read(buffer)) > 0)
{
temp.write(buffer, 0, length);
}
temp.close();
in.close();
LOGGER.info("prepared exefile for un7zing");
if (password!=null) {
sevenZFile = new SevenZFile(tempFile, password.toCharArray());
} else {
sevenZFile = new SevenZFile(tempFile);
}
SevenZArchiveEntry entry;
boolean first = true;// accept only files with
while((entry = sevenZFile.getNextEntry()))
{
if(entry.isDirectory())
{
continue;
}
File curfile = new File(file.getParentFile(), entry.getName());
File parent = curfile.getParentFile();
if(!parent.exists())
{
parent.mkdirs();
}
FileOutputStream out = new FileOutputStream(curfile);
byte[] content = new byte[(int) entry.getSize()];
sevenZFile.read(content, 0, content.length);
out.write(content);
out.close();
}
}
catch(Exception e)
{
throw e;
}
finally
{
try
{
tempFile.delete();
sevenZFile.close();
}
catch(Exception e)
{
LOGGER.trace("error on cloasing Stream: " + sevenZFile.getDefaultName(), e);
}
}
}
Please acknowledge that this simple solution does only unpack in to the same directory as the as sfx-file is placed!

java.util.zip.ZipOutputStream - Zipping large files faster?

I am wondering how could I speed up the zipping process of 40+ image files, in my android app.
Clients are sending images, which needs to be compressed or placed into a folder before uploading on the server. Now I use the bellow method, but this way files are zipped in about 20-30 seconds, while the phone appears to be frozen and users tend to exit the app :(
The method I use for zipping:
private static final int BUFFER_SIZE = 2048;
public void zip(String[] files, String zipFile) throws IOException {
File zipDirectory = new File(Environment.getExternalStorageDirectory()
+ "/laborator/");
if (!zipDirectory.exists()) {
zipDirectory.mkdirs();
} else {
System.out.println("folder already exists!");
}
BufferedInputStream origin = null;
ZipOutputStream out = new ZipOutputStream(new BufferedOutputStream(
new FileOutputStream(Environment.getExternalStorageDirectory()
+ "/laborator/" + zipFile)));
try {
byte data[] = new byte[BUFFER_SIZE];
for (int i = 0; i < files.length; i++) {
FileInputStream fi = new FileInputStream(files[i]);
origin = new BufferedInputStream(fi, BUFFER_SIZE);
try {
ZipEntry entry = new ZipEntry(files[i].substring(files[i]
.lastIndexOf("/") + 1));
out.putNextEntry(entry);
int count;
while ((count = origin.read(data, 0, BUFFER_SIZE)) != -1) {
out.write(data, 0, count);
}
} finally {
origin.close();
}
}
} finally {
out.close();
System.out.println("ziping done");
sendZip();
}
}
Since your images are jpgs chances are high that you don't get any decent compression within the ZIP file. So you could try to just put the images uncompressed into the ZIP file which should be considerable faster without increasing the size of the ZIP file:
ZipEntry entry = new ZipEntry(files[i].subs...
entry.setMethod(ZipEntry.STORED);
out.putNextEntry(entry);
You could use
out.setLevel(Deflater.NO_COMPRESSION);
This way no need to change ZipEntry.

java zip to binary format and then decompress

I have a task that
read a zip file from local into binary message
transfer binary message through EMS as String (done by java API)
receive transferred binary message as String (done by java API)
decompress the binary message and then print it out
The problem I am facing is DataFormatException while decompress the message.
I have no idea which part went wrong.
I use this to read file into binary message:
static String readFile_Stream(String fileName) throws IOException {
File file = new File(fileName);
byte[] fileData = new byte[(int) file.length()];
FileInputStream in = new FileInputStream(file);
in.read(fileData);
String content = "";
System.out.print("Sent message: ");
for(byte b : fileData)
{
System.out.print(getBits(b));
content += getBits(b);
}
in.close();
return content;
}
static String getBits(byte b)
{
String result = "";
for(int i = 0; i < 8; i++)
result = ((b & (1 << i)) == 0 ? "0" : "1") + result;
return result;
}
I use this to decompress message:
private static byte[] toByteArray(String input)
{
byte[] byteArray = new byte[input.length()/8];
for (int i=0;i<input.length()/8;i++)
{
String read_data = input.substring(i*8, i*8+8);
short a = Short.parseShort(read_data, 2);
byteArray[i] = (byte) a;
}
return byteArray;
}
public static byte[] unzipByteArray(byte[] file) throws IOException {
byte[] byReturn = null;
Inflater oInflate = new Inflater(false);
oInflate.setInput(file);
ByteArrayOutputStream oZipStream = new ByteArrayOutputStream();
try {
while (! oInflate.finished() ){
byte[] byRead = new byte[4 * 1024];
int iBytesRead = oInflate.inflate(byRead);
if (iBytesRead == byRead.length){
oZipStream.write(byRead);
}
else {
oZipStream.write(byRead, 0, iBytesRead);
}
}
byReturn = oZipStream.toByteArray();
}
catch (DataFormatException ex){
throw new IOException("Attempting to unzip file that is not zipped.");
}
finally {
oZipStream.close();
}
return byReturn;
}
The message I got is
java.io.IOException: Attempting to unzip file that is not zipped.
at com.sourcefreak.example.test.TibcoEMSQueueReceiver.unzipByteArray(TibcoEMSQueueReceiver.java:144)
at com.sourcefreak.example.test.TibcoEMSQueueReceiver.main(TibcoEMSQueueReceiver.java:54)
After check, the binary message does not corrupted after transmission.
Please help to figure out the problem.
Have you tried using InflaterInputStream? Based on my experience, using Inflater directly is rather tricky. You can use this to get started:
public static byte[] unzipByteArray(byte[] file) throws IOException {
InflaterInputStream iis = new InflaterInputStream(new ByteArrayInputStream(file));
ByteArrayOutputStream baos = new ByteArrayOutputStream();
byte[] buffer = new byte[512];
int length = 0;
while ((length = iis.read(buffer, 0, buffer.length) != 0) {
baos.write(buffer, 0, length);
}
iis.close();
baos.close();
return baos.toByteArray();
}
I finally figure out the problem.
The problem is the original file is a .zip file, so I should use zipInputStream to unzip the file before further processing.
public static byte[] unzipByteArray(byte[] file) throws IOException {
// create a buffer to improve copy performance later.
byte[] buffer = new byte[2048];
byte[] content ;
// open the zip file stream
InputStream theFile = new ByteArrayInputStream(file);
ZipInputStream stream = new ZipInputStream(theFile);
ByteArrayOutputStream output = new ByteArrayOutputStream();
try
{
ZipEntry entry;
while((entry = stream.getNextEntry())!=null)
{
//String s = String.format("Entry: %s len %d added %TD", entry.getName(), entry.getSize(), new Date(entry.getTime()));
//System.out.println(s);
// Once we get the entry from the stream, the stream is
// positioned read to read the raw data, and we keep
// reading until read returns 0 or less.
//String outpath = outdir + "/" + entry.getName();
try
{
//output = new FileOutputStream(outpath);
int len = 0;
while ((len = stream.read(buffer)) > 0)
{
output.write(buffer, 0, len);
}
}
finally
{
// we must always close the output file
if(output!=null) output.close();
}
}
}
finally
{
// we must always close the zip file.
stream.close();
}
content = output.toByteArray();
return content;
}
This code work for zip file containing single file inside.

A better way to convert a directory of files into bytes

I have been messing with this for some time and it's getting better and better, but it's still a little slow for me. Can anyone help speed this up / make the design better, please?
Also, the files must only be numbers and the file must end with the file extension ".dat"
I never added the checks because I didn't feel is was necessary.
public void preloadModels() {
try {
File directory = new File(signlink.findcachedir() + "raw", File.separator);
File[] modelFiles = directory.listFiles();
for (int modelIndex = modelFiles.length - 1;; modelIndex--) {
String modelFileName = modelFiles[modelIndex].getName();
byte[] buffer = getBytesFromInputStream(new FileInputStream(new File(directory, modelFileName)));
Model.method460(buffer, Integer.parseInt(modelFileName.replace(".dat", "")));
}
} catch (Throwable e) {
return;
}
}
public static final byte[] getBytesFromInputStream(InputStream inputStream) throws IOException {
byte[] buffer = new byte[32 * 1024];
int bufferSize = 0;
for (;;) {
int read = inputStream.read(buffer, bufferSize, buffer.length - bufferSize);
if (read == -1) {
return Arrays.copyOf(buffer, bufferSize);
}
bufferSize += read;
if (bufferSize == buffer.length) {
buffer = Arrays.copyOf(buffer, bufferSize * 2);
}
}
}
I would do the following.
public void preloadModels() throws IOException {
File directory = new File(signlink.findcachedir() + "raw");
for (File file : directory.listFiles()) {
if (!file.getName().endsWith(".dat")) continue;
byte[] buffer = getBytesFromFile(file);
Model.method460(buffer, Integer.parseInt(file.getName().replace(".dat", "")));
}
}
public static byte[] getBytesFromFile(File file) throws IOException {
byte[] buffer = new byte[(int) file.length()];
try (DataInputStream dis = new DataInputStream(new FileInputStream(file))) {
dis.readFully(buffer);
return buffer;
}
}
If this is still too slow, most likely the limitation is the speed of hard drive.
How about using Apache Commons IOUtils class.
IOUtils.toByteArray(InputStream input)
I think the easiest way is to add all directories content to archive. Have a look at java.util.zip. It has some bugs with file names before 7th version. There is also Apache Commons implementation.

How do you uncompress a split volume zip in Java?

I need to reassemble a 100-part zip file and extract the content. I tried simply concatenating the zip volumes together in an input stream but that does not work. Any suggestions would be appreciated.
Thanks.
Here is the code you can start from. It extracts a single file entry from the multivolume zip archive:
package org.test.zip;
import java.io.BufferedOutputStream;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.io.SequenceInputStream;
import java.util.Arrays;
import java.util.Collections;
import java.util.zip.ZipEntry;
import java.util.zip.ZipInputStream;
public class Main {
public static void main(String[] args) throws IOException {
ZipInputStream is = new ZipInputStream(new SequenceInputStream(Collections.enumeration(
Arrays.asList(new FileInputStream("test.zip.001"), new FileInputStream("test.zip.002"), new FileInputStream("test.zip.003")))));
try {
for(ZipEntry entry = null; (entry = is.getNextEntry()) != null; ) {
OutputStream os = new BufferedOutputStream(new FileOutputStream(entry.getName()));
try {
final int bufferSize = 1024;
byte[] buffer = new byte[bufferSize];
for(int readBytes = -1; (readBytes = is.read(buffer, 0, bufferSize)) > -1; ) {
os.write(buffer, 0, readBytes);
}
os.flush();
} finally {
os.close();
}
}
} finally {
is.close();
}
}
}
Just a note to make it more dynamic -- 100% based on mijer code below.
private void CombineFiles (String[] files) throws FileNotFoundException, IOException {
Vector<FileInputStream> v = new Vector<FileInputStream>(files.length);
for (int x = 0; x < files.length; x++)
v.add(new FileInputStream(inputDirectory + files[x]));
Enumeration<FileInputStream> e = v.elements();
SequenceInputStream sequenceInputStream = new SequenceInputStream(e);
ZipInputStream is = new ZipInputStream(sequenceInputStream);
try {
for (ZipEntry entry = null; (entry = is.getNextEntry()) != null;) {
OutputStream os = new BufferedOutputStream(new FileOutputStream(entry.getName()));
try {
final int bufferSize = 1024;
byte[] buffer = new byte[bufferSize];
for (int readBytes = -1; (readBytes = is.read(buffer, 0, bufferSize)) > -1;) {
os.write(buffer, 0, readBytes);
}
os.flush();
} finally {
os.close();
}
}
} finally {
is.close();
}
}
To just concatenate the segment data did not work for me. In this case the segments had been created with Linux command-line zip (InfoZip version 3.0):
> zip -s 5m data.zip -r data/
Segment files named data.z01, data.z02, ..., data.zip was created.
The first segment data.z01 contained the spanning signature 0x08074b50, as described in the Zip File Format Specification by PKWARE. The presence of these 4 bytes made Java ZipInputStream ignore all entries in the archive. The central registry in the last segment also contained extra segment information compared to a non-split archive but that did not cause ZipInputStream any problems.
All I had to do was to skip the spanning signature. The following code will extract entries both from an archive that have been segmented with zip -s and from a zip file that have been split by the Linux split commad, like this: split -d -b 5M data.zip data.zip.. The code is based on szhem's.
public class ZipCat {
private final static byte[] SPANNING_SIGNATURE = {0x50, 0x4b, 0x07, 0x08};
public static void main(String[] args) throws IOException {
List<InputStream> asList = new ArrayList<>();
byte[] buf4 = new byte[4];
PushbackInputStream pis = new PushbackInputStream(new FileInputStream(args[0]), buf4.length);
asList.add(pis);
if (pis.read(buf4) != buf4.length) {
throw new IOException(args[0] + " is too small for a zip file/segment");
}
if (!Arrays.equals(buf4, SPANNING_SIGNATURE)) {
pis.unread(buf4, 0, buf4.length);
}
for (int i = 1; i < args.length; i++) {
asList.add(new FileInputStream(args[i]));
}
try (ZipInputStream is = new ZipInputStream(new SequenceInputStream(Collections.enumeration(asList)))) {
for (ZipEntry entry = null; (entry = is.getNextEntry()) != null;) {
if (entry.isDirectory()) {
new File(entry.getName()).mkdirs();
} else {
try (OutputStream os = new BufferedOutputStream(new FileOutputStream(entry.getName()))) {
byte[] buffer = new byte[1024];
int count = -1;
while ((count = is.read(buffer)) != -1) {
os.write(buffer, 0, count);
}
}
}
}
}
}
}

Categories