Java zip character encoding - java

I'm using the following method to compress a file into a zip file:
import java.util.zip.CRC32;
import java.util.zip.ZipEntry;
import java.util.zip.ZipOutputStream;
public static void doZip(final File inputfis, final File outputfis) throws IOException {
FileInputStream fis = null;
FileOutputStream fos = null;
final CRC32 crc = new CRC32();
crc.reset();
try {
fis = new FileInputStream(inputfis);
fos = new FileOutputStream(outputfis);
final ZipOutputStream zos = new ZipOutputStream(fos);
zos.setLevel(6);
final ZipEntry ze = new ZipEntry(inputfis.getName());
zos.putNextEntry(ze);
final int BUFSIZ = 8192;
final byte inbuf[] = new byte[BUFSIZ];
int n;
while ((n = fis.read(inbuf)) != -1) {
zos.write(inbuf, 0, n);
crc.update(inbuf);
}
ze.setCrc(crc.getValue());
zos.finish();
zos.close();
} catch (final IOException e) {
throw e;
} finally {
if (fis != null) {
fis.close();
}
if (fos != null) {
fos.close();
}
}
}
My problem is that i have flat text files with the content N°TICKET for example, the zipped result gives some weired characters when uncompressed N° TICKET. Also characters such as é and à are not supported.
I guess it's due to the character encoding, but I don't know how to set it in my zip method to ISO-8859-1 ?
(I'm running on windows 7, java 6)

You are using streams which write exactly the bytes that they are given. Writers interpret character data and convert it to the corresponding bytes and Readers do the opposite. Java (at least in version 6) doesn't provide an easy way to to mix and match operations on zipped data and for writing characters.
This way will work though. It is, however, a little clunky.
File inputFile = new File("utf-8-data.txt");
File outputFile = new File("latin-1-data.zip");
ZipEntry entry = new ZipEntry("latin-1-data.txt");
BufferedReader reader = new BufferedReader(new FileReader(inputFile));
ZipOutputStream zipStream = new ZipOutputStream(new FileOutputStream(outputFile));
BufferedWriter writer = new BufferedWriter(
new OutputStreamWriter(zipStream, Charset.forName("ISO-8859-1"))
);
zipStream.putNextEntry(entry);
// this is the important part:
// all character data is written via the writer and not the zip output stream
String line = null;
while ((line = reader.readLine()) != null) {
writer.append(line).append('\n');
}
writer.flush(); // i've used a buffered writer, so make sure to flush to the
// underlying zip output stream
zipStream.closeEntry();
zipStream.finish();
reader.close();
writer.close();

Afaik this is not available in Java 6.
But I do believe that http://commons.apache.org/compress/ can provide a solution.
Switching to Java 7 provides a new constructor that that encoding as an additional parameter.
https://blogs.oracle.com/xuemingshen/entry/non_utf_8_encoding_in
zipStream = new ZipInputStream(
new BufferedInputStream(new FileInputStream(archiveFile), BUFFER_SIZE),
Charset.forName("ISO-8859-1")

try to use org.apache.commons.compress.archivers.zip.ZipFile; not java's own library so you can give encoding like that:
import org.apache.commons.compress.archivers.zip.ZipFile;
ZipFile zipFile = new ZipFile(filepath,encoding);

Related

Java write exe file

is it possible to write/create an exe file in Java?
I can successfully read it but writing the exact same data that has been read to a new file seems to create some trouble because Windows tell's me it's not supported for my pc anymore.
This is the code I'm using to read the file where path is a String given with the actual path (it's in the .jar itself that's why I'm using ResourceAsStream()):
try {
InputStream inputStream = FileIO.class.getResourceAsStream(path);
BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream));
ArrayList<String> _final = new ArrayList<String>();
String line;
while ((line = reader.readLine()) != null) {
_final.add(line);
}
inputStream.close();
return _final.toArray(new String[_final.size()]);
}catch(Exception e) {
return null;
}
This is the code I'm using to write the file:
public static void writeFileArray(String path, String[] data) {
String filename = path;
try{
FileWriter fileWriter = new FileWriter(filename);
BufferedWriter bufferedWriter = new BufferedWriter(fileWriter);
for(String d : data) {
bufferedWriter.write(d + "\n");
}
bufferedWriter.close();
}
catch(IOException ex){
System.out.println("FileIO failed to write file, IO exception");
}
}
So it doesn't give me any error's or something and the file size of the original .exe and the 'transferred' .exe stays the same, but it doesn't work anymore. Am I just doing it wrong? Did I forget something? Can u even do this with Java?
Btw I'm not that experienced with reading/writing files..
Thanks for considering my request.
I'm going to guess that you're using a Reader when you should be using a raw input stream. Use BufferedInputStream instead of BufferedReader.
BufferedInputStream in = new BufferedInputStream( inputStream );
The problem is that Reader interprets the binary as your local character set instead of the data you want.
Edit: if you need a bigger hint start with this. I just noticed you're using a BufferedWriter too, that won't work either.
try {
InputStream inputStream = FileIO.class.getResourceAsStream(path);
BufferedInputStream in = new BufferedInputStream( inputStream );
ByteArrayOutputStream bos = new ByteArrayOutputStream();
byte[] bytes = new byte[ 1024 ];
for( int length; ( length = ins.read( bytes ) ) != -1; )
bos.write( bytes, 0, length );
}
inputStream.close();
return bos;
When you are using Java 7 or newer, you should copy a resource to a file using
public static void copyResourceToFile(String resourcePath, String filePath) {
try(InputStream inputStream = FileIO.class.getResourceAsStream(resourcePath)) {
Files.copy(inputStream, Paths.get(filePath));
}
catch(IOException ex){
System.out.println("Copying failed. "+ex.getMessage());
}
}
This construct ensures correct closing of the resources even in the exceptional case and the JRE method ensures correct and efficient copying of the data.
It accepts additional options, e.g. to specify that the target file should be overwritten in case it already exists, you would use
public static void copyResourceToFile(String resourcePath, String filePath) {
try(InputStream inputStream = FileIO.class.getResourceAsStream(resourcePath)) {
Files.copy(inputStream, Paths.get(filePath), StandardCopyOption.REPLACE_EXISTING);
}
catch(IOException ex){
System.out.println("Copying failed. "+ex.getMessage());
}
}
You are using InputStreams for strings, .exe files are bytes!
Try using a ByteArrayInputStream and ByteArrayOutputStream.
Edit: completing with markspace's answer:
new BufferedInputStream(new ByteArrayInputStream( ... ) )

zip4j, extract a password protected file from an inputstream (blob inputstream which is a zip file)

I have a database that contains blobs and a password protected zip inside this database, using the standard File object approach i traditionally see
File zipFile = new File("C:\\file.zip");
net.lingala.zip4j.core.ZipFile table = new net.lingala.zip4j.core.ZipFile(zipFile);
if (table.isEncrypted())
table.setPassword(password);
net.lingala.zip4j.model.FileHeader entry = table.getFileHeader("file_inside_the_zip.txt");
return table.getInputStream(entry); //Decrypted inputsteam!
my question is, how do i implement something like this without the use of temporary files, and purely obtaining an inputstream of the blob alone, so far i have something like this
InputStream zipStream = getFileFromDataBase("stuff.zip");
//This point forward I have to save zipStream as a temporary file and use the traditional code above
I faced the same problem while processing a password protected zipped file in a Hadoop File System (HDFS). HDFS doesn't know about the File object.
This is what worked for me using zip4j:
Configuration conf = new Configuration();
FileSystem fs = FileSystem.get(conf);
Path hdfsReadPath = new Path(zipFilePath); // like "hdfs://master/dir/sub/data/the.zip"
FSDataInputStream inStream = fs.open(hdfsReadPath);
ZipInputStream zipInputStream = new ZipInputStream(inStream, passWord.toCharArray());
LocalFileHeader zipEntry = null;
BufferedReader reader = new BufferedReader(new InputStreamReader(zipInputStream));
while ((zipEntry = zipInputStream.getNextEntry()) != null ) {
String entryName = zipEntry.getFileName();
System.out.println(entryName);
if (!zipEntry.isDirectory()) {
String line;
while ((line = reader.readLine()) != null) {
//process the line
}
}
}
reader.close();
zipInputStream.close();
I believe it is not possible via zip4j as it is very centered around files.
Have a look at this one: http://blog.alutam.com/2012/03/31/new-library-for-reading-and-writing-zip-files-in-java/
There is a way you can achieve it with the net.lingala.zip4j.io.inputstream.ZipInputStream
(given a byte[] zipFile and a String password)
String zipPassword = "abcabc";
ZipInputStream innerZip = new ZipInputStream(new ByteArrayInputStream(zipFile), zipPassword.toCharArray());
then you could loop over your non protected zip
File zip = null;
while ((zipEntry = zipIs.getNextEntry()) != null) {
zip = new File(file.getAbsolutePath(), zipEntry.getFileName());
....
}
public void extractWithZipInputStream(File zipFile, char[] password) throws IOException {
LocalFileHeader localFileHeader;
int readLen;
byte[] readBuffer = new byte[4096];
InputStream inputStream = new FileInputStream(zipFile);
try (ZipInputStream zipInputStream = new ZipInputStream(inputStream, password)) {
while ((localFileHeader = zipInputStream.getNextEntry()) != null) {
File extractedFile = new File(localFileHeader.getFileName());
try (OutputStream outputStream = new FileOutputStream(extractedFile)) {
while ((readLen = zipInputStream.read(readBuffer)) != -1) {
outputStream.write(readBuffer, 0, readLen);
}
}
}
}
}
This method needs to be modified according to your need. For example, you may have to change the output location. I have tried and it has worked. For better understanding see https://github.com/srikanth-lingala/zip4j

Inline input stream processing in Java

I need some help on below problem. I am working on a project where I need to deal with files.
I get the handle of input stream from the user from which before writing it to disk I need to perform certain steps.
calculate the file digest
check for only 1 zip file present, unzip the data if zipped
dos 2 unix conversion
record length validation
and encrypt and save the file to disk
Also need to break the flow if there is any exception in the process
I tried to use piped output and input stream, but the constraint is Java recommends it to run in 2 separate threads. Once I read from input stream I am not able to use it from other processing steps. Files can be very big so cannot cache all the data in buffer.
Please provide your suggestions or is there any third party lib I can use for same.
The biggest issue is that you'll need to peek ahead in the provided InputStream to decide if you received a zipfile or not.
private boolean isZipped(InputStream is) throws IOException {
try {
return new ZipInputStream(is).getNextEntry() != null;
} catch (final ZipException ze) {
return false;
}
}
After this you need to reset the inputstream to the initial position before setting up a DigestInputStream.
Then read a ZipInputstream or the DigestInputstream directly.
After you've done your processing, read the DigestInputStream to the end so you can obtain the digest.
Below code has been validated through a wrapping "CountingInputstream" that keeps track of the total number of bytes read from the provided FileInputStream.
final FileInputStream fis = new FileInputStream(filename);
final CountingInputStream countIs = new CountingInputStream(fis);
final boolean isZipped = isZipped(countIs);
// make sure we reset the inputstream before calculating the digest
fis.getChannel().position(0);
final DigestInputStream dis = new DigestInputStream(countIs, MessageDigest.getInstance("SHA-256"));
// decide which inputStream to use
InputStream is = null;
ZipInputStream zis = null;
if (isZipped) {
zis = new ZipInputStream(dis);
zis.getNextEntry();
is = zis;
} else {
is = dis;
}
final File tmpFile = File.createTempFile("Encrypted_", ".tmp");
final OutputStream os = new CipherOutputStream(new FileOutputStream(tmpFile), obtainCipher());
try {
readValidateAndWriteRecords(is, os);
failIf2ndZipEntryExists(zis);
} catch (final Exception e) {
os.close();
tmpFile.delete();
throw e;
}
System.out.println("Digest: " + obtainDigest(dis));
dis.close();
System.out.println("\nValidating bytes read and calculated digest");
final DigestInputStream dis2 = new DigestInputStream(new CountingInputStream(new FileInputStream(filename)), MessageDigest.getInstance("SHA-256"));
System.out.println("Digest: " + obtainDigest(dis2));
dis2.close();
Not really relevant, but these are the helper methods:
private String obtainDigest(DigestInputStream dis) throws IOException {
final byte[] buff = new byte[1024];
while (dis.read(buff) > 0) {
dis.read(buff);
}
return DatatypeConverter.printBase64Binary(dis.getMessageDigest().digest());
}
private void readValidateAndWriteRecords(InputStream is, final OutputStream os) throws IOException {
final BufferedReader br = new BufferedReader(new InputStreamReader(is));
// do2unix is done automatically by readline
for (String line = br.readLine(); line != null; line = br.readLine()) {
// record length validation
if (line.length() < 1) {
throw new RuntimeException("RecordLengthValidationFailed");
}
os.write((line + "\n").getBytes());
}
}
private void failIf2ndZipEntryExists(ZipInputStream zis) throws IOException {
if (zis != null && zis.getNextEntry() != null) {
throw new RuntimeException("Zip File contains multiple entries");
}
}
==> output:
Digest: jIisvDleAttKiPkyU/hDvbzzottAMn6n7inh4RKxPOc=
CountingInputStream closed. Total number of bytes read: 1100
Validating bytes read and calculated digest
Digest: jIisvDleAttKiPkyU/hDvbzzottAMn6n7inh4RKxPOc=
CountingInputStream closed. Total number of bytes read: 1072
Fun question, I may have gone overboard with my answer :)

Prepend lines to file in Java

Is there a way to prepend a line to the File in Java, without creating a temporary file, and writing the needed content to it?
No, there is no way to do that SAFELY in Java. (Or AFAIK, any other programming language.)
No filesystem implementation in any mainstream operating system supports this kind of thing, and you won't find this feature supported in any mainstream programming languages.
Real world file systems are implemented on devices that store data as fixed sized "blocks". It is not possible to implement a file system model where you can insert bytes into the middle of a file without significantly slowing down file I/O, wasting disk space or both.
The solutions that involve an in-place rewrite of the file are inherently unsafe. If your application is killed or the power dies in the middle of the prepend / rewrite process, you are likely to lose data. I would NOT recommend using that approach in practice.
Use a temporary file and renaming. It is safer.
There is a way, it involves rewriting the whole file though (but no temporary file). As others mentioned, no file system supports prepending content to a file. Here is some sample code that uses a RandomAccessFile to write and read content while keeping some content buffered in memory:
public static void main(final String args[]) throws Exception {
File f = File.createTempFile(Main.class.getName(), "tmp");
f.deleteOnExit();
System.out.println(f.getPath());
// put some dummy content into our file
BufferedWriter w = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(f)));
for (int i = 0; i < 1000; i++) {
w.write(UUID.randomUUID().toString());
w.write('\n');
}
w.flush();
w.close();
// append "some uuids" to our file
int bufLength = 4096;
byte[] appendBuf = "some uuids\n".getBytes();
byte[] writeBuf = appendBuf;
byte[] readBuf = new byte[bufLength];
int writeBytes = writeBuf.length;
RandomAccessFile rw = new RandomAccessFile(f, "rw");
int read = 0;
int write = 0;
while (true) {
// seek to read position and read content into read buffer
rw.seek(read);
int bytesRead = rw.read(readBuf, 0, readBuf.length);
// seek to write position and write content from write buffer
rw.seek(write);
rw.write(writeBuf, 0, writeBytes);
// no bytes read - end of file reached
if (bytesRead < 0) {
// end of
break;
}
// update seek positions for write and read
read += bytesRead;
write += writeBytes;
writeBytes = bytesRead;
// reuse buffer, create new one to replace (short) append buf
byte[] nextWrite = writeBuf == appendBuf ? new byte[bufLength] : writeBuf;
writeBuf = readBuf;
readBuf = nextWrite;
};
rw.close();
// now show the content of our file
BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream(f)));
String line;
while ((line = reader.readLine()) != null) {
System.out.println(line);
}
}
You could store the file content in a String and prepend the desired line by using a StringBuilder-Object. You just have to put the desired line first and then append the file-content-String.
No extra temporary file needed.
No. There are no "intra-file shift" operations, only read and write of discrete sizes.
It would be possible to do so by reading a chunk of the file of equal length to what you want to prepend, writing the new content in place of it, reading the later chunk and replacing it with what you read before, and so on, rippling down the to the end of the file.
However, don't do that, because if anything stops (out-of-memory, power outage, rogue thread calling System.exit) in the middle of that process, data will be lost. Use the temporary file instead.
private static void addPreAppnedText(File fileName) {
FileOutputStream fileOutputStream =null;
BufferedReader br = null;
FileReader fr = null;
String newFileName = fileName.getAbsolutePath() + "#";
try {
fileOutputStream = new FileOutputStream(newFileName);
fileOutputStream.write("preappendTextDataHere".getBytes());
fr = new FileReader(fileName);
br = new BufferedReader(fr);
String sCurrentLine;
while ((sCurrentLine = br.readLine()) != null) {
fileOutputStream.write(("\n"+sCurrentLine).getBytes());
}
fileOutputStream.flush();
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
fileOutputStream.close();
if (br != null)
br.close();
if (fr != null)
fr.close();
new File(newFileName).renameTo(new File(newFileName.replace("#", "")));
} catch (IOException ex) {
ex.printStackTrace();
}
}
}

GZIPInputStream reading line by line

I have a file in .gz format. The java class for reading this file is GZIPInputStream.
However, this class doesn't extend the BufferedReader class of java. As a result, I am not able to read the file line by line. I need something like this
reader = new MyGZInputStream( some constructor of GZInputStream)
reader.readLine()...
I though of creating my class which extends the Reader or BufferedReader class of java and use GZIPInputStream as one of its variable.
import java.io.BufferedReader;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.Reader;
import java.util.zip.GZIPInputStream;
public class MyGZFilReader extends Reader {
private GZIPInputStream gzipInputStream = null;
char[] buf = new char[1024];
#Override
public void close() throws IOException {
gzipInputStream.close();
}
public MyGZFilReader(String filename)
throws FileNotFoundException, IOException {
gzipInputStream = new GZIPInputStream(new FileInputStream(filename));
}
#Override
public int read(char[] cbuf, int off, int len) throws IOException {
// TODO Auto-generated method stub
return gzipInputStream.read((byte[])buf, off, len);
}
}
But, this doesn't work when I use
BufferedReader in = new BufferedReader(
new MyGZFilReader("F:/gawiki-20090614-stub-meta-history.xml.gz"));
System.out.println(in.readLine());
Can someone advice how to proceed ..
The basic setup of decorators is like this:
InputStream fileStream = new FileInputStream(filename);
InputStream gzipStream = new GZIPInputStream(fileStream);
Reader decoder = new InputStreamReader(gzipStream, encoding);
BufferedReader buffered = new BufferedReader(decoder);
The key issue in this snippet is the value of encoding. This is the character encoding of the text in the file. Is it "US-ASCII", "UTF-8", "SHIFT-JIS", "ISO-8859-9", …? there are hundreds of possibilities, and the correct choice usually cannot be determined from the file itself. It must be specified through some out-of-band channel.
For example, maybe it's the platform default. In a networked environment, however, this is extremely fragile. The machine that wrote the file might sit in the neighboring cubicle, but have a different default file encoding.
Most network protocols use a header or other metadata to explicitly note the character encoding.
In this case, it appears from the file extension that the content is XML. XML includes the "encoding" attribute in the XML declaration for this purpose. Furthermore, XML should really be processed with an XML parser, not as text. Reading XML line-by-line seems like a fragile, special case.
Failing to explicitly specify the encoding is against the second commandment. Use the default encoding at your peril!
GZIPInputStream gzip = new GZIPInputStream(new FileInputStream("F:/gawiki-20090614-stub-meta-history.xml.gz"));
BufferedReader br = new BufferedReader(new InputStreamReader(gzip));
br.readLine();
BufferedReader in = new BufferedReader(new InputStreamReader(
new GZIPInputStream(new FileInputStream("F:/gawiki-20090614-stub-meta-history.xml.gz"))));
String content;
while ((content = in.readLine()) != null)
System.out.println(content);
You can use the following method in a util class, and use it whenever necessary...
public static List<String> readLinesFromGZ(String filePath) {
List<String> lines = new ArrayList<>();
File file = new File(filePath);
try (GZIPInputStream gzip = new GZIPInputStream(new FileInputStream(file));
BufferedReader br = new BufferedReader(new InputStreamReader(gzip));) {
String line = null;
while ((line = br.readLine()) != null) {
lines.add(line);
}
} catch (FileNotFoundException e) {
e.printStackTrace(System.err);
} catch (IOException e) {
e.printStackTrace(System.err);
}
return lines;
}
here is with one line
try (BufferedReader br = new BufferedReader(
new InputStreamReader(
new GZIPInputStream(
new FileInputStream(
"F:/gawiki-20090614-stub-meta-history.xml.gz")))))
{br.readLine();}

Categories