decompressing with zlib in Java (incorrect header check)

decompressing with zlib in Java (incorrect header check) - java

I try to program (and understand) compression/decompression.
I have a file which is compressed with zlib and I thought that I found the solution to decompress my file:
import java.util.Scanner;
import java.util.zip.*;
import java.io.*;
public class ZLibCompression
{
public static void main(String args[])throws IOException, DataFormatException {
File compressed = new File("./MyFile.hlb");
decompress(compressed, new File("./MyFile.txt"));
}
public static void decompress(File compressed, File raw)
throws IOException
{
try (InputStream inputStream = new InflaterInputStream(new FileInputStream(compressed));
OutputStream outputStream = new FileOutputStream(raw))
{
copy(inputStream, outputStream);
}
}
private static void copy(InputStream inputStream, OutputStream outputStream)
throws IOException
{
byte[] buffer = new byte[1000];
int length;
while ((length = inputStream.read(buffer)) > 0)
{
outputStream.write(buffer, 0, length);
}
}
}
But I get the following error stack trace:
Exception in thread "main" java.util.zip.ZipException: incorrect header check
at java.base/java.util.zip.InflaterInputStream.read(InflaterInputStream.java:164)
at java.base/java.io.FilterInputStream.read(FilterInputStream.java:106)
at ZLibCompression.copy(ZLibCompression.java:46)
at ZLibCompression.decompress(ZLibCompression.java:20)
at ZLibCompression.main(ZLibCompression.java:11)
Then I checked the header of my file and it says:
{
"compression" : {
"crc32" : 2575274738,
"decompressed_size" : 9020404,
"type" : "zlib"
},
"encoded_data" : "eNrsvV2Xm0i
What is my error? I found a Python script that works fine with the same file:
#!/usr/bin/env python
import sys
import os
import json
import base64
import zlib
SETLIST_OR_BUNDLE = "MyFile.hlb"
infile = open(SETLIST_OR_BUNDLE)
data = json.load(infile)
infile.close()
keys = list(data.keys())
if 'encoded_data' in keys:
unz = zlib.decompress(base64.b64decode(data['encoded_data']))
setlist_or_bundle = json.loads(unz)
keys = list(setlist_or_bundle.keys())
if 'setlists' in keys:
setlists = setlist_or_bundle['setlists']
elif 'presets' in keys:
setlists = [setlist_or_bundle]
for setlist in setlists:
keys = list(setlist.keys())
if 'meta' in keys:
print()
print("SETLIST: %s" % (setlist['meta']['name']))
presets = setlist['presets']
#print json.dumps(presets, indent=4)
for preset in presets:
if 'meta' in list(preset.keys()):
meta = preset['meta']
preset_name = meta['name']
print(" ", preset_name)
I think it has something to do with the base64 part and I found a similar question where someone mentioned "you have to decode the Base64 string into a byte array first" - OK fine - Can anyone explain or give me a link to a tutorial?
All I need is the same functionality in Java like the Python script above has - And of course I want to learn something...

First of all, it looks like your file is not compressed as a whole. Instead, it is a JSON-String containing the actual compressed data as encoded_data. You also need to unwrap the JSON-String then. The easiest way to deal with JSON encrypted data is by using a library. Check this post for some comparisons of different libraries.
Next, as you can see in your python code, the encoded data gets decoded from Base64 before passed through the ZLIB-Decompressor (zlib.decompress(base64.b64decode(data)))
The java equivalent to un-Base64 a String would be:
Base64.getDecoder().decode(string);

Related

How to temporarily create a text file without any file location and send as a response in spring boot at run time?

Need to create a txt file by the available data and then need to send the file as rest response.
the app is deployed in container. i dont want to store it in any location on container or any location in spring boot resources. is there any way where we can create file at runtime buffer without giving any file location and then send it in rest response?
App is production app so i need a solution which is secure

A file is a file. You're using the wrong words - in java, the concept of a stream of data, at least for this kind of job, is called an InputStream or an OutputStream.
Whatever method you have that takes a File? That's the end of the road. A File is a file. You can't fake it. But, talk to the developers, or check for alternate methods, because there is absolutely no reason anything in java that does data processing requires a File. It should be requiring an InputStream or possibly a Reader. Or perhaps even there is a method that gives you an OutputStream or Writer. All of these things are fine - they are abstractions that lets you just send data to it, from a file, a network connection, or made up whole cloth, which is what you want.
Once you have one of those, it's trivial. For example:
String text = "The Text you wanted to store in a fake file";
byte[] data = text.getBytes(StandardCharsets.UTF_8);
ByteArrayInputStream in = new ByteArrayInputStream(data);
whateverSystemYouNeedToSendThisTo.send(in);
Or for example:
String text = "The Text you wanted to store in a fake file";
byte[] data = text.getBytes(StandardCharsets.UTF_8);
try (var out = whateverSystemYouNeedToSendThisTo.getOUtputStream()) {
out.write(data);
}

Take a look at the function below:
Imports
import com.google.common.io.Files;
import org.springframework.http.ContentDisposition;
import org.springframework.http.HttpHeaders;
import org.springframework.http.MediaType;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.GetMapping;
import java.io.*;
import java.nio.file.Paths;
Function:
#GetMapping(value = "/getFile", produces = MediaType.APPLICATION_OCTET_STREAM_VALUE)
private ResponseEntity<byte[]> getFile() throws IOException {
File tempDir = Files.createTempDir();
File file = Paths.get(tempDir.getAbsolutePath(), "fileName.txt").toFile();
String data = "Some data"; //
try (FileWriter fileWriter = new FileWriter(file)) {
fileWriter.append(data).flush();
} catch (Exception ex) {
ex.printStackTrace();
}
byte[] zippedData = toByteArray(new FileInputStream(file));
HttpHeaders httpHeaders = new HttpHeaders();
httpHeaders.setContentDisposition(ContentDisposition.builder("attachment").filename("file.txt").build());
httpHeaders.setContentType(MediaType.APPLICATION_OCTET_STREAM);
httpHeaders.setContentLength(zippedData.length);
return ResponseEntity.ok().headers(httpHeaders).body(zippedData);
}
public static byte[] toByteArray(InputStream in) throws IOException {
ByteArrayOutputStream os = new ByteArrayOutputStream();
byte[] buffer = new byte[in.available()];
int len;
// read bytes from the input stream and store them in buffer
while ((len = in.read(buffer)) != -1) {
// write bytes from the buffer into output stream
os.write(buffer, 0, len);
}
return os.toByteArray();
}

In a nutshell, you want to store data in memory. Basic building block for this is array of bytes - byte[].
In JDK there are two classes to connect IO world with byte array - ByteArrayInputStream and ByteArrayOutputStream.
Rest is just same, as when dealing with files.

Example 1
#GetMapping(value = "/image")
public #ResponseBody byte[] getImage() throws IOException {
InputStream in = getClass()
.getResourceAsStream("/com/baeldung/produceimage/image.jpg");
return IOUtils.toByteArray(in);
}
Example 2:
#GetMapping("/get-image-dynamic-type")
#ResponseBody
public ResponseEntity<InputStreamResource> getImageDynamicType(#RequestParam("jpg") boolean jpg) {
MediaType contentType = jpg ? MediaType.IMAGE_JPEG : MediaType.IMAGE_PNG;
InputStream in = jpg ?
getClass().getResourceAsStream("/com/baeldung/produceimage/image.jpg") :
getClass().getResourceAsStream("/com/baeldung/produceimage/image.png");
return ResponseEntity.ok()
.contentType(contentType)
.body(new InputStreamResource(in));
}
Ref: https://www.baeldung.com/spring-controller-return-image-file

Zip Archives get corrupted when uploading to Azure Blob Store using REST API

I have been really banging my head against the wall with this one, uploading text files is fine, but when I upload a zip archive into my blob store -> it gets corrupted, and cannot be opened once downloaded.
Doing a hex compare (image below) of the original versus file that has been through Azure shows some subtle replacements have happened, but I cannot find the source of the change/corruption.
I have tried forcing UTF-8/Ascii/UTF-16, but found UTF-8 is probably correct, none have resolved the issue.
I have also tried different http libraries but got the same result.
Deployment environment is forcing unirest, and cannot use the Microsoft API (Which seems to work fine).
package blobQuickstart.blobAzureApp;
import java.io.ByteArrayOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.InputStream;
import java.util.Base64;
import org.junit.Test;
import kong.unirest.HttpResponse;
import kong.unirest.Unirest;
public class StackOverflowExample {
#Test
public void uploadSmallZip() throws Exception {
File testFile = new File("src/test/resources/zip/simple.zip");
String blobStore = "secretstore";
UploadedFile testUploadedFile = new UploadedFile();
testUploadedFile.setName(testFile.getName());
testUploadedFile.setFile(testFile);
String contentType = "application/zip";
String body = readFileContent(testFile);
String url = "https://" + blobStore + ".blob.core.windows.net/naratest/" + testFile.getName() + "?sv=2020-02-10&ss=b&srt=o&sp=c&se=2021-09-07T20%3A10%3A50Z&st=2021-09-07T18%3A10%3A50Z&spr=https&sig=xvQTkCQcfMTwWSP5gXeTB5vHlCh2oZXvmvL3kaXRWQg%3D";
HttpResponse<String> response = Unirest.put(url)
.header("x-ms-blob-type", "BlockBlob").header("Content-Type", contentType)
.body(body).asString();
if (!response.isSuccess()) {
System.out.println(response.getBody());
throw new Exception("Failed to Upload File! Unexpected response code: " + response.getStatus());
}
}
private static String readFileContent(File file) throws Exception {
InputStream is = new FileInputStream(file);
ByteArrayOutputStream answer = new ByteArrayOutputStream();
byte[] byteBuffer = new byte[8192];
int nbByteRead;
while ((nbByteRead = is.read(byteBuffer)) != -1)
{
answer.write(byteBuffer, 0, nbByteRead);
}
is.close();
byte[] fileContents = answer.toByteArray();
String s = Base64.getEncoder().encodeToString(fileContents);
byte[] resultBytes = Base64.getDecoder().decode(s);
String encodedContents = new String(resultBytes);
return encodedContents;
}
}
Please help!

byte[] resultBytes = Base64.getDecoder().decode(s);
String encodedContents = new String(resultBytes);
You are creating a String from a byte array containing binary data. String is only for printable characters. You do multiple pointless encoding/decoding just taking more memory.
If the content is in a ZIP format, it's binary, just return the byte array. Or you can encode the content, but then you should return the content encoded. As a weakness, you're doing it all in memory, limiting potential size of the content.

Unirest file handlers will by default force a multipart body - not supported by Azure.
A Byte Array can be provided directly as per this: https://github.com/Kong/unirest-java/issues/248
Unirest.put("http://somewhere")
.body("abc".getBytes())

Can we create file/zipfile as InputStream without a temporary file in java

I need to create a test where I need to pass an InputStream as input which would be a zipfile. Is there a way we can create InputStream directly instead creating in memory / local file copies so that stream can be used for testing.

If I understand your question correctly you want to create a ZIP file in-memory (i.e. not on disk) from hard-coded data. This is certainly possible using the following classes:
java.io.ByteArrayInputStream
java.io.ByteArrayOutputStream
java.util.zip.ZipOutputStream
java.util.zip.ZipEntry
Here's an example:
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.zip.ZipEntry;
import java.util.zip.ZipOutputStream;
public class Main {
public static void main(String[] args) throws IOException {
// Destination of the ZIP file (an in-memory byte array)
ByteArrayOutputStream boas = new ByteArrayOutputStream();
/*
* Write the ZIP file. This creates a single entry named "file.txt"
* with "Hello, World!" as its contents.
*/
try (ZipOutputStream zos = new ZipOutputStream(boas)) {
zos.putNextEntry(new ZipEntry("file.txt"));
zos.write("Hello, World!".getBytes());
zos.closeEntry();
}
// Create an InputStream to read the raw bytes of the ZIP file
ByteArrayInputStream bois = new ByteArrayInputStream(boas.toByteArray());
/*
* The following writes the ZIP file to disk, specifically to a file named "test.zip"
* in the working directory. The purpose of this is to allow you to run the code
* and see a tangible result (i.e. lets you inspect the resulting ZIP file). Obviously
* you would not do this in your own code since you want to avoid writing the ZIP file
* to disk.
*
* Note: Will fail if the file already exists.
*/
Files.copy(bois, Path.of("test.zip"));
}
}

Elaborating on some of the answers found above let me describe a situation I encountered and how I solved it. I encountered a situation where I receive files that are PGP encrypted (.pgp, .asc etc) so I had to decrypt the file (.txt) itself be passing it into a stream reader to parse the line entries. Instead of saving the decrypted file as temporarily file in the file system before reading it in to parse the entries. I did it in memory as follows.
I had a function which pipes the InputStream into a ByteArrayOutputStream (instead of FileOutputStream, retrieve the byte array and initialize it as ByteArrayInputStream and printed out each entry with a reader
InputStream encFileIn = new FileInputStream("some_encrypted_file.asc");
byte[] decryptFile = util.decryptFile(encFileIn);
BufferedReader reader = new BufferedReader(new InputStreamReader(
new ByteArrayInputStream(decryptFile), StandardCharsets.UTF_8));
reader.lines().forEach(f -> System.out.println(f));
Here are some code snippets on piping the InputStream inside my decrypt function which may not be applicable to you as I am working with PGP files. So you maybe try to directly instantiate a ByteArrayInputStream as is
public static void pipeAll(InputStream inStr, OutputStream outStr)
throws IOException
{
byte[] bs = new byte[BUFFER_SIZE];
int numRead;
while ((numRead = inStr.read(bs, 0, bs.length)) >= 0)
{
outStr.write(bs, 0, numRead);
}
}
public static byte[] readAll(InputStream inStr)
throws IOException
{
ByteArrayOutputStream buf = new ByteArrayOutputStream();
pipeAll(inStr, buf);
return buf.toByteArray();
}

How to fix java.lang.OutOfMemoryError: Java heap space error? [duplicate]

This question already has answers here:
How to deal with "java.lang.OutOfMemoryError: Java heap space" error?
(31 answers)
Closed 3 years ago.
I have a file with size of 32 MB, I have downloaded it from DocuShare server to DocuShare temp folder. I am trying to read the file content from it to create a file. I get error when I URL encode my base64 content.
I am not getting any exception when I run the same code a simple java application. But when I use the same code in DocuShare service to get document content I get Exception.
HTTP Status 500 - org.glassfish.jersey.server.ContainerException: java.lang.OutOfMemoryError: Java heap space
org.glassfish.jersey.server.ContainerException: java.lang.OutOfMemoryError: Java heap space
File file = new File(filePath);
FileInputStream fileInputStreamReader = new FileInputStream(file);
byte[] bytes = new byte[(int)file.length()];
fileInputStreamReader.read(bytes);
String encodedBase64 = String encodedBase64 = java.util.Base64.getEncoder().encodeToString(bytes);
String urlEncoded = URLEncoder.encode(encodedBase64);
How to fix this error?
Do I need to increase my tomcat heap size?

There are two ways in which you can fix the issue.
You can increase the heap size, but IMO this is a bad solution, because you will hit the same issue if you get several parallel requests or when you try to process a bigger file.
You can optimize your algorithm - instead of storing several copies of your file in-memory, you can process it in a streaming fashion, thus not holding more than several KBs in memory:
import java.io.InputStream;
import java.io.OutputStream;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.Base64;
public class Launcher {
public static void main(String[] args) throws Exception {
final Path input = Paths.get("example");
final Path output = Paths.get("output");
try (InputStream in = Files.newInputStream(input); OutputStream out = Base64.getUrlEncoder().wrap(Files.newOutputStream(output))) {
final byte[] buffer = new byte[1024 * 8];
for (int read = in.read(buffer); read > 0; read = in.read(buffer)) {
out.write(buffer, 0, read);
}
}
}
}
PS: If you really need the URL encoder, you'll have to create a streaming version of it, but I think a URL-safe base64 would be more than enough

Base64 converts each 3 bytes into 4 letters. That means you can read your data in chunks and decode it in the same way as you would decode the whole file.
Try this:
File file = new File(filePath);
FileInputStream fileInputStreamReader = new FileInputStream(file);
StringBuilder sb = new StringBuilder();
Base64.Encoder encoder = java.util.Base64.getEncoder();
int bufferSize = 3 * 1024; //3 mb is the size of a chunk
byte[] bytes = new byte[bufferSize];
int readSize = 0;
while ((readSize = fileInputStreamReader.read(bytes)) == bufferSize) {
sb.append(encoder.encodeToString(bytes));
}
if (readSize > 0) {
bytes = Arrays.copyOf(bytes, readSize);
sb.append(encoder.encodeToString(bytes) );
}
String encodedBase64 = sb.toString();

If you have large files, you will always run into OOM errors depending on size of file. If your goal is to base64 encoding using Apache Commons Base64 Streams.
https://commons.apache.org/proper/commons-codec/apidocs/org/apache/commons/codec/binary/Base64InputStream.html

computing checksum for an input stream

I need to compute checksum for an inputstream(or a file) to check if the file contents are changed. I have this below code that generates a different value for each execution though I'm using the same stream. Can someone help me to do this right?
public class CreateChecksum {
public static void main(String args[]) {
String test = "Hello world";
ByteArrayInputStream bis = new ByteArrayInputStream(test.getBytes());
System.out.println("MD5 checksum for file using Java : " + checkSum(bis));
System.out.println("MD5 checksum for file using Java : " + checkSum(bis));
}
public static String checkSum(InputStream fis){
String checksum = null;
try {
MessageDigest md = MessageDigest.getInstance("MD5");
//Using MessageDigest update() method to provide input
byte[] buffer = new byte[8192];
int numOfBytesRead;
while( (numOfBytesRead = fis.read(buffer)) > 0){
md.update(buffer, 0, numOfBytesRead);
}
byte[] hash = md.digest();
checksum = new BigInteger(1, hash).toString(16); //don't use this, truncates leading zero
} catch (Exception ex) {
}
return checksum;
}
}

You're using the same stream object for both calls - after you've called checkSum once, the stream will not have any more data to read, so the second call will be creating a hash of an empty stream. The simplest approach would be to create a new stream each time:
String test = "Hello world";
byte[] bytes = test.getBytes(StandardCharsets.UTF_8);
System.out.println("MD5 checksum for file using Java : "
+ checkSum(new ByteArrayInputStream(bytes)));
System.out.println("MD5 checksum for file using Java : "
+ checkSum(new ByteArrayInputStream(bytes)));
Note that your exception handling in checkSum really needs fixing, along with your hex conversion...

Check out the code in org/apache/commons/codec/digest/DigestUtils.html

Changes on a file are relatively easy to monitor, File.lastModified() changes each time a file is changed (and closed). There is even a build-in API to get notified of selected changes to the file system: http://docs.oracle.com/javase/tutorial/essential/io/notification.html
The hashCode of an InputStream is not suitable to detect changes (there is no definition how an InputStream should calculate its hashCode - quite likely its using Object.hashCode, meaning the hashCode doesn't depend on anything but object identity).
Building an MD5 like you try works, but requires reading the entire file every time. Quite a performance killer if the file is large and/or watching for multiple files.

You are confusing two related, but different responsibilities.
First you have a Stream which provides stuff to be read. Then you have a checksum on that stream; however, your implementation is a static method call, effectively divorcing it from a class, meaning that nobody has the responsibility for maintaining the checksum.
Try reworking your solution like so
public ChecksumInputStream implements InputStream {
private InputStream in;
public ChecksumInputStream(InputStream source) {
this.in = source;
}
public int read() {
int value = in.read();
updateChecksum(value);
return value;
}
// and repeat for all the other read methods.
}
Note that now you only do one read, with the checksum calculator decorating the original input stream.

The issue is after you first read the inputstream. The pos has reach the end. The quick way to resolve your issue is
ByteArrayInputStream bis = new ByteArrayInputStream(test.getBytes());
System.out.println("MD5 checksum for file using Java : " + checkSum(bis));
bis = new ByteArrayInputStream(test.getBytes());
System.out.println("MD5 checksum for file using Java : " + checkSum(bis));

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

decompressing with zlib in Java (incorrect header check) - java

Related

How to temporarily create a text file without any file location and send as a response in spring boot at run time?

Zip Archives get corrupted when uploading to Azure Blob Store using REST API

Can we create file/zipfile as InputStream without a temporary file in java

How to fix java.lang.OutOfMemoryError: Java heap space error? [duplicate]

computing checksum for an input stream

Categories

Resources