Porting Java's DEFLATE algorithm to C# - java

We have a requirement to decompress some data created by a Java system using the DEFLATE algorithm. This we have no control over.
While we don't know the exact variant, we are able to decompress data sent to us using the following Java code:
public static String inflateBase64(String base64)
{
try (Reader reader = new InputStreamReader(
new InflaterInputStream(
new ByteArrayInputStream(
Base64.getDecoder().decode(base64)))))
{
StringWriter sw = new StringWriter();
char[] chars = new char[1024];
for (int len; (len = reader.read(chars)) > 0; )
sw.write(chars, 0, len);
return sw.toString();
}
catch (IOException e)
{
System.err.println(e.getMessage());
return "";
}
}
Unfortunately, our ecosystem is C# based. We're shelling out to the Java program at the moment using the Process object but this is clearly sub-optimal from a performance point of view so we'd like to port the above code to C# if at all possible.
Some sample input and output:
>java -cp . Deflate -c "Pack my box with five dozen liquor jugs."
eJwLSEzOVsitVEjKr1AozyzJUEjLLEtVSMmvSs1TyMksLM0vUsgqTS/WAwAm/w6Y
>java -cp . Deflate -d eJwLSEzOVsitVEjKr1AozyzJUEjLLEtVSMmvSs1TyMksLM0vUsgqTS/WAwAm/w6Y
Pack my box with five dozen liquor jugs.
>
We're told the Java system conforms to RFC 1951 so we've looked at quite a few libraries but none of them seem to decompress the data correctly (if at all). One example is DotNetZip:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Ionic.Zlib;
namespace Decomp
{
class Program
{
static void Main(string[] args)
{
// Deflate
String start = "Pack my box with five dozen liquor jugs.";
var x = DeflateStream.CompressString(start);
var res1 = Convert.ToBase64String(x, 0, x.Length);
// Inflate
//String source = "eJwLSEzOVsitVEjKr1AozyzJUEjLLEtVSMmvSs1TyMksLM0vUsgqTS/WAwAm/w6Y"; // *** FAILS ***
String source = "C0hMzlbIrVRIyq9QKM8syVBIyyxLVUjJr0rNU8jJLCzNL1LIKk0v1gMA";
var part1 = Convert.FromBase64String(source);
var res2 = DeflateStream.UncompressString(part1);
}
}
}
This implements RFC 1951 according to the documentation, but does not decipher the string correctly (presumably due to subtle algorithm differences between implementations).
From a development point of view we could do with understanding the exact variant we need to write. Is there any header information or online tools we could use to provide an initial steer? It feels like we're shooting in the dark a little bit here.

https://www.nuget.org/packages/ICSharpCode.SharpZipLib.dll/
using ICSharpCode.SharpZipLib.Zip.Compression.Streams;
using System;
using System.IO;
using System.Text;
namespace ConsoleApp1
{
class Program
{
static void Main(string[] args)
{
string input = "Pack my box with five dozen liquor jugs.";
string encoded = Encode(input);
string decoded = Decode(encoded);
Console.WriteLine($"Input: {input}");
Console.WriteLine($"Encoded: {encoded}");
Console.WriteLine($"Decoded: {decoded}");
Console.ReadKey(true);
}
static string Encode(string text)
{
byte[] bytes = Encoding.UTF8.GetBytes(text);
using (MemoryStream inms = new MemoryStream(bytes))
{
using (MemoryStream outms = new MemoryStream())
{
using (DeflaterOutputStream dos = new DeflaterOutputStream(outms))
{
inms.CopyTo(dos);
dos.Finish();
byte[] encoded = outms.ToArray();
return Convert.ToBase64String(encoded);
}
}
}
}
static string Decode(string base64)
{
byte[] bytes = Convert.FromBase64String(base64);
using (MemoryStream ms = new MemoryStream(bytes))
{
using (InflaterInputStream iis = new InflaterInputStream(ms))
{
using (StreamReader sr = new StreamReader(iis))
{
return sr.ReadToEnd();
}
}
}
}
}
}

Related

How to alter / refactor method to use FileInputStream for a local file along with InputStream for a URL using Java 1.8?

Using Java 1.8, I created a class which obtains a zip file from an external HTTP URL:
e.g.
https://raw.githubusercontent.com/mlampros/DataSets/master/fastText_data.zip
and converts it into a String based MD5 hash:
6aa2fe666f83953a089a2caa8b13b80e
My utility class:
public class HashUtils {
public static String makeHashFromUrl(String fileUrl) {
try {
MessageDigest md = MessageDigest.getInstance("MD5");
InputStream is = new URL(fileUrl).openStream();
try {
is = new DigestInputStream(is, md);
// Up to 8K per read
byte[] ignoredBuffer = new byte[8 * 1024];
while (is.read(ignoredBuffer) > 0) { }
} finally {
is.close();
}
byte[] digest = md.digest();
StringBuilder sb = new StringBuilder();
for (int i = 0; i < digest.length; i++) {
sb.append(Integer.toString((digest[i] & 0xff) + 0x100, 16).substring(1));
}
return sb.toString();
} catch (Exception ex) {
throw new RuntimeException(ex);
}
}
}
Whereas this is fine for an external URL containing zip file (or any file with any type of file extension), I need to be be able to use the same code (algorithm) for files that reside on a local filesystem.
e.g.
Inside $CATALINA_HOME/temp/fastText_data.zip
Would need to use this instead:
InputStream fis = new FileInputStream(filename);
How could I do this using the same method (don't want to violate DRY - Don't Repeat Yourself)?
Of course, creating a brand new method containing the same code but using the InputStream fis = new FileInputStream(filename); instead of InputStream is = new URL(fileUrl).openStream(); would be violating the DRY principle?
What would be a good way to refactor this out? Two public methods with a refactored private method containing the same lines of code?
Make three methods: A private method that expects an InputStream argument which is given to your current logic, and two very short public methods which each call the private method with an InputStream they create.
public static String makeHashFromUrl(String url) {
try (InputStream stream = new URL(url).openStream()) {
return makeHashFromStream(stream);
}
}
public static String makeHashFromFile(File file) {
try (InputStream stream = new BufferedInputStream(new FileInputStream(file))) {
return makeHashFromStream(stream);
}
}
private static String makeHashFromStream(InputStream is) {
try {
MessageDigest md = MessageDigest.getInstance("MD5");
try {
is = new DigestInputStream(is, md);
// etc.
}

Convert MD5 - base64 From JAVA to PHP

I have a problem converting this Java code that generate md5-base64 to php.
I'd try more then 5 hours but without success.
This is the java code:
public static void main(String[] args) {
// TODO code application logic here
try {
String string = "customString";
String format = "20190101000000";
StringBuilder sb = new StringBuilder();
sb.append(format);
sb.append(string);
String sb2 = sb.toString();
byte[] bytes = sb2.getBytes();
byte[] bArr = new byte[16];
MessageDigest instance2 = MessageDigest.getInstance("MD5");
instance2.update(bytes, 0, bytes.length);
instance2.digest(bArr, 0, 16);
PrintStream printStream6 = System.out;
String a2 = Base64.getEncoder().encodeToString(bArr);
if (a2.length() >= 20) {
a2 = a2.substring(0, 19).trim();
}
StringBuilder sb8 = new StringBuilder();
sb8.append("MD5 16: ");
sb8.append(a2);
printStream6.println(sb8.toString());
} catch (Exception e) {
System.out.println(e);
}
}
And this is my php
<?php
$string = 'customString';
$format = '20190101000000';
$res = $format . $string;
$md5 = md5($res, true);
echo $md5;
echo '------------------';
$base = base64_encode($md5);
echo $base;
echo '------------------';
$result = substr($base, 0, 19);
echo $result;
echo '------------------';
The Java result is 1B2M2Y8AsgTpgAmY7Ph and php is iSKxA+7Y1mMnHhwf0yb
Check for charset encodings. In Java Strings are usually UTF-8 encoded. But when you transform to byte[] in sb2.getBytes(); it is using platform default charset (e.g. ISO-8859-1).
You have to provide the charset in java to have a determined behavior:
sb2.getBytes(java.nio.charset.Charset.forName("UTF-8");
or, the other way round, if goal isn't simply to make both reproduce same output, but you have to implement a PHP solution compatible with your existing Java solution, convert the PHP UTF-8 string to correct charset before md5(...). Therefore use iconv method.

Reading binary data in Java

So for a project I am working on, I need to be reading binary data from .FRX files into my Java project. Java's standard byte reader however, keeps returning the wrong bytes for me, which I believe could be a result of Java's modified UTF8-encoding. If I use C#'s binary reading methods, I get the output that I require. An obvious (but proving to be difficult) solution is using C# and a DLL to wrap into the Java project, and I was just wondering if anyone has any simpler alternatives in Java, perhaps an alternative standard byte-reader which can be implemented in Java relatively easily.
Any help is greatly appreciated!
Question update
Here is my C# program, which returns the output I am looking for.
using System;
using System.Collections.Generic;
using System.Text;
using System.IO;
public class GetFromFRX
{
public string getFromFRX(string filename, int pos)
{
StringBuilder buffer = new StringBuilder();
using (BinaryReader b = new BinaryReader(File.Open("frmResidency.frx", FileMode.Open)))
{
try
{
b.BaseStream.Seek(pos, SeekOrigin.Begin);
int length = b.ReadInt32();
for (int i = 0; i < length; i++)
{
buffer.Append(b.ReadChar());
}
}
catch (Exception e)
{
return "Error obtaining resource\n" + e.Message;
}
}
return buffer.ToString();
}
}
And here is some slightly differently formatted Java code:
import java.io.BufferedInputStream;
import java.io.FileInputStream;
import java.io.InputStream;
public class JavaReader {
public static void main(String[] args) throws Exception {
InputStream i = null;
BufferedInputStream b = null;
try{
// open file
i = new FileInputStream("frmResidency.frx");
// input stream => buffed input stream
b = new BufferedInputStream(i);
int numByte = b.available();
byte[] buf = new byte[numByte];
b.read(buf, 2, 3);
for (byte d : buf) {
System.out.println((char)d+":" + d);
}
}catch(Exception e){
e.printStackTrace();
}finally{
if(i!=null)
i.close();
if(b!=null)
b.close();
}
}
}
In your Java code:
You are using available() in a way which is specifically warned against in the Javadoc.
You aren't checking the result returned by the read() method.
You are reading into the buffer at offset 2 and then checking the entire buffer.
You are reading bytes where your C# code reads characters.
You aren't reading the length word.
You aren't using methods like DataInputStream.readInt() which correspond to your C# code.

PHP's hash() in Java

I am currently trying to get Java to generate the same hash for a string as PHP's hash algorithm does.
I have come close enough:
hash('sha512', 'password');
outputs:
b109f3bbbc244eb82441917ed06d618b9008dd09b3befd1b5e07394c706a8bb980b1d7785e5976ec049b46df5f1326af5a2ea6d103fd07c95385ffab0cacbc86
Java code:
public static void main(String[] args) {
hash("password");
}
private static String hash(String salted) {
byte[] digest;
try {
MessageDigest mda = MessageDigest.getInstance("SHA-512");
digest = mda.digest(salted.getBytes("UTF-8"));
} catch (Exception e) {
digest = new byte[]{};
}
String str = "";
for (byte aDigest : digest) {
str += String.format("%02x", 0xFF & aDigest);
}
return str;
}
This outputs the same.
My problem is when I use the third argument within PHP's hash function. On PHP's site it's described as following:
raw_output
When set to TRUE, outputs raw binary data. FALSE outputs lowercase hexits.
I am not quite sure how to implement this extra parameter. I think mainly my question would be, how do I convert a String object into a binary String object? Currently, running it with PHP generates the following: http://sandbox.onlinephpfunctions.com/code/a1bd9b399b3ac0c4db611fe748998f18738d19e3
This should reproduce the outcome from your link:
String strBinary = null;
try {
strBinary = new String(digest, "UTF-8");
} catch (UnsupportedEncodingException e) {
}
and you'll need these imports at the top of your file:
import java.nio.charset.Charset;
import java.io.UnsupportedEncodingException;
I hope I understood your issue correctly.

What's the different between javascript deflate and java.util.zip.Deflater

I wrote some Javascript code.
compress with base64 and deflate
function base64 (str) {
return new Buffer(str).toString("base64");
}
function deflate (str) {
return RawDeflate.deflate(str);
}
function encode (str) {
return base64(deflate(str));
}
var str = "hello, world";
console.log("Test Encode");
console.log(encode(str));
I converted "hello, world" to 2f8d48710d6e4229b032397b2492f0c2
and I want to decompress this string(2f8d48710d6e4229b032397b2492f0c2) in java
I put the str in a file, then:
public static String decompress1951(final String theFilePath) {
byte[] buffer = null;
try {
String ret = "";
System.out.println("can come to ret");
InputStream in = new InflaterInputStream(new Base64InputStream(new FileInputStream(theFilePath)), new Inflater(true));
System.out.println("can come to in");
while (in.available() != 0) {
buffer = new byte[20480];
*****line 64 excep happen int len = in.read(buffer, 0, 20480);
if (len <=0) {
break;
}
ret = ret + new String(buffer, 0, len);
}
in.close();
return ret;
} catch (IOException e) {
System.out.println("Has IOException");
System.out.println(e.getMessage());
e.printStackTrace();
}
return "";
}
But I have an exception:
java.util.zip.ZipException: invalid stored block lengths
at java.util.zip.InflaterInputStream.read(Unknown Source)
at com.cnzz.mobile.datacollector.DecompressDeflate.decompress1951(DecompressDeflate.java:64)
at com.cnzz.mobile.datacollector.DecompressDeflate.main(DecompressDeflate.java:128)
The java code up there works perfectly. As in the comment, you somehow got the encoded value wrong. The encoded value I got using the javascript value is y0jNycnXUSjPL8pJAQA=
Then, when you copy this value to file and call decompress1951, you do in fact get back hello, world as required. Don't know what to say on the javascript part as the code you use seems to sync up nicely with examples on the distribution web pages. I notice there is the original and the fork so maybe there is some confusion there? Anyhow there is this jsfiddle which I think can be seen as a working version if you want to take a look at that one.

Categories