issue: length changed when convert between String and byte array? - java

I thought the length of output will always keep the same when converting between byte[] and String. But below example shows this is incorret.
byte[] b1 = {55, -71, -35, -35, 83, -115, 107, -80, -62, 86, 98, 125, -68, -12, 14, -92, -122, -65, -117, -26, 80, -102, 75, 49, -120, -10, 18, -8, 82, -21, 49, 80, 125, 94, -35, -66, 91, 79, 77, -29, -48, -85, 29, -48, -118, -13, -84, -77, 93, -101, -7, 46, -44, -25, -42, 72, -33, -81, -120, -40, 40, 65, 58, -74, -34, 99, -8, -118, 83, 110, -94, 69, 21, -27, 114, 43, -23, 7, 120, -15, 21, 110, 108, 98, -99, 7, 107, 63, -48, 32, 123, 35, -36, -35, 7, -75, 40, -3, 33, 92, -79, 119, 22, -63, 27, 123, -98, 92, -93, 30, 51, 55, 106, -109, 99, 123, 25, -111, -53, 66, 117, 121, -20, 6, -10, -34, -76, -120, -56, 123, 48, -9, -116, -81, -47, 67, 80, 14, -58, -17, -92, -75, 119, 27, 125, -115, -31, 114, -96, 126, -87, 98, -108, -21, -113, 36, 104, -69, -74, 41, -68, 115, 103, 106, -39, 10, 0, 7, -66, 84, -94, 46, -1, -62, -115, 104, -104, 53, 86, -117, 15, -100, 46, 7, 57, -84, 40, 118, -12, 93, -6, -31, 28, 81, -72, 123, 54, -76, 123, 111, 54, 121, 126, -19, -32, 99, 109, -68, -103, 29, 75, 57, 115, 33, 110, -23, -116, 11, 112, 117, 67, -100, 21, 94, -16, 94, 24, 47, -90, -48, 30, 15, 24, 98, -114, -96, 37, -47, 32, 74, 110, 58, 35, 77, 62, -74, 94, 59, 63, -35, -59, 10, 43, 65, -63, 59, -65, 58, 69, 88, -91, -58, -103, 88, 6, -105, 92, -9, -19, 26, 5, -42, -38, -82, -56, 42, -45, 30, 103, -113, -64, -82, 29, 6, 40, 102, 44, 59, 51, -69, -70, 90, -126, 40, -105, 103, 92, 124, 120, 43, -53, 73, -109, 103, -62, -64, -68, -81, -61, -68, -73, -6, -112, 85, 119, -92, -85, -31, -37, 32, -2, 100, 34, 41, -128, 73, -92, -94, 71, 98, 0, 126, -98, -51, -8, -72, -97, 66, -71, -14, -74, -39, 56, 71, 46, -94, 40, 32, -84, -17, -128, 60, 25, 75, -104, 25, 49, -14, -103, -89, 97, -61, 89, -109, 118, 114, 123, -38, 101, 98, 7, 70, 9, 42, 98, -94, 73, -70, 72, 43, 52, -89, -20, -22, -58, -109, -88, 36, 118, 71, -34, -85, -24, -46, -120, -118, 5, -118, -53, -5, -87, -116, -38, 101, 74, -111, -2, 12, 48, -105, -110, 6, -114, 31, 70, -42, -118, -61, 82, 83, -37, 27, -56, 91, 113, -23, -40, -121, 35, 79, 3, 79, 58, -54, -11, -41, -48, -109, -54, 96, 80, 77, -69, -88, -75, -126, -64, 54, 33, 7, 121, 16, -49, 26, 68, 94, 107, -79, -17, -67, -59, 57, -8, -36, 99, 29, -2, 36, -91, 70, 56, 76, 88, 40, 85, -16, 120, -101, -21, 83, 103, -91, 28, 14, 17, 73, -102, -121, 69, -102, 18, -115, -92, -5, -50, -20};
System.out.println("resultBytes length = " + b1.length);
String s = new String(b1, "utf-8");
System.out.println("cipherText length = " + s.length());
byte[] b2 = s.getBytes("utf-8");
System.out.println("newResultBytes length = " + b2.length);
By running this, I got output:
length of b1 = 496
length of s = 470
length of b2 = 877
why they are so different?

In UTF-8 encoding a character may have more than 1 byte.
Example:
Character -> Codepoints -> UTF-8 Encoding
ä -> 00E4 -> C3 A4
So 2 bytes in the input can be displayed as 1 character in the output.
Now in Unicode you can decompose characters (especially foreign languages). So to keep my example the character ä can be decomposed to
¨a
This are now 2 characters that have the following encodings
Character -> Codepoints -> UTF-8 Encoding
¨a -> 00A4 0061 -> C2 A4 61
Especially if you use asian languages this decomposing takes place more often then in this example.
So for this example (and when the decomposing takes place, which is not for sure in every language) you would have the following output of your program:
length of b1 = 2
length of s = 1
length of b2 = 3
I think that can explain your findings.

Related

How to consume Avro Serialized messages from AWS MSK via Apache Beam

PCollection<KafkaRecord<String, byte[]>> kafkaRecordPCollection =
pipeline.apply(
KafkaIO.<String, byte[]>read()
.withBootstrapServers("bootstrap-server")
.withTopic("topic")
.withConsumerConfigUpdates(props)
.withKeyDeserializer(StringDeserializer.class)
.withValueDeserializer(ByteArrayDeserializer.class));
When I create kafkaIO in above format Im able to get data in below mentioned way.
KV{Struct{OC_NO=ABCDE3,PO_NO=11XA435A,S_ID=024,__dbz__physicalTableIdentifier=cdc.po.PO}, [3, 0, -93, -44, 46, 43, 44, -90, 68, 71, -96, 24, -105, 51, 107, -22, -27, 86, 0, 2, 12, 65, 66, 67, 68, 69, 51, 0, 2, 97, 22, 49, 46, 56, 46, 49, 46, 70, 105, 110, 97, 108, 20, 112, 111, 115, 116, 103, 114, 101, 115, 113, 108, 20, 99, 100, 99, 95, 115, 101, 114, 118, 101, 114, -6, -96, -63, -16, -28, 96, 0, 10, 102, 97, 108, 115, 101, 18, 75, 65, 70, 75, 65, 45, 80, 79, 67, 2, 66, 91, 34, 49, 50, 53, 53, 51, 52, 49, 56, 48, 56, 52, 53, 54, 34, 44, 34, 49, 50, 53, 53, 51, 56, 57, 55, 54, 52, 52, 56, 56, 34, 93, 26, 112, 117, 114, 99, 104, 97, 115, 101, 111, 114, 100, 101, 114, 28, 80, 85, 82, 67, 72, 65, 83, 69, 95, 79, 82, 68, 69, 82, 2, -66, -95, -57, -90, 15, 2, -112, -18, -4, -80, -119, 73, 0, 2, 117, 2, -116, -89, -63, -16, -28, 96, 0]}
But I need to deserialize the byte[] into a Pojo class generated from AVRO schema. And when I try to use it for withValueDeserializer() im getting errors. Is there a specific way to do it.
I have created a custom MyClassKafkaAvroDeserializer as well.
#Slf4j
public class MyClassKafkaAvroDeserializer extends
AbstractKafkaAvroDeserializer implements Deserializer<Envelope> {
#Override
public void configure(Map<String, ?> configs, boolean isKey) {
configure(new KafkaAvroDeserializerConfig(configs));
}
#Override
public Envelope deserialize(String s, byte[] bytes) {
return (Envelope) this.deserialize(bytes);
}
#Override
public void close() {
}
}
PCollection<KafkaRecord<String, Envelope>> kafkaRecordPCollection =
pipeline.apply(
KafkaIO.<String, Envelope>read()
.withBootstrapServers("bootstrap-server")
.withTopic("topic")
.withConsumerConfigUpdates(props)
.withKeyDeserializer(StringDeserializer.class)
.withValueDeserializer(MyClassKafkaAvroDeserializer.class)
);
Which gives me an error like below
Exception in thread "main" org.apache.kafka.common.errors.SerializationException: Unknown magic byte!
I think you also need to provide a coder in this case, like:
KafkaIO.<String, Envelope>read()
...
.withValueDeserializerAndCoder(
MyClassKafkaAvroDeserializer.class, AvroCoder.of(Envelope.class))
If you're having trouble with deserializers, you can always follow your byte-producing Read with a Map that does the deserialization.

converting a base64 decoded string to list in python does not produce same result as in Java array output

I have a base64 encoded string pTWlzYwVk74RHlbhrHtYxjlmTpa1KY3LVj3X8o3PHUURfY07Qnk5wFPHP7SHDvoJSaM24DybXt20+ou3evsEmLNQfzsF2A1lfSsG2dIKf5Gmhb1qXVN7C6z1mJIRTWt99ei9A1Ozyc7et2DpKpX0SGIaKPcmf2TomYvt1Q+YWTaabUoue9BgI2VHb3L2f/UdRo5ja6beSeA= forexample, In Java when decoding this base64 decoded I get the correct result but in python different result. Here is the code I used.
python code
decoded = base64.b64decode("pTWlzYwVk74RHlbhrHtYxjlmTpa1KY3LVj3X8o3PHUURfY07Qnk5wFPHP7SHDvoJSaM24DybXt20+ou3evsEmLNQfzsF2A1lfSsG2dIKf5Gmhb1qXVN7C6z1mJIRTWt99ei9A1Ozyc7et2DpKpX0SGIaKPcmf2TomYvt1Q+YWTaabUoue9BgI2VHb3L2f/UdRo5ja6beSeA=")
mylist = list(decoded )
print("mylist", mylist)
output becomes
mylist [165, 53, 165, 205, 140, 21, 147, 190, 17, 30, 86, 225, 172, 123, 88, 198, 57, 102, 78, 150, 181, 41, 141, 203, 86, 61, 215, 242, 141, 207, 29, 69, 17, 125, 141, 59, 66, 121, 57, 192, 83, 199, 63, 180, 135, 14, 250, 9, 73, 163, 54, 224, 60, 155, 94, 221, 180, 250, 139, 183, 122, 251, 4, 152, 179, 80, 127, 59, 5, 216, 13, 101, 125, 43, 6, 217, 210, 10, 127, 145, 166, 133, 189, 106, 93, 83, 123, 11, 172, 245, 152, 146, 17, 77, 107, 125, 245, 232, 189, 3, 83, 179, 201, 206, 222, 183, 96, 233, 42, 149, 244, 72, 98, 26, 40, 247, 38, 127, 100, 232, 153, 139, 237, 213, 15, 152, 89, 54, 154, 109, 74, 46, 123, 208, 96, 35, 101, 71, 111, 114, 246, 127, 245, 29, 70, 142, 99, 107, 166, 222, 73, 224]
and the Java side I have used like this
String str = "pTWlzYwVk74RHlbhrHtYxjlmTpa1KY3LVj3X8o3PHUURfY07Qnk5wFPHP7SHDvoJSaM24DybXt20+ou3evsEmLNQfzsF2A1lfSsG2dIKf5Gmhb1qXVN7C6z1mJIRTWt99ei9A1Ozyc7et2DpKpX0SGIaKPcmf2TomYvt1Q+YWTaabUoue9BgI2VHb3L2f/UdRo5ja6beSeA="
byte[] decoded = Base64.getDecoder().decode(str.getBytes(StandardCharsets.UTF_8));
System.out.println("myarray: "+Arrays.toString(decoded));
and output becomes
[-91, 53, -91, -51, -116, 21, -109, -66, 17, 30, 86, -31, -84, 123, 88, -58, 57, 102, 78, -106, -75, 41, -115, -53, 86, 61, -41, -14, -115, -49, 29, 69, 17, 125, -115, 59, 66, 121, 57, -64, 83, -57, 63, -76, -121, 14, -6, 9, 73, -93, 54, -32, 60, -101, 94, -35, -76, -6, -117, -73, 122, -5, 4, -104, -77, 80, 127, 59, 5, -40, 13, 101, 125, 43, 6, -39, -46, 10, 127, -111, -90, -123, -67, 106, 93, 83, 123, 11, -84, -11, -104, -110, 17, 77, 107, 125, -11, -24, -67, 3, 83, -77, -55, -50, -34, -73, 96, -23, 42, -107, -12, 72, 98, 26, 40, -9, 38, 127, 100, -24, -103, -117, -19, -43, 15, -104, 89, 54, -102, 109, 74, 46, 123, -48, 96, 35, 101, 71, 111, 114, -10, 127, -11, 29, 70, -114, 99, 107, -90, -34, 73, -32]
in comparison, the positive values of the two output is same but negatives the python version is outputting different result. why is that? what am I doing wrong? I saw a similar issue on here on stackoverflow, but Unfortunately does not resolve my issue.
Those result are same.
In java byte is signed type, so values bigger than 127 are considered negatives.
Apparently in python byte is unsigned.

Input to large for RSA Cipher

I have generated the following Public Key in Android.
fun createCipher(): Cipher
{
val posKey = posPublicKey
posKey.publicKey.modulus
var spec = RSAPublicKeySpec(BigInteger(posPublicKey.publicKey.modulus), BigInteger(posPublicKey.publicKey.exponent))
var fact = KeyFactory.getInstance(KeyProperties.KEY_ALGORITHM_RSA)
publicKey = fact.generatePublic(spec)
var cipher = Cipher.getInstance("RSA/ECB/OAEPWithSHA-256AndMGF1Padding")
cipher.init(Cipher.ENCRYPT_MODE, publicKey)
return cipher
}
The RSA key size that I have selected is 4096 bytes
The exponent is 3 bytes and the modulus is 512 bytes
The Modulus Byte Array is as follows:
[-32, -28, -121, 32, 109, -82, 43, 115, 43, -117, 20, 35, 122, 33, -2, 23, 23, -22, 75, 0, 91, 10, -89, 48, 33, -89, 57, 1, 57, -13, 9, -127, 90, 121, -96, -94, 106, -16, -105, -112, -74, 30, -12, 74, -74, 104, -26, 15, 99, -22, -55, -75, 14, -45, 56, 20, 85, 90, 83, -50, 68, -114, 5, -10, -109, 79, 44, 81, 68, -98, -45, -51, -97, 71, 90, -13, -78, 118, -21, -47, 66, 104, -83, 56, -72, -27, 45, 16, 70, 32, -76, -125, -11, 108, 126, -61, -126, 16, 6, -49, -106, -114, 18, 49, 121, -39, -109, 115, 111, -128, 83, 8, -110, -10, -4, 51, -67, 49, 66, 103, -76, -88, -110, 122, 56, 29, -101, 22, 3, 117, -104, -54, -64, -71, 23, 58, 87, 37, 96, 25, -114, 38, 1, -126, 33, -91, -4, 89, -28, 10, 95, -104, -24, -38, 17, 47, -122, 24, -89, 123, 100, 12, -10, -57, -44, 45, 25, 39, -80, -101, -6, -99, -95, -5, 70, 32, 37, -57, -52, -47, -66, 85, 10, -48, 75, 4, -114, 104, -7, -112, -128, 4, 114, 77, -40, 96, 66, 83, -54, 10, 111, 102, -39, -63, 2, -75, 38, 36, 24, 13, -51, 96, 89, -60, -40, 99, 65, 123, 52, -114, 122, 75, 32, -121, 80, -76, -11, -1, -31, -118, -51, -21, 13, 109, 111, -102, 120, -56, 62, -19, -79, 86, -41, -81, 67, -80, -63, 37, 35, 47, 109, -32, 47, -128, 95, -48, -53, -1, -125, -19, -9, -10, 15, -116, -50, 53, -86, -102, -24, 107, 122, -43, -125, 51, 14, 101, 67, 57, 116, 97, -40, -98, -82, -118, -83, 120, -107, -14, 19, -49, -27, 10, 25, 40, 43, -27, 31, 59, -57, 58, 33, -98, 1, -45, -118, 76, -21, -13, -123, 67, 42, -37, -96, -32, 33, 124, 1, 44, -99, 74, 18, 32, 10, -107, -121, 86, -115, -70, -107, 109, 17, -92, 109, -47, 60, -49, -91, 7, -125, 47, 78, 86, 81, -2, -35, 17, 124, 94, -26, -80, -84, 120, 110, 38, -55, -90, -11, 107, 73, 71, 44, 69, -58, 56, -59, 2, 94, 27, 88, 29, -57, 95, -99, 5, 102, -66, 118, -82, 126, 20, -104, -95, 47, -2, 77, -33, 89, -66, -92, 121, -5, 78, 68, -1, -82, -95, -121, 117, -29, 70, 11, -72, 54, -99, -13, -87, 9, 77, -113, 51, -124, -56, -8, 126, -114, -31, 90, -125, -11, 41, -85, 74, 3, 90, -95, 85, 121, 61, 14, 116, 51, -40, -57, -124, -69, -51, -76, -119, -80, 95, 95, 17, -34, 80, -36, 66, -51, 14, -69, -113, 35, -109, -115, -16, -3, -118, 114, -20, -81, 57, -65, 40, -8, -67, -85, 110, 50, -128, 44, -78, 93, -44, -93, 89, -76, 13, 98, -38, -55, -120, 11, 127, 84, -2, 101, 57, -121, -111, 91, -102, -118, 85, -124, -90, 91, -84, 28, 120, -28, -105, 88, -73, 6, 89, 33, 8, 9, 30, 9, -6, 17, 25]
The Exponent Byte Array is as Follows:
[1, 0, 1]
The test Key is as follows:
val stringKey = "8D-F7-5B-15-0F-2A-E5-3E-FD-44-5A-63-50-AC-62-D6-06-2D-59-5C-F1-C3-9A-DB-45-25-0D-7A-72-AE-DF-87"
val stringIV = "FA-94-FD-74-2E-AC-2C-90-79-98-AF-A3-D7-12-5D-A2"
var aeskey = AesKeyBuilder()
val key = (stringKey.replace("-", "")).toByteArray(Charsets.US_ASCII)
val iv = (stringIV.replace("-", "")).toByteArray(Charsets.US_ASCII)
aeskey.key = key
aeskey.iv = iv
val encryptedKey = cipher.doFinal(aesKey.Key)
The item that I am trying to encrypt is 64 bytes. Using this public key.
However, I get the following error:
Process: com.touchsides.rewards.debug, PID: 19470
com.android.org.bouncycastle.crypto.DataLengthException: input too large for RSA cipher.
at com.android.org.bouncycastle.crypto.engines.RSACoreEngine.convertInput(RSACoreEngine.java:115)
at com.android.org.bouncycastle.crypto.engines.RSABlindedEngine.processBlock(RSABlindedEngine.java:95)
at com.android.org.bouncycastle.crypto.encodings.OAEPEncoding.encodeBlock(OAEPEncoding.java:199)
at com.android.org.bouncycastle.crypto.encodings.OAEPEncoding.processBlock(OAEPEncoding.java:131)
I believe the modulus size is large enough to allow for the encryption of this byte array.
You get the com.android.org.bouncycastle.crypto.DataLengthException error because RSA key was created with negative Modulus. So it can't be used to encrypt even one byte.
BigInteger default constructor uses the most significant bit of provided byte array payload as sign bit for the number. So you have to explicitly specify that you want a positive number using another constructor where the first argument is a sign indicator:
BigInteger(1, posPublicKey.publicKey.modulus), BigInteger(1, posPublicKey.publicKey.exponent)
There's also a C# approach to add a zero-byte in front of the byte array (so the sign bit will be always zero):
BigInteger(byteArrayOf(0) + posPublicKey.publicKey.modulus), BigInteger(byteArrayOf(0) + posPublicKey.publicKey.exponent)
And finally, if you already have your BigInteger created the wrong way, it's possible to convert it:
var modulus = BigInteger(posPublicKey.publicKey.modulus)
if (modulus.compareTo(BigInteger.ZERO) < 0)
modulus = modulus.add(BigInteger.ONE.shiftLeft(4096))
Please also look at this answer and answers of this question.

lzw decompression algorithm in java for given compressed text

I have following compressed text given as a byte Array:
byte[] compressed = [97, 2, 10, 28, 72, -80, -96, -63, -125, 8, 19, 42, 92, -56, -80, -95, -61, -121, 16, 35, 74, -100, 72, -79, -94, -59, -117, 24, 51, 106, -36, -56, -79, -93, -57, -113, 32, 67, -118, 28, 73, -78, -92, -55, -109, 40, 83, -86, 92, -55, -78, -91, -53, -105, 48, 99, -54, -100, 73, -77, -90, -51, -101, 56, 115, -22, -36, -55, -77, -89, -49, -97, 64, -125, 10, 29, 74, -76, -88, -47, -93, 72, -109, 42, 93, -54, -76, -87, -45, -89, 80, -93, 74, -99, 74, -75, -86, -43, -85, 88, -77, 106, -35, -54, -75, -85, -41, -81, 96, -61, -118, 29, 75, -74, -84, -39, -77, 104, -45, -86, 93, -53, -74, -83, -37, -73, 112, -29, -54, -99, 75, -73, -82, -35, -69, 120, -13, -22, -35, -53, -73, -81, -33, -65, -128, 3, 11, 30, 76, -72, -80, -31, -61, -120, 19, 43, 94, -52, -72, -79, -29, -57, -112, 35, 75, -98, 76, -71, -78, -27, -53, -104, 51, 107, -34, -52, -71, -77, -25, -49, -96, 67, -117, 30, 77, -70, -76, -23, -45, -88, 83, -85, 94, -51, -70, -75, -21, -41, -80, 99, -53, -98, 77, -69, -74, -19, -37, -72, 115, -21, -34, -51, -69, -73, -17, -33, -64, -125, 11, 31, 78, -68, -72, -15, -29, -56, -109, 43, 95, -50, -68, -71, -13, -25, -48, -93, 75, -97, 78, -67, -70, -11, -21, -40, -77, 107, -33, -50, -67, -69, -9, -17, -32, -61, -117, 31, 79, -66, -68, -7, -13, -24, -45, -85, 95, -49, -66, -67, -5, -9, -16, -29, -53, -97, 79, -65, -66, -3, -5, -8, -13, -21, -33, -49, -65, -65, -1, -1, 0, -61, 4, 20, 56, -112, 96, 65, -125, 7, 17, 38, 84, -72, -112, 97, 67, -121, 15, 33, 70, -108, 56, -111, 98, 69, -117, 23, 49, 102, -44, -72, -111, 99, 71, -113, 31, 65, -122, 20, 57, -110, 100, 73, -109, 39, 81, -90, 84, -71, -110, 101, 75, -105, 47, 97, -58, -108, 57, -109, 102, 77, -101, 55, 113, -26, -44, -71, -109, 103, 79, -97, 63, -127, 6, 21, 58, -108, 104, 81, -93, 71, -111, 38, 85, -70, -108, 105, 83, -89, 79, -95, 70, -107, 58, -107, 106, 85, -85, 87, -79, 102, -43, -70, -107, 107, 87, -81, 95, -63, -122, 21, 59, -106, 108, 89, -77, 103, -47, -90, 85, -69, -106, 109, 91, -73, 111, 91, 2];
and a given codeword length of 9 bits.
The resulting string should be aaaaaaaaaaaaaaaaaaaaaaaaa... with a total length of 39270 of all *a*s.
I want to write a decompress function which decompresses the compressed byte array and returns the resulting string of *a*s.
I tried the normal LZW implementations but this didn't work very well. The given codeword length of 9 bit and the negative values in the byte array gives me some headache.
My understanding is I have to convert the whole byte array to one binary string and read each value every 9 bits (codeword length)?
Has anyone a hint or a suggestion how this can be done? Thanks, I really appreciate your support.
Edit:
Here is some code, I tried so far, which works with
byte[] compressed = [68, 0, 97, 0, 115, 0, 32, 0, 105, 0, 115, 0, 116, 0, 32, 0, 101, 0, 105, 0, 110, 0, 32, 0, 107, 0, 117, 0, 114, 0, 122, 0, 101, 0, 114, 0, 32, 0, 84, 0, 101, 0, 120, 0, 116, 0]
and a given codeword length of 16 bits.
The resulting string is Das ist ein kurzer Text.
public static List<String> convertByteArrayToBinaryStringList(byte[]compressedData,int codeWordLength){
StringBuilder sb=new StringBuilder();
List<String> binaryCompressedValues=new ArrayList<String>();
for(int i=0;i<compressedData.length;i++){
sb.append(byteToBinaryString(compressedData[i]));
}
char[]binaryCharArray=sb.toString().toCharArray();
int j=0;
while(j<binaryCharArray.length){
StringBuilder binStringBuilder=new StringBuilder();
for(int i=j;i<j+codeWordLength;i++){
if(j+codeWordLength>binaryCharArray.length){
System.out.println("End reached!");
binStringBuilder.append("0");
}else{
binStringBuilder.append(binaryCharArray[i]);
}
}
j+=codeWordLength;
binaryCompressedValues.add(binStringBuilder.toString());
}
return binaryCompressedValues;
}
public String incrementDictSize(String currentDictSize) {
String incrementedString = Integer.toBinaryString(Integer.valueOf(currentDictSize, 2) + 1);
int lengthDistance = currentDictSize.length() - incrementedString.length();
String padding = "";
if (lengthDistance > 0) {
for (int i = 0; i < lengthDistance; i++) {
padding += "0";
}
}
return padding + incrementedString;
}
public byte[]uncompress(byte[]compressedData){
int codeWordLength=16;
List<String> binaryCompressedValues=new ArrayList<String>();
binaryCompressedValues=convertByteArrayToBinaryStringList(compressedData,codeWordLength);
//int dictSize = 256;
String dictSize="1111111100000000";
Map<String, String> dictionary=new HashMap<String, String>();
String s="00000000";
String padString="00000000";
for(int i=0;i< 256;i++){
s=String.format("%8s",Integer.toBinaryString(i)).replace(' ','0');
dictionary.put(s+padString,""+(char)i);
System.out.println("dictionary.get("+i+") "+s+padString+" "+dictionary.get(s+padString));
}
//String w = "" + (char)(byte)compressedData[0];
String w="";
StringBuffer result=new StringBuffer(w);
for(String k:binaryCompressedValues){
String entry;
if(dictionary.containsKey(k)){
entry=dictionary.get(k);
dictionary.put(incrementDictSize(currentDictSize),w+entry.charAt(0));
result.append(entry);
w=entry;
}else{
entry=w+w.charAt(0);
result.append(entry);
dictionary.put(incrementDictSize(currentDictSize),w+w.charAt(0));
w=entry;
}
}
return result.toString().getBytes();
}
Example code for reading n-bit-codes LSB first.
Beware dirty details like alignment on code size change.
/** An <code>NBitsInputStream</code> reads its <code>InputStream</code>
* as codes of n bits, least significant bits first.
*/// extend with set/increaseCodeLength() as needed
class NBitsInputStream extends java.io.FilterInputStream {
int buffer, validBits;
int codeLength, codeMask;
protected NBitsInputStream(InputStream in, int n) {
super(in);
codeLength = n;
codeMask = (1 << n) - 1;
}
/** Reads a code of n bits, least significant bits first.
* #return the code, or -1 if -1 is read. */
#Override
public int read() throws IOException {
while (validBits < codeLength) {
int high = super.read();
if (high < 0) // EOF
return high;
buffer |= high << validBits;
validBits += 8;
}
int code = buffer & codeMask;
validBits -= codeLength;
buffer >>= codeLength;
return code;
}
}
(Tried
for (int code ; 0 <= (code = codes.read()) ; ) {
String entry = (String) dictionary.get(code);
w += (null != entry ? entry : w).charAt(0);
dictionary.put(++currentDictSize, w);
result.append(w = entry);
}
without unexpected observations.)
Note that your dictionary will contain not just the longest strings, but each of their prefixes, for up to about 2130641537 characters using 16 bit codes (in addition to a result of almost that length).

DataFormatException: incorrect header check in java.util.zip.InflaterInputStream

I use javafx WebView in my project and it began crash at one website.
Through debug i understood, that when page receives part of js-code server uses header "Content-Encoding:deflate", ignoring my request headers.
Main problem in inflate method of InflaterInputStream.
java.util.zip.ZipException: incorrect header check
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:164)
I reproduced this error in simple method:
public byte[] makeTestRequest(String url) throws Exception {
InputStreamResponseListener listener = new InputStreamResponseListener();
Request request = httpClient
.newRequest(url);
request.send(listener);
Response response = listener.get(5, TimeUnit.SECONDS);
byte[] uncompressedData = new byte[65536];
if (response.getStatus() == 200) {
try(InputStream responseContent = listener.getInputStream()){
InputStream stream = new InflaterInputStream(responseContent);
int len, offset = 0;
while ((len = stream.read(uncompressedData , offset, uncompressedData.length-offset))>0) {
offset += len;
}
stream.close();
} catch (Exception e) {
e.printStackTrace();
}
}
return uncompressedData;
}
Error occurs in this method.
http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8-b132/java/util/zip/Inflater.java#247
Inflator's buffer in this moment contains data:
[-52, 93, 105, 115, 19, -57, -70, -2, 43, 42, 127, -54, -87, -63, -55, 116, -49, -98, 111, 22, 96, 98, -57, -40, -114, 23, -74, -101, -5, -63, 24, -127, 29, -37, -40, 120, 97, 75, -91, -54, 24, 8, -71, 5, 9, 107, 78, 40, 3, 33, 108, 39, 124, 57, 39, 62, 24, -127, -15, 34, -2, -62, -24, 31, -35, 126, -33, 30, 75, -45, -45, 61, 51, 26, 105, 20, 83, -87, 20, -42, 72, -74, -34, -18, -89, -33, 125, -23, -17, 59, -26, -58, -58, -89, 74, -117, 71, 74, -13, 11, -109, -77, 103, 59, -66, 116, 40, -47, 109, 87, -73, -68, 125, 29, -89, -25, 103, 103, 106, -49, 117, -10, 122, 108, 124, 113, 118, 126, -95, -10, -56, -78, 108, -41, -13, 8, -75, -10, 117, 44, -52, -51, -50, 47, 46, 116, 124, -7, 63, -33, 119, 76, -98, -22, -8, -110, 24, -122, 109, -17, 99, 127, 122, -66, 116, 118, -79, 7, 30, -20, -21, -104, -102, 60, -53, 126, -24, 88, 40, -99, -103, 97, 15, 59, -40, -17, -80, 95, 25, -104, 63, 85, -102, 103, 79, 53, -83, 83, -45, 14, -112, -2, 18, 123, 126, 118, 108, -90, -60, 30, -7, 47, -85, 87, -85, 43, -2, 127, -3, -118, -65, -11, 121, -63, 95, -83, 94, -59, -97, 55, 11, -2, 29, -1, -91, -65, -38, -15, -61, -66, -32, -69, -22, 127, 26, -120, -112, -1, -80, -6, 79, -42, 126, -97, -102, 54, -11, -62, -76, -102, -87, -76, -114, 107, -102, 123, -48, -103, 10, -3, -31, -25, 126, -39, -33, 97, -1, 109, 84, -81, 48, 90, 31, 51, 90, -33, -79, 127, 95, 86, 111, 85, 127, -87, -34, -16, -53, -43, 101, 124, -2, -112, 125, -7, -102, -65, -59, 62, 117, -69, -10, -11, 102, 10, -7, -29, -22, 111, -87, 47, -33, 51, 28, 51, 59, -7, 93, 90, 118, -14, -1, -12, 55, -128, -8, -6, 87, 59, -60, -10, -78, -94, 60, 114, -48, -42, 15, -57, -94, -4, 79, 32, -124, -3, -5, -100, -67, 126, -29, -81, 85, -105, 25, 73, -1, -57, 8, -72, -30, 111, -78, 47, 47, 23, -4, 109, -10, 112, -123, -47, 20, -94, -62, -16, 44, 43, 76, -123, -95, -96, 98, -66, 116, -122, 29, 89, 120, -41, 54, -115, 40, 81, 68, -45, 102, -12, -119, -109, 33, -94, -18, -78, 111, -127, -81, 44, -121, 72, -5, -61, -81, 84, -81, -80, -89, 12, 60, -10, -22, 25, -37, -93, -113, -80, 51, -80, 35, -2, -70, -65, 86, -93, -57, 72, 1, -108, 36, 125, 79, 125, 85, -90, 103, -89, -82, 74, 90, -58, 56, -79, -10, -89, 44, -29, 87, -1, 13, 35, -70, 2, -80, 50, -46, -73, 16, -8, -73, 8, 55, -20, -7, -57, 78, 98, -121, 118, -106, -70, 70, -104, 6, -102, 74, -125, -50, -74, 82, 100, -71, 87, -64, -77, 64, -125, -1, 1, -72, -104, -67, 126, 84, 103]
I'm surprised, but my that page successfully displayed in Chrome and Safari, but don't work in javafx WebView.
Is it a bug in inflate method or i'm doing something wrong?
It didn't solve the problem with javafx WebView, but using Inflator constructor with nowrap option helped with incorrect header check" error:
Inflator inf = new Inflator(true);
InputStream stream = new InflaterInputStream(responseContent, inf);

Categories