GZIP outputs and sizes between C#/dot Net and Java

GZIP outputs and sizes between C#/dot Net and Java - java

I am testing the feasibility of compressing some messaging between Java and C#.
The messaging used ranges from small strings (40bytes) to larger strings (4K).
I have found differences in the output of Java GZIP implementation to the dot Net GZIP implementation.
I'm guessing that dot Net has a larger header that is causing the large overhead.
I prefer the Java implementation as it works better on small strings, and would like the dot Net to achieve similar results.
Output, Java version 1.6.0_10
Text:EncodeDecode
Bytes:(12 bytes)RW5jb2RlRGVjb2Rl <- Base64
Compressed:(29)H4sIAAAAAAAAAHPNS85PSXVJBZEAd9jYdgwAAAA=
Decompressed:(12)RW5jb2RlRGVjb2Rl
Converted:EncodeDecode
Text:EncodeDecodeEncodeDecodeEncodeDecodeEncodeDecodeEncodeDecodeEncodeDecodeEncodeDecodeEncodeDecodeEncodeDecodeEncodeDecode
Bytes:(120)RW5jb2RlRGVjb2RlRW5jb2RlRGVjb2RlRW5jb2RlRGVjb2RlRW5jb2RlRGVjb2RlRW5jb2RlRGVjb2RlRW5jb2RlRGVjb2RlRW5jb2RlRGVjb2RlRW5jb2RlRGVjb2RlRW5jb2RlRGVjb2RlRW5jb2RlRGVjb2Rl
Compressed:(33)H4sIAAAAAAAAAHPNS85PSXVJBZGudGQDAOcKnrd4AAAA
Decompressed:(120)RW5jb2RlRGVjb2RlRW5jb2RlRGVjb2RlRW5jb2RlRGVjb2RlRW5jb2RlRGVjb2RlRW5jb2RlRGVjb2RlRW5jb2RlRGVjb2RlRW5jb2RlRGVjb2RlRW5jb2RlRGVjb2RlRW5jb2RlRGVjb2RlRW5jb2RlRGVjb2Rl
Converted:EncodeDecodeEncodeDecodeEncodeDecodeEncodeDecodeEncodeDecodeEncodeDecodeEncodeDecodeEncodeDecodeEncodeDecodeEncodeDecode
Output, dot Net 2.0.50727
Text:EncodeDecode
Bytes:(12)RW5jb2RlRGVjb2Rl
Compressed:(128)H4sIAAAAAAAEAO29B2AcSZYlJi9tynt/SvVK1+B0oQiAYBMk2JBAEOzBiM3mkuwdaUcjKasqgcplVmVdZhZAzO2dvPfee++999577733ujudTif33/8/XGZkAWz2zkrayZ4hgKrIHz9+fB8/Ik6X02qWP83x7/8Dd9jYdgwAAAA=
Decompressed:(12)RW5jb2RlRGVjb2Rl
Text:EncodeDecode
Text:EncodeDecodeEncodeDecodeEncodeDecodeEncodeDecodeEncodeDecodeEncodeDecodeEncodeDecodeEncodeDecodeEncodeDecodeEncodeDecode
Bytes:(120)RW5jb2RlRGVjb2RlRW5jb2RlRGVjb2RlRW5jb2RlRGVjb2RlRW5jb2RlRGVjb2RlRW5jb2RlRGVjb2RlRW5jb2RlRGVjb2RlRW5jb2RlRGVjb2RlRW5jb2RlRGVjb2RlRW5jb2RlRGVjb2RlRW5jb2RlRGVjb2Rl
Compressed:(131)H4sIAAAAAAAEAO29B2AcSZYlJi9tynt/SvVK1+B0oQiAYBMk2JBAEOzBiM3mkuwdaUcjKasqgcplVmVdZhZAzO2dvPfee++999577733ujudTif33/8/XGZkAWz2zkrayZ4hgKrIHz9+fB8/Ik6X02qWP83x7w/z9/8H5wqet3gAAAA=
Decompressed:(120)RW5jb2RlRGVjb2RlRW5jb2RlRGVjb2RlRW5jb2RlRGVjb2RlRW5jb2RlRGVjb2RlRW5jb2RlRGVjb2RlRW5jb2RlRGVjb2RlRW5jb2RlRGVjb2RlRW5jb2RlRGVjb2RlRW5jb2RlRGVjb2RlRW5jb2RlRGVjb2Rl
Text:EncodeDecodeEncodeDecodeEncodeDecodeEncodeDecodeEncodeDecodeEncodeDecodeEncodeDecodeEncodeDecodeEncodeDecodeEncodeDecode
How can I achieve the smaller sized encoding on the dot Net side?
Note,
Java implementation can decode dot Net implementation and
dot Net implementation can decode Java implementation.
Java Code
#Test
public void testEncodeDecode()
{
final String strTitle = "EncodeDecode";
try
{
debug( "Text:" + strTitle );
byte[] ba = strTitle.getBytes( "UTF-8" );
debug( "Bytes:" + toString( ba ) );
byte[] eba = encode_GZIP( ba );
debug( "Encoded:" + toString( eba ) );
byte[] ba2 = decode_GZIP( eba );
debug( "Decoded:" + toString( ba2 ) );
debug( "Converted:" + new String( ba2, "UTF-8" ) );
}
catch( Exception ex ) { fail( ex ); }
}
#Test
public void testEncodeDecode2()
{
final String strTitle = "EncodeDecode";
try
{
StringBuilder sb = new StringBuilder();
for( int i = 0 ; i < 10 ; i++ ) sb.append( strTitle );
debug( "Text:" + sb.toString() );
byte[] ba = sb.toString().getBytes( ENCODING );
debug( "Bytes:" + toString( ba ) );
byte[] eba = encode_GZIP( ba );
debug( "Encoded:" + toString( eba ) );
byte[] ba2 = decode_GZIP( eba );
debug( "Decoded:" + toString( ba2 ) );
debug( "Converted:" + new String( ba2, ENCODING ) );
}
catch( Exception ex ) { fail( ex ); }
}
private String toString( byte[] ba )
{
return "("+ba.length+")"+Base64.byteArrayToBase64( ba );
}
protected static byte[] encode_GZIP( byte[] baData ) throws IOException
{
ByteArrayOutputStream baos = new ByteArrayOutputStream();
ByteArrayInputStream bais = new ByteArrayInputStream( baData );
GZIPOutputStream zos = new GZIPOutputStream( baos );
byte[] baBuf = new byte[ 1024 ];
int nSize;
while( -1 != ( nSize = bais.read( baBuf ) ) )
{
zos.write( baBuf, 0, nSize );
zos.flush();
}
Utilities.closeQuietly( zos );
Utilities.closeQuietly( bais );
return baos.toByteArray();
}
protected static byte[] decode_GZIP( byte[] baData ) throws IOException
{
ByteArrayOutputStream baos = new ByteArrayOutputStream();
ByteArrayInputStream bais = new ByteArrayInputStream( baData );
GZIPInputStream zis = new GZIPInputStream( bais );
byte[] baBuf = new byte[ 1024 ];
int nSize;
while( -1 != ( nSize = zis.read( baBuf ) ) )
{
baos.write( baBuf, 0, nSize );
baos.flush();
}
Utilities.closeQuietly( zis );
Utilities.closeQuietly( bais );
return baos.toByteArray();
}
private void debug( Object o ) { System.out.println( o ); }
private void fail( Exception ex )
{
ex.printStackTrace();
Assert.fail( ex.getMessage() );
}
dot Net Code
[Test]
public void TestJava6()
{
string strData = "EncodeDecode";
Console.WriteLine("Text:" + strData);
byte[] baData = Encoding.UTF8.GetBytes(strData);
Console.WriteLine("Bytes:" + toString(baData));
byte[] ebaData2 = encode_GZIP(baData);
Console.WriteLine("Encoded:" + toString(ebaData2));
byte[] baData2 = decode_GZIP(ebaData2);
Console.WriteLine("Decoded:" + toString(baData2));
Console.WriteLine("Text:" + Encoding.UTF8.GetString(baData2));
}
[Test]
public void TestJava7()
{
string strData = "EncodeDecode";
StringBuilder sb = new StringBuilder();
for (int i = 0; i < 10; i++) sb.Append(strData);
Console.WriteLine("Text:" + sb.ToString());
byte[] baData = Encoding.UTF8.GetBytes(sb.ToString());
Console.WriteLine("Bytes:" + toString(baData));
byte[] ebaData2 = encode_GZIP(baData);
Console.WriteLine("Encoded:" + toString(ebaData2));
byte[] baData2 = decode_GZIP(ebaData2);
Console.WriteLine("Decoded:" + toString(baData2));
Console.WriteLine("Text:" + Encoding.UTF8.GetString(baData2));
}
public string toString(byte[] ba)
{
return "(" + ba.Length + ")" + Convert.ToBase64String(ba);
}
protected static byte[] decode_GZIP(byte[] ba)
{
MemoryStream writer = new MemoryStream();
using (GZipStream zis = new GZipStream(new MemoryStream(ba), CompressionMode.Decompress))
{
Utilities.CopyStream(zis, writer);
}
return writer.ToArray();
}
protected static byte[] encode_GZIP(byte[] ba)
{
using (MemoryStream reader = new MemoryStream(ba))
{
MemoryStream writer = new MemoryStream();
using (GZipStream zos = new GZipStream(writer, CompressionMode.Compress))
{
Utilities.CopyStream(reader, zos);
}
return writer.ToArray();
}
}

This is one of several bugs in the .NET gzip code. That code should be avoided. Use DotNetZip instead. See answer here: Why does my C# gzip produce a larger file than Fiddler or PHP? .

Related

How to write an array of bytes in zip and then read it from there

Sorry for my English.
I need zzat array of bytes (I do it through zip), but I do not use files, channels and buffers.
After that I need to unload (unzip this array to another array)
I did something like this but it doesn't work:
public class Main {
public static void main(String[] args) {
byte[] b = "Help me please".getBytes();
ByteArrayOutputStream baos = new ByteArrayOutputStream();
try {
baos.write(b);
} catch (IOException e) {
e.printStackTrace();
}
try (ZipOutputStream zos = new ZipOutputStream(baos)){
ZipEntry out = new ZipEntry("1");
zos.putNextEntry(out);
zos.closeEntry();
}
catch (IOException e){
e.printStackTrace();
}
byte[] a = baos.toByteArray(); //compressed array
ByteArrayInputStream bais = new ByteArrayInputStream(a);
try(ZipInputStream zis = new ZipInputStream(bais)){
System.out.println('1');
byte[]c = zis.readAllBytes();
zis.closeEntry();
System.out.println(c.equals(b));
}
catch (IOException e){
e.printStackTrace();
}
}
}

The following worked for me. Note that I open the Zip file stream first, then I open the entry, then I write the bytes. It has to go in that order or it doesn't work.
public class ZipFileTest {
public static void main( String[] args ) throws IOException {
byte[] b = "Help me please".getBytes( "UTF-8" );
ByteArrayOutputStream baos = new ByteArrayOutputStream();
try( ZipOutputStream zos = new ZipOutputStream( baos ) ) {
ZipEntry out = new ZipEntry( "1 First" );
zos.putNextEntry( out );
zos.write( b, 0, b.length );
zos.closeEntry();
}
byte[] a = baos.toByteArray(); //compressed array
ByteArrayInputStream bais = new ByteArrayInputStream( a );
try( ZipInputStream zis = new ZipInputStream( bais ) ) {
for( ZipEntry zipe; (zipe = zis.getNextEntry()) != null; ) {
byte[] data = new byte[1024];
int length = zis.read( data, 0, data.length );
System.out.println( "Entry: " + zipe.toString() );
System.out.println( "Data: " + new String( data, 0, length, "UTF-8" ) );
zis.closeEntry();
}
}
}
}
Output:
run:
Entry: 1 First
Data: Help me please
BUILD SUCCESSFUL (total time: 0 seconds)

Retrieve data from via UART

I am using UART to get the input to my system from PIC Development Board. and I use the following code to get the values from the board.
public SerialReader( InputStream in ) {
this.in = in;
}
public void run() {
byte[] buffer = new byte[ 1024 ];
int len = -1;
try {
/*len = this.in.read(buffer);
int x = buffer[0] & 0xff;
System.out.println(buffer[0]);
System.out.println(x);
System.out.println((char)x);*/
while( ( len = this.in.read( buffer ) ) > -1 ) {
System.out.print( new String( buffer, 0, len ) );
System.out.println(len);
/*String s = new String(buffer); //buffer.toString(); 1
System.out.println(s);
for (int i=0; i<=buffer.length; i++)
System.out.println(buffer[i]);
//System.out.print( new String( buffer, 0, len ) );
*/ }
} catch( IOException e ) {
e.printStackTrace();
}
}
}
Output: ààà
Expected Output: #C01=0155,INT=16:11,OUT=05:11
How do I retrive the expected output.

Java PHP FastCGI SocketException: Connection Reset on reading data

I'm trying to write a php-cgi connection for the Java webserver I'm developing but it's not really working.
I'm currently trying to write a fastcgi client using this php client as an example https://github.com/adoy/PHP-FastCGI-Client/blob/master/src/Adoy/FastCGI/Client.php
I somehow managed to get php-cgi to parse php files from my request. However, only about 1 in 4 requests sorta kinda succeeds and then fails when I'm trying to read more data:
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(Unknown Source)
at java.net.SocketInputStream.read(Unknown Source)
at java.net.SocketInputStream.read(Unknown Source)
at com.bit_stab.fastcgi.client.Packet.<init>(Packet.java:26)
at com.bit_stab.fastcgi.client.Client.readResponse(Client.java:51)
at com.bit_stab.webdragonplugin.php.PHPPlugin.runPhpCgi(PHPPlugin.java:98)
at com.bit_stab.webdragonplugin.php.PHPPlugin.main(PHPPlugin.java:42)
The rest of them just flunks:
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(Unknown Source)
at java.net.SocketInputStream.read(Unknown Source)
at java.net.SocketInputStream.read(Unknown Source)
at com.bit_stab.fastcgi.client.Packet.<init>(Packet.java:37)
at com.bit_stab.fastcgi.client.Client.readResponse(Client.java:46)
at com.bit_stab.webdragonplugin.php.PHPPlugin.runPhpCgi(PHPPlugin.java:98)
at com.bit_stab.webdragonplugin.php.PHPPlugin.main(PHPPlugin.java:42)
I'm currently running php-cgi from commandprompt with php-cgi -b 127.0.0.1:8091 And I'm using this code to test:
public static void main(String[] args)
{
try
{
HashMap<String, String> map = new HashMap<String, String>();
map.put( "DOCUMENT_ROOT" , "D:/Programma's/Eclipse/Workspaces/Java/HTTPWebServer/test root" );
map.put( "SCRIPT_FILENAME" , "D:/Programma's/Eclipse/Workspaces/Java/HTTPWebServer/test root/index.php" );
map.put( "SCRIPT_NAME" , "/index.php" );
map.put( "DOCUMENT_URI" , "/index.php" );
map.put( "REQUEST_METHOD" , "GET" );
map.put( "SERVER_PROTOCOL" , "HTTP/1.1" );
map.put( "REDIRECT_STATUS" , "200" );
map.put( "PHP_SELF" , "/index.php" );
map.put( "HOME" , "D:/Programma's/Eclipse/Workspaces/Java/HTTPWebServer/test root" );
map.put( "FCGI_ROLE" , "RESPONDER" );
map.put( "HTTP_CONNECTION" , "keep-alive" );
Client c = new Client( "127.0.0.1" , 8090 );
c.asyncRequest( map , "GET /index.php HTTP/1.1\r\nConnection: Keep-Alive\r\n\r\n" );
c.readResponse();
}
catch( Exception e )
{
// TODO Auto-generated catch block
e.printStackTrace();
}
}
This is Client.java
package com.bit_stab.fastcgi.client;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.net.Socket;
import java.net.UnknownHostException;
import java.util.Map;
import java.util.Map.Entry;
public class Client
{
private Socket socket;
private short reqId = 0b0; //TODO singleton requestID counter
public Client( String host, int port ) throws UnknownHostException, IOException
{
socket = new Socket( host, port );
}
public short asyncRequest( Map<String, String> params, String content ) throws IOException
{
ByteArrayOutputStream paramBytes = new ByteArrayOutputStream();
for ( Entry<String, String> param: params.entrySet() )
paramBytes.write( nvpair( param.getKey() , param.getValue() ) );
Packet beginRequest = new Packet( (byte) 1, reqId, new byte[] { 0, 1, 1, 0, 0, 0, 0, 0 } );
Packet requestParams = new Packet( (byte) 4, reqId, paramBytes.toByteArray() );
Packet requestContent = new Packet( (byte) 5, reqId, content.getBytes() );
OutputStream stream = socket.getOutputStream();
stream.write( beginRequest.getBytes() );
stream.write( requestParams.getBytes() );
stream.write( requestContent.getBytes() );
return reqId++;
}
public void readResponse() throws IOException
{
InputStream stream = socket.getInputStream();
Packet response = new Packet( stream );
System.out.println( new String( response.getContent() ) );
Packet p;
while ( ( p = new Packet( stream ) ).getType() != 3 )
System.out.println( new String( p.getContent() ) );
}
public byte[] nvpair( String name, String value )
{
try
{
int nl = name.length();
int vl = value.length();
ByteArrayOutputStream bytes = new ByteArrayOutputStream( nl + vl + 10 );
if ( nl < 256 )
bytes.write( (byte) nl );
else
bytes.write( new byte[] { b( nl >> 24 ), b( nl >> 16 ), b( nl >> 8 ), b( nl ) } );
if ( vl < 256 )
bytes.write( (byte) vl );
else
bytes.write( new byte[] { b( vl >> 24 ), b( vl >> 16 ), b( vl >> 8 ), b( vl ) } );
bytes.write( name.getBytes( "UTF-8" ) );
bytes.write( value.getBytes( "UTF-8" ) );
return bytes.toByteArray();
}
catch( IOException e )
{
e.printStackTrace();
}
return null;
}
public byte b( int i )
{
return (byte) i;
}
}
and this is Packet.java
package com.bit_stab.fastcgi.client;
import java.io.IOException;
import java.io.InputStream;
public class Packet
{
private byte version = 1;
private byte type;
private short requestId;
private byte paddingLength = 0;
private byte reserved = 0;
private byte[] content;
public Packet( byte type, short requestId, byte... content )
{
this.type = type;
this.requestId = requestId;
this.content = content;
}
public Packet( InputStream stream ) throws IOException
{
byte[] head = new byte[8];
stream.read( head );
this.version = head[0];
this.type = head[1];
this.requestId = (short)( ( ( head[2] & 0xFF ) << 8 ) | ( head[3] & 0xFF ) );
int contentLength = ( ( ( head[4] & 0xFF ) << 8 ) | ( head[5] & 0xFF ) );
this.paddingLength = head[6];
this.reserved = head[7];
this.content = new byte[contentLength];
stream.read( content );
stream.skip( paddingLength & 0xFF );
}
public byte getType()
{
return type;
}
public short getId()
{
return requestId;
}
public byte[] getContent()
{
return content;
}
public byte[] getBytes()
{
byte[] b = new byte[8 + content.length];
b[0] = version;
b[1] = type;
b[2] = (byte) ( requestId >> 8 );
b[3] = (byte) requestId;
b[4] = (byte) ( content.length >> 8 );
b[5] = (byte) content.length;
b[6] = paddingLength;
b[7] = reserved;
for ( int i = 0; i < content.length; i++ )
b[i + 8] = content[i];
return b;
}
}
I'm using Java 8 and an unedited PHP 5.6.1 from windows.php.net
What's going wrong here and how can I fix it?

I found out what it was, I was sending content without a content length and it didn't like that.

MD5 being returned is the same even after digesting the files - Java

I wrote the following Java method to read all the entries of a ZipInputStream file and process its MD5 based on FILE CONTENT ONLY. Inside my class Tczip I have:
public String digest( ZipInputStream entry ) throws IOException{
byte[] digest = null;
MessageDigest md5 = null;
String mdEnc = "";
ZipEntry current;
try {
md5 = MessageDigest.getInstance( "MD5" );
if( entry != null ) {
while(( current = entry.getNextEntry() ) != null ) {
if( current.isDirectory() ) {
digest = this.encodeUTF8( current.getName() );
md5.update( digest );
}
else{
int size = ( int )current.getSize();
if(size > 0){
digest = new byte[ size ];
entry.read( digest, 0, size );
md5.update( digest );
}
}
}
digest = md5.digest();
mdEnc = new BigInteger( 1, md5.digest() ).toString( 16 );
entry.close();
}
}
catch ( NoSuchAlgorithmException e ) {
// TODO Auto-generated catch block
e.printStackTrace();
}
catch (IllegalArgumentException ex){
System.out.println("There is an illegal encoding.");
//
// The fix for Korean/Chinese/Japanese encodings goes here
//
Charset encoding = Charset.forName("utf-8");
ZipInputStream zipinputstream =
new ZipInputStream(new FileInputStream( this.filename ), encoding);
digest = new byte[ 1024 ];
current = zipinputstream.getNextEntry();
while (current != null) { //for each entry to be extracted
String entryName = current.getName();
System.out.println("Processing: " + entryName);
int n;
FileOutputStream fileoutputstream =
new FileOutputStream( this.filename );
while (( n = zipinputstream.read( digest, 0, 1024 )) > -1) {
fileoutputstream.write(digest, 0, n);
}
fileoutputstream.close();
zipinputstream.closeEntry();
current = zipinputstream.getNextEntry();
}//while
zipinputstream.close();
}
return mdEnc;
}
public byte[] encodeUTF8( String name ) {
final Charset UTF8_CHARSET = Charset.forName( "UTF-8" );
return name.getBytes( UTF8_CHARSET );
}
Then the program would go over a root directory (aka C:\workspace\path\to\source\code ), iterating over all the directories, looking for .zip files to be processed. These files go into File[] files:
public void showFiles( File[] files ){
for( File file : files ){
if( file.isDirectory() ) {
showFiles( file.listFiles( this.filter ) );
}
else {
try {
String path = file.getCanonicalPath();
String relative = path.replace("tc10.0.0.2012080100_A", "tc10.0.0.2012080600_C" );
File b = new File(relative);
if( b.exists() ) {
System.out.println( "Processing :" + file.getName() );
this.zip_a = new Tczip( path );
this.zip_b = new Tczip( relative );
String md5_a = this.zip_a.digest();
String md5_b = this.zip_b.digest();
System.out.println("MD5 A: " + md5_a);
System.out.println("MD5 B: " + md5_b);
if( md5_a.equals( md5_b )){
System.out.println( "They Match" );
}
else {
System.out.println( "They don't Match" );
}
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
}
So I wanted to process the MD5 of all those zip files, and compare if they do match: TWO EQUAL(IN CONTENT) ZIP FILES ARE EXPECTED TO HAVE THE SAME MD5. If the file content is not the same, then the MD5 would be different.
However, when I execute the program, I have:
Processing :web.zip
MD5 A: d41d8cd98f00b204e9800998ecf8427e
MD5 B: d41d8cd98f00b204e9800998ecf8427e
They Match
Processing :weldmgmt_icons.zip
MD5 A: d41d8cd98f00b204e9800998ecf8427e
MD5 B: d41d8cd98f00b204e9800998ecf8427e
They Match
Processing :weldmgmt_install.zip
MD5 A: d41d8cd98f00b204e9800998ecf8427e
MD5 B: d41d8cd98f00b204e9800998ecf8427e
They Match
Processing :weldmgmt_template.zip
MD5 A: d41d8cd98f00b204e9800998ecf8427e
MD5 B: d41d8cd98f00b204e9800998ecf8427e
They Match
Why do they same the same MD5? I expect two files to have the same MD5, but not all of them. Any suggestions? What am I doing wrong?

I believe the following lines of code:
while(( current = entry.getNextEntry() ) != null ) {
if( current.isDirectory() ) {
digest = this.encodeUTF8( current.getName() );
md5.update( digest );
}
else{
int size = ( int )current.getSize();
if(size > 0){
digest = new byte[ size ];
entry.read( digest, 0, size );
md5.update( digest );
}
}
}
Is where this implementation fails. So looking at the API
a call to entry.getNextEntry() will return the next file to be processed. However, you are discarding that value if it is not a directory. So it would stand to reason that the hash is the same because you are just processing the same file in your entry.read line every time.
UPDATE
To fix this you should be able to do something along the lines of entry = entry.getNextEntry();
Or to make it less painful on others do something like: currentEntry = entry.getNextEntry();

Java Strings breaking file operations

I have a glitch in my program that causes a question mark (\u003f) to appear in the sixth (index = 5) slot in strings when I encrypt them. Normally, this is reversed upon decryption. However, it is not reversed if I save the string to a file first. I have determined that when I save a string containing Unicode characters to a file, I will not be able to determine the correct length for the file. I have managed to reproduce the glitch in the following function...
public static void testFileIO(String[] args)
{
System.out.println("TESTING FILE IO FUNCTIONS...");
try
{
String filename = "test.txt";
String testString = "UB\u4781ERBLAH\u037f\u8746";
System.out.println("Output: " + testString);
FileWriter fw = new FileWriter(filename);
fw.write(testString);
fw.close();
FileReader fr = new FileReader(filename);
int length;
for(length = 0; fr.read() != -1; length++);
if(length != testString.length())
System.out.println("Failure on file length measurement.");
fr.close();
fr = new FileReader(filename);
char[] buffer = new char[length];
fr.read(buffer);
String result = new String(buffer);
fr.close();
System.out.println("Input: " + result);
if(result.equals(testString)) System.out.println("SUCCESS!");
else System.out.println("FAILURE.");
}
catch (Throwable e)
{
e.printStackTrace();
System.out.println("FAILURE.");
return;
}
}
As an additional note, a failure in file length measurement is also caught.
Here is the Crypto class that I use to encrypt and decrypt Strings...
abstract public class Crypto
{
/**
* Encrypt the plaintext with a bitwise xor cipher
* #param plainText The plaintext to encrypt
* #param password The key for the bitwise xor cipher
* #return Ciphertext yielded by given plaintext and password
*/
public static String encrypt(String plainText, String key)
{
char[] data = plainText.toCharArray();
Random rand = new Random();
rand.setSeed(key.hashCode());
char[] pass = new char[data.length];
for(int i = 0; i < pass.length; i++)
{
pass[i] = (char)rand.nextInt();
}
for(int i = 0; i < data.length; i++)
{
data[i] ^= pass[i % pass.length];
}
return new String(data);
}
/**
* Decrypt an encrypted message using the same key as for encryption
* #param cipherText The cipherText message to be deciphered
* #param password The seed for the random generator to get the right keys
* #return The plaintext message corresponding to 'cipherText'
*/
public static String decrypt(String cipherText, String key)
{
char[] data = cipherText.toCharArray();
Random rand = new Random();
rand.setSeed(key.hashCode());
char[] pass = new char[data.length];// = key.getBytes("ASCII");
for(int i = 0; i < pass.length; i++)
{
pass[i] = (char)rand.nextInt();
}
for(int i = 0; i < data.length; i++)
{
data[i] ^= pass[i % pass.length];
}
return new String(data);
}
}

The code is correct but almost never works - As a rule of thumb, avoid FileReader and FileWriter and build your own readers/writers using InputStreamReader and OutputStreamWriter which allow you to specify the encoding to use (and hence how to protect 16bit Unicode characters when you write 8bit data).
I use a helper class for this because I need it all the time:
private static final String FILE = "file";
private static final String CHARSET = "charset";
public static BufferedReader createReader( File file, Encoding charset ) throws IOException {
JavaUtils.notNull( FILE, file );
JavaUtils.notNull( CHARSET, charset );
FileInputStream stream = null;
try {
stream = new FileInputStream( file );
return createReader( stream, charset );
} catch( IOException e ) {
IOUtils.closeQuietly( stream );
throw e;
} catch( RuntimeException e ) {
IOUtils.closeQuietly( stream );
throw e;
}
}
public static BufferedReader createReader( InputStream stream, Encoding charset ) throws IOException {
JavaUtils.notNull( "stream", stream );
JavaUtils.notNull( "charset", charset );
try {
return new BufferedReader( new InputStreamReader( stream, charset.encoding() ) );
} catch( UnsupportedEncodingException e ) {
IOUtils.closeQuietly( stream );
throw new UnknownEncodingException( charset, e );
} catch( RuntimeException e ) {
IOUtils.closeQuietly( stream );
throw e;
}
}
public static BufferedWriter createWriter( File file, Encoding charset ) throws IOException {
JavaUtils.notNull( FILE, file );
JavaUtils.notNull( CHARSET, charset );
FileOutputStream stream = null;
try {
stream = new FileOutputStream( file );
return new BufferedWriter( new OutputStreamWriter( stream, charset.encoding() ) );
} catch( UnsupportedEncodingException e ) {
IOUtils.closeQuietly( stream );
throw new UnknownEncodingException( charset, e );
} catch( IOException e ) {
IOUtils.closeQuietly( stream );
throw e;
} catch( RuntimeException e ) {
IOUtils.closeQuietly( stream );
throw e;
}
}
The type Encoding is an interface which I implement using one or more enums:
public interface Encoding {
String encoding();
Charset charset();
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

GZIP outputs and sizes between C#/dot Net and Java - java

This is one of several bugs in the .NET gzip code. That code should be avoided. Use DotNetZip instead. See answer here: Why does my C# gzip produce a larger file than Fiddler or PHP? .

Related

How to write an array of bytes in zip and then read it from there

Retrieve data from via UART

Java PHP FastCGI SocketException: Connection Reset on reading data

MD5 being returned is the same even after digesting the files - Java

Java Strings breaking file operations

Categories

Resources