We have a Client/Server application which communicates over RMI. The server sends HashMaps to the client. All works well, however when sending large HashMaps, transfer times can be slow.
Is there any way to compress the HashMaps before sending, then decompress on the client? I do not want to create any files on disk whatsoever (All must be in RAM)
Thanks
You can use DeflatorOutputStream to a ByteArrayOutputStream, however you will end up with a byte[] so your RMI call should return a byte[].
Small serializable obejct won't compress well, however if you have many Serializable objects it can compress very well. So can large amounts of text.
The simplest thing to do is to try it. If there are repeated strings or even portions of strings, this will help compression.
public static void main(String... args) throws IOException {
Map<String, String> map = new HashMap<String, String>();
for(int i=0;i<1000;i++)
map.put(""+Math.random(), ""+Math.random());
byte[] bytes1 = toBytes(map);
byte[] bytes2 = toCompressedBytes(map);
System.out.println("HashMap with "+map.size()+" entries, Uncompressed length="+bytes1.length+", compressed length="+bytes2.length);
}
public static byte[] toCompressedBytes(Object o) throws IOException {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
ObjectOutputStream oos = new ObjectOutputStream(new DeflaterOutputStream(baos));
oos.writeObject(o);
oos.close();
return baos.toByteArray();
}
public static byte[] toBytes(Object o) throws IOException {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
ObjectOutputStream oos = new ObjectOutputStream(baos);
oos.writeObject(o);
oos.close();
return baos.toByteArray();
}
public static Object fromCompressedBytes(byte[] bytes) throws IOException, ClassNotFoundException {
ObjectInputStream ois = new ObjectInputStream(new InflaterInputStream(new ByteArrayInputStream(bytes)));
return ois.readObject();
}
Prints
HashMap with 1000 entries, Uncompressed length=42596, compressed length=19479
Don't do anything to the hashmap. Instead, Write a custom socket factory that compresses the data using a DeflaterOutputStream.
Many years ago I used to serialize objects into byte array and then zip it. Zip is still supported by Java :) so try this method.
You may try a custom serialization mechanism for the elements inside the hashmap.
What kind of information are you sending? how do the object inside look like?
Even using the default mechanism, and marking all the unneeded attributes as transient will help.
Additionally you may attempt to sending the data your self serializing it before to a ZipOutputStream but I would let that as a last resource, for the binary content won't compress too much.
EDIT
Since your using only strings, you can create an wrapper whose custom serialization is a compressed array ( pretty much as Peter Lawrey answer ) but, using a custom serialization would let you encapsulate the serialization process and have it working some how "transparently" for RMI ( RMI serialization would never know you're using a compressed version )
Here's a demo:
import java.io.*;
import java.util.*;
import java.util.zip.*;
public class MapDemo implements Serializable {
private Map<String,String> map = new HashMap<String,String>();
// only for demo/comparison purposes, default would use compressoin always
private boolean useCompression;
public MapDemo( Map<String,String> map , boolean compressed ) {
this.map = map;
this.useCompression = compressed;
}
// This is the custom serialization using compression
private void writeObject(ObjectOutputStream out) throws IOException {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
OutputStream os = useCompression ? new DeflaterOutputStream( baos ) : baos;
ObjectOutputStream oos = new ObjectOutputStream( os );
oos.writeObject( this.map );
oos.close();
out.write( baos.toByteArray() );
}
}
class Main {
public static void main( String [] args ) throws IOException {
Map<String,String> regular = new HashMap<String,String>();
Map<String,String> compressed = new HashMap<String,String>();
Random r = new Random();
for( int i = 0 ; i < 100000 ; i++ ) {
String key = ""+r.nextInt(1000000);
String value = ""+r.nextInt(1000000) ;
// put the same info
compressed.put( key , value );
regular.put( key , value );
}
save( new MapDemo( compressed, true ) , "map.compressed");
save( new MapDemo( regular, false ) , "map.regular");
}
private static void save( Object o, String toFile ) throws IOException {
// This is similar to what RMI serialization would do behind scenes
ObjectOutputStream oos = new ObjectOutputStream( new FileOutputStream(toFile));
oos.writeObject( o );
oos.close();
}
}
Related
I asked this once before and my post was deleted for not providing the code that uses the helper class. This time I have created a full test suite which shows the exact problem.
I am of the opinion that Java's ZipInputStream breaks the Liskov Substitution Principle (LSP) with regards to the InputStream abstract class. For ZipInputStream to be a subtype of InputStream, then objects of type InputStream in a program may be replaced with objects of type ZipInputStream without altering any of the desirable properties of that program (correctness, task performed, etc.).
The way in which LSP is violated here is for the read methods.
InputStream.read(byte[], int, int) states that it returns:
the total number of bytes read into the buffer, or -1 if there is no more data because the end of the stream has been reached.
The problem with ZipInputStream is that it has modified the meaning of a -1 return value. It states:
the actual number of bytes read, or -1 if the end of the entry is reached
(there is actually a hint to a similar problem with the available method in the Android documentation http://developer.android.com/reference/java/util/zip/ZipInputStream.html)
Now for the code that demonstrates the problem. (This is a cut down version of what I was actually trying to do so please excuse any poor style, multithreading problems, or the fact that the stream is advanced etc.).
Class that accepts any InputStream to generate a SHA1 of the stream:
public class StreamChecker {
private byte[] lastHash = null;
public boolean isDifferent(final InputStream inputStream) throws IOException {
final byte[] hash = generateHash(inputStream);
final byte[] temp = lastHash;
lastHash = hash;
return !Arrays.equals(temp, hash);
}
private byte[] generateHash(final InputStream inputStream) throws IOException {
return DigestUtils.sha1(inputStream);
}
}
Unit tests:
public class StreamCheckerTest {
#Test
public void testByteArrayInputStreamIsSame() throws IOException {
final StreamChecker checker = new StreamChecker();
final byte[] bytes = "abcdef".getBytes();
try (final ByteArrayInputStream stream = new ByteArrayInputStream(bytes)) {
Assert.assertTrue(checker.isDifferent(stream));
}
try (final ByteArrayInputStream stream = new ByteArrayInputStream(bytes)) {
Assert.assertFalse(checker.isDifferent(stream));
}
// Passes
}
#Test
public void testByteArrayInputStreamWithDifferentDataIsDifferent() throws IOException {
final StreamChecker checker = new StreamChecker();
byte[] bytes = "abcdef".getBytes();
try (final ByteArrayInputStream stream = new ByteArrayInputStream(bytes)) {
Assert.assertTrue(checker.isDifferent(stream));
}
bytes = "123456".getBytes();
try (final ByteArrayInputStream stream = new ByteArrayInputStream(bytes)) {
Assert.assertTrue(checker.isDifferent(stream));
}
// Passes
}
#Test
public void testZipInputStreamIsSame() throws IOException {
final StreamChecker checker = new StreamChecker();
final byte[] bytes = "abcdef".getBytes();
try (final ZipInputStream stream = createZipStream("test", bytes)) {
Assert.assertTrue(checker.isDifferent(stream));
}
try (final ZipInputStream stream = createZipStream("test", bytes)) {
Assert.assertFalse(checker.isDifferent(stream));
}
// Passes
}
#Test
public void testZipInputStreamWithDifferentEntryDataIsDifferent() throws IOException {
final StreamChecker checker = new StreamChecker();
byte[] bytes = "abcdef".getBytes();
try (final ZipInputStream stream = createZipStream("test", bytes)) {
Assert.assertTrue(checker.isDifferent(stream));
}
bytes = "123456".getBytes();
try (final ZipInputStream stream = createZipStream("test", bytes)) {
// Fails here
Assert.assertTrue(checker.isDifferent(stream));
}
}
private ZipInputStream createZipStream(final String entryName,
final byte[] bytes) throws IOException {
try (final ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
final ZipOutputStream stream = new ZipOutputStream(outputStream)) {
stream.putNextEntry(new ZipEntry(entryName));
stream.write(bytes);
return new ZipInputStream(new ByteArrayInputStream(
outputStream.toByteArray()));
}
}
}
So back to the problem... LSP is violated since you can read to the end of the stream for an InputStream but not for a ZipInputStream and of course this will break the correctness property of any method that tries to use it in such a way.
Is there any way that this can be achieved or is ZipInputStream fundamentally flawed?
I see no LSP violation. The documentation for ZipInputStream.read(byte[], int, int) says 'Reads from the current ZIP entry into an array of bytes'.
At any one time, the ZipInputStream is really the input stream of the entry, not the whole ZIP file. And it's hard to see what else ZipInputStream.read() could possibly do at end of entry other than return -1.
this will break the correctness property of any method that tries to use it in such a way
Hard to see how the method would ever know.
I have serialized a HashTable<String,Object> object using an ObjectOutputStream. When serializing the object, I get no exception, but upon deserialization, the following exception occurs:
Exception in thread "main" java.io.InvalidClassException: java.lang.Long; local class
incompatible: stream classdesc serialVersionUID = 4290774032661291999, local class
serialVersionUID = 4290774380558885855
I no longer get the error when I remove all of the keys in the HashTable that have a value that is not a String (all of the key / value pairs I removed had a primitive type as their value).
What could be causing this error?
UPDATE - Here's the code
public static String serialize(Quiz quiz) throws IOException{
HashMap<String,Object> quizData = new HashMap<String,Object>();
quizData.put("version", 0); //int
quizData.put("name", quiz.getName()); //String
quizData.put("desc", quiz.getDesc()); //String
quizData.put("timelimitType", quiz.getTimelimitType()); //String
quizData.put("timelimit", quiz.getTimelimit()); //long
ArrayList<String> serializedQuestionsData = new ArrayList<String>();
for (Question question : quiz.getQuestions())
serializedQuestionsData.add(Question.serialize(question));
quizData.put("questions", serializedQuestionsData.toArray(new String[0])); //String[]
ByteArrayOutputStream baos = new ByteArrayOutputStream();
ObjectOutputStream oos;
try { oos = new ObjectOutputStream(baos); } catch (IOException error){ throw error; }
try { oos.writeObject(quizData); } catch (IOException error){ throw error; }
return baos.toString();
}
#SuppressWarnings("unchecked")
public static Quiz deserialize(String serializedQuizData) throws IOException, ClassNotFoundException{
ByteArrayInputStream bais = new ByteArrayInputStream(serializedQuizData.getBytes());
ObjectInputStream ois;
try { ois = new ObjectInputStream(bais); } catch (IOException error){ throw error; }
HashMap<String,Object> quizData;
// Exception occurs on the following line!!
try { quizData = (HashMap<String,Object>) ois.readObject(); } catch (ClassNotFoundException error){ throw error; }
Quiz quiz;
if ((int) quizData.get("version") == 0){
quiz = new Quiz((String) quizData.get("name"),
(String) quizData.get("desc"),
(String) quizData.get("timelimitType"),
(long) quizData.get("timelimit"));
for (String serializedQuestionData : (String[]) quizData.get("questions"))
quiz.addQuestion(Question.deserialize(serializedQuestionData));
} else {
throw new UnsupportedOperationException("Unsupported version: \"" + quizData.get("version") + "\"");
}
return quiz;
}
The problem is that you're transforming a byte array output stream to a String using toString(). The toString() method simply uses the platform default encoding to transform the bytes (which do not represent characters at all but are purely binary data) into a String. This is thus a lossy operation, because your platform default encoding doesn't have a valid character for every possible byte.
You shouldn't use String to hold binary data. A String contains characters. If you really need a String, then encode the byte array using a Hexadecimal or Base64 encoder. Otherwise, simply use a byte array to hold your binary data:
public static byte[] serialize(Quiz quiz) throws IOException{
...
ByteArrayOutputStream baos = new ByteArrayOutputStream();
...
return baos.toByteArray();
}
#SuppressWarnings("unchecked")
public static Quiz deserialize(byte[] serializedQuizData) throws IOException, ClassNotFoundException{
ByteArrayInputStream bais = new ByteArrayInputStream(serializedQuizData);
...
return quiz;
}
The only explanation I can think of is that is that something is corrupting your object stream between you reading it and writing it. The serialVersionID in "the local class) (4290774380558885855) is standard across all Java implementations that try to be compatible with Java (tm). The source code for java.lang.Long says that that serial version id has not changed since Java 1.0.2.
If you need further help, you will need to provide an SSCCE that covers both creation and reading of the serialized object.
I have a collection of objects which i need to store in byte format and then afterwards i have to convert the data which in bytes back into collection of objects.I need the answer in java.
For eg I have an array of objects(any type) then i have to convert this array to byte array in java and then vice versa.
Please if possible suggest me the collection to use and the methods which support it.
Assuming that Foo implements Serializable, just do
List<Foo> list = createItSomehow();
ByteArrayOutputStream baos = new ByteArrayOutputStream();
ObjectOutputStream oos = new ObjectOutputStream(baos);
try {
oos.writeObject(list);
} finally {
oos.close();
}
byte[] bytes = baos.toByteArray();
// ...
And the other way round:
ByteArrayInputStream bais = new ByteArrayInputStream(bytes);
ObjectInputStream ois = new ObjectInputStream(bais);
List<Foo> list = null;
try {
list = (List<Foo>) ois.readObject();
} finally {
ois.close();
}
// ...
Instead of ByteArrayOutputStream and ByteArrayInputStream you can of course also supply FileOutputStream and FileInputStream respectively to write/read it to/from file.
See also:
The Java Tutorials - Essential Classes - Basic I/O - Object Streams
I have class which is seralized and does convert a very large amount of data object to blob to save it to database.In the same class there is decode method to convert blob to the actual object.Following is the code for encode and decode of the object.
private byte[] encode(ScheduledReport schedSTDReport)
{
byte[] bytes = null;
try
{
ByteArrayOutputStream bos = new ByteArrayOutputStream();
ObjectOutputStream oos = new ObjectOutputStream(bos);
oos.writeObject(schedSTDReport);
oos.flush();
oos.close();
bos.close();
//byte [] data = bos.toByteArray();
//ByteArrayOutputStream baos = new ByteArrayOutputStream();
//GZIPOutputStream out = new GZIPOutputStream(baos);
//XMLEncoder encoder = new XMLEncoder(out);
//encoder.writeObject(schedSTDReport);
//encoder.close();
bytes = bos.toByteArray();
//GZIPOutputStream out = new GZIPOutputStream(bos);
//out.write(bytes);
//bytes = bos.toByteArray();
}
catch (Exception e)
{
_log.error("Exception caught while encoding/zipping Scheduled STDReport", e);
}
decode(bytes);
return bytes;
}
/*
* Decode the report definition blob back to the
* ScheduledReport object.
*/
private ScheduledReport decode(byte[] bytes)
{
ByteArrayInputStream bais = new ByteArrayInputStream(bytes);
ScheduledReport sSTDR = null;
try
{
ObjectInputStream ois = new ObjectInputStream(bais);
//GZIPInputStream in = new GZIPInputStream(bais);
//XMLDecoder decoder = new XMLDecoder(in);
sSTDR = (ScheduledReport)ois.readObject();//decoder.readObject();
//decoder.close();
}
catch (Exception e)
{
_log.error("IOException caught while decoding/unzipping Scheduled STDReport", e);
}
return sSTDR;
}
The problem here is whenver I change something else in this class
means any other method,a new class version is created and so the new version the class is unable to decode the originally encoded blob object. The object which I am passing for encode is also seralized object but this problem exists. Any ideas thanks
Yup, Java binary serialization is pretty brittle :(
You can add a static serialVersionUID field to the class so that you can control the version numbers... this should prevent problems due to adding methods. You'll still run into potential issues when fields are added though. See the JavaDocs for Serializable for some more details.
You might want to consider using another serialization format such as Protocol Buffers to give you more control though.
You can implement java.io.Externalizable so that you are able to control what is serialized and expected in deserialization.
i like to encode a java map of strings as a single base 64 encoded string. The encoded string will be transmitted to a remote endpoint and maybe manipulated by a not nice person. So the worst thing that should happen are invaild key,value-tuples, but should not bring any other security risks aside.
Example:
Map<String,String> map = ...
String encoded = Base64.encode(map);
// somewhere else
Map<String,String> map = Base64.decode(encoded);
Yes, must be Base64. Not like that or that or any other of these. Is there an existing lightweight solution (Single Utils-Class prefered) out there? Or do i have to create my own?
Anything better than this?
// marshalling
ByteArrayOutputStream baos = new ByteArrayOutputStream();
ObjectOutputStream oos = new ObjectOutputStream(baos);
oos.writeObject(map);
oos.close();
String encoded = new String(Base64.encodeBase64(baos.toByteArray()));
// unmarshalling
byte[] decoded = Base64.decodeBase64(encoded.getBytes());
ByteArrayInputStream bais = new ByteArrayInputStream(decoded);
ObjectInputStream ois = new ObjectInputStream(bais);
map = (Map<String,String>) ois.readObject();
ois.close();
Thanks,
my primary requirements are: encoded string should be as short as possible and contain only latin characters or characters from the base64 alphabet (not my call). there are no other reqs.
Use Google Gson to convert Map to JSON. Use GZIPOutputStream to compress the JSON string. Use Apache Commons Codec Base64 or Base64OutputStream to encode the compressed bytes to a Base64 string.
Kickoff example:
public static void main(String[] args) throws IOException {
Map<String, String> map = new HashMap<String, String>();
map.put("key1", "value1");
map.put("key2", "value2");
map.put("key3", "value3");
String serialized = serialize(map);
Map<String, String> deserialized = deserialize(serialized, new TypeToken<Map<String, String>>() {}.getType());
System.out.println(deserialized);
}
public static String serialize(Object object) throws IOException {
ByteArrayOutputStream byteaOut = new ByteArrayOutputStream();
GZIPOutputStream gzipOut = null;
try {
gzipOut = new GZIPOutputStream(new Base64OutputStream(byteaOut));
gzipOut.write(new Gson().toJson(object).getBytes("UTF-8"));
} finally {
if (gzipOut != null) try { gzipOut.close(); } catch (IOException logOrIgnore) {}
}
return new String(byteaOut.toByteArray());
}
public static <T> T deserialize(String string, Type type) throws IOException {
ByteArrayOutputStream byteaOut = new ByteArrayOutputStream();
GZIPInputStream gzipIn = null;
try {
gzipIn = new GZIPInputStream(new Base64InputStream(new ByteArrayInputStream(string.getBytes("UTF-8"))));
for (int data; (data = gzipIn.read()) > -1;) {
byteaOut.write(data);
}
} finally {
if (gzipIn != null) try { gzipIn.close(); } catch (IOException logOrIgnore) {}
}
return new Gson().fromJson(new String(byteaOut.toByteArray()), type);
}
Another possible way would be using JSON which is a very ligthweight lib.
The the encoding then would look like this:
JSONObject jso = new JSONObject( map );
String encoded = new String(Base64.encodeBase64( jso.toString( 4 ).toByteArray()));
Your solution works. The only other approach would be to serialize the map yourself (iterate over the keys and values). That would mean you'd have to make sure you handle all the cases correctly (for example, if you transmit the values as key=value, you must find a way to allow = in the key/value and you must separate the pairs somehow which means you must also allow this separation character in the name, etc).
All in all, it's hard to get right, easy to get wrong and would take a whole lot more code and headache. Plus don't forget that you'd have to write a lot of error handling code in the parser (receiver side).