Compress and serialize String on the fly - java

I need to store a binary object (java class having several collections inside) in the key-value storage.
The size limit for the value is 4K.
I created XStream based serializer and deserializer, so when I am done filling my class members I can serialize it to a String or to a file.
In the worst case the serialized String/file size is ~30K. I mange to achive good compression rate so after compression my file is ~2K which fits the bill.
My question: is there any useful java API\library\technique that can:
compress a String and serialize the compressed object.
decompress previously compressed object and create a regular String from it
I am looking for one-liners that do not require intermediate storage of serialized object to file for later compression.
Appreciate your help!

Try a GZIPOutputStream for zipping the String:
ByteArrayOutputStream out = new ByteArrayOutputStream();
Writer writer = new BufferedWriter(new OutputStreamWriter(new GZIPOutputStream(out)));
writer.write(string);
byte[] zipped = out.toByteArray();
And to unzip again:
ByteArrayInputStream in = new ByteArrayInputStream(zipped);
BufferedReader reader = new BufferedReader(new InputStreamReader(new GZIPInputStream(in)));
string = reader.readLine();

Related

Is it possible to save pdf document to byte array (aspose.pdf for java)

I need to save a pdf document, generated by aspose.pdf for java library to memory (without using temporary file)
I was looking at the documentation and didn't find the save method with the appropriate signature. (I was looking for some kind of outputstream, or at least byte array).
Is it possible? If it is, how can I manage that?
Thanks
Aspose.Pdf for Java supports saving output to both file and stream. Please check following code snippet, It will help you to accomplish the task.
byte[] input = getBytesFromFile(new File("C:/data/HelloWorld.pdf"));
ByteArrayOutputStream output = new ByteArrayOutputStream();
com.aspose.pdf.Document pdfDocument = new com.aspose.pdf.Document(new ByteArrayInputStream(input));
pdfDocument.save(output);
//If you want to read the result into a Document object again, in Java you need to get the
//data bytes and wrap into an input stream.
InputStream inputStream=new ByteArrayInputStream(output.toByteArray());
I am Tilal Ahmad, developer evangelist at Aspose.
I did similar thing.
Here is method to write data to byte:
public byte[] toBytes() {
//create byte array output stream object
ByteArrayOutputStream byteOutStream = new ByteArrayOutputStream();
//create new data output stream object
DataOutputStream outStream = new DataOutputStream(byteOutStream);
try {//write data to bytes stream
if (data != null) {
outStream.write(data);//write data
}//return array of bytes
return byteOutStream.toByteArray();
}
Then you do something like
yourFileName.toBytes;

jena read inputstream from gzipped file

I have the following code to read a dataset into a jena model using inputstream however I would like my program to be able to read compressed (gzipped) files as well (using filePath).
Dataset dataset = TDBFactory.createDataset(tdbPath);
Model model = dataset.getDefaultModel();
InputStream str = FileManager.get().open(filePath);
model.read(str,null, "N-TRIPLES");
You need to create a GZIPInputStream to read it then
Dataset dataset = TDBFactory.createDataset(tdbPath);
Model model = dataset.getDefaultModel();
InputStream str = FileManager.get().open(filePath);
if (useGZIP) {
str = new GZIPInputStream(str);
}
model.read(str,null, "N-TRIPLES");
If you use the newer RDFDataMgr APIs then GZip compression should be handled entirely transparently:
RDFDataMgr.read(model, filePath, Lang.NTRIPLES);

How do I get an FileInputStream from FileItem in java?

I am trying to avoid the FileItem getInputStream(), because it will get the wrong encoding, for that I need a FileInputStream instead. Is there any way to get a FileInputStream without using this method? Or can I transform my fileitem into a file?
if (this.strEncoding != null && !this.strEncoding.isEmpty()) {
br = new BufferedReader(new InputStreamReader(clsFile.getInputStream(), this.strEncoding));
}
else {
// br = ?????
}
You can try
FileItem#getString(encoding)
Returns the contents of the file item as a String, using the specified encoding.
You can use the write method here.
File file = new File("/path/to/file");
fileItem.write(file);
An InputStream is binary data, bytes. It must be converted to text by giving the encoding of those bytes.
Java uses internally Unicode to represent all text scripts. For text it uses String/char/Reader/Writer.
For binary data, byte[], InputStream, OutputStream.
So you could use a bridging class, like InputStreamReader:
String encoding = "UTF-8"; // Or "Windows-1252" ...
BufferedReader in = new BufferedStream(
new InputStreamReader(fileItem.getInputStream(),
encoding));
Or if you read the bytes:
String s = new String(bytes, encoding);
The encoding is often an option parameter (there then exists an overloaded method without encoding).

Java: Hashmap with contents compiled

I am looking to implement a HashMap with its contents in the bytecode. This would be similar to me serializing the content and then reading it in. But in my experience serialization only works with saving it to a file and then reading it in, I would want this implementation to be faster than that.
But in my experience serialization only works with saving it to a file and then reading it in, I would want this implementation to be faster than that.
Serialization works with streams. Specifically, ObjectOutputStream can wrap any OutputStream. If you want to perform in-memory serialization, you could use ByteArrayOutputStream here.
Similarly on the input side.
You can save your HashMap as byte array using Java Serialization mechanizm
Map map = new HashMap();
map.put(1, 1);
ByteArrayOutputStream bout = new ByteArrayOutputStream();
ObjectOutputStream oos = new ObjectOutputStream(bout);
oos.writeObject(map);
oos.close();
byte[] bytes = bout.toByteArray();
// restore from bytes
ObjectInputStream ois = new ObjectInputStream(new ByteArrayInputStream(bytes));
map = (Map) ois.readObject();
System.out.println(map);
output
{1=1}
not that both keys and values in the Map must be Serializable otherwise it wont work

Getting XML data from the byteArray of a zipFile

I'm writing a simple program that retrieves XML data from an object, and parses it dynamically, based on user criteria. I am having trouble getting the XML data from the object, due to the format it is available in.
The object containing the XML returns the data as a byteArray of a zipFile, like so.
MyObject data = getData();
byte[] byteArray = data.getPayload();
//The above returns the byteArray of a zipFile
The way I checked this, is by writing the byteArray to a String
String str = new String(byteArray);
//The above returns a string with strange characters in it.
Then I wrote the data to a file.
FileOutputStream fos = new FileOutputStream("new.txt");
fos.write(byteArray);
I renamed new.txt as new.zip. When I opened it using WinRAR, out popped the XML.
My problem is that, I don't know how to do this conversion in Java using streams, without writing the data to a zip file first, and then reading it. Writing data to disk will make the software way too slow.
Any ideas/code snippets/info you could give me would be really appreciated!! Thanks
Also, if you need a better explanation from me, I'd be happy to elaborate.
As another option, I am wondering whether an XMLReader would work with a ZipInputStream as InputSource.
ByteArrayInputStream bis = new ByteArrayInputStream(byteArray);
ZipInputStream zis = new ZipInputStream(bis);
InputSource inputSource = new InputSource(zis);
A zip archive can contain several files. You have to position the zip stream on the first entry before parsing the content:
ByteArrayInputStream bis = new ByteArrayInputStream(byteArray);
ZipInputStream zis = new ZipInputStream(bis);
ZipEntry entry = zis.getNextEntry();
InputSource inputSource = new InputSource(new BoundedInputStream(zis, entry.getCompressedSize()));
The BoundedInputStream class is taken from Apache Commons IO (http://commons.apache.org/io)

Categories