converting outputStream to byte array [duplicate] - java

This question already has answers here:
How to convert outputStream to a byte array?
(5 answers)
Closed 5 years ago.
How can I get the bytes of an outputStream, or how can I convert an outputStream to a byte array?

From a theoretical perspective (i.e., irrespective of whether it makes sense in practice as a use case), this is an interesting question that essentially requires the implementation of a method like
public abstract byte[] convert(OutputStream out);
The Java OutputStream class, as its name implies, only supports an overridden write() method for I/O, and that write() method gets either an integer (representing 1 byte) or a byte array, the contents of which it sends to an output (e.g., a file).
For example, the following code saves the bytes already present in the data array, to the output.txt file:
byte[] data = ... // Get some data
OutputStream fos = new FileOutputStream("path/to/output.txt");
fos.write(data);
In order to get all the data that a given OutputStream will be outputting and put it into a byte array (i.e., into a byte[] object), the class from which the corresponding OutputStream object was instantiated, should keep storing all the bytes processed via its write() methods and provide a special method, such as toByteArray(), that would return them all, upon invocation.
This is exactly what the ByteArrayOutputStream class does, making the convert() method trivial (and unnecessary):
public byte[] convert(ByteArrayOutputStream out) {
return out.toByteArray();
}
For any other type of OutputStream, not inherently supporting a similar conversion to a byte[] object, there is no way to make the conversion, before the OutputStream is drained, i.e. before the desired calls to its write() methods have been completed.
If such an assumption (of the writes to have been completed) can be made, and if the original OutputStream object can be replaced, then one option is to wrap it inside a delegate class that would essentially "grab" the bytes that would be supplied via its write() methods. For example:
public class DrainableOutputStream extends FilterOutputStream {
private final ByteArrayOutputStream buffer;
public DrainableOutputStream(OutputStream out) {
super(out);
this.buffer = new ByteArrayOutputStream();
}
#Override
public void write(byte b[]) throws IOException {
this.buffer.write(b);
super.write(b);
}
#Override
public void write(byte b[], int off, int len) throws IOException {
this.buffer.write(b, off, len);
super.write(b, off, len);
}
#Override
public void write(int b) throws IOException {
this.buffer.write(b);
super.write(b);
}
public byte[] toByteArray() {
return this.buffer.toByteArray();
}
}
The calls to the write() methods of the internal "buffer" (ByteArrayOutputStream) precede the calls to the original stream (which, in turn, can be accessed via super, or even via this.out, since the corresponding parameter of the FilterOutputStream is protected). This makes sure that the bytes will be buffered, even if there is an exception while writing to the original stream.
To reduce the overhead, the calls to super in the above class can be omitted - e.g., if only the "conversion" to a byte array is desired. Even the ByteArrayOutputStream or OutputStream classes can be used as parent classes, with a bit more work and some assumptions (e.g., about the reset() method).
In any case, enough memory has to be available for the draining to take place and for the toByteArray() method to work.

For #Obicere comment example:
ByteArrayOutputStream btOs = new ByteArrayOutputStream();
btOs.write("test bytes".getBytes());
String restoredString = new String(btOs.toByteArray());
System.out.println(restoredString);

Related

Java write a byte array with given ObjectOutputStream

I have a serializable class with custom writeObject() and readObject() methods.
When an object serializes, it needs to write two byte arrays, one after another. When something deserializes it, it needs to read those two arrays.
This is my code:
private void writeObject (final ObjectOutputStream out) throws IOException {
..
out.writeByte(this.signature.getV()); //one byte
out.writeObject(this.signature.getR()); //an array of bytes
out.writeObject(this.signature.getS()); //an array of bytes
out.close();
}
private void readObject (final ObjectInputStream in) throws IOException, ClassNotFoundException {
..
v = in.readByte();
r = (byte[])in.readObject();
s = (byte[])in.readObject();
this.signature = new Sign.SignatureData(v, r, s); //creating a new object because
//sign.signaturedata
// is not serializable
in.close();
}
When the object is being deserialized (readObject method) it throws an EOFException and all three variables are null/undefined.
Relating to question title, I saw a class called ByteArrayOutputStream, but to use it, it has to be enclosed in a ObjectOutputStream, which I cannot do, ad I have an OutputStream given and must write with it.
1. How do one properly write a byte array using objectOutputStream and properly reads it using ObjectInputStream?
2. Why the code above throws an EOFException without reading even one variable?
EDIT: I need to clarify: the readObject() and writeObject() are called by jvm itself while deserializing and serializing the object.
The second thing is, the SignatureData is a subclass to Sign, that comes from a third-party library - and that's why it's not serializable.
The third thing is, the problem probably lies in the reading and writing byte arrays by ObjectInput/ObjectOutput streams, not in the Sign.SignatureData class.

Unserialize an array of bytes taking account of its useful length

I have an array of bytes whose length equals XXX. It contains a serialized object which I want to unserialise (ie. : I want to create a copy of this object from these stored bytes).
But I have a constraint : the useful length of my bytes array. Indeed, I want to take in consideration the latter to unserialise (ie. : the serialized object can be shorter than the array's size).
I hope you will understand easier with my two little methods (the first serialises, while the last unserialises) :
byte[] toBytes() throws IOException {
byte[] array_bytes;
ByteArrayOutputStream byte_array_output_stream = new ByteArrayOutputStream();
ObjectOutput object_output = new ObjectOutputStream(byte_array_output_stream);
object_output.writeObject(this);
object_output.close();
array_bytes = byte_array_output_stream.toByteArray();
return array_bytes;
}
And the current unserialisation method (which is "wrong" for the moment because I don't use the useful length) :
static Message fromBytes(byte[] bytes, int length) throws IOException, ClassNotFoundException, ClassCastException {
Message message;
ByteArrayInputStream byte_array_input_stream = new ByteArrayInputStream(bytes);
ObjectInput object_input = new ObjectInputStream(byte_array_input_stream);
message = (Message) object_input.readObject();
object_input.close();
return message;
}
As you can see, readObject doesn't need a length, and I must : that's a problem, and perhaps I should NOT use this method.
Thus, my question is : With or without using readObject, how could I take in consideration the useful length (ie. : "payload" ?) of my bytes array ?
I assume that your Message class implements Serializable.
In this case, when you write your message, it gets automatically serialized from the java runtime, as explained in the Serializable interface.
I cannot be sure how or why you might find part of the generated byte array as not useful, since it is all part of the serialized instance.
However, I might suggest that you follow the Externalizable interface way:
your Message class will implement Externalizable. Then you have the option of controlling how exactly your class gets serialized and de-serialized in writeExternal(ObjectOutput out) and readExternal(ObjectInput in) methods respectively, where you can write the length you want in the stream, read it back, and/or keep only the required amount of bytes.

What is the difference between write and writeInt?

When writing to a file using an OuputStream, what is the difference between using writeInt():
public static void makeFile(String name) throws Exception{
try (
OutputStream ostr = new FileOutputStream(name); ) {
//Uses writeInt() method
ostr.writeInt(1);
ostr.close();
}
}
and using write():
public static void makeFile(String name) throws Exception{
try (
OutputStream ostr = new FileOutputStream(name); ) {
// Uses the write() method with an int as input
ostr.write(1);
ostr.close();
}
}
What do both methods mean?
writeInt is not a member of OutputStream, so it wont compile. Assuming you use DataOutputStream or similar it will write the four bytes of the 32-bit integer in big-endian order. write will just write a single byte (the least significant of the int).
Arguably it wasn't a great idea to mix these two different ideas in the same interface. DataOutputStream should not have extended OutputStream, but too late to fix that now.
writeInt(int) comes from DataOutput interface. (ObjectOutputStream implements ObjectOutput interface, and ObjectOutput interface extends DataOutput interface.) As you can see from the JavaDoc documentation for DataOutput writeInt method, it writes four bytes in big endian order to the underlying stream.
write(int) comes from OutputStream class, which is extended by ObjectOutputStream. This method writes the low order byte of the int argument (the "right most" eight bits). Again, you can see this in the JavaDoc documentation.

Could ByteBuffer implement DataOutput/DataInput?

Is there some subtle reason why java.nio.ByteBuffer does not implement java.io.DataOutput or java.io.DataInput, or did the authors just not choose to do this? It would seem straightforward to map the calls (e.g. putInt() -> writeInt()).
The basic problem I (and some others, apparently) have is older classes which know how to serialize/serialize themselves using the generic interfaces: DataInput/DataOutput. I would like to reuse my custom serialization without writing a custom proxy for ByteBuffer.
Just wrap the buffer in ByteArrayInputStream or ByteArrayOutputStream using the put() or wrap() methods. The problem with having a ByteBuffer directly emulate a datainput/output stream has to do with not knowing the sizes in advance. What if there's an overrun?
What is needed is a ByteBufferOutputStream in which you can wrap / expose the required behaviors. Examples of this exist; the Apache avro serialization scheme has such a thing. It's not too hard to roll your own. Why is there not one by default? Well, it's not a perfect world...
ByteArrayOutputStream backing = new ByteArrayOutputStream();
DataOutput foo = new DataOutputStream(backing);
// do your serialization out to foo
foo.close();
ByteBuffer buffer = ByteBuffer.wrap(backing.toByteArray());
// now you've got a bytebuffer...
A better way that works with direct buffers too:
class ByteBufferOutputStream extends OutputStream
{
private final ByteBuffer buffer;
public ByteBufferOutputStream(ByteBuffer buffer)
{
this.buffer = buffer;
}
public void write(int b) throws IOException
{
buffer.put((byte) b);
}
}
Note that this requires calling buffer.flip() after you are done writing to it, before you can read from it.

Java serialization - incompatible serialVersionUID

I understand the theory behind incompatible serialVersionUIDs (i.e. you can discriminate different compilation versions of the same class) but I am seeing an issue that I don't understand and doesn't fall into the obvious error causes (different compiled version of the same class).
I am testing a serialization/deserialization process. All code is running on one machine, in the same VM, and both serialization and deserialization methods are using the same version of the compiled class. Serialization works fine. The class being serialized is quite complex, contains a number of other classes (java types and UDTs), and contains reference cycles. I haven't declared my own UID in any class.
This is the code:
public class Test {
public static void main(String[] args) throws Exception {
ContextNode context = WorkflowBuilder.getSimpleSequentialContextNode();
String contextString = BinarySerialization.serializeToString(context);
ContextNode contextD = BinarySerialization.deserializeFromString(ContextNode.class, contextString);
}
}
public class BinarySerialization {
public static synchronized String serializeToString(Object obj) throws Exception {
ByteArrayOutputStream byteStream = new ByteArrayOutputStream();
ObjectOutputStream oos = new ObjectOutputStream(byteStream);
oos.writeObject(obj);
oos.close();
return byteStream.toString();
}
public static synchronized <T> T deserializeFromString(Class<T> type, String byteString) throws Exception {
T object = null;
ByteArrayInputStream byteStream = new ByteArrayInputStream(byteString.getBytes());
ObjectInputStream in = new ObjectInputStream(byteStream);
object = (T)in.readObject();
in.close();
return object;
}
}
I am getting an InvalidClassException (local class incompatible: stream classdesc serialVersionUID = -7189235121689378989, local class serialVersionUID = -7189235121689362093) when deserializing.
What is the underlying issue? And how should I fix it?
Thanks
Edit
I should state the purpose of this. The serialized data will both need to be stored in a sqlite database and sent across the wire to other clients. If String is the wrong format for passing around the serialized data, what should I be using instead that will let me store and pass the data about? Thanks again.
First rule: never use String or char[] or Reader or Writer when handling binary data.
You're handling binary data and try to put it into a String. Don't do that, that's an inherently broken operation.
Next: the return value of byteStream.toString() doesn't in any way represent the actual content of the ByteArrayOutputStream. You'll want to use .getBytes() and pass the byte[] around (remember: treat binary data as binary data and not as a String).

Categories