Context
I'm doing my student project and building a testing tool for regression testing.
Main idea: capture all constructors/methods/functions invocations using AOP during runtime and record all data into a database. Later retrieve the data, run constructors/methods/functions in the same order, and compare return values.
Problem
I'm trying to serialize objects (and arrays of objects) into a byte array, record it into PostgreSQL as a blob, and later (in another runtime) retrieve that blob and deserialize it back to object. But when I deserialize data in another runtime it changes and, for example, instead of boolean, I retrieve int. If I do exactly the same operations in the same runtime (serialize - insert into the database - SELECT from the database - deserialize) everything seems to work correctly.
Here is how I record data:
private void writeInvocationRecords(InvocationData invocationData, boolean isConstructor) {
final List<InvocationData> invocationRecords = isConstructor ? constructorInvocationRecords : methodInvocationRecords;
final String recordsFileName = isConstructor ? "constructor_invocation_records.json" : "method_invocation_records.json";
byte[] inputArgsBytes = null;
ByteArrayOutputStream bos = new ByteArrayOutputStream();
ObjectOutputStream out = null;
try {
out = new ObjectOutputStream(bos);
out.writeObject(invocationData.inputArgs);
out.flush();
inputArgsBytes = bos.toByteArray();
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
bos.close();
} catch (IOException ex) {
// ignore close exception
}
}
byte[] returnValueBytes = null;
ByteArrayOutputStream rvBos = new ByteArrayOutputStream();
ObjectOutputStream rvOut = null;
try {
rvOut = new ObjectOutputStream(rvBos);
rvOut.writeObject(invocationData.returnValue);
rvOut.flush();
returnValueBytes = rvBos.toByteArray();
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
rvBos.close();
} catch (IOException ex) {
// ignore close exception
}
}
invocationRecords.add(invocationData);
if (invocationRecords.size() >= (isConstructor ? CONSTRUCTORS_CACHE_SIZE : METHODS_CACHE_SIZE)) {
List<InvocationData> tempRecords = new ArrayList<InvocationData>(invocationRecords);
invocationRecords.clear();
try {
for (InvocationData record : tempRecords) {
SerialBlob blob = new javax.sql.rowset.serial.SerialBlob(inputArgsBytes);
SerialBlob rvBlob = new javax.sql.rowset.serial.SerialBlob(returnValueBytes);
psInsert.setString(1, record.className);
psInsert.setString(2, record.methodName);
psInsert.setArray(3, conn.createArrayOf("text", record.inputArgsTypes));
psInsert.setBinaryStream(4, blob.getBinaryStream());
psInsert.setString(5, record.returnValueType);
psInsert.setBinaryStream(6, rvBlob.getBinaryStream());
psInsert.setLong(7, record.invocationTimeStamp);
psInsert.setLong(8, record.invocationTime);
psInsert.setLong(9, record.orderId);
psInsert.setLong(10, record.threadId);
psInsert.setString(11, record.threadName);
psInsert.setInt(12, record.objectHashCode);
psInsert.setBoolean(13, isConstructor);
psInsert.executeUpdate();
}
conn.commit();
} catch (Exception e) {
e.printStackTrace();
}
}
}
Here is how I retrieve data:
List<InvocationData> constructorsData = new LinkedList<InvocationData>();
List<InvocationData> methodsData = new LinkedList<InvocationData>();
Statement st = conn.createStatement();
ResultSet rs = st.executeQuery(SQL_SELECT);
while (rs.next()) {
Object returnValue = new Object();
byte[] returnValueByteArray = new byte[rs.getBinaryStream(7).available()];
returnValueByteArray = rs.getBytes(7);
final String returnType = rs.getString(6);
ByteArrayInputStream rvBis = new ByteArrayInputStream(returnValueByteArray);
ObjectInputStream rvIn = null;
try {
rvIn = new ObjectInputStream(rvBis);
switch (returnType) {
case "boolean":
returnValue = rvIn.readBoolean();
break;
case "double":
returnValue = rvIn.readDouble();
break;
case "int":
returnValue = rvIn.readInt();
break;
case "long":
returnValue = rvIn.readLong();
break;
case "char":
returnValue = rvIn.readChar();
break;
case "float":
returnValue = rvIn.readFloat();
break;
case "short":
returnValue = rvIn.readShort();
break;
default:
returnValue = rvIn.readObject();
break;
}
rvIn.close();
rvBis.close();
} catch (IOException e) {
e.printStackTrace();
} catch (ClassNotFoundException e) {
e.printStackTrace();
} finally {
try {
if (rvIn != null) {
rvIn.close();
}
} catch (IOException ex) {
// ignore close exception
}
}
Object[] inputArguments = new Object[0];
byte[] inputArgsByteArray = new byte[rs.getBinaryStream(5).available()];
rs.getBinaryStream(5).read(inputArgsByteArray);
ByteArrayInputStream bis = new ByteArrayInputStream(inputArgsByteArray);
ObjectInput in = null;
try {
in = new ObjectInputStream(bis);
inputArguments = (Object[])in.readObject();
} catch (IOException e) {
e.printStackTrace();
} catch (ClassNotFoundException e) {
e.printStackTrace();
} finally {
try {
if (in != null) {
in.close();
}
} catch (IOException ex) {
// ignore close exception
}
}
InvocationData invocationData = new InvocationData(
rs.getString(2),
rs.getString(3),
(String[])rs.getArray(4).getArray(),
inputArguments,
rs.getString(6),
returnValue,
rs.getLong(8),
rs.getLong(9),
rs.getLong(10),
rs.getLong(11),
rs.getString(12),
rs.getInt(13)
);
if (rs.getBoolean(14)) {
constructorsData.add(invocationData);
} else {
methodsData.add(invocationData);
}
}
st.close();
rs.close();
conn.close();
An explosion of errors and misguided ideas inherent in this question:
Your read and write code is broken.
available() doesn't work. Well, it does what the javadoc says it does, and if you read the javadoc, and read it very carefully, you should come to the correct conclusion that what that is, is utterly useless. If you ever call available(), you've messed up. You're doing so here. More generally your read and write code doesn't work. For example, .read(byteArr) also doesn't do what you think it does. See below.
The entire principle behind what you're attempting to do, doesn't work
You can't 'save the state' of arbitrary objects, and if you want to push the idea, then if you can, then certainly not in the way you're doing it, and in general this is advanced java that involves hacking the JDK itself to get at it: Think of an InputStream that represents data flowing over a network connection. What do you imagine the 'serialization' of this InputStream object should look like? If you consider serialization as 'just represent the underlying data in memory', then what you'd get is a number that represents the OS 'pipe handle', and possibly some IP, port, and sequence numbers. This is a tiny amount of data, and all this data is completely useless - it doesn't say anything meaningful about that connection and this data cannot be used to reconstitute it, at all. Even within the 'scope' of a single session (i.e. where you serialize, and then deserialize almost immediately afterwards), as networks are a stream and once you grab a byte (or send a byte), it's gone. The only useful, especially for the notion of 'lets replay everything that happened as a test', serialization strategy involves actually 'recording' all the bytes that were picked up, as it happens, on the fly. This is not a thing that you can do as a 'moment in time' concept, it's continuous. You need a system that is recording all the things (it needs to be recording every inputstream, every outputstream, every time System.currentTimeMillis() in invoked, every time a random number is generated, etc), and then needs to use the results of recording it all when your API is asked to 'save' an arbitrary state.
Serialization instead is a thing that objects need to opt into, and where they may have to write custom code to properly deal with it. Not all objects can even be serialized (an InputStream representing a network pipe, as above, is one example of an object that cannot be serialized), and for some, serializing them requires some fancy footwork, and the only hope you have is that the authors of the code that powers this object put in that effort. If they didn't, there is nothing you can do.
The serialization framework of java awkwardly captures both of these notions. It does mean that your code, even if you fix the bugs in it, will fail on most objects that can exist in a JVM. Your testing tool can only be used to test the most simplistic code.
If you're okay with that, read on. But if not, you need to completely rethink what you're going to do with this.
ObjectOutputStream sucks
This is not just my opinion, the openjdk team itself is broadly in agreement (they probably wouldn't quite put it like that, of course). The data emitted by OOS is a weird, inefficient, and underspecced binary blob. You can't analyse this data in any feasible way other than spending a few years reverse engineering the protocol, or just deserializing it (which requires having all the classes, and a JVM - this can be an acceptable burden, depends on your use case).
Contrast to e.g. Jackson which serializes data into JSON, which you can parse with your eyeballs, or in any language, and even without the relevant class files. You can construct 'serialized JSON' yourself without the benefit of first having an object (for testing purposes this sounds like a good idea, no? You need to test this testing framework too!).
How do I fix this code?
If you understand all the caveats above and somehow still conclude that this project, as written and continuing to use the ObjectOutputStream API is still what you want to do (I really, really doubt that's the right call):
Use the newer APIs. available() does not return the size of that blob. read(someByteArray) is not guaranteed to fill the entire byte array. Just read the javadoc, it spells it out.
There is no way to determine the size of an inputstream by asking that inputstream. You may be able to ask the DB itself (usually, LENGTH(theBlobColumn) works great in a SELECT query.
If you somehow (e.g. using LENGTH(tbc)) know the full size, you can use InputStream's readFully method, which will actually read all bytes, vs. read, which reads at least 1, but is not guaranteed to read all of it. The idea is: It'll read the smallest chunk that is available. Imagine a network pipe where bytes are dribbling into the network card's buffer, one byte a second. If so far 250 bytes have dribbled in and you call .read(some500SizeByteArr), then you get 250 bytes (250 of the 500 bytes are filled in, and 250 is returned). If you call .readFully(some500SizeByteArr), then the code will wait about 250 seconds, and then returns 500, and fills in all 500 bytes. That's the difference, and that explains why read works the way it does. Said differently: If you do not check what read() is returning, your code is definitely broken.
If you do not know how much data there is, your only option involves a while loop, or to call a helper method that does that. You need to make a temporary byte array, then in a loop keep calling read until it returns -1. For every loop, take the bytes in that array from 0 to (whatever the read call returned), and send these bytes someplace else. For example, a ByteArrayOutputStream.
Class matching
when I deserialize data in another runtime it changes and, for example, instead of boolean, I retrieve int
The java serialization system isn't magically changing your stuff on you. Well, put a pin that. Most likely the class file available in the first run (where you saved the blob in the db) was different vs what it looked like in your second run. Voila, problem.
More generally this is a problem in serialization. If you serialize, say, class Person {Date dob; String name;}, and then in a later version of the software you realize that using a j.u.Date to store a date of birth is a very silly idea, as Date is an unfortunately named class (it represents an instant in time and not a date at all), so you replace it with a LocalDate instead, thus ending up with class Person{LocalDate dob; String name;}, then how do you deal with the problem that you now want to deserialize a BLOB that was made back when the Person.class file still had the broken Date dob; field?
The answer is: You can't. Java's baked in serialization mechanism will flat out throw an exception here, it will not try to do this. This is the serialVersionUID system: Classes have an ID and changing anything about them (such as that field) changes this ID; the ID is stored in the serialized data. If the IDs don't match, deserialization cannot be done. You can force the ID (make a field called serialVersionUID - you can search the web for how to do that), but then you'd still get an error, java's deserializer will attempt to deserialize a Date object into a LocalDate dob; field and will of course fail.
Classes can write their own code to solve this problem. This is non-trivial and is irrelevant to you, as you're building a framework and presumably can't pop in and write code for your testing framework's userbase's custom class files.
I told you to put a pin in 'the serialization mechanism isnt going to magically change types on you'. Put in sufficient effort with overriding serialVersionUID and such and you can end up there. But that'd be because you wrote code that confuses types, e.g. in your readObject implementation (again, search the web for java's serialization mechanism, readObject/writeObject - or just start reading the javadoc of java.io.Serializable, that's a good starting-off point).
Style issues
You create objects for no purpose, you seem to have some trouble with the distinction between a variable/reference and an object. You aren't using try-with-resources. The way your SELECT calls are made suggests you have an SQL injection security issue. e.printStackTrace() as line line in a catch block is always incorrect.
I have this input stream that checks if I have a certain CAD file open or not. I am doing this by using an input stream to run a tasklist command with the name I want to check. I currently have a boolean that returns true if the specific CAD file isn't open. If the CAD file is open, it returns false. However, I want it to be able to loop this until the CAD file is open because as of right now I have to keep running it in order for it to work. I also need to be able to check this boolean from a separate class. I have it in my main right now so i could test it. My code looks like this...
public class AutoCadCheck {
public static void main(String[] argv) throws Exception {
String notOpen = "INFO: No tasks are running which match the specified criteria";
StringBuilder textBuilder = new StringBuilder();
String command = "tasklist /fi \"windowtitle eq Autodesk AutoCAD 2017 - [123-4567.dwg]";
int i;
InputStream myStream = Runtime.getRuntime().exec(command).getInputStream();
while ((i = myStream.read()) != -1) {
textBuilder.append((char) i);
}
String output = textBuilder.toString();
boolean logical = output.contains(notOpen);
if (logical) {
System.out.println("DWG Not Open");
} else {
System.out.print(output);
}
myStream.close();
}
}
My other class is going to have an 'if statement' that checks whether my boolean "logical" is false, and if so, print something. I have tried every possible method I could think of, but I cannot get it to function the way I want it to. Every other thing I found involving looping an inputstream didn't really apply to my situation. So hopefully someone can help me out in achieving what I want to do.
I would start by moving everything out of main and into a different class. This will make retrieving values and calling specific functions easier. Then create an object of that class in main. Once that is done, I'd create a get method for the boolean variable. Now to focus on the loop. Once the object is created in main, create a conditional loop inside of main which calls the function you need until a different condition is met. This condition might be met once the file is open. After the condition is met, it exits to another loop that relies on another conditional, such as user input.
public class AutoCadCheck {
public static void main(String[] argv) throws Exception {
AutoCadFile file = new AutoCadFile();
//loop 1
//Some conditional so the program will
//continue to run after the file has been found.
// while(){
//loop 2
//check to see if the file is open or not
//while(logical){
//}
//}
}
}
Other class
import java.io.IOException;
import java.io.InputStream;
public class AutoCadFile {
private String notOpen;
private StringBuilder textBuilder;
private String command;
private int i;
private InputStream myStream;
private String output;
private boolean logical;
public AutoCadFile() {
notOpen = "INFO: No tasks are running which match the specified criteria";
textBuilder = new StringBuilder();
command = "tasklist /fi \"windowtitle eq Autodesk AutoCAD 2017 - [123-4567.dwg]";
output = textBuilder.toString();
logical = output.contains(notOpen);
try {
myStream = Runtime.getRuntime().exec(command).getInputStream();
} catch (IOException e) {
e.printStackTrace();
}
}
public void checkForFileOpen() {
try {
while ((i = myStream.read()) != -1) {
textBuilder.append((char) i);
}
if (logical) {
System.out.println("DWG Not Open");
} else {
System.out.print(output);
}
myStream.close();
} catch (IOException e) {
e.printStackTrace();
}
}
public boolean getFileBoolean() {
return logical;
}
}
My other class is going to have an if statement that checks whether my boolean logical is false ...
Well, logical is a local variable within a method. So no code in another class is going to be able to see it.
There are two common approaches to this kind of thing:
Make the variable (i.e. logical) a field of the relevant class. (Preferably NOT a static field because that leads to other problems.)
Put your code into a method that returns the value you are assigning to logical as a result.
From a design perspective the second approach is preferable ... because it reduces coupling relative to the first. But if your application is tiny, that hardly matters.
I can see a couple of other significant problems with your code.
When you use exec(String), you are relying on the exec method to split the command string into a command name and arguments. Unfortunately, exec does not understand the (OS / shell / whatever specific) rules for quoting, etcetera in commands. So it will make a mess of your quoted string. You need to do the splitting yourself; i.e something like this:
String[] command = new String{} {
"tasklist",
"/fi",
"windowtitle eq Autodesk AutoCAD 2017 - [123-4567.dwg]"
};
Your code potentially leaks an input stream. You should use a "try with resource" to avoid that; e.g.
try (InputStream myStream = Runtime.getRuntime().exec(command).getInputStream()) {
// do stuff
} // the stream is closed automatically ... always
I'm trying to get an ObjectInputStream that will allow me to read data from it and, if it's not of the right type, put the data back onto the stream (using mark and reset) for some other code to deal with. I've tried wrapping the InputStream retrieved from the Socket (s in the following example) in a BufferedInputStream before wrapping it in an ObjectInputStream as I believed to be the solution, however when calling ois.markSupported() false is still returned. Below is that attempt:
ois = new ObjectInputStream(new BufferedInputStream(s.getInputStream()));
Any help greatly appreciated!
I would build a higher-level abstraction on top of the stream. Something like this (pseudo-code, not finalized):
public class Buffer {
private final ObjectInputStream in;
private Object current;
public Buffer(ObjectInputStream in) {
this.in = in;
}
public Object peek() {
if (current == null) {
current = in.readObject();
}
return current;
}
public void next() {
current = in.readObject();
}
}
You would use peek() repeatedly to get the current object, and if it suits you, call next() to go to the next one.
Of course, you need to deal with exceptions, the end of the stream, closing it properly, etc. But you should get the idea.
Or, if you can just read everything in memory, then do it and create a Queue with the objects from the stream, then pass that Queue around and use peek() and poll().
I hope I didn't just find a bug in Java! I am running JDK 7u11 (mostly because that is the sanctioned JVM allowed by my employer) and I am noticing a very odd issue.
Namely, I am chunking data into a LinkedHashSet and writing it to a file using the ObjectOutputStream daisy changed through the GZIpOutputStream (mentioning this just in case it matters).
Now, when I get to the other side of the program and readObject() I notice that the size always reads 68, which I is the first size. The underlying table can have many more or less than 68, but the .size() method always returns 68. More troubling, when I try to manually iterate the underlying Set, it also stops at 68.
while(...) {
oos.writeInt(p_rid);
oos.writeObject(wptSet);
wptSet.clear();
// wptSet = new LinkedHashSet<>(); // **This somehow causes the heapsize to increase dramatically, but it does solve the problem**
}
And when reading it
Set<Coordinate> coordinates = (Set<Coordinate>) ois.readObject();
the coordinates.size() always returns 68. Now, I could make a workaround by also .writeInt() the size, but I can only iterate through 68 members!
Notice the wptSet = new LinkedHashSet<>() line actually solves the issue. The main problem with that is it makes my heapsize skyrocket when looking at the program in JVisualVM.
Update:
I actually just found a viable workaround that fixes the memory leak of re-instantiating wptSet... System.gc() Calling that after each call to .clear() actually keeps the memory leak away.
Either way, I shouldn't have to do this and shipping out the LinkedHashSet should not exhibit this behavior.
Alright, I think I understand what you are asking.
Here is an example to reproduce...
import java.util.*;
import java.io.*;
class Example {
public static void main(String[] args) throws Exception {
Set<Object> theSet = new LinkedHashSet<>();
final int size = 3;
for(int i = 0; i < size; ++i) {
theSet.add(i);
}
ByteArrayOutputStream bytesOut = new ByteArrayOutputStream();
ObjectOutputStream objectsOut = new ObjectOutputStream(bytesOut);
for(int i = 0; i < size; ++i) {
objectsOut.writeObject(theSet);
theSet.remove(i); // mutate theSet for each write
}
ObjectInputStream objectsIn = new ObjectInputStream(
new ByteArrayInputStream(bytesOut.toByteArray()));
for(;;) {
try {
System.out.println(((Set<?>)objectsIn.readObject()).size());
} catch(EOFException e) {
break;
}
}
}
}
The output is
3
3
3
What is going on here is that ObjectOutputStream detects that you are writing the same object every time. Each time theSet is written, a "shared reference" to the object is written so that the same object is deserialized each time. This is explained in the documentation:
Multiple references to a single object are encoded using a reference sharing mechanism so that graphs of objects can be restored to the same shape as when the original was written.
In this case you should use writeUnshared(Object) which will bypass this mechanism, instead of writeObject(Object).
I have the following design in a project
Multiple crawlers
a list ImageList for found images (Observable); this gets updated by threaded processes (thus parallel)
two observers which listen to the list (Downloader and ImagesWindow); caveat: these can be notified multiple times, because the list gets updated by threads
I always wanted to get only the newest entries from ImageList so I implemented it with a counter:
public class ImageList extends Observable {
private final ConcurrentMap<Integer, Image> images = new ConcurrentHashMap<Integer, Image>();
private final AtomicInteger counter = new AtomicInteger(0);
/* There is some more code within here, but its not that important
important is that stuff gets added to the list and the list shall
inform all listeners about the change
The observers then check which is the newest ID in the list (often +1
but I guess I will reduce the inform frequency somehow)
and call (in synchronized method):
int lastIndex = list.getCurrentLastIndex();
getImagesFromTo(myNextValue, lastIndex);
myNextValue = lastIndex + 1;
*/
public synchronized void addToFinished(Image job) throws InterruptedException {
int currentCounter = counter.incrementAndGet();
images.put(currentCounter, job);
this.setChanged();
this.notifyObservers();
}
public synchronized int getCurrentLastIndex() {
return counter.get();
}
public ArrayList<Image> getImagesFromTo(int starting, int ending) {
ArrayList<Image> newImages = new ArrayList<Image>();
Image image;
for (int i = starting; i <= ending; i++) {
image = images.get(i);
if (image != null) {
newImages.add(image);
}
}
return newImages;
}
}
The observers (Downloader here) use this method like this:
#Override
public void update(Observable o, Object arg) {
System.out.println("Updated downloader");
if (o instanceof ImageList) {
ImageList list = (ImageList) o;
downloadNewImages(list);
}
}
private synchronized void downloadNewImages(ImageList list) {
int last = list.getCurrentLastIndex();
for (Image image : list.getImagesFromTo(readImageFrom, last)) {
// code gets stuck after this line
if (filter.isOk(image)) {
// and before this line
// [here was a line, but it also fails if I remove it]
}
}
// set the index to the new index
readImageFrom = last + 1;
}
However, sometimes the loop gets stuck and a second call seems to be allowed on the method. Then this is what happens:
Downloader retrieves images 70 to 70
Downloader retrieves images 70 to 71
Downloader retrieves images 70 to 72
…
Downloader retrieves images 70 to n
So a second call to the method is allowed entering the method, but the counter readImageFrom never gets updated.
When I remove both calls to the other functions within the loop, the script begins to work. I know they are not synchronized, but do they have to be if already the "parent" is synchronized?
filter.isOK() is implemented like this (the other functions just return true or false; the code fails when I have hasRightColor included, I guess because it is a bit slower to calculate):
public boolean isOk(Image image) {
return hasRightDimensions(image) && hasRightColor(image);
}
How can this happen? Eclipse does not show any thrown exception (which of course would cause the method to be exited).
Maybe there also is a totally different approach for getting only the newest content of a list from multiple observers (where each observer might be notified several times because the program runs parallel)?
Okay, the error was some wicked NullPointerException which was not displayed to me (whoever knows why) in filter.isOk().
I was not able to see it in my IDE, because I had changed from this.image to parameter-passing image, but forgot to remove private image in the header and to change the parameters of the last of the three functions.
So eclipse did neither say anything about a missing image nor about an unused this.image.
Finally.