Java memory leak and garbage collection, - java

This question has baffled myself and my cohorts. In the program I had written I was experiencing a memory leak. The var Platform, was being reassigned each iterations to a new object. But for some reason the old platform objects where not being cleaned by the gc and after many iterations the heap overflows:
Some of you may realise this is a PSO algorithm. But for those who don't this function has to be evaluated 1000's of times, and basicplatform is an extremely data extensive object, so multiple instances would eventually give a memory overflow, just to give a little context.
Buggy code:
public class Fitness implements FitnessFunction{
protected Platform platform;
public Fitness(){
}
public Fitness(Platform platform) {
this.platform = platform;
}
#Override
public double fitness(Particle p) {
try {
platform = new BasicPlatform("testData.csv");
} catch (Exception e) {
e.printStackTrace();
}
platform.startSimulation();
double prof = platform.getFitness();
v.clear();
if(prof != 0)
return -prof;
return 0;
}
}
After being confused as to why there is leak, as surely there shoudln't be, my friend showed me this solution, which he used before in a similar situation:
public class TradingRuleFitness implements FitnessFunction{
protected Platform platform;
public Fitness(){
}
public Fitness(Platform platform) {
this.platform = platform;
}
#Override
public double fitness(Particle p) {
Vector<Platform> v = new Vector<Platform>();
try {
//platform = new BasicPlatform("testData.csv");
v.add(new BasicPlatform("testData.csv"));
} catch (Exception e) {
e.printStackTrace();
}
double prof = v.get(0).getFitness;
v.clear();
if(prof != 0)
return -prof;
return 0;
}
}
Nearly exactly the same but this time instead of re-assigning the var platform we create a new object inside a vector and delete it after I have finished with it.This method seems to force the gc into cleaning up.
The question is why does this vector method work but not the original which technically should and are there any cleaner solutions?
p.s I have cleaned up unnesscary bits of code, as the question is about the object creation and removal

In the first case, if your Fitness object si retained, so will your Platform object be retained as it a field.
In the second case, your Platform is held in a local variable and is discard when fitness returns.
It doesn't have to be a Vector, it could be a plain local variable.
Try removing the field in both examples and it should work either way.

Seems in the first example you also cleared the "unnesscary bits of code" for the vector itself :)
It'll be hard to give a correct answer like this...

Related

reading a reference to an object and reading the object’s fields under JMM

This post was raised after reading: https://shipilev.net/blog/2016/close-encounters-of-jmm-kind/#pitfall-semi-sync
class Box {
int x;
public Box(int v) {
x = v;
}
}
class RacyBoxy {
Box box;
public synchronized void set(Box v) {
box = v;
}
public Box get() {
return box;
}
}
and test:
#JCStressTest
#State
public class SynchronizedPublish {
RacyBoxy boxie = new RacyBoxy();
#Actor
void actor() {
boxie.set(new Box(42)); // set is synchronized
}
#Actor
void observer(IntResult1 r) {
Box t = boxie.get(); // get is not synchronized
if (t != null) {
r.r1 = t.x;
} else {
r.r1 = -1;
}
}
}
The author says that it is possible that r.r1 == 0. And I agree with
that. But, I am confused with an explanation:
The actual failure comes from the fact that reading a reference to an object and reading the object’s fields are distinct under the memory model.
I agree that
reading a reference to an object and reading the object’s fields are distinct under the memory model
but, I don't see how it has an influence on result.
Please help me understand it.
P.S. If someone is confused about #Actor. It just means: run in a thread.
I think it adresses a common miconception of people that read code with regards to sequential consitency. The fact that the reference to an instance is available in one thread, does not imply that its constructor is set. In other words: reading an instance is a different operation than reading an instance's field. Many people assume that once they can observe an instance, it requires the constructor to be run but due to the missing read synchronization, this is not true for the above example.
Ill just slightly augment the accepted answer here - without some barriers there are absolutely no guarantees that once you see a reference (think some threads can get a hold of a reference) - all the fields from that constructor are initialized. I actually answered sort of this already some time ago to one of your questions if I'm not mistaken.
There are two barriers inserted after the constructor that has final fields LoadLoad and LoadStore; it you think about their names - you will notice that no operation after the constructor can be re-ordered with one inside it:
Load -> Load (no Load can be re-ordered with a previous Load)
Load -> Store (no Store can be re-ordered with a previous Load)
Also note that it would be impossible for you to break that under the current x86 memory model - as it is a (too?) strong memory model; and as such these barriers are free on x86 - they are not inserted at all, because the operations are not re-ordered.

Dynamically generate a single function (without subfunctions), representing a binary expression tree, at run time with Byte Buddy

Introduction
I want to compare some libraries for generating code at run time. At the moment I touched the surface of Javassist and Byte Buddy.
As a proof of concept I am trying to solve a small problem, which is a starting point for a more complex one.
Basically I have a binary expression tree which I want to convert into a single line of code and load it into my java run time. For simplicity reasons I have only add nodes and constants as leafs so far.
Javassist Reference
I already have a way for doing this in Javassist (which at least works for a single node with two leafs). The code is looking like this:
public class JavassistNodeFactory{
public DynamicNode generateDynamicNode(INode root){
DynamicNode dynamicNode = null;
try {
CtClass cc = createClass();
interceptMethod(root, cc);
compileClass(cc);
dynamicNode = instantiate(cc);
}catch (Exception e){
System.out.println("Error compiling class with javassist: "+ e.getMessage());
e.printStackTrace();
}
return dynamicNode;
}
private DynamicNode instantiate(CtClass cc) throws CannotCompileException, IllegalAccessException, InstantiationException {
Class<?> clazz = cc.toClass();
return (DynamicNode) clazz.newInstance();
}
private void compileClass(CtClass cc) throws NotFoundException, IOException, CannotCompileException {
cc.writeFile();
}
private void interceptMethod(INode root, CtClass cc) throws NotFoundException, CannotCompileException {
CtMethod calculateMethod = cc.getSuperclass().getDeclaredMethod("calculateValue",null);
calculateMethod.setBody("return "+ nodeToString(root)+ ";");
}
private CtClass createClass() throws CannotCompileException, NotFoundException {
ClassPool pool = ClassPool.getDefault();
CtClass cc = pool.makeClass(
"DN"+ UUID.randomUUID().toString().replace("-","")
);
cc.setSuperclass(pool.get("org.jamesii.mlrules.util.runtimeCompiling.DynamicNode"));
return cc;
}
private static String nodeToString(INode node){
if (node.getName().equals("")){
return ((ValueNode)node).getValue().toString();
}else{
List<? extends INode> children = node.getChildren();
assert(children.size()==2);
return ("("+nodeToString(children.get(0))+node.getName()+nodeToString(children.get(1))+")");
}
}
}
The DynamicNode class looks like this:
public class DynamicNode implements INode {
#Override
public <N extends INode> N calc() {
Double value = calculateValue();
return (N) new ValueNode<Double>(value);
}
#Override
public List<? extends INode> getChildren() {
return null;
}
#Override
public String getName() {
return null;
}
private Double calculateValue() {
return null;
}
}
The important part is the nodeToString() function, where I generate an arithmetic formula represented by the returned string, from a given root node. TheValueNode is a leaf of the tree with a constant Value, which would be returned as a String.
Other nodes (only add nodes for my case) will call the function recursively for each child and print brackets arround the expression as well as printing the operator (returned by the getName() function) in the middle of the two children (in short: "(leftChild+rightChild)").
The body of the calculateValue() function will be altered in the interceptMethod() function by Javassist, to return the result of the generated formula.
Byte Buddy Attempt
I have played around with Byte Buddy to achieve a similar solution. But as I looked deeper into the concepts and the documentation, I felt more and more like this is not a problem Byte Buddy was designed for. The majority of examples and questions seem to concentrate on the function delegation to other functions (which actually exist already at compile time, and are only connected to at run time). This is not really convenient in my case, since I have no way of knowing the actual tree I want to convert, at compile time. It is probably possible to use the underlying ASM library, but I would like to avoid handling byte code by myself (and possible successors of mine).
I have a (obviously not working) basic implementation, but I am stuck at the point where I have to provide an Implementation for the intercept() function of the Byte Buddy library. My last state looks like this:
public class ByteBuddyNodeFactory{
#Override
public DynamicNode generateDynamicNode(INode root) {
DynamicNode dynamicNode = null;
try {
Class<?> dynamicType = new ByteBuddy()
.subclass(DynamicNode.class)
.name("DN"+ UUID.randomUUID().toString().replace("-",""))
//this is the point where I have problems
//I don't know how to generate the Implementation for the intercept() function
//An attempt is in the nodeToImplementation() function
.method(ElementMatchers.named("calculateValue")).intercept(nodeToImplementation(root))
.make()
.load(Object.class.getClassLoader())
.getLoaded();
dynamicNode = (DynamicNode) dynamicType.newInstance();
} catch (Exception e) {
System.out.println("Error compiling testclass with bytebuddy: " + e.getMessage());
e.printStackTrace();
}
return dynamicNode;
}
private Implementation.Composable nodeToImplementation(INode node){
if (node.getName().equals("")){
return (Implementation.Composable)FixedValue.value(((ValueNode)node).getValue());
}else{
List<? extends INode> children = node.getChildren();
assert(children.size()==2);
switch (node.getName()){
case ("+"):
//This is the point where I am completely lost
//This return is just the last thing I tried and may be not even correct Java code
// But hopefully at least my intention gets clearer
return (MethodCall.invoke((Method sdjk)-> {
return (nodeToImplementation(children.get(0)).andThen(node.getName().andThen(nodeToImplementation(children.get(1)))));
}));
default:
throw new NotImplementedException();
}
}
}
}
My idea was to concatenate subfunctions together and therefore tried to work with the Composable Implementation. I tried to return a MethodDelegation but as I mentioned I got the feeling that this wouldn't be the right approach. After that I tried MethodCall but I soon realized that I have exactly no idea how to make things work with this one either^^
Question
Is it possible in Byte Buddy to generate a function from a tree structure as dynamically as in Javassist, without calling as many sub functions as I have nodes?
How would I do this, if possible?
And if it is not possible: is it possible with other byte code manipulation libraries like cglib.
I would prefer to stay an abstraction level above byte code, since the study of the underlying concepts should be irrelevant for my problem.
What you are trying to do is not easily possible with Byte Buddy's high-level APIs. Instead, you should assemble a method using StackManipulations if you want to use Byte Buddy. Stack manipulations do still contain Java byte code but these bits should be so trivial that they would be easy to implement.
The reason that Byte Buddy does not aim for this scenario is that you can normally find a better abstraction for your code than to assemble code snippets. Why can your nodes not implement the actual implementation which is then called from your instrumented method? The JIT compiler does typically optimize this code to the same result as your manually inlined code. Additionally, you preserve debuggability and reduce the complexity of your code.

Java: How/Should I optimize a method with multiple IF statements?

The problem is less generic, than in subj. Here I have the Builder pattern for user's convenience and a method with multiple IFs. However each IF statement is a condition on one of the object's non-final field. There're no assignment operations to these fields within the body of the method under consideration as well, as no setters provided by the class's API. Example:
public class MyFormatter {
public static class Builder {
private final boolean notOptional; // don't mind this one, just the Builder pattern
private boolean optionalA, optionalB, optionalC; // these would matter further
private Builder optionalA() { optionalA = true; return this; }
private Builder optionalB() { optionalB = true; return this; }
private Builder optionalC() { optionalC = true; return this; }
public Builder(boolean notOptional) {
this.notOptional = notOptional;
}
public MyFormatter build() {
MyFormatter formatter = new MyFormatter(notOptional);
formatter.optionalA = optionalA;
formatter.optionalB = optionalB;
formatter.optionalC = optionalC;
return formatter;
}
}
private final boolean notOptional;
private boolean optionalA, optionalB, optionalC; // Not final
private MyFormatter(boolean notOptional) {
this.notOptional = notOptional;
}
protected String publish(String msg) {
StringBuilder sb = new StringBuilder();
// Here we go: a lot of IFs, though conditions "effectively never" change
if (optionalA) {
sb.append("something");
}
if (optionalB) {
sb.append("something else");
}
if (optionalC) {
sb.append("and something else");
}
return sb.toString();
}
}
Ok, now the questions are how much JIT-compiler can do to optimize this code, and if there's anything I can do to optimize it (some lazy initialization etc.).
p.s. (Harder question) Imagine this code being translated in JavaScript (by GWT), i.e. no JVM would be involved in executing/optimizing this method. What can a programmer do in this case to improve the performance?
It is absolutely crucial for dev to see the real-time dynamics and each millisecond matter a lot.
That's it. Unless your devs can read many thousand messages per second, you're fine. The cost of
if (optionalA) {
sb.append("something");
}
consists of two parts.
The conditional branch and the appending. A mispredicted branch takes 10-20 cycles, i.e., up to 20 / 3 nanoseconds on a 3 GHz CPU. A correctly predicted branch is essentially free and because of the boolean being constant and the code being hot, you can assume that.
According to the length of "something", the appending may be more costly, but no details are given, so there's nothing to optimize.
I don't think the JIT will find something to optimize here. You could size your StringBuilder to gain a bit.
All in all, it is premature optimization.
Imagine this code being translated in JavaScript (by GWT)
Modern browsers have an advanced JIT just like Java does. Due to Javascript being weakly typed, it can't be as fast, but it comes pretty close.
Measure before optimizing, so you don't spend your time where the CPU does not.

Security - Array is stored directly - String [][] [duplicate]

There is a Sonar Violation:
Sonar Violation: Security - Array is stored directly
public void setMyArray(String[] myArray) {
this.myArray = myArray;
}
Solution:
public void setMyArray(String[] newMyArray) {
if(newMyArray == null) {
this.myArray = new String[0];
} else {
this.myArray = Arrays.copyOf(newMyArray, newMyArray.length);
}
}
But I wonder why ?
It's complaining that the array you're storing is the same array that is held by the caller. That is, if the caller subsequently modifies this array, the array stored in the object (and hence the object itself) will change.
The solution is to make a copy within the object when it gets passed. This is called defensive copying. A subsequent modification of the collection won't affect the array stored within the object.
It's also good practice to normally do this when returning a collection (e.g. in a corresponding getMyArray() call). Otherwise the receiver could perform a modification and affect the stored instance.
Note that this obviously applies to all mutable collections (and in fact all mutable objects) - not just arrays. Note also that this has a performance impact which needs to be assessed alongside other concerns.
It's called defensive copying. A nice article on the topic is "Whose object is it, anyway?" by Brian Goetz, which discusses difference between value and reference semantics for getters and setters.
Basically, the risk with reference semantics (without a copy) is that you erronously think you own the array, and when you modify it, you also modify other structures that have aliases to the array. You can find many information about defensive copying and problems related to object aliasing online.
I had the same issue:
Security - Array is stored directly The user-supplied array
'palomitas' is stored directly.
my original method:
public void setCheck(boolean[] palomitas) {
this.check=palomitas;
}
fixed turned to:
public void setCheck(boolean[] palomitas) {
if(palomitas == null) {
this.check = new boolean[0];
} else {
this.check = Arrays.copyOf(palomitas, palomitas.length);
}
}
Other Example:
Security - Array is stored directly The user-supplied array
private String[] arrString;
public ListaJorgeAdapter(String[] stringArg) {
arrString = stringArg;
}
Fixed:
public ListaJorgeAdapter(String[] stringArg) {
if(stringArg == null) {
this.arrString = new String[0];
} else {
this.arrString = Arrays.copyOf(stringArg, stringArg.length);
}
}
To eliminate them you have to clone the Array before storing / returning it as shown in the following class implementation, so noone can modify or get the original data of your class but only a copy of them.
public byte[] getarrString() {
return arrString.clone();
}
/**
* #param arrStringthe arrString to set
*/
public void arrString(byte[] arrString) {
this.arrString= arrString.clone();
}
I used it like this and Now I am not getting any SONAR violation...
It's more ease than all of this. You only need to rename the method parameter to anything else to avoid Sonar violations.
http://osdir.com/ml/java-sonar-general/2012-01/msg00223.html
public void setInventoryClassId(String[] newInventoryClassId)
{
if(newInventoryClassId == null)
{
this.inventoryClassId = new String[0];
}
else
{
this.inventoryClassId = Arrays.copyOf(newInventoryClassId, newInventoryClassId.length);
}
}
To go the defensive-implementation-way can save you a lot of time.
In Guava you get another nice solution to reach the goal: ImmutableCollections
http://code.google.com/p/guava-libraries/wiki/ImmutableCollectionsExplained
There are certain cases where it is a design decision and not missed out. In these cases, you need to modify the Sonar rules to exclude it so that it doesn't show such issues in report.

Saving on Instance Variables

Our server recently has been going down a lot and I was tasked to improve the memory usage of a set of classes that was identified to be the culprit.
I have code which initializes an instance of an object and goes like this:
boolean var1;
boolean var2;
.
.
.
boolean var100;
void setup() {
var1 = map.hasFlag("var1");
var2 = map.hasFlag("var2);
.
.
.
if (map.hasFlag("some flag") {
doSomething();
}
if (var1) {
increment something
}
if (var2) {
increment something
}
}
The setup code takes about 1300 lines. My question is if it is possible for this method to be more efficient in terms of using too many instance variables.
The instance variables by the way are used in a "main" method handleRow() where for example:
handleRow(){
if (var1) {
doSomething();
}
.
.
.
if (var100) {
doSomething();
}
}
One solution I am thinking is to change the implementation by removing the instance variables in the setup method and just calling it directly from the map when I need it:
handleRow(){
if (map.hasFlag("var1") {
doSomething();
}
.
.
.
if (map.hasFlag("var100") {
doSomething();
}
}
That's one solution I am considering but I would like to hear the inputs of the community. :)
If these are really all boolean variables, consider using a BitSet instead. You may find that reduces the memory footprint by a factor of 8 or possibly even 32 depending on padding.
100 boolean variables will take 1.6k of memory when every boolean with overhead takes 16 bytes (which is a bit much imho) I do not think this will be the source of the problem.
Replacing these flags with calls into the map will negatively impact performance, so your change will probably make things worse.
Before you go redesigning your code (a command pattern looks like a good candidate) you should look further into where the memory leak is that you are asked to solve.
Look for maps that the classes keep adding to, collections that are static variables etc. Once you find out where the reason for the memory growth lies you can decide which part of your classes to refactor.
You could save memory at the cost of time (but if your memory use is a real problem, then it's probably a nett gain in time) by storing the values in a bitset.
If the class is immutable (once you create it, you never change it) then you can perhaps gain by using a variant on Flyweight pattern. Here you have a store of in-use objects in a weak hashmap, and create your objects in a factory. If you create an object that is identical to an existing object, then your factory returns this previous object instead. The saving in memory can be negliable or massive depending on how many repeated objects there are.
If the class is not immutable, but there is such repetition, you can still use the Flyweight pattern, but you will have to do a sort of copy-on-write where altering an object makes it change from using a shared internal representation to one of its own (or a new one from the flyweight store). This is yet more complicated and yet more expensive in terms of time, but again if its appropriate, the savings can be great.
You can use command pattern:
public enum Command {
SAMPLE_FLAG1("FLAG1") {
public abstract void call( ){
//Do you increment here
}
},
SAMPLE_FLAG2("FLAG2") {
public abstract void call( ){
//Do you increment here
}
};
private Map<String, Command> commands = new HashMap<String, Command>( );
static {
for ( Command cmd : Command.values( )) {
commands.put( cmd.name, cmd);
}
};
private String name;
private Command( String name) {
this.name = name;
}
public Command fromString( String cmd) {
return commands.get( cmd);
}
public abstract void call( );
}
and then:
for( String $ : flagMap.keySet( )) {
Command.fromString( $).call( );
}

Categories