Rete object not freeing Values after reset - java

I am using Jess together with a FixThreadPool to create several Rete engines that can be used to evaluate the performance of a system in a parallel mode. Each Rete engine runs independently from the others and takes as an input a Java object containing the design of the system and outputs another Java object that contains its performance metrics.
Before evaluating each system, I reset the Rete engines to their original state. However, as my program runs the RAM memory keeps piling up, with more and more jess.Value objects being stored.
This is the class that I use to interface Jess with Java:
public class Variant {
private final Object value;
private final String type;
public Variant(Object value) {
this.value = cast2JavaObject(value);
this.type = (this.value instanceof List) ? "multislot" : "slot";
}
public Object getValue() {
return value;
}
public String getType() {
return type;
}
private Object cast2JavaObject(Object value) {
try {
if (value instanceof Value) {
return castJessValue((Value) value);
} else {
return value;
}
} catch (Exception e) {
System.out.println(e.getMessage());
e.printStackTrace();
return null;
}
}
private synchronized Object castJessValue(Value value) throws Exception {
if (value.type() == RU.LIST) {
List list = new ArrayList<Object>();
list.addAll(Arrays.asList((Object[]) RU.valueToObject(Object.class, value, null)));
return list;
} else {
return RU.valueToObject(Object.class, value, null);
}
}
public Value toJessValue() throws Exception {
Object val;
if (value instanceof List) {
val = ((List) value).toArray();
} else {
val = value;
}
return RU.objectToValue(val.getClass(), val);
}
}
Is it possible that the Object contained within the Variant is pointing to the contents of a jess.Value and therefore they are not being collected by the GC when I call rete.reset()?

I think that this would be possible if the object passed in the constructor (be it a jess.Value or a plain POJO) references one or more jess.Value's. Neither your cast2JavaObject nor RU.valueToObject are recursive + introspective.
However, what if there were jess.Value objects contained? They are decorations for Java objects, and even if they were unwrapped the heap would be piling up with the bared objects alone, just slower.
If you use store/fetch I'd also call clearStorage in addition to reset.
I suggest an experiment to narrow the OOM problem down. Rather than reset, recreate the Rete object. If the problem persists, I daresay it is in some other nook or cranny of your application.

Related

Java: alternative to null for special meaning

In Java every variable of a object derived type can be an instance of that type OR null - as far I know.
Is there an alternative? e.g. not an instance but also not null?
I need to represent a special state.
e.g. use it as a parameter in a search-function that can represent a regular value, a null for "empty" or a wildcard for "anything".
Integer n;
n = null; // empty
n = new Integer (11); // regular value
n = ???? // wildcard
search (some_list, n);
The type in the sample is Integer. But it should be generic. So no Integer.MAX would be of help.
I want to do it without any added "flag-variables" - if possible.
There's no way to assign anything that's not either a null value or an instance of the correct type to a reference variable.
But there's ways to simulate that.
Take a look at Optional, it provides an object that can be either absent (very roughly equivalent to null) or present (and have an actual value).
You could do something similar, but with 3 states by creating your own class, let's call it SearchValue:
public class SearchValue<T> {
private final T value;
private final boolean missing;
private final boolean wildcard;
private SearchValue(T value) {
this.value = value;
this.missing = false,
this.wildcard = false;
}
private SearchValue(boolean isMissing) {
this.value = null;
this.missing = isMissing;
this.wildcard = !isMissing;
}
public static <T> SearchValue<T> of(T value) {
return new SearchValue<>(value);
}
public static <T> SearchValue<T> missing() {
return new SearchValue(true);
}
public static <T> SearchValue<T> wildcard() {
return new SearchValue(false);
}
public T getValue() {
if (value == null) {
throw new IllegalStateException("no value specified");
}
return value;
}
public boolean isValue() {
return value != null;
}
public boolean isMissing() {
return missing;
}
public boolean isWildcard() {
return wildcard;
}
}
Any SearchValue instance will return true on exactly one of isValue, isMissing or isWildcard (and only return successfully from getValue() when isValue() returns true).
Note that this can definitely be optimized (by reducing the flags to one field and/or making sure that there's only ever one missing or wildcard instance, since they are interchangable), but the general principle should be clear.
No, a variable is either null, or points to an object of an appropriate type, there is no other possibility. However, you can achieve your goal by encapsulating your search term, e.g.
public class SearchTerm<T> {
private final T value;
public static final SearchTerm WILDCARD = new SearchTerm<Object>(new Object());
public SearchTerm(T value) {
this.value = value;
}
public T getValue() {
return this.value;
}
}
To check what type of SearchTerm an instance represents
void doSearch(SearchTerm<String> searchTerm) {
if (searchTerm == SearchTerm.WILDCARD) {
// do a wildcard search
} else if (searchTerm.getValue() == null) {
// do whatever type of search this represents
} else {
// search for items that match this term
String searchTermValue = searchTerm.getValue();
}
}

Lazy initialization / memoization without volatile

It appears the Java Memory Model does not define "refreshing" and "flushing" of the local cache, instead people only call it that way for simplicity, but actually the "happens-before" relationship implies refreshing and flushing somehow (would be great if you can explain that, but not directly part of the question).
This is getting me really confused combined with the fact that the section about the Java Memory Model in the JLS is not written in a way which makes it easy to understand.
Therefore could you please tell me if the assumptions I made in the following code are correct and if it is therefore guaranteed to run correctly?
It is partially based on the code provided in the Wikipedia article on Double-checked locking, however there the author used a wrapper class (FinalWrapper), but the reason for this is not entirely obvious to me. Maybe to support null values?
public class Memoized<T> {
private T value;
private volatile boolean _volatile;
private final Supplier<T> supplier;
public Memoized(Supplier<T> supplier) {
this.supplier = supplier;
}
public T get() {
/* Apparently have to use local variable here, otherwise return might use older value
* see https://jeremymanson.blogspot.com/2008/12/benign-data-races-in-java.html
*/
T tempValue = value;
if (tempValue == null) {
// Refresh
if (_volatile);
tempValue = value;
if (tempValue == null) {
// Entering refreshes, or have to use `if (_volatile)` again?
synchronized (this) {
tempValue = value;
if (tempValue == null) {
value = tempValue = supplier.get();
}
/*
* Exit should flush changes
* "Flushing" does not actually exists, maybe have to use
* `_volatile = true` instead to establish happens-before?
*/
}
}
}
return tempValue;
}
}
Also I have read that the constructor call can be inlined and reordered resulting in a reference to an uninitialized object (see this comment on a blog). Is it then safe to directly assign the result of the supplier or does this have to be done in two steps?
value = tempValue = supplier.get();
Two steps:
tempValue = supplier.get();
// Reorder barrier, maybe not needed?
if (_volatile);
value = tempValue;
Edit: The title of this question is a little bit misleading, the goal was to have reduced usage of a volatile field. If the initialized value is already in the cache of a thread, then value is directly accessed without the need to look in the main memory again.
You can reduce usage of volatile if you have only a few singletons. Note: you have to repeat this code for each singleton.
enum LazyX {
;
static volatile Supplier<X> xSupplier; // set somewhere before use
static class Holder {
static final X x = xSupplier.get();
}
public static X get() {
return Holder.x;
}
}
If you know the Supplier, this becomes simpler
enum LazyXpensive {
;
// called only once in a thread safe manner
static final Xpensive x = new Xpensive();
// after class initialisation, this is a non volatile read
public static Xpensive get() {
return x;
}
}
You can avoid making the field volatile by using Unsafe
import sun.misc.Unsafe;
import java.lang.reflect.Field;
import java.util.function.Supplier;
public class LazyHolder<T> {
static final Unsafe unsafe = getUnsafe();
static final long valueOffset = getValueOffset();
Supplier<T> supplier;
T value;
public T get() {
T value = this.value;
if (value != null) return value;
return getOrCreate();
}
private T getOrCreate() {
T value;
value = (T) unsafe.getObjectVolatile(this, valueOffset);
if (value != null) return value;
synchronized (this) {
value = this.value;
if (value != null) return value;
this.value = supplier.get();
supplier = null;
return this.value;
}
}
public static Unsafe getUnsafe() {
try {
Field theUnsafe = Unsafe.class.getDeclaredField("theUnsafe");
theUnsafe.setAccessible(true);
return (Unsafe) theUnsafe.get(null);
} catch (NoSuchFieldException | IllegalAccessException e) {
throw new AssertionError(e);
}
}
private static long getValueOffset() {
try {
return unsafe.objectFieldOffset(LazyHolder.class.getDeclaredField("value"));
} catch (NoSuchFieldException e) {
throw new AssertionError(e);
}
}
}
However, having the extra look up is a micro optimisation. If you are willing to take a synchronisation hit once per thread, you can avoid using volatile at all.
Your code is not thread safe, which can easily be shown by stripping off all irrelevant parts:
public class Memoized<T> {
private T value;
// irrelevant parts omitted
public T get() {
T tempValue = value;
if (tempValue == null) {
// irrelevant parts omitted
}
return tempValue;
}
}
So value has no volatile modifier and you’re reading it within the get() method without synchronization and when non-null, proceed using it without any synchronization.
This code path alone is already making the code broken, regardless of what you are doing when assigning value, as all thread safe constructs require both ends, reading and writing sides, to use a compatible synchronization mechanism.
The fact that you are using esoteric constructs like if (_volatile); becomes irrelevant then, as the code is already broken.
The reason why the Wikipedia example uses a wrapper with a final field is that immutable objects using only final fields are immune to data races and hence, the only construct that is safe when reading its reference without a synchronization action.
Note that since lambda expressions fall into the same category, you can use them to simplify the example for your use case:
public class Memoized<T> {
private boolean initialized;
private Supplier<T> supplier;
public Memoized(Supplier<T> supplier) {
this.supplier = () -> {
synchronized(this) {
if(!initialized) {
T value = supplier.get();
this.supplier = () -> value;
initialized = true;
}
}
return this.supplier.get();
};
}
public T get() {
return supplier.get();
}
}
Here, supplier.get() within Memoized.get() may read an updated value of supplier without synchronization action, in which case it will read the correct value, because it is implicitly final. If the method reads an outdated value for the supplier reference, it will end up at the synchronized(this) block which uses the initialized flag to determine whether the evaluation of the original supplier is necessary.
Since the initialized field will only be accessed within the synchronized(this) block, it will always evaluate to the correct value. This block will be executed at most once for every thread, whereas only the first one will evaluate get() on the original supplier. Afterwards, each thread will use the () -> value supplier, returning the value without needing any synchronization actions.

ConcurrentHashMap dilemma in Java

CocncurrentHashMap provides a method to atomically check and add an element if it is not present via putIfAbsent method as shown in the example below
xmlObject = new XMLObejct(xmlId);
mapOfXMLs.putIfAbsent(xmlId, xmlObject);
However my dilemma is that , I have to create that xmlObject in advance. Is there a way to postpone the object creation after the key present check.
I want all three things below to happen atomically
Check if the key present
Create object if key is not present.
Add the object to map.
I know I can achieve this using synchronized block , If I am using a synchronized block , why use a CocurrentHashMap?
The Guava Caches offer such a functionality ( http://code.google.com/p/guava-libraries/wiki/CachesExplained ) though it's somewhat hidden.
If you can already use Java 8, then you can use computeIfAbsent. But I guess if you could use it, you would not have asked....
The standard, almost perfect pattern is this:
Foo foo = map.get(key);
if(foo == null) {
map.putIfAbsent(new Foo());
foo = map.get(key);
}
It does sometimes result in an extra object, but extremely infrequently, so from a performance standpoint is certainly fine. It only wouldn't be fine if constructing your object inserted into a database or charged a user or some such.
I've encountered this scenario a couple of times, and they allowed for the value to be created lazily. It may not apply to your use case, but if it does, this is basically what I did:
static abstract class Lazy<T> {
private volatile T value;
protected abstract T initialValue();
public T get() {
T tmp = value;
if (tmp == null) {
synchronized (this) {
tmp = value;
if (tmp == null)
value = tmp = initialValue();
}
}
return tmp;
}
}
static ConcurrentHashMap<Integer, Lazy<XmlObject>> map = new ConcurrentHashMap<>();
and then populating the map:
final int id = 1;
map.putIfAbsent(id, new Lazy<XmlObject>() {
#Override
protected XmlObject initialValue() {
return new XmlObject(id);
}
});
System.out.println(map.get(id).get());
You can of course create a specialized LazyXmlObject for convenience:
static class LazyXmlObject extends Lazy<XmlObject> {
private final int id;
public LazyXmlObject(int id) {
super();
this.id = id;
}
#Override
protected XmlObject initialValue() {
return new XmlObject(id);
}
}
and the usage would be:
final int id = 1;
map.putIfAbsent(id, new LazyXmlObject(id));
System.out.println(map.get(id).get());

Efficiently access fields by name

A little background as to what I'm trying to achieve:
I'm parsing JSON (over 15GB) and I must store it in memory so any wrappers and extra data is not welcomed, due to the framework and interfaces used within it I must provide functionality to access fields by name. By replacing some String with Enum, Integer with int, Double with double, etc. I'm able to shave about 90% of memory footprint (in comparison with Jackson).
I'm looking to efficiently access the fields at runtime in Java by their name. I'm aware of reflection, but for my case its performance is simply unacceptable, so I don't want to use it.
If it makes the problem easier to solve I'm not too bothered about setting the fields values. I also know at compile time the names of supported fields.
I don't want to store everything in a map i.e. Map<String,Object> due to the memory footprint of boxed object, but I don't mind returning them in a boxed form.
I'm sure this problem was encountered by others and I'm interested in any clever solutions - cleverer than tons of if ... else ... statements.
Let's say the interface to implement is:
public interface Accessor {
Object get(String fieldName);
}
The Object returned by get can be of any type including enum. A naive implementation would be:
public class TestObject implements Accessor {
public enum MyEnum {ONE, TWO, THREE};
private final MyEnum myEnum;
private final int myInt;
private final double myDouble;
private final String myString;
public TestObject(MyEnum myEnum, int myInt, double myDouble, String myString) {
this.myEnum = myEnum;
this.myInt = myInt;
this.myDouble = myDouble;
this.myString = myString;
}
#Override
public Object get(String fieldName) {
if ("myEnum".equals(fieldName)) {
return myEnum;
} else if ("myInt".equals(fieldName)) {
return myInt;
} else if ("myDouble".equals(fieldName)) {
return myDouble;
} else if ("myString".equals(fieldName)) {
return myString;
} else {
throw new UnsupportedOperationException(); // Or could simply return null
}
}
}
What you want is a mapping from a fieldName to a value, the type of which is determined by the fieldName. You know the set of field names up-front, so this is an ideal task for an Enum.
If you don't like the idea of hard-coding each field as an enum, then the variation would be an enum-per-type (MY_FIELD1 becomes MY_ENUM), with a mapping from fieldName to this EnumType.
In the code below I'm making assumptions about the relationship between fieldName and TestObject. Specifically it looks like TestObject is presenting various types of the same value (surely where reasonable), as opposed to a separate value for each field name?
So, to the code:
Rewrite:
#Override
public Object get(String fieldName) {
MyField field = MyField.mapNameToField(fieldName);
if (field == null)
throw new UnsupportedOperationException(); // Or could simply return null
return field.getValue(this);
}
Given (something like):
enum MyField {
MY_FIELD1("myField1") {
public Object getValue(TestObject obj) { return obj.myEnum; }
},
MY_FIELD2("myField2") {
public Object getValue(TestObject obj) { return obj.myInt; }
},
...
;
public abstract Object getValue(TestObject obj);
public String getName() { return name; }
public static MyField mapNameToField(String name) { return map.get(name); }
static {
map = new HashMap<String,MyField>();
for(MyField value: values()) {
map.put(value.getName(), value);
}
}
private MyField(String fieldName) { name = fieldName; }
private String name;
private static Map<String, MyField> map;
}
I've never used this, but looks promising:
http://labs.carrotsearch.com/download/hppc/0.4.1/api/
"High Performance Primitive Collections (HPPC) library provides typical data structures (lists, stacks, maps) template-generated for all Java primitive types (byte, int, etc.) to conserve memory and boost performance."
In particular, the Object{Type}OpenHashMap classes might be what you're looking for:
ObjectByteOpenHashMap
ObjectCharOpenHashMap
ObjectDoubleOpenHashMap
ObjectFloatOpenHashMap
ObjectIntOpenHashMap
ObjectLongOpenHashMap
ObjectShortOpenHashMap
I imagine you would have all 7 of these defined as fields (or whatever subset of them you like), and you would probe each one in turn to see if the key was present for that type of primitive value. E.g.,
if (byteMap.containsKey(key)) {
return byteMap.lget(); // last value saved in a call to containsKey()
} else if (charMap.containsKey(key)) {
return charMap.lget();
} else if {
// and so on...
}
Notice they have their own special lget() method call to optimize the containsKey() / get() usage pattern so typical with maps.

WeakMultiton: ensuring there's only one object for a specific database row

In my application I need to ensure that for an entity representing a data row in a database
I have at most one java object representing it.
Ensuring that they are equals() is not enough, since I could get caught by coherency problems.
So basically I need a multiton; moreover, I need not to keep this object in memory when it is not necessary, so I will be using weak references.
I have devised this solution:
package com.example;
public class DbEntity {
// a DbEntity holds a strong reference to its key, so as long as someone holds a
// reference to it the key won't be evicted from the WeakHashMap
private String key;
public void setKey(String key) {
this.key = key;
}
public String getKey() {
return key;
}
//other stuff that makes this object actually useful.
}
package com.example;
import java.lang.ref.WeakReference;
import java.util.WeakHashMap;
import java.util.concurrent.locks.ReentrantLock;
public class WeakMultiton {
private ReentrantLock mapLock = new ReentrantLock();
private WeakHashMap<String, WeakReference<DbEntity>> entityMap = new WeakHashMap<String, WeakReference<DbEntity>>();
private void fill(String key, DbEntity object) throws Exception {
// do slow stuff, typically fetch data from DB and fill the object.
}
public DbEntity get(String key) throws Exception {
DbEntity result = null;
WeakReference<DbEntity> resultRef = entityMap.get(key);
if (resultRef != null){
result = resultRef.get();
}
if (result == null){
mapLock.lock();
try {
resultRef = entityMap.get(key);
if (resultRef != null){
result = resultRef.get();
}
if (result == null){
result = new DbEntity();
synchronized (result) {
// A DbEntity holds a strong reference to its key, so the key won't be evicted from the map
// as long as result is reachable.
entityMap.put(key, new WeakReference<DbEntity>(result));
// I unlock the map, but result is still locked.
// Keeping the map locked while querying the DB would serialize database calls!
// If someone tries to get the same DbEntity the method will wait to return until I get out of this synchronized block.
mapLock.unlock();
fill(key, result);
// I need the key to be exactly this String, not just an equal one!!
result.setKey(key);
}
}
} finally {
// I have to check since I could have already released the lock.
if (mapLock.isHeldByCurrentThread()){
mapLock.unlock();
}
}
}
// I synchronize on result since some other thread could have instantiated it but still being busy initializing it.
// A performance penality, but still better than synchronizing on the whole map.
synchronized (result) {
return result;
}
}
}
WeakMultiton will be instantiated only in the database wrapper (single point of access to the database) and its get(String key) will of course be the only way to retrieve a DbEntity.
Now, to the best of my knowledge this should work, but since this stuff is pretty new to me, I fear I could be overseeing something about the synchronization or the weak references!
Can you spot any flaw or suggest improvements?
I found out about guava's MapMaker and wrote this generic AbstractWeakMultiton:
package com.example;
import java.util.Map;
import java.util.concurrent.locks.ReentrantLock;
import com.google.common.collect.MapMaker;
public abstract class AbstractWeakMultiton<K,V, E extends Exception> {
private ReentrantLock mapLock = new ReentrantLock();
private Map<K, V> entityMap = new MapMaker().concurrencyLevel(1).weakValues().<K,V>makeMap();
protected abstract void fill(K key, V value) throws E;
protected abstract V instantiate(K key);
protected abstract boolean isNullObject(V value);
public V get(K key) throws E {
V result = null;
result = entityMap.get(key);
if (result == null){
mapLock.lock();
try {
result = entityMap.get(key);
if (result == null){
result = this.instantiate(key);
synchronized (result) {
entityMap.put(key, result);
// I unlock the map, but result is still locked.
// Keeping the map locked while querying the DB would serialize database calls!
// If someone tries to get the same object the method will wait to return until I get out of this synchronized block.
mapLock.unlock();
fill(key, result);
}
}
} finally {
// I have to check since the exception could have been thrown after I had already released the lock.
if (mapLock.isHeldByCurrentThread()){
mapLock.unlock();
}
}
}
// I synchronize on result since some other thread could have instantiated it but still being busy initializing it.
// A performance penalty, but still better than synchronizing on the whole map.
synchronized (result) {
// I couldn't have a null result because I needed to synchronize on it,
// so now I check whether it's a mock object and return null in case.
return isNullObject(result)?null:result;
}
}
}
It has the following advantages to my earlier try:
It does not depend on the fact that values hold a strong reference to the key
It does not need to do the awkward double checking for expired weak references
It is reusable
On the other hand, it depends on the rather beefy Guava library, while the first solution used just classes from the runtime environment. I can live with that.
I'm obviously still looking for further improvements and error spotting, and basically everything that answers the most important question: will it work?

Categories