Many readers but not when writer available with HashMap java - java

I've crawl through many question regarding this area but my question still remains with me. I'm seeking some elaborate answer as well(If you kind enough?). So i could understand this more clearly and community as well.
This is my question. I have this map.
private static volatile Map<Integer, Type> types;
and have static getter as,
static Type getType(final int id)
{
if (types == null)
{
synchronized (CLASSNAME.class)
{
if (types == null)
{
types = new HashMap<Integer, Type>();
....add items to the map
}
}
}
return types.get(id);
}
Problem in this code is first thread can initialize the types so it won't be null anymore. While first thread adding values to map second thread can retrieve data from it. That means corrupted data.
I see that this can be avoid by synchronizing whole method but then multiple readers is not possible. It's an one time construction for that map and there will be no modification. So multiple readers is essential.
Also we can use Collections.synchronizeMap but if i'm correct it also not allowing concurrent readers. I tried but ConcurrentHashMap doesn't solve this either. Maybe due to it's independent partition locking behavior.
Simply what i need is no reading until map created fully and then multiple read should be possible.
Anyone got a solution?
Thanks.

There is a simple solution to your problem. Use a temporary variable, so that the reference types is null as long as the map is not completely populated. If you change the code in that way, it is thread-safe and quite efficient.
static Type getType(final int id) {
if (types == null) {
synchronized (CLASSNAME.class) {
if (types == null) {
HashMap<Integer, Type> temp = new HashMap<>();
// populate temp
types = temp;
}
}
}
return types.get(id);
}
Thread-safe, lazy and efficient initialization is a frequently required feature. Unfortunately, it's not directly supported by Java, neither by the programming language nor by the standard library. Instead, there are different patterns, and your implementation is known as Double-checked locking.
A short excursion to C++: C++11 has support for lazy, thread-safe initialization both in the language and in the library. If there is only one global type mapping, you can write the following in C++:
auto populated_map()
{
std::map<int, type> result;
// ... populate map
return result;
}
auto get_type(int id) -> const type&
{
static const std::map<int, type> map = populated_map();
return map.find(id)->second;
}
If you need lazy initialization per object, you can use the library support around std::once_flag and std::call_once:
class types
{
private:
std::once_flag _flag;
std::map<int, type> _map;
public:
auto get_type(int id) -> const type&
{
std::call_once(_flag, [this] { _map = populated_map(); });
return _map.find(id)->second;
}
};

Take a look into the Memoization pattern. There are specific implementations available in Java 8 but if you aren't adopting that soon, look at Guava's MapMaker, specifically:
private final ConcurrentMap<Map<Integer, Type> types = new MapMaker()
.makeComputingMap(new Function<Integer, Type>() {
public Graph apply(Type key) {
return loadForType(key);
}
});
In this case, no one thread will be populating this map (it may be that a single thread does). The idea is, when a thread enters it will check to see if a value for any Integer is available. If not it will run the function once, if it is, it will return it while not blocking

Related

Omitting an instance field at run time in Java

Java's assert mechanism allows disabling putting in assertions which have essentially no run time cost (aside from a bigger class file) if assertions are disabled. But this may cover all situations.
For instance, many of Java's collections feature "fail-fast" iterators that attempt to detect when you're using them in a thread-unsafe way. But this requires both the collection and the iterator itself to maintain extra state that would not be needed if these checks weren't there.
Suppose someone wanted to do something similar, but allow the checks to be disabled and if they are disabled, it saves a few bytes in the iterator and likewise a few more bytes in the ArrayList, or whatever.
Alternatively, suppose we're doing some sort of object pooling that we want to be able to turn on and off at runtime; when it's off, it should just use Java's garbage collection and take no room for reference counts, like this (note that the code as written is very broken):
class MyClass {
static final boolean useRefCounts = my.global.Utils.useRefCounts();
static {
if(useRefCounts)
int refCount; // want instance field, not local variable
}
void incrementRefCount(){
if(useRefCounts) refCount++; // only use field if it exists;
}
/**return true if ready to be collected and reused*/
boolean decrementAndTestRefCount(){
// rely on Java's garbage collector if ref counting is disabled.
return useRefCounts && --refCount == 0;
}
}
The trouble with the above code is that the static bock makes no sense. But is there some trick using low-powered magic to make something along these lines work? (If high powered magic is allowed, the nuclear option is generate two versions of MyClass and arrange to put the correct one on the class path at start time.)
NOTE: You might not need to do this at all. The JIT is very good at inlining constants known at runtime especially boolean and optimising away the code which isn't used.
The int field is not ideal, however, if you are using a 64 bit JVM, the object size might not change.
On the OpenJDK/Oracle JVM (64-bit), the header is 12 bytes by default. The object alignment is 8 byte so the object will use 16 bytes. The field, adds 4 bytes, which after alignment is also 16 bytes.
To answer the question, you need two classes (unless you use generated code or hacks)
class MyClass {
static final boolean useRefCounts = my.global.Utils.useRefCounts();
public static MyClass create() {
return useRefCounts ? new MyClassPlus() : new MyClass();
}
void incrementRefCount() {
}
boolean decrementAndTestRefCount() {
return false;
}
}
class MyClassPlus extends MyClass {
int refCount; // want instance field, not local variable
void incrementRefCount() {
refCount++; // only use field if it exists;
}
boolean decrementAndTestRefCount() {
return --refCount == 0;
}
}
If you accept a slightly higher overhead in the case you’re using your ref count, you may resort to external storage, i.e.
class MyClass {
static final WeakHashMap<MyClass,Integer> REF_COUNTS
= my.global.Utils.useRefCounts()? new WeakHashMap<>(): null;
void incrementRefCount() {
if(REF_COUNTS != null) REF_COUNTS.merge(this, 1, Integer::sum);
}
/**return true if ready to be collected and reused*/
boolean decrementAndTestRefCount() {
return REF_COUNTS != null
&& REF_COUNTS.compute(this, (me, i) -> --i == 0? null: i) == null;
}
}
There is a behavioral difference for the case that someone invokes decrementAndTestRefCount() more often than incrementRefCount(). While your original code silently runs into a negative ref count, this code will throw a NullPointerException. I prefer failing with an exception in this case…
The code above will leave you with the overhead of a single static field in case you’re not using the feature. Most JVMs should have no problems eliminating the conditionals regarding the state of a static final variable.
Note further that the code allows MyClass instances to get garbage collected while having a non-zero ref count, just like when it was an instance field, but also actively removes the mapping when the count reaches the initial state of zero again, to minimize the work needed for cleanup.

A method of getting the value of fields without using reflection

I've been given a class with some 200 fields in which their values are read using reflection. It looks basically like this
for (Field f : this.getClass().getFields())
{
try
{
Object o = f.get(this);
if (f.getType() == String.class)
{
//do things with the string
}
}
catch (Exception ex)
{
logger.error("Cannot get value for field. {}", ex.getMessage());
}
}
This works very well for such an unwieldy amount of fields as I suppose is the point of reflection. I've been asked to refactor it because it's slow (is it?).
So far the only method I can come up with his an ungodly amount of hard coding, is there another quick method?
First you should verify with a profiler that it indeed is slow. Reflection is slower than accessing variables normally, but that doesn't necessarily mean that it's the source of slowness.
Provided that you're using setters to modify those values, you can refactor the class to update a Map<String,Object> whenever a setter is called. This provides faster access to the fields than reflection, but may not be possible depending on your use case.
Most of the time is spent in obtaining the Field objects (and possibly filtering them) The actual lookup can be pretty fast. I use ClassValue to cache this information and speed it up.
public enum StringFields {
INSTANCE;
final ClassValue<List<Field>> fieldsCache = new ClassValue<List<Field>>() {
#Override
protected List<Field> computeValue(Class<?> type) {
return Collections.unmodifiableList(
Stream.of(type.getFields())
.filter(f -> f.getType() == String.class)
.peek(f -> f.setAccessible(true)) // turn off security check
.collect(Collectors.toList()));
}
};
public static List<Field> getAllStringFields(Class<?> type) {
return INSTANCE.fieldsCache.get(type);
}
}
So far the only method I can come up with his an ungodly amount of hard coding, is there another quick method?
You can use reflection to get the getters of those fields and generate code which reads out those getters.
The code generation can then be part of a build step.

Multi threading with a ConcurrentHashMap

I'm trying to create a method with a ConcurrentHashMap with the following behavior.
Read no lock
Write lock
prior to writing,
read to see if record exist,
if it still doesn't exist, save to database and add record to map.
if record exist from previous write, just return record.
My thoughts.
private Object lock1 = new Object();
private ConcurrentHashMap<String, Object> productMap;
private Object getProductMap(String name) {
if (productMap.isEmpty()) {
productMap = new ConcurrentHashMap<>();
}
if (productMap.containsKey(name)) {
return productMap.get(name);
}
synchronized (lock1) {
if (productMap.containsKey(name)) {
return productMap.get(name);
} else {
Product product = new Product(name);
session.save(product);
productMap.putIfAbsent(name, product);
}
}
}
Could someone help me to understand if this is a correct approach?
There are several bugs here.
If productMap isn't guaranteed to be initialized, you will get an NPE in your first statement to this method.
The method isn't guaranteed to return anything if the map is empty.
The method doesn't return on all paths.
The method is both poorly named and unnecessary; you're trying to emulate putIfAbsent which half accomplishes your goal.
You also don't need to do any synchronization; ConcurrentHashMap is thread safe for your purposes.
If I were to rewrite this, I'd do a few things differently:
Eagerly instantiate the ConcurrentHashMap
Bind it to ConcurrentMap instead of the concrete class (so ConcurrentMap<String, Product> productMap = new ConcurrentHashMap<>();)
Rename the method to putIfMissing and delegate to putIfAbsent, with some logic to return the same record I want to add if the result is null. The above absolutely depends on Product having a well-defined equals and hashCode method, such that new Product(name) will produce objects with the same values for equals and hashCode if provided the same name.
Use an Optional to avoid any NPEs with the result of putIfAbsent, and to provide easier to digest code.
A snippet of the above:
public Product putIfMissing(String key) {
Product product = new Product(key);
Optional<Product> result =
Optional.ofNullable(productMap.putIfAbsent(key, product));
session.save(result.orElse(product));
return result.orElse(product);
}

Is java dynamic synchronization a good idea or allowed?

Basically, what is needed is to synchronize requests to each of the records.
Some of the codes I can think of is like this:
//member variable
ConcurrentHashMap<Long, Object> lockMap = new ConcurrentHashMap<Long, Object>();
//one method
private void maintainLockObjects(long id){
lockMap.putIfAbsent(id, new Object());
}
//the request method
bar(long id){
maintainLockObjects(id);
synchronized(lockMap.get(id)){
//logic here
}
}
Have a look at ClassLoader.getClassLoadingLock:
Returns the lock object for class loading operations. For backward compatibility, the default implementation of this method behaves as follows. If this ClassLoader object is registered as parallel capable, the method returns a dedicated object associated with the specified class name. Otherwise, the method returns this ClassLoader object.
Its implementation code may look familiar to you:
protected Object getClassLoadingLock(String className) {
Object lock = this;
if (parallelLockMap != null) {
Object newLock = new Object();
lock = parallelLockMap.putIfAbsent(className, newLock);
if (lock == null) {
lock = newLock;
}
}
return lock;
}
The first null check is only for the mentioned backwards compatibility. So besides that, the only difference between this heavily used code and your approach is that this code avoids to call get afterwards as putIfAbsent already returns the old object if there is one.
So the simply answer, it works and this pattern also proving within a really crucial part of Oracle’s JRE implementation.

On using Enum based Singleton to cache large objects (Java)

Is there any better way to cache up some very large objects, that can only be created once, and therefore need to be cached ? Currently, I have the following:
public enum LargeObjectCache {
INSTANCE;
private Map<String, LargeObject> map = new HashMap<...>();
public LargeObject get(String s) {
if (!map.containsKey(s)) {
map.put(s, new LargeObject(s));
}
return map.get(s);
}
}
There are several classes that can use the LargeObjects, which is why I decided to use a singleton for the cache, instead of passing LargeObjects to every class that uses it.
Also, the map doesn't contain many keys (one or two, but the key can vary in different runs of the program) so, is there another, more efficient map to use in this case ?
You may need thread-safety to ensure you don't have two instance of the same name.
It does matter much for small maps but you can avoid one call which can make it faster.
public LargeObject get(String s) {
synchronized(map) {
LargeObject ret = map.get(s);
if (ret == null)
map.put(s, ret = new LargeObject(s));
return ret;
}
}
As it has been pointed out, you need to address thread-safety. Simply using Collections.synchronizedMap() doesn't make it completely correct, as the code entails compound operations. Synchronizing the entire block is one solution. However, using ConcurrentHashMap will result in a much more concurrent and scalable behavior if it is critical.
public enum LargeObjectCache {
INSTANCE;
private final ConcurrentMap<String, LargeObject> map = new ConcurrentHashMap<...>();
public LargeObject get(String s) {
LargeObject value = map.get(s);
if (value == null) {
value = new LargeObject(s);
LargeObject old = map.putIfAbsent(s, value);
if (old != null) {
value = old;
}
}
return value;
}
}
You'll need to use it exactly in this form to have the correct and the most efficient behavior.
If you must ensure only one thread gets to even instantiate the value for a given key, then it becomes necessary to turn to something like the computing map in Google Collections or the memoizer example in Brian Goetz's book "Java Concurrency in Practice".

Categories