Is there any way to implement a type of reference whose value can be exchanged with another atomically?
In Java we have AtomicReference which can be swapped with a local variable but not with another AtomicReference.
You can do:
AtomicReference r1 = new AtomicReference("hello");
AtomicReference r2 = new AtomicReference("world");
and swap them with a combination of two operations:
r1.set(r2.getAndSet(r1.get()));
But this leaves them in an inconsistent state in between, where both contain "hello". Also even if you could swap them atomically, you still could not read them (as a pair) atomically.
What I would like to be able to do is:
PairableAtomicReference r1 = new PairableAtomicReference("hello");
PairableAtomicReference r2 = new PairableAtomicReference("world");
AtomicRefPair rp = new AtomicRefPair(r1, r2);
then
Object[] oldVal, newVal;
do {
oldVal = rp.get();
newVal = new Object[] {oldVal[1], oldVal[0]};
} while (! rp.compareAndSet(oldVal, newVal));
to swap the values, and in another thread:
AtomicRefPair otherRP = new AtomicRefPair(r1, r2);
System.out.println(Arrays.toString(otherRP.get()));
and be certain that the output will be either [hello, world] or [world, hello].
Notes:
r1 and r2 are paired for this operation, but it's possible that another thread will independently pair, say r1 and another r3 (unfortunately that means I cannot use this solution.)
There will be hundreds of thousands of these references, so a global ReentrantLock would be a major bottleneck.
rp and otherRP are not necessarily shared between threads, so simply locking them will not work. They could be interned, but the intern pool would need its own synchronization which would be another bottleneck.
I have only made groups of 2 references here, but the ability to group 3 or more would be a bonus.
Is it possible to implement a lock-free version of AtomicRefPair? I have a hunch that it isn't, but if not then maybe there is an article somewhere that explains why?
Related: How do I atomically swap 2 ints in C#?
Have an immutable class holding the pair. That is your atom. Swapping the pair means replacing the atom.
update: your question isn't very clear. but in general, for a concurrent system consisting of multiple variables, one might want
take a snapshot of system state. the snapshot doesn't change once taken.
atomically update system state by changing multiple variables at once. it may be required that there is no other update between my update an a previous snapshot (which my calculation was based on)
you can model your system directly in snapshots, if it doesn't consume too much resources.
I don't know if there's a nice solution, but the following ugly one could work:
public final class MyReference<T> extends ReentrantLock implements Comparable<MyReference<T>> {
public MyReference() {
id = counter.incrementAndGet();
}
public void swap(MyReference<T> other) {
if (id < other.id) {
lock();
other.lock();
} else {
other.lock();
lock();
}
final T tmp = value;
value = other.value;
other.value = tmp;
unlock();
other.unlock();
}
public static <T> List<T> consistentGet(List<MyReference<T>> references) {
final ArrayList<MyReference<T>> sortedReferences = Lists.newArrayList(references);
Collections.sort(sortedReferences);
for (val r : sortedReferences) r.lock();
final List<T> result = Lists.newArrayListWithExpectedSize(sortedReferences.size());
for (val r : references) result.add(r.value);
for (val r : sortedReferences) r.unlock();
return result;
}
#Override
public int compareTo(MyReference<T> o) {
return id < o.id ? -1 : id > o.id ? 1 : 0;
}
private final static AtomicInteger counter = new AtomicInteger();
private T value;
private final int id;
}
Use MyReference instead of AtomicReference.
It uses a lot of locks, but none of them is global.
It acquires locks in a fixed order, so it's deadlock-free.
It compiles using lombok and guava (take it as pseudocode without them).
Related
Recently I've been trying to reimplement my data parser into streams in java, but I can't figure out how to do one specific thing:
Consider object A with timeStamp.
Consider object B which is made of various A objects
Consider some metrics which tells us time range for object B.
What I have now is some method with state which goes though list with objects A and if it fits into last object B, it goes there, otherwise it creates new B instance and starts putting objects A there.
I would like to do this in streams way
Take whole list of objects A and make it as stream. Now I need to figure out function which will create "chunks" and accumulate them into objects B. How do I do that?
Thanks
EDIT:
A and B are complex, but I will try to post here some simplified version.
class A {
private final long time;
private A(long time) {
this.time = time;
}
long getTime() {
return time;
}
}
class B {
// not important, build from "full" temporaryB class
// result of accumulation
}
class TemporaryB {
private final long startingTime;
private int counter;
public TemporaryB(A a) {
this.startingTime = a.getTime();
}
boolean fits(A a) {
return a.getTime() - startingTime < THRESHOLD;
}
void add(A a) {
counter++;
}
}
class Accumulator {
private List<B> accumulatedB;
private TemporaryBParameters temporaryBParameters
public void addA(A a) {
if(temporaryBParameters.fits(a)) {
temporaryBParameters.add(a)
} else {
accumulateB.add(new B(temporaryBParameters)
temporaryBParameters = new TemporaryBParameters(a)
}
}
}
ok so this is very simplified way how do I do this now. I don't like it. it's ugly.
In general such problem is badly suitable for Stream API as you may need non-local knowledge which makes parallel processing harder. Imagine that you have new A(1), new A(2), new A(3) and so on up to new A(1000) with Threshold set to 10. So you basically need to combine input into batches by 10 elements. Here we have the same problem as discussed in this answer: when we split the task into subtasks the suffix part may not know exactly how many elements are in the prefix part, so it cannot even start combining data into batches until the whole prefix is processed. Your problem is essentially serial.
On the other hand, there's a solution provided by new headTail method in my StreamEx library. This method parallelizes badly, but having it you can define almost any operation in just a few lines.
Here's how to solve your problem with headTail:
static StreamEx<TemporaryB> combine(StreamEx<A> input, TemporaryB tb) {
return input.headTail((head, tail) ->
tb == null ? combine(tail, new TemporaryB(head)) :
tb.fits(head) ? combine(tail, tb.add(head)) :
combine(tail, new TemporaryB(head)).prepend(tb),
() -> StreamEx.ofNullable(tb));
}
Here I modified your TemporaryB method this way:
TemporaryB add(A a) {
counter++;
return this;
}
Sample (assuming Threshold = 1000):
List<A> input = Arrays.asList(new A(1), new A(10), new A(1000), new A(1001), new A(
1002), new A(1003), new A(2000), new A(2002), new A(2003), new A(2004));
Stream<B> streamOfB = combine(StreamEx.of(input), null).map(B::new);
streamOfB.forEach(System.out::println);
Output (I wrote simple B.toString()):
B [counter=2, startingTime=1]
B [counter=3, startingTime=1001]
B [counter=2, startingTime=2002]
So here you actually have a lazy Stream of B.
Explanation:
StreamEx.headTail parameters are two lambdas. First is called at most once when input stream is non-empty. It receives the first stream element (head) and the stream containing all other elements (tail). The second is called at most once when input stream is empty and receives no parameters. Both should produce an output stream which would be used instead. So what we have here:
return input.headTail((head, tail) ->
tb == null is the starting case, create new TemporaryB from the head and call self with the tail:
tb == null ? combine(tail, new TemporaryB(head)) :
tb.fits(head) ? Ok, just add the head into existing tb and call self with the tail:
tb.fits(head) ? combine(tail, tb.add(head)) :
Otherwise again create new TemporaryB(head), but also prepend the output stream with the current tb (actually emitting a new element into target stream):
combine(tail, new TemporaryB(head)).prepend(tb),
Input stream is exhausted? Ok, return the last gathered tb if any:
() -> StreamEx.ofNullable(tb));
Note that headTail implementation guarantees that such solution while looking recursive does not eat the stack and heap more than constant amount. You can check it on thousands of input elements if you doubt:
Stream<B> streamOfB = combine(LongStreamEx.range(100000).mapToObj(A::new), null).map(B::new);
streamOfB.forEach(System.out::println);
I'm currently looking at the heap dump of this silly little test class (taken at the very end of the main method):
public class WeakRefTest {
static final class RefObj1 { int i; }
static final class RefObj2 { int j; }
public static void main(String[] args) {
Set<WeakReference<?>> objects = new HashSet<>();
RefObj1 obj1 = new RefObj1();
RefObj2 obj2 = new RefObj2();
for (int i = 0; i < 1000; i++) {
objects.add(new WeakReference<RefObj1>(obj1));
objects.add(new WeakReference<RefObj2>(obj2));
}
}
}
Now I'm trying to figure out how to count the number of references to a specific class in objects. If this were a SQL database, it'd be easy:
select objects.className as referent, count(*) as cnt
from java.lang.ref.WeakReference ref
inner join heapObjects objects on ref.referent = objects.objectId
group by objects.className;
Result:
referent | cnt
===================
WeakRefTest$RefObj1 | 1000
WeakRefTest$RefObj2 | 1000
After some research, I figured I can construct a Eclipse MAT OQL query that gives me the classes involved:
select DISTINCT OBJECTS classof(ref.referent) from java.lang.ref.WeakReference ref
Alas, this doesn't include their count and OQL doesn't seem to support a GROUP BY clause. Any ideas how to get this information?
Edited to add: In reality, none of the objects added to the Set (nor the Set implementation itself, obviously) are under my control. So sorry, modifying RefObj1 and RefObj2 isn't allowed.
Edit2: I found this related question about using OQL in jvisualvm but it turns out that OQL is actually Javascript unleashed at a heap dump. I'd be fine with something like that, too. But playing around with it hasn't produced results for me, yet. I'll update the question if that changes.
Open the histogram view (there is a toolbar button for this, which looks like a bar graph).
In the first row of the histogram view where it says "Regex", type WeakReference to filter the view.
Find the java.lang.ref.WeakReference line, right-click, and choose "Show Objects By Class" -> "By Outgoing References".
The resulting view should be summarize the objects being referred to, grouped by class as you require. The Objects column should indicate the number of instances for each class.
You could just write a method in the object that returns the information and call that from Eclipse...
Since you cannot modify the object then the next best thing will be to write a utility function in some method that you can modify and call that from the eclipse debugger. I don't know Eclipse well enough to help you do it without inserting something to the source code, sorry.
I would use a Weak HashSet. You can just use set.size() to get the number of references still alive.
static final class RefObj1 { int i; }
static final class RefObj2 { int j; }
public static void main(String[] args) {
Set objects = Collections.newSetFroMap(new WeakHashMap());
RefObj1 obj1 = new RefObj1();
RefObj2 obj2 = new RefObj2();
for (int i = 0; i < 1000; i++) {
objects.add(obj1);
objects.add(obj2);
}
obj1 = null;
System.gc();
System.out.println("Objects left is " + objects.size());
}
I would expect this to print 0, 1 or 2 depending on how the objects are cleaned up.
I have the following code:
Note: I simplified the code as much as possible for readability.
If I forgot any critical pieces let me know.
public class User(){
private Relations relations;
public User(){
relations = new Relations(this);
}
public getRelations(){
return relations;
}
}
public class Relations(){
private User user;
public Relations(User user){
this.user = user;
}
public synchronized void setRelation(User user2){
Relations relations2 = user2.getRelations();
synchronized(relations2){
storeRelation(user2);
if(!relations2.hasRelation(user))
relations2.setRelation(user);
}
}
public synchronized boolean hasRelation(User user2){
... // Checks if this relation is present in some kind of collection
}
/*Store this relation, unless it is already present*/
private void storeRelation(User user2){
... // Stores this relation in some kind of collection
}
}
This implementation should make sure that for all Relations x, y with:
x.user = u_x
y.user = u_y
the following invariant holds:
x.hasRelation( u_y ) <=> y.hasRelation( u_x )
I believe that holds for the code stated above?
Note: It does of course not hold during the execution of setRelation(..),
but at that moment the locks for both relations involved are
held by the executing thread so no other thread can read the
hasRelation(..) of one of the relations involved.
Assuming that this holds i believe there is still a potential deadlock-risk.
Is that correct? And if it is, how can I solve it?
I think i would need to obtain both locks in setRelation(..) atomically somehow.
You are correct on both points: your invariant does hold (assuming that I understand correctly what your method-names mean and so on, and assuming that by if(!relations.hasRelation(user)) relations2.setRelation(user2); you meant to write if(!relations2.hasRelation(user)) relations2.setRelation(user);), but you do have the risk of a deadlock: if one thread needs to obtain a lock on x and then on y, and another thread needs to obtain a lock on y and then on x, then there's a risk that each thread will succeed in getting its first lock, and thereby prevent the other from getting its second lock.
One solution is to enforce a strict universal ordering for getting locks on Relations instances. What you do is, you add a constant integer field lockOrder:
private final int lockOrder;
and a static integer field currentLockOrder:
private static int currentLockOrder = 0;
and every time you create a Relations instance, you set its lockOrder to the current value of currentLockOrder, and increment said:
public Relations()
{
synchronized(Relations.class) // a lock on currentLockOrder
{
lockOrder = currentLockOrder;
++currentLockOrder;
}
}
such that every instance of Relations will have a distinct, immutable value for lockOrder. Your setRelation method would then obtain locks in the specified order:
public void setRelation(final User thatUser)
{
final Relations that = thatUser.getRelations();
synchronized(lockOrder < that.lockOrder ? this : that)
{
synchronized(lockOrder < that.lockOrder ? that : this)
{
storeRelation(thatUser);
if(! that.hasRelation(user))
that.storeRelation(user);
}
}
}
thereby ensuring that if two threads both need to get locks on both x and y, then either they'll both first get locks on x, or they'll both first get locks on y. Either way, no deadlock will occur.
Note, by the way, that I changed setRelation to storeRelation. setRelation would work, but why add that complexity?
Also, there's still one thing I don't get: how come x.setRelation(u_y) calls x.storeRelation(u_y) unconditionally, but calls y.setRelation(u_x) (or y.storeRelation(u_x)) only if y doesn't already have the relationship? It doesn't make sense. It seems like either both checks are needed, or neither check is. (Without seeing the implementation of Relations.storeRelation(...), I can't guess which of those is the case.)
The problem: Maintain a bidirectional many-to-one relationship among java objects.
Something like the Google/Commons Collections bidi maps, but I want to allow duplicate values on the forward side, and have sets of the forward keys as the reverse side values.
Used something like this:
// maintaining disjoint areas on a gameboard. Location is a space on the
// gameboard; Regions refer to disjoint collections of Locations.
MagicalManyToOneMap<Location, Region> forward = // the game universe
Map<Region, <Set<Location>>> inverse = forward.getInverse(); // live, not a copy
Location parkplace = Game.chooseSomeLocation(...);
Region mine = forward.get(parkplace); // assume !null; should be O(log n)
Region other = Game.getSomeOtherRegion(...);
// moving a Location from one Region to another:
forward.put(parkplace, other);
// or equivalently:
inverse.get(other).add(parkplace); // should also be O(log n) or so
// expected consistency:
assert ! inverse.get(mine).contains(parkplace);
assert forward.get(parkplace) == other;
// and this should be fast, not iterate every possible location just to filter for mine:
for (Location l : mine) { /* do something clever */ }
The simple java approaches are: 1. To maintain only one side of the relationship, either as a Map<Location, Region> or a Map<Region, Set<Location>>, and collect the inverse relationship by iteration when needed; Or, 2. To make a wrapper that maintains both sides' Maps, and intercept all mutating calls to keep both sides in sync.
1 is O(n) instead of O(log n), which is becoming a problem. I started in on 2 and was in the weeds straightaway. (Know how many different ways there are to alter a Map entry?)
This is almost trivial in the sql world (Location table gets an indexed RegionID column). Is there something obvious I'm missing that makes it trivial for normal objects?
I might misunderstand your model, but if your Location and Region have correct equals() and hashCode() implemented, then the set of Location -> Region is just a classical simple Map implementation (multiple distinct keys can point to the same object value). The Region -> Set of Location is a Multimap (available in Google Coll.). You could compose your own class with the proper add/remove methods to manipulate both submaps.
Maybe an overkill, but you could also use in-memory sql server (HSQLDB, etc). It allows you to create index on many columns.
I think you could achieve what you need with the following two classes. While it does involve two maps, they are not exposed to the outside world, so there shouldn't be a way for them to get out of sync. As for storing the same "fact" twice, I don't think you'll get around that in any efficient implementation, whether the fact is stored twice explicitly as it is here, or implicitly as it would be when your database creates an index to make joins more efficient on your 2 tables. you can add new things to the magicset and it will update both mappings, or you can add things to the magicmapper, which will then update the inverse map auotmatically. The girlfriend is calling me to bed now so I cannot run this through a compiler - it should be enough to get you started. what puzzle are you trying to solve?
public class MagicSet<L> {
private Map<L,R> forward;
private R r;
private Set<L> set;
public MagicSet<L>(Map forward, R r) {
this.forward = map;
this.r = r;
this.set = new HashSet<L>();
}
public void add(L l) {
set.add(l);
forward.put(l,r);
}
public void remove(L l) {
set.remove(l);
forward.remove(l);
}
public int size() {
return set.size();
}
public in contains(L l){
return set.contains(l);
}
// caution, do not use the remove method from this iterator. if this class was going
// to be reused often you would want to return a wrapped iterator that handled the remove method properly. In fact, if you did that, i think you could then extend AbstractSet and MagicSet would then fully implement java.util.Set.
public Iterator iterator() {
return set.iterator();
}
}
public class MagicMapper<L,R> { // note that it doesn't implement Map, though it could with some extra work. I don't get the impression you need that though.
private Map<L,R> forward;
private Map<R,MagicSet<L>> inverse;
public MagicMapper<L,R>() {
forward = new HashMap<L,R>;
inverse = new HashMap<R,<MagicSet<L>>;
}
public R getForward(L key) {
return forward.get(key);
}
public Set<L> getBackward(R key) {
return inverse.get(key); // this assumes you want a null if
// you try to use a key that has no mapping. otherwise you'd return a blank MagicSet
}
public void put (L l, R r) {
R oldVal = forward.get(l);
// if the L had already belonged to an R, we need to undo that mapping
MagicSet<L> oldSet = inverse.get(oldVal);
if (oldSet != null) {oldSet.remove(l);}
// now get the set the R belongs to, and add it.
MagicSet<L> newSet = inverse.get(l);
if (newSet == null) {
newSet = new MagicSet<L>(forward, r);
inverse.put(r,newSet);
}
newSet.add(l); // magically updates the "forward" map
}
}
Is there any better way to cache up some very large objects, that can only be created once, and therefore need to be cached ? Currently, I have the following:
public enum LargeObjectCache {
INSTANCE;
private Map<String, LargeObject> map = new HashMap<...>();
public LargeObject get(String s) {
if (!map.containsKey(s)) {
map.put(s, new LargeObject(s));
}
return map.get(s);
}
}
There are several classes that can use the LargeObjects, which is why I decided to use a singleton for the cache, instead of passing LargeObjects to every class that uses it.
Also, the map doesn't contain many keys (one or two, but the key can vary in different runs of the program) so, is there another, more efficient map to use in this case ?
You may need thread-safety to ensure you don't have two instance of the same name.
It does matter much for small maps but you can avoid one call which can make it faster.
public LargeObject get(String s) {
synchronized(map) {
LargeObject ret = map.get(s);
if (ret == null)
map.put(s, ret = new LargeObject(s));
return ret;
}
}
As it has been pointed out, you need to address thread-safety. Simply using Collections.synchronizedMap() doesn't make it completely correct, as the code entails compound operations. Synchronizing the entire block is one solution. However, using ConcurrentHashMap will result in a much more concurrent and scalable behavior if it is critical.
public enum LargeObjectCache {
INSTANCE;
private final ConcurrentMap<String, LargeObject> map = new ConcurrentHashMap<...>();
public LargeObject get(String s) {
LargeObject value = map.get(s);
if (value == null) {
value = new LargeObject(s);
LargeObject old = map.putIfAbsent(s, value);
if (old != null) {
value = old;
}
}
return value;
}
}
You'll need to use it exactly in this form to have the correct and the most efficient behavior.
If you must ensure only one thread gets to even instantiate the value for a given key, then it becomes necessary to turn to something like the computing map in Google Collections or the memoizer example in Brian Goetz's book "Java Concurrency in Practice".