Java - Multithreads CPU cache - java

I know the threads save the values of the variables in the cpu cache where it is running because in this way the cpu doesnt have wait so much time when it's necessary to get the values inside in the variables.
But for example if i have this object
public class MyObject {
int a = 2;
}
and now the thread do something like this:
MyObject obj = new MyObject();
obj.a=3;
My question is:
what will be saved in the cpu cache ?
all the MyObject structure or just the reference?
I think all the structure (have more sense) but i prefer to ask because i would like to be sure about that.
I'm a noob about multithread and i'm sure is more complex how a cpu cache works, but at the moment i need just basic information.

In your example, only one thread is acting. For this thread, cache is transparent - there is no way to determine if a value is in cache, in main memory, or both. First all values are put in the cache but then very soon, in an unknown moment of time they are pushed out.
"i would like to be sure about that" - why? Your program behaviour does not depend on this.

These question has two sides:
What the CPU is doing: The CPU is designed to keep everything in the cache that is needed very often. If you change a value it will keep changes in the cache until it is needs to write it to the main memory (actually it depends on the CPUs strategy write-back vs write-through). The "need" to write it to main memory is programatically controlled or the CPU descides its needing space for other stuff. To answer one part of your question: For the CPU everything is data, the value you set in Java, and the internal object data structures. To access your value, you need the object address first, so that is very probably in the cache, too :)
The second point, is what Java programmer should expect and not expect: This is very exactly defined in the Java Memory Model. Just start here: http://en.wikipedia.org/wiki/Java_Memory_Model
So for your lines:
MyObject obj = new MyObject();
obj.a=3;
There is no guarantee that another thread running after this code, sees the new value. And it also may not see your new object reference but null instead. You need a synchronized block or a volatile variable.

Related

How to resolve memory leak in multi threading environment?

I have a interesting case of memory leak in a multi threading environment.
I have following logic:
public void update(string key, CustomObj NewResource)
{
//fetch old resource for the given key from the concurrent hashmap
// update hashmap with the newResource for the given key
// close the old resource to prevent memory leak
}
public void read (string key)
{
// return resource for the given key
}
Now if I have two threads:
Thread#1: calling update method to update the resource for key K
Thread#2: calling read method to read the resource for the same key K.
Note: CustomObj belongs to a third party library so I cant put finalize method in it for closing it.
Even using synchronization on read and update method wont help because the update thread can close the resource while the read thread is still using it.
Could you please tell how to maintain thread safety without memory leak in this scenario ?
You should never use finalize() for reasons too broad to discuss here.
If several threads can work with one object at the same time, then you can use "reference counting" to track when the resource should be closed.
Every thread/function/etc that currently works with the object increments its "users count" with one when it acquires access to the object. When it stops working with it, then it decrements its "users count" by one. The thread that decremented the count to zero closes the object. You can take advantage of the various "atomic" primitives provided by the java standard library in order to create a lock-free solution.
As this an object from a third pary library, you'll need to create some kind of wrapper to track the references.
PS: Usually it's not a good idea to use objects with shared state across threads - it begs for trouble - synchronization issues, data races, performance lost in synchronization, etc.

Java multi-threading accessing same variable

I have a Java program which create 2 threads, inside these 2 threads, they are trying to update the global variable abc to different value, let's say integer 1 and integer 3.
Let's say they execute the code at the same time (at same milisecond), for example:
public class MyThread implements Runnable{
public void run(){
while(true){
if (currentTime == specificTime){
abc = 1; //another thread update abc to 3
}
}
}
}
In this case, how can we determine the result of the variable abc? I am very curious how Operating System schedule the execution?
(I know Synchronize should be used, but I just want to know naturally how the system will handle this kind of conflict problem.)
The operating system has little involvement in this: at the time your threads are running, the memory allocated to abc is under control of JVM running your program, so it's your program that is in control.
When two threads access the same memory location, the last writer wins. Which particular thread gets to be the last writer, however, is non-deterministic, unless you use synchronization.
Moreover, without you taking special care of accessing the shared data, one thread may not even see the results of the other thread writing to the abc location.
To avoid synchronization issues, you should use synchronization or one of the java.util.concurrent.atomic classes.
From Java's perspective the situation is fairly simple if abc is not volatile or accessed with appropriate synchronisation.
Let's assume that abc is 0 originally. After your two threads have updated it to respectively 1 and 3, abc could be observed in three states: 0, 1 or 3. Which value you get is not deterministic and the result may vary from one run to the other.
Depends on the operating system, running environment etc.
Some environments will actually stop you from doing this - known as thread safety.
Otherwise the results are totally unpredictable which is why it is so dangerous to do this.
It mainly just depends on which thread updated it last for what the value will be. One thread will get CPU cycles before the other to do the atomic operation first.
Also, I don't think that operating systems go as far as to schedule threads because in most operating systems it is the program that is responsible for them, and without explicit calls like synchronise, or a threading pool model then I think the order of execution is pretty hard to predict. Its a very environment dependent thing.
From the system's perspective the result will depend on many software, hardware and run-time factors that cannot be known in advance. From this perspective there is no conflict nor a problem.
From the programmer's perspective the result is not deterministic and therefore a problem/conflic. The conflict needs to be resolved at design-time.
In this case, how can we determine the result of the variable abc? I
am very curious how Operating System schedule the execution?
The result will not be deterministic, as the value will be the last written one. You can not make any guarantee about the result. The execution is scheduled like any other one. As you demand no synchronization in your code the JVM will not enforce anything for you.
I know Synchronize should be used, but I just want to know naturally
how the system will handle this kind of conflict problem.
Simple said: it wont, as for the system there is no conflict. Only for you, the programmer, problems will occur, since you will eventually run into a data race and not deterministic behavior. It is completely up to you.
just add volatile modificator to your variable, then it'll be udpated through all threads. And thread reading it will get it's actual value. volatile means that value will be always up to date for all threads accessing it.

Thread safety when only one thread is writing

I know if two threads are writing to the same place I need to make sure they do it in a safe way and cause no problems but what if just one thread reads and does all the writing while another just reads.
In my case I'm using a thread in a small game for the 1st time to keep the updating apart from the rendering. The class that does all the rendering will never write to anything it reads, so I am not sure anymore if I need handle every read and write for everything they both share.
I will take the right steps to make sure the renderer does not try to read anything that is not there anymore but when calling things like the player and entity's getters should I be treating them in the same way? or would setting the values like x, y cords and Booleans like "alive" to volatile do the trick?
My understanding has become very murky on this and could do with some enlightening
Edit: The shared data will be anything that needs to be drawn and moved and stored in lists of objects.
For example the player and other entity's;
With the given information it is not possible to exactly specify a solution, but it is clear that you need some kind of method to synchronize between the threads. The issue is that as long as the write operations are not atomic that you could be reading data at the moment that it is being updates. This means that you for instance get an old y-coordinate with a new x-coordinate.
Basically you only do not need to worry about synchronization if both threads are only reading the information or - even better - if all the data structures are immutable (so both threads can not modify the objects). The best way to proceed is to think about which operations need to be atomic first, and then create a solution to make the operations atomic.
Don't forget: get it working, get it right, get it optimized (in that order).
You could have problems in this case if list's sizes are variable and you don't synchronize the access to them, consider this:
read-only thread reads mySharedList size and it sees it is 15; at that moment its CPU time finishes and read-write thread is given the CPU
read-write thread deletes an element from the list, now its size is 14.
read-only thread is again granted CPU time, it tries to read the last element using the (now obsolete) size it read before being interrupted, you'll have an Exception.

Detemining queue size

I have to implement a queue to which object will be added and removed by two different threads at different time based on some factor.My problem is the requirement says the queue( whole queue and data it hold) should not take 200KB+ data .If size is 200 thread should wait for space to be available to push more data.Object pushed may vary in size.I can create java queue obut the size of queue will return the total object pushed instead of total memory used How do i determine the totla size of data my queue is refering to .
Consider the object pushed as
class A{
int x;
byte[] buf;//array size vary per object
}
There is no out of the box functionality for this in Java. (In part, because there is no easy way to know if the objects added to the collection are referenced elsewhere and therefore if adding them takes up additional memory.)
For your use case, you would probably be best of just subclassing queue. Override the super to add the size of the object to a counter (obviously you will have to make this calculation thread safe.) and to throw an exception IllegalStateException if it doesn't have room. Similarly decrement your counter if on an overridden remove class.
The method of determining how to much space to add to the counter could vary. Farlan suggested using this and that looks like it would work. But since you are suggesting that you are dealing with a byte array, the size of the data you are adding might already be known to you. You will also have to consider whether you want to consider any of the overhead. The object takes some space, as does the reference inside of the queue itself. Plus the queue object. You could figure out exact values for that, but since it seems like your requirement is just to prevent outofmemory, you could probably just use rough estimates for those as long as you are consistent.
The details of what queue class you want to subclass may depend on how much contention you think there will be between the threads. But it sounds like you have a handle on the sync issues.

String and concurrency in Java

This maybe a related question: Java assignment issues - Is this atomic?
I have the same class as the OP that acts on a mutable string reference. But set rarely happens. (basically this string is part of a server configuration that only reloads when forced to).
public class Test {
private String s;
public void setS(String str){
s = str;
}
public String getS(){
return s;
}
}
Multiple threads will be pounding this variable to read its value. What is the best method to make it 'safe' while not having to incur the performance degradation by declaring it volatile?
I am currently heading into the direction of ReadWriteLock, but as far as I understand, ReadWrite locks does not make it safe from thread caching? unless some syncronisation happen? Which means I've gone a full circle back to I may as well just use the volatile keyword?
Is my understanding correct? Is there nothing that can 'notify' other threads about an update to a variable in main memory manually such that they can update their local cache just once on a full moon?
volatile on this seems overkill given that the server application is designed to run for months without restart. By that time, it would've served a few million reads. I'm thinking I might as well just set the String as static final and not allow it mutate without a complete application and JVM restart.
Reads and writes to references are atomic. The problems you can incur is attempting to perform a read and a write (an update) or guaranteeing that after a write all thread see this change on the next read. However, only you can say what your requirements are.
When you use volatile, it requires a cache coherent copy be read or written. This doesn't require a copy be made to/from main memory as the caches communicate amongst themselves, even between sockets. There is a performance impact but it doesn't mean the caches are not used.
Even if the access did go all the way to main memory, you could still do millions of accesses per second.
Why a mutable String? Why not a Config class with a simple static String. When config is updated, you change this static reference, which is an atomic operation and won't be a problem for reading threads. You then have no synchronization, no locking penalties.
In order to notify the clients to this server you can use observer pattern, who ever is interested in getting the info of server update can register for your event and server delivers the notification. This shouldnt become a bottleneck as you mentioned the reload is not often.
Now to make this thread safe you can have a separate thread handle the update of server state and if your get you check for the state if state is 'Updating' you wait for it to complete say you went to sleep. Once your update thread is done it should change the state from 'Updating' to 'Updated', once you come out of sleep check for the state if it is 'Updating' then go to sleep or else start servicing the request.
This approach will add an extra if in your code but then it will enable you to reload the cache without forcing application restart.
Also this shouldnt be a bottleneck as server update is not frequent.
Hope this makes some sense.
In order to avoid the volatile keyword, you could add a "memory barrier" method to your Test class that is only called very rarely, for example
public synchronized void sync() {
}
This will force the thread to re-read the field value from main memory.
Also, you would have to change the setter to
public synchronized void setS(String str){
s = str;
}
The synchronized keyword will force the setting thread to write directly to main memory.
See here for a detailed explanation of synchronization and memory barriers.

Categories