This maybe a related question: Java assignment issues - Is this atomic?
I have the same class as the OP that acts on a mutable string reference. But set rarely happens. (basically this string is part of a server configuration that only reloads when forced to).
public class Test {
private String s;
public void setS(String str){
s = str;
}
public String getS(){
return s;
}
}
Multiple threads will be pounding this variable to read its value. What is the best method to make it 'safe' while not having to incur the performance degradation by declaring it volatile?
I am currently heading into the direction of ReadWriteLock, but as far as I understand, ReadWrite locks does not make it safe from thread caching? unless some syncronisation happen? Which means I've gone a full circle back to I may as well just use the volatile keyword?
Is my understanding correct? Is there nothing that can 'notify' other threads about an update to a variable in main memory manually such that they can update their local cache just once on a full moon?
volatile on this seems overkill given that the server application is designed to run for months without restart. By that time, it would've served a few million reads. I'm thinking I might as well just set the String as static final and not allow it mutate without a complete application and JVM restart.
Reads and writes to references are atomic. The problems you can incur is attempting to perform a read and a write (an update) or guaranteeing that after a write all thread see this change on the next read. However, only you can say what your requirements are.
When you use volatile, it requires a cache coherent copy be read or written. This doesn't require a copy be made to/from main memory as the caches communicate amongst themselves, even between sockets. There is a performance impact but it doesn't mean the caches are not used.
Even if the access did go all the way to main memory, you could still do millions of accesses per second.
Why a mutable String? Why not a Config class with a simple static String. When config is updated, you change this static reference, which is an atomic operation and won't be a problem for reading threads. You then have no synchronization, no locking penalties.
In order to notify the clients to this server you can use observer pattern, who ever is interested in getting the info of server update can register for your event and server delivers the notification. This shouldnt become a bottleneck as you mentioned the reload is not often.
Now to make this thread safe you can have a separate thread handle the update of server state and if your get you check for the state if state is 'Updating' you wait for it to complete say you went to sleep. Once your update thread is done it should change the state from 'Updating' to 'Updated', once you come out of sleep check for the state if it is 'Updating' then go to sleep or else start servicing the request.
This approach will add an extra if in your code but then it will enable you to reload the cache without forcing application restart.
Also this shouldnt be a bottleneck as server update is not frequent.
Hope this makes some sense.
In order to avoid the volatile keyword, you could add a "memory barrier" method to your Test class that is only called very rarely, for example
public synchronized void sync() {
}
This will force the thread to re-read the field value from main memory.
Also, you would have to change the setter to
public synchronized void setS(String str){
s = str;
}
The synchronized keyword will force the setting thread to write directly to main memory.
See here for a detailed explanation of synchronization and memory barriers.
Related
I have a pretty basic method,
//do stuff
}
. I was having issues in that new quotes would update the order, so I wanted to synchronize on the order parameter. So my code would like:
handleOrder(IOrder order) {
synchronized(order){
//do stuff
}
}
Now however, intellij is complaining that:
Synchronization on method parameter 'order'
Inspection info: Reports synchronization on a local variable or parameter. It is very difficult to guarantee correctness when such synchronization is used. It may be possible to improve code like this by controlling access through e.g. a synchronized wrapper class, or by synchronizing on a field.
Is this something I actually need to be concerned about?
Yes, because this type of synchronization is generally an indication that the code cannot easily be reviewed to ensure that deadlocks don't take place.
When you synchronize on a field, you're combining the synchronization code with the instance being used in a way that permits you to have most, if not all of the competing methods in the same file. This makes it easier to review the file for deadlocks and errors in the synchronization approach. The same idea applies when using a synchronized wrapper class.
When you synchronize on a passed instance (local field) then you need to review all of the code of the entire application for other synchronization efforts on the same instance to get the same level of security that a mistake was not made. In addition, this will have to be done frequently, as there is little assurance that after the next commit, a developer will have done the same code scan to make sure that their synchronization didn't impact code that lived in some remote directory (or even in a remote JAR file that doesn't have source code on their machine).
After reading a little bit about the java memory model and synchronization, a few questions came up:
Even if Thread 1 synchronizes the writes, then although the effect of the writes will be flushed to main memory, Thread 2 will still not see them because the read came from level 1 cache. So synchronizing writes only prevents collisions on writes. (Java thread-safe write-only hashmap)
Second, when a synchronized method exits, it automatically establishes a happens-before relationship with any subsequent invocation of a synchronized method for the same object. This guarantees that changes to the state of the object are visible to all threads. (https://docs.oracle.com/javase/tutorial/essential/concurrency/syncmeth.html)
A third website (I can't find it again, sorry) said that every change to any object - it doesn't care where the reference comes from - will be flushed to memory when the method leaves the synchronized block and establishes a happens-before situation.
My questions are:
What is really flushed back to memory by exiting the synchronized block? (As some websites also said that only the object whose lock has been aquired will be flushed back.)
What does happens-before-relaitonship mean in this case? And what will be re-read from memory on entering the block, what not?
How does a lock achieve this functionality (from https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/locks/Lock.html):
All Lock implementations must enforce the same memory synchronization semantics as provided by the built-in monitor lock, as described in section 17.4 of The Java™ Language Specification:
A successful lock operation has the same memory synchronization effects as a successful Lock action.
A successful unlock operation has the same memory synchronization effects as a successful Unlock action.
Unsuccessful locking and unlocking operations, and reentrant locking/unlocking operations, do not require any memory synchronization effects.
If my assumtion that everything will be re-read and flushed is correct, this is achieved by using synchronized-block in the lock- and unlock-functions (which are mostly also necessary), right? And if it's wrong, how can this functionality be achieved?
Thank you in advance!
The happens-before-relationship is the fundamental thing you have to understand, as the formal specification operates in terms of these. Terms like “flushing” are technical details that may help you understanding them, or misguide you in the worst case.
If a thread performs action A within a synchronized(object1) { … }, followed by a thread performing action B within a synchronized(object1) { … }, assuming that object1 refers to the same object, there is a happens-before-relationship between A and B and these actions are safe regarding accessing shared mutable data (assuming, no one else modifies this data).
But this is a directed relationship, i.e. B can safely access the data modified by A. But when seeing two synchronized(object1) { … } blocks, being sure that object1 is the same object, you still need to know whether A was executed before B or B was executed before A, to know the direction of the happens-before-relationship. For ordinary object oriented code, this usually works naturally, as each action will operate on whatever previous state of the object it finds.
Speaking of flushing, leaving a synchronized block causes flushing of all written data and entering a synchronized block causes rereading of all mutable data, but without the mutual exclusion guaranty of a synchronized on the same instance, there is no control over which happens before the other. Even worse, you can not use the shared data to detect the situation, as without blocking the other thread, it can still inconsistently modify the data you’re operating on.
Since synchronizing on different objects can’t establish a valid happens-before relationship, the JVM’s optimizer is not required to maintain the global flush effect. Most notably, today’s JVMs will remove synchronization, if Escape Analysis has proven that the object is never seen by other threads.
So you can use synchronizing on an object to guard access to data stored somewhere else, i.e not in that object, but it still requires consistent synchronizing on the same object instance for all access to the same shared data, which complicates the program logic, compared to simply synchronizing on the same object containing the guarded data.
volatile variables, like used by Locks internally, also have a global flush effect, if threads are reading and writing the same volatile variable, and use the value to form a correct program logic. This is trickier than with synchronized blocks, as there is no mutual exclusion of code execution, or well, you could see it as having a mutual exclusion limited to a single read, write, or cas operation.
There is no flush per-se, it's just easier to think that way (easier to draw too); that's why there are lots of resources online that refer to flush to main memory (RAM assuming), but in reality it does not happen that often. What really happens is that a drain is performed of the load and/or store buffers to L1 cache (L2 in case of IBM) and it's up to the cache coherence protocol to sync data from there; or to put it differently caches are smart enough to talk to each other (via a BUS) and not fetch data from main memory all the time.
This is a complicated subject (disclaimer: even though I try to do a lot of reading on this, a lot of tests when I have time, I absolutely do not understand it in full glory), it's about potential compiler/cpu/etc re-orderings (program order is never respected), it's about flushes of the buffers, about memory barriers, release/acquire semantics... I don't think that your question is answerable without a phD report; that's why there are higher layers in the JLS called - "happens-before".
Understanding at least a small portion of the above, you would understand that your questions (at least first two), make very little sense.
What is really flushed back to memory by exiting the synchronized block
Probably nothing at all - caches "talk" to each other to sync data; I can only think of two other cases: when you first time read some data and when a thread dies - all written data will be flushed to main memory(but I'm not sure).
What does happens-before-relaitonship mean in this case? And what will be re-read from memory on entering the block, what not?
Really, the same sentence as above.
How does a lock achieve this functionality
Usually by introducing memory barriers; just like volatiles do.
I am new to Java.
I am practicing by writing small programs.
In one of the programs I have an object that holds some configuration.
This configuration can be changed at runtime.
I am saving the configuration to file by serializing the object.
It seems to me that I must take a lock on the object that I am serializing before Serialization, to make sure it wouldn't change during the Serialization.
synchronized (myObject)
{
output.writeObject(myObject);
}
However I've read that one should try to avoid IO operation (such as writing to file) in synchronized block or under any other form of lock. It does make sense, since IO operation might take relatively long time, keeping other threads waiting/blocked.
I wonder, whether there is a way to avoid Serialization under lock...
Any suggestions will be welcome.
A couple of solutions to this problem are:
Only serialize immutable objects
Create a copy of the object and serialize the copy
But what happens in the interval between setting the object and starting to flush it? You could still end up having written another object.
A possible solution could be to lock the object only for writing after having modified it and unlock first when it has been flushed. Locking and unlocking could be done through acquiring and releasing a binary semaphore.
So, acquire() a permit before writing to the object variable and release one after having serialized. This will block only active Modifier-threads and effectively allow further concurrent execution. This will avoid the I/O polling in your example.
One problem is that there could be a second context switch from the Writer-thread - after having written it and just before releasing the lock. But, if you are OK with letting the Modifier-thread(s) wait a bit more, this is then a no worry.
Hope this helps!
You need to execute the serialization process inside lock as your use case required that during write no one should able to modify .
1.Basic solution will be reduce the execution time by Use the transient keyword to reduce the amount of data serialized. Additionally, customized readObject() and writeObject() methods may be beneficial in some cases.
2.If possible you can modify your logic so that you can split your lock So that multiple thread can read /but not able to modify.
3.You can use pattern like use in collection where iteration may take long time so they clone the original object before iteration.
As far as i know, static variables and methods are shared across different sessions. dose this sort of behavior may cause performance degradation, for example when different sessions are reading a static var or calling a static variable at the same time.
There's no usually performance penalty involved in multiple threads reading the same variable or calling the same method at the same time, as long as no other threads are writing to that variable.
And if one thread can write a variable that another thread is reading, then you have a concurrency control issue that you need to handle carefully.
Note, however, that there may be an exception to the above on specific kinds of hardware when a variable that one thread writes is adjacent in memory to a variable that other threads read. In this case they may be in the same "cache line" -- the unit of memory that is read from RAM and cached, and in that case there may be contention between the readers and writers, as the hardware can't tell that they aren't accessing the same location.
The googlable term for this is "false sharing".
Simply "using static variables across sessions" does not inherently have performance implications. There is, however, a cousin concern that you need to look at, instead.
The fields that you're reading from/writing to from multiple user sessions will be accessed concurrently. This means that you will need to make your objects thread-safe (that's going to be necessary if you are writing to these static fields). This is what can have direct performance implications.
I have a set of counters which will only ever be updated in a single thread.
If I read these values from another thread and I don't user volatile/atomic/synchronized how out of date can these values be?
I ask as I am wondering if I can avoid using volatile/atomic/synchronized here.
I currently believe that I can't make any assumptions about time to update (so I am forced to use at least volatile). Just want to make sure I am not missing something here.
I ask as I am wondering if I can avoid using volatile/atomic/synchronized here.
In practice, the CPU cache is probably going to be synchronized to main memory anyway on a regular basis (how often depends on many parameters), so it sounds like you would be able to see some new values from time to time.
But that is missing the point: the actual problem is that if you don't use a proper synchronization pattern, the compiler is free to "optimise" your code and remove the update part.
For example:
class Broken {
boolean stop = false;
void broken() throws Exception {
while (!stop) {
Thread.sleep(100);
}
}
}
The compiler is authorised to rewrite that code as:
void broken() throws Exception {
while (true) {
Thread.sleep(100);
}
}
because there is no obligation to check if the non-volatile stop might change while you are executing the broken method. Mark the stop variable as volatile and that optimisation is not allowed any more.
Bottom line: if you need to share state you need synchronization.
How stale a value can get is left entirely to the discretion of the implementation -- the spec doesn't provide any guarantees. You will be writing code that depends on the implementation details of a particular JVM and which can be broken by changes to memory models or to how the JIT reorders code. The spec seems to be written with the intent of giving the implementers as much rope as they want, as long as they observe the constraints imposed by volatile, final, synchronized, etc.
It looks like the only way that I can avoid the synchronization of these variables is to do the following (similar to what Zan Lynx suggested in the comments):
Figure out the maximum age I am prepared to accept. I will make this
the "update interval".
Each "update interval" copy the unsynchronized counter variables to synchronized variables. This neeeds to be done on the write thread.
Read thread(s) can only read from these synchronized variables.
Of course, this optimization may only be a marginal improvement and would probably not be worth it considering the extra complexity it would create.
Java8 has a new class called LongAdder which helps with the problem of using volatile on a field. But until then...
If you do not use volatile on your counter then the results are unpredictable. If you do use volatile then there are performance problems since each write must guarantee cache/memory coherency. This is a huge performance problem when there are many threads writing frequently.
For statistics and counters that are not critical to the application, I give users the option of volatile/atomic or none with none the default. So far, most use none.