Taking lock to serialize

Taking lock to serialize - java

I am new to Java.
I am practicing by writing small programs.
In one of the programs I have an object that holds some configuration.
This configuration can be changed at runtime.
I am saving the configuration to file by serializing the object.
It seems to me that I must take a lock on the object that I am serializing before Serialization, to make sure it wouldn't change during the Serialization.
synchronized (myObject)
{
output.writeObject(myObject);
}
However I've read that one should try to avoid IO operation (such as writing to file) in synchronized block or under any other form of lock. It does make sense, since IO operation might take relatively long time, keeping other threads waiting/blocked.
I wonder, whether there is a way to avoid Serialization under lock...
Any suggestions will be welcome.

A couple of solutions to this problem are:
Only serialize immutable objects
Create a copy of the object and serialize the copy

But what happens in the interval between setting the object and starting to flush it? You could still end up having written another object.
A possible solution could be to lock the object only for writing after having modified it and unlock first when it has been flushed. Locking and unlocking could be done through acquiring and releasing a binary semaphore.
So, acquire() a permit before writing to the object variable and release one after having serialized. This will block only active Modifier-threads and effectively allow further concurrent execution. This will avoid the I/O polling in your example.
One problem is that there could be a second context switch from the Writer-thread - after having written it and just before releasing the lock. But, if you are OK with letting the Modifier-thread(s) wait a bit more, this is then a no worry.
Hope this helps!

You need to execute the serialization process inside lock as your use case required that during write no one should able to modify .
1.Basic solution will be reduce the execution time by Use the transient keyword to reduce the amount of data serialized. Additionally, customized readObject() and writeObject() methods may be beneficial in some cases.
2.If possible you can modify your logic so that you can split your lock So that multiple thread can read /but not able to modify.
3.You can use pattern like use in collection where iteration may take long time so they clone the original object before iteration.

Related

java - what does synchronized really do according the java memory model?

After reading a little bit about the java memory model and synchronization, a few questions came up:
Even if Thread 1 synchronizes the writes, then although the effect of the writes will be flushed to main memory, Thread 2 will still not see them because the read came from level 1 cache. So synchronizing writes only prevents collisions on writes. (Java thread-safe write-only hashmap)
Second, when a synchronized method exits, it automatically establishes a happens-before relationship with any subsequent invocation of a synchronized method for the same object. This guarantees that changes to the state of the object are visible to all threads. (https://docs.oracle.com/javase/tutorial/essential/concurrency/syncmeth.html)
A third website (I can't find it again, sorry) said that every change to any object - it doesn't care where the reference comes from - will be flushed to memory when the method leaves the synchronized block and establishes a happens-before situation.
My questions are:
What is really flushed back to memory by exiting the synchronized block? (As some websites also said that only the object whose lock has been aquired will be flushed back.)
What does happens-before-relaitonship mean in this case? And what will be re-read from memory on entering the block, what not?
How does a lock achieve this functionality (from https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/locks/Lock.html):
All Lock implementations must enforce the same memory synchronization semantics as provided by the built-in monitor lock, as described in section 17.4 of The Java™ Language Specification:
A successful lock operation has the same memory synchronization effects as a successful Lock action.
A successful unlock operation has the same memory synchronization effects as a successful Unlock action.
Unsuccessful locking and unlocking operations, and reentrant locking/unlocking operations, do not require any memory synchronization effects.
If my assumtion that everything will be re-read and flushed is correct, this is achieved by using synchronized-block in the lock- and unlock-functions (which are mostly also necessary), right? And if it's wrong, how can this functionality be achieved?
Thank you in advance!

The happens-before-relationship is the fundamental thing you have to understand, as the formal specification operates in terms of these. Terms like “flushing” are technical details that may help you understanding them, or misguide you in the worst case.
If a thread performs action A within a synchronized(object1) { … }, followed by a thread performing action B within a synchronized(object1) { … }, assuming that object1 refers to the same object, there is a happens-before-relationship between A and B and these actions are safe regarding accessing shared mutable data (assuming, no one else modifies this data).
But this is a directed relationship, i.e. B can safely access the data modified by A. But when seeing two synchronized(object1) { … } blocks, being sure that object1 is the same object, you still need to know whether A was executed before B or B was executed before A, to know the direction of the happens-before-relationship. For ordinary object oriented code, this usually works naturally, as each action will operate on whatever previous state of the object it finds.
Speaking of flushing, leaving a synchronized block causes flushing of all written data and entering a synchronized block causes rereading of all mutable data, but without the mutual exclusion guaranty of a synchronized on the same instance, there is no control over which happens before the other. Even worse, you can not use the shared data to detect the situation, as without blocking the other thread, it can still inconsistently modify the data you’re operating on.
Since synchronizing on different objects can’t establish a valid happens-before relationship, the JVM’s optimizer is not required to maintain the global flush effect. Most notably, today’s JVMs will remove synchronization, if Escape Analysis has proven that the object is never seen by other threads.
So you can use synchronizing on an object to guard access to data stored somewhere else, i.e not in that object, but it still requires consistent synchronizing on the same object instance for all access to the same shared data, which complicates the program logic, compared to simply synchronizing on the same object containing the guarded data.
volatile variables, like used by Locks internally, also have a global flush effect, if threads are reading and writing the same volatile variable, and use the value to form a correct program logic. This is trickier than with synchronized blocks, as there is no mutual exclusion of code execution, or well, you could see it as having a mutual exclusion limited to a single read, write, or cas operation.

There is no flush per-se, it's just easier to think that way (easier to draw too); that's why there are lots of resources online that refer to flush to main memory (RAM assuming), but in reality it does not happen that often. What really happens is that a drain is performed of the load and/or store buffers to L1 cache (L2 in case of IBM) and it's up to the cache coherence protocol to sync data from there; or to put it differently caches are smart enough to talk to each other (via a BUS) and not fetch data from main memory all the time.
This is a complicated subject (disclaimer: even though I try to do a lot of reading on this, a lot of tests when I have time, I absolutely do not understand it in full glory), it's about potential compiler/cpu/etc re-orderings (program order is never respected), it's about flushes of the buffers, about memory barriers, release/acquire semantics... I don't think that your question is answerable without a phD report; that's why there are higher layers in the JLS called - "happens-before".
Understanding at least a small portion of the above, you would understand that your questions (at least first two), make very little sense.
What is really flushed back to memory by exiting the synchronized block
Probably nothing at all - caches "talk" to each other to sync data; I can only think of two other cases: when you first time read some data and when a thread dies - all written data will be flushed to main memory(but I'm not sure).
What does happens-before-relaitonship mean in this case? And what will be re-read from memory on entering the block, what not?
Really, the same sentence as above.
How does a lock achieve this functionality
Usually by introducing memory barriers; just like volatiles do.

Java. Read, write, separate synch

I am learning multithreading, and I have a little question.
When I am sharing some variable between threads (ArrayList, or something other like double, float), should it be lcoked by the same object in read/write? I mean, when 1 thread is setting variable value, can another read at same time withoud any problems? Or should it be locked by same object, and force thread to wait with reading, until its changed by another thread?

All access to shared state must be guarded by the same lock, both reads and writes. A read operation must wait for the write operation to release the lock.
As a special case, if all you would to inside your synchronized blocks amounts to exactly one read or write operation, then you may dispense with the synchronized block and mark the variable as volatile.

Short: It depends.
Longer:
There is many "correct answer" for each different scenarios. (and that makes programming fun)
Do the value to be read have to be "latest"?
Do the value to be written have let all reader known?
Should I take care any race-condition if two threads write?
Will there be any issue if old/previous value being read?
What is the correct behaviour?
Do it really need it to be correct ? (yes, sometime you don't care for good)
tl;dr
For example, not all threaded programming need "always correct"
sometime you tradeoff correctness with performance (e.g. log or progress counter)
sometime reading old value is just fine
sometime you need eventually correct (e.g. in map-reduce, nobody nor synchronized is right until all done)
in some cases, correct is mandatory for every moment (e.g. your bank account balance)
in write-once, read-only it doesn't matter.
sometime threads in groups with complex cases.
sometime many small, independent lock run faster, but sometime flat global lock is faster
and many many other possible cases
Here is my suggestion: If you are learning, you should thing "why should I need a lock?" and "why a lock can help in DIFFERENT cases?" (not just the given sample from textbook), "will if fail or what could happen if a lock is missing?"

If all threads are reading, you do not need to synchronize.
If one or more threads are reading and one or more are writing you will need to synchronize somehow. If the collection is small you can use synchronized. You can either add a synchronized block around the accesses to the collection, synchronized the methods that access the collection or use a concurrent threadsafe collection (for example, Vector).
If you have a large collection and you want to allow shared reading but exclusive writing you need to use a ReadWriteLock. See here for the JavaDoc and an exact description of what you want with examples:
ReentrantReadWriteLock
Note that this question is pretty common and there are plenty of similar examples on this site.

Thread safety when only one thread is writing

I know if two threads are writing to the same place I need to make sure they do it in a safe way and cause no problems but what if just one thread reads and does all the writing while another just reads.
In my case I'm using a thread in a small game for the 1st time to keep the updating apart from the rendering. The class that does all the rendering will never write to anything it reads, so I am not sure anymore if I need handle every read and write for everything they both share.
I will take the right steps to make sure the renderer does not try to read anything that is not there anymore but when calling things like the player and entity's getters should I be treating them in the same way? or would setting the values like x, y cords and Booleans like "alive" to volatile do the trick?
My understanding has become very murky on this and could do with some enlightening
Edit: The shared data will be anything that needs to be drawn and moved and stored in lists of objects.
For example the player and other entity's;

With the given information it is not possible to exactly specify a solution, but it is clear that you need some kind of method to synchronize between the threads. The issue is that as long as the write operations are not atomic that you could be reading data at the moment that it is being updates. This means that you for instance get an old y-coordinate with a new x-coordinate.
Basically you only do not need to worry about synchronization if both threads are only reading the information or - even better - if all the data structures are immutable (so both threads can not modify the objects). The best way to proceed is to think about which operations need to be atomic first, and then create a solution to make the operations atomic.
Don't forget: get it working, get it right, get it optimized (in that order).

You could have problems in this case if list's sizes are variable and you don't synchronize the access to them, consider this:
read-only thread reads mySharedList size and it sees it is 15; at that moment its CPU time finishes and read-write thread is given the CPU
read-write thread deletes an element from the list, now its size is 14.
read-only thread is again granted CPU time, it tries to read the last element using the (now obsolete) size it read before being interrupted, you'll have an Exception.

Synchronization with databases

Quick question about "best practices" in Java. Suppose you have a database object, with the primary data structure for the database as a map. Further, suppose you wanted to synchronize any getting/setting info for the map. Is it better to synchronize every method that accesses/modifies the map, or do you want to create sync blocks around the map every time it's modified/accessed?

Depends on the scope of your units of work that need to be atomic. If you have a process that performs multiple operations that represent a single change of state, then you want to synchronize that entire process on the Map object. If you are synchronizing each individual operation, multiple threads can still interleave with each other on reads and writes. It would be like using a database cursor in read-uncommitted mode. You might make a decision based on some other threads half-complete work, seeing an incomplete/incorrect data state.
(And of course insert obligatory suggestion to use classes from java.util.concurrent.locks instead of the synchronized keyword :) )

In the general case, it is better to prefer to synchronize on a private final Object for non-private methods than it is to have private synchronized methods. The rationale for this is that you do not want a rouge caller to pass an input to your method and acquire your lock. For private methods you have complete control over how they can be called.
Personally, I avoid synchronized methods and encapsulate the method in a synchronized() block instead. This gives me tighter control and prevents outside sources from stealing my monitor. I cannot think of cases where you would want to provide an outside source access to your monitor, but if you did you could instead pass them your lock object just the same. But like I said, I would avoid that.

Java avoid race condition WITHOUT synchronized/lock

In order to avoid race condition, we can synchronize the write and access methods on the shared variables, to lock these variables to other threads.
My question is if there are other (better) ways to avoid race condition? Lock make the program slow.
What I found are:
using Atomic classes, if there is only one shared variable.
using a immutable container for multi shared variables and declare this container object with volatile. (I found this method from book "Java Concurrency in Practice")
I'm not sure if they perform faster than syncnronized way, is there any other better methods?
thanks

Avoid state.
Make your application as stateless as it is possible.
Each thread (sequence of actions) should take a context in the beginning and use this context passing it from method to method as a parameter.
When this technique does not solve all your problems, use the Event-Driven mechanism (+Messaging Queue).
When your code has to share something with other components it throws event (message) to some kind of bus (topic, queue, whatever).
Components can register listeners to listen for events and react appropriately.
In this case there are no race conditions (except inserting events to the queue). If you are using ready-to-use queue and not coding it yourself it should be efficient enough.
Also, take a look at the Actors model.

Atomics are indeed more efficient than classic locks due to their non-blocking behavior i.e. a thread waiting to access the memory location will not be context switched, which saves a lot of time.
Probably the best guideline when synchronization is needed is to see how you can reduce the critical section size as much as possible. General ideas include:
Use read-write locks instead of full locks when only a part of the threads need to write.
Find ways to restructure code in order to reduce the size of critical sections.
Use atomics when updating a single variable.
Note that some algorithms and data structures that traditionally need locks have lock-free versions (they are more complicated however).

Well, first off Atomic classes uses locking (via synchronized and volatile keywords) just as you'd do if you did it yourself by hand.
Second, immutability works great for multi-threading, you no longer need monitor locks and such, but that's because you can only read your immutables, you cand modify them.
You can't get rid of synchronized/volatile if you want to avoid race conditions in a multithreaded Java program (i.e. if the multiple threads cand read AND WRITE the same data). Your best bet is, if you want better performance, to avoid at least some of the built in thread safe classes which do sort of a more generic locking, and make your own implementation which is more tied to your context and thus might allow you to use more granullar synchronization & lock aquisition.
Check out this implementation of BlockingCache done by the Ehcache guys;
http://www.massapi.com/source/ehcache-2.4.3/src/net/sf/ehcache/constructs/blocking/BlockingCache.java.html

One of the alternatives is to make shared objects immutable. Check out this post for more details.

You can perform up to 50 million lock/unlocks per second. If you want this to be more efficient I suggest using more course grain locking. i.e. don't lock every little thing, but have locks for larger objects. Once you have much more locks than threads, you are less likely to have contention and having more locks may just add overhead.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.