Right to implement synchronized blocks for Netty server?

Right to implement synchronized blocks for Netty server? - java

I have been building a game server using Netty which may have thousands concurrent connections. I have known that in the server side those connections may share some but not only one worker threads thus it is not safe to let them to access freely shared data, e.g. to find, remove or add some objects to some common lists and maps. I am considering to add synchronized blocks to all code which access shared data. (For heavy tasks such as querying database I plan to use ExecutorService / Threads so synchronisation won’t be a big problem for those tasks).
I am still confused if it a good / common solution or there are better ways (than using synchronized blocks) to do that for Netty server.
Can someone give me some advices please. Many thanks in advance.

Synchronized blocks (or their equivalent ReentrantLocks) are the only reliable way to access shared data. In asynchronous environment like Netty, however, the code inside synchronized block may not call to wait(), since this excludes current thread from serving. Usually synchronized block is an intermediate object in producer-consumer communication, and consumer calls to wait() when there is no data from producer. To avoid blocking, consumer, when data are not ready, places himself (or another object of type Runnable) into the intermediate object, and producer, when sending data, submits that Runnable to a thread pool (instead of calling to notify()).
Synchronized blocks are low-level facilities and should not be mixed with business logic. Instead, a communicating framework should be developed with adequate interface. An example of such framework is my df4j library.

Related

Java RMI and advanced multithreading

I am implementing something like a database where data manipulation statements (inserts, updates and deletes) get evaluated. Some statements can execute concurrently and others cannot (I compute that). I like the ease of use and convenience of RMI, however I need to have a deeper understanding of the RMI service implementation w.r.t multithreading. For example,
Can the multithreading be controlled in any way?
Is a thread created for each remote call (on server side) or are thread pools used?
More generally, using RMI, how can I ensure that some rmi calls wait for other calls to terminate?
Is there another non-RMI approach, with the same convenience and efficiency that would work better for this?
If I want multi-threading should I just create threads myself on the server side code? The concern is that if the RMI Service creates multiple threads than I would be adding additional unnecessary threads.
If, for example, a thread is created on each call, then I can use the java join method to order the statement execution. On the other hand, if thread pools are used then the join method won't work (since the threads don't terminate).

Overview
There seems to be a few questions within this post, so I will attempt to walk you through each portion in some detail.
Question 1 - Can the multi-threading be controlled in any way?
Yes! Your implementation of the multi-threading can be whatever you want it to be. A RMI implementation is only the communication between seperate JVMs with enough abstraction to feel like they exist on 1 JVM; thus has no effect on multi-threading as it is only the communication layer.
Question 2 - Is a thread created for each remote call (on the server side) or are thread-pools used?
See the documentation here. The short answer to if they are on separate threads is no.
A method dispatched by the RMI runtime to a remote object implementation may or may not execute in a separate thread. The RMI runtime makes no guarantees with respect to mapping remote object invocations to threads. Since remote method invocation on the same remote object may execute concurrently, a remote object implementation needs to make sure its implementation is thread-safe.
RMI using thread-pools depends on the implementation, but as a developer utilizing RMI this should be of no concern as it is encapsulated in the RMI connection layer.
Question 3 - Using RMI, how can I ensure that some RMI calls wait for other calls to terminate?
This is a rather vague question, but I think what your asking is how do you properly block when synchronizing in RMI. This comes with your design of the application. Lets take the scenario where you are trying to access the database and you must synchronize DB access. If the client attempts to invoke access through RMI, it will invoke the remote server's method that holds all the synchronization, thus wait for a lock if it must. Therefore, the Client will be waiting for its turn to access the DB via the server. So, with your current scenario, you want your synchronization of the DB to be present on the server-side.
Question 4 - Is there another non-RMI approach, with the same convenience and efficiency that would work better for this?
Absolutely. Below is a brief list of communication implementations that could be utilized for communication.
1) RESTful
2) RMI
3) Sockets
4) gRPC
My recommendation is to utilize RESTful as it is the most straight-forward and has plenty of implementations/documentation on the internet. Efficiency seems to be quite a high concern for you, but your operations are only manipulating a DB in a standard manner. Therefore, I believe a Restful implementation would provide more than enough efficiency.
Think of it like this; you have N number clients, a load-balancer, and M servers. There exists no constant connection between clients and servers thus reducing complexity and computation. As N clients grows, the load balancer creates more instances of servers and allocating the load appropriately. Note, the requests between clients and servers are actually quite small as they will have a payload and a request type. Additionally, servers will receive the requests and compute the operations as normal and in parallel. The optimization can be done on the server side via threadpools or frameworks such as spring.

What you are asking for is a way to coordinate execution of the tasks according to their dependencies. The fact that your tasks make RMI calls is insignificant. We can imagine a pure computational tasks which does not access remote machines and still are dependent on each other, for example, by providing values computed in one task as parameters for other tasks.
The coordination of dependent tasks is the central problem of asynchronous programming. The support of asynchronous programming in JDK is not full but sufficient for your problem. You need to use 2 things: CompletableFuture and an Executor. Note that an RMI call blocks the thread it runs on, so using an Executor with limited number of threads can lead to the deadlock of specific kind, named "thread starvation", when the computation cannot move on because all available threads are blocked. So use an executor with unlimited number of threads, the simplest is the one which creates new thread for each task:
Executor newThreadExecutor = (Runnable r)->new Thread(r).start();
Then, for each RMI call (or any other task), declare the task method. If the task does not depend on other tasks, then the method should have no parameters. If the task depends on the result(s) produced by other task(s), then declare one or two parameters (greater number of parameters is not supported by CompletableFuture directly). Let we have:
String m0a() {return "ok";} // no args
Integer m0b() {return 1;} // no args
Double m2(String arg, Integer arg2) {return arg2/2.0;} // 2 args
Let we want to compute the following result:
String r0a = m0a();
Integer r0b = m0b();
Double r2 = m2(r0a, r0b);
but asynchronously, so that calls to m0a and m0b are executed in parallel, and the call to m2 starts as soon as both m0a and m0b finished.
Then, wrap each task method with an instance of CompletableFuture Depending on the signature of the task method, different methods of CompletableFuture are used:
CompletableFuture<String> t0a = CompletableFuture.supplyAsync(this::m0a, newThreadExecutor)
CompletableFuture<Integer> t0b = CompletableFuture.supplyAsync(this::m0b, newThreadExecutor)
CompletableFuture<Double> t2 = t0a.thenCombineAsync(t0b, this::m2, newThreadExecutor)
The tasks start execution right after declaration, no call to special start method is required.
To get the final result from the last task t2, method get() of interface Future can be used:
Double res = t2.get();

Best practices with Akka in Scala and third-party Java libraries

I need to use memcached Java API in my Scala/Akka code. This API gives you both synchronous and asynchronous methods. The asynchronous ones return java.util.concurrent.Future. There was a question here about dealing with Java Futures in Scala here How do I wrap a java.util.concurrent.Future in an Akka Future?. However in my case I have two options:
Using synchronous API and wrapping blocking code in future and mark blocking:
Future {
blocking {
cache.get(key) //synchronous blocking call
}
}
Using asynchronous Java API and do polling every n ms on Java Future to check if the future completed (like described in one of the answers above in the linked question above).
Which one is better? I am leaning towards the first option because polling can dramatically impact response times. Shouldn't blocking { } block prevent from blocking the whole pool?

I always go with the first option. But i am doing it in a slightly different way. I don't use the blocking feature. (Actually i have not thought about it yet.) Instead i am providing a custom execution context to the Future that wraps the synchronous blocking call. So it looks basically like this:
val ecForBlockingMemcachedStuff = ExecutionContext.fromExecutorService(Executors.newFixedThreadPool(100)) // whatever number you think is appropriate
// i create a separate ec for each blocking client/resource/api i use
Future {
cache.get(key) //synchronous blocking call
}(ecForBlockingMemcachedStuff) // or mark the execution context implicit. I like to mention it explicitly.
So all the blocking calls will use a dedicated execution context (= Threadpool). So it is separated from your main execution context responsible for non blocking stuff.
This approach is also explained in a online training video for Play/Akka provided by Typesafe. There is a video in lesson 4 about how to handle blocking calls. It is explained by Nilanjan Raychaudhuri (hope i spelled it correctly), who is a well known author for Scala books.
Update: I had a discussion with Nilanjan on twitter. He explained what the difference between the approach with blocking and a custom ExecutionContext is. The blocking feature just creates a special ExecutionContext. It provides a naive approach to the question how many threads you will need. It spawns a new thread every time, when all the other existing threads in the pool are busy. So it is actually an uncontrolled ExecutionContext. It could create lots of threads and lead to problems like an out of memory error. So the solution with the custom execution context is actually better, because it makes this problem obvious. Nilanjan also added that you need to consider circuit breaking for the case this pool gets overloaded with requests.
TLDR: Yeah, blocking calls suck. Use a custom/dedicated ExecutionContext for blocking calls. Also consider circuit breaking.

The Akka documentation provides a few suggestions on how to deal with blocking calls:
In some cases it is unavoidable to do blocking operations, i.e. to put
a thread to sleep for an indeterminate time, waiting for an external
event to occur. Examples are legacy RDBMS drivers or messaging APIs,
and the underlying reason is typically that (network) I/O occurs under
the covers. When facing this, you may be tempted to just wrap the
blocking call inside a Future and work with that instead, but this
strategy is too simple: you are quite likely to find bottlenecks or
run out of memory or threads when the application runs under increased
load.
The non-exhaustive list of adequate solutions to the “blocking
problem” includes the following suggestions:
Do the blocking call within an actor (or a set of actors managed by a router), making sure to configure a thread pool which is either
dedicated for this purpose or sufficiently sized.
Do the blocking call within a Future, ensuring an upper bound on the number of such calls at any point in time (submitting an unbounded
number of tasks of this nature will exhaust your memory or thread
limits).
Do the blocking call within a Future, providing a thread pool with an upper limit on the number of threads which is appropriate for the
hardware on which the application runs.
Dedicate a single thread to manage a set of blocking resources (e.g. a NIO selector driving multiple channels) and dispatch events as they
occur as actor messages.
The first possibility is especially well-suited for resources which
are single-threaded in nature, like database handles which
traditionally can only execute one outstanding query at a time and use
internal synchronization to ensure this. A common pattern is to create
a router for N actors, each of which wraps a single DB connection and
handles queries as sent to the router. The number N must then be tuned
for maximum throughput, which will vary depending on which DBMS is
deployed on what hardware.

Common practices to avoid timeouts / starvation in Java?

I have a web-service that write files to disk and other stuff to database. The entire operation takes 1-2 seconds for each write.
The service can, bur that is unlikely, be called from several clients at the same time. Let´s assume that 20 clients call the webservice at the same time, the write operations must be synchronized. In that case, some clients can get a time out exception because they have to wait to many seconds.
Are there any good practices to solve these kind of situations? As it is now, the methods are synchronized (and that can cause the starvation/timeouts).
Should I let all threads get into the write method by removing the synchronized keyword and put their task into a task queue to avoid a timeout? Is that the correct way to get arount this?

Removing the synchronized and putting it into a task queue by itself will not help you (because that's effectively what the synchronized is doing for you). However if you respond to the web request as soon as you put it on the queue, then you will reduce your response fime. But at the cost of some reliability as the user will get a confirmation that the work is done and the work will not really have been done (the system could crash before the work is done).

Francis Upton's practice is indeed an accepted practice.
Another one, is making more fine grained synchronization. Instead of synchronizing all read/write methods of a class, you can synchronize access of the exact invariants that should be synchronized.
And yet even better, is to get rid of synchronization altogether. This is possible using the java.util.concurrent package. This package introduce new collections that use Non-Blocking Algorithms (implemented in java using Compare-Ans-Swap atomic instructions). These collections, such as ConcurrentHashMap, enable much better throughput when scaling.
You can read more about it in this article.

In this type of implementation (slow service under increasing load) you want to make as much as possible async, including the timeout processing (if server-based) and the required I/O. Don't hold up your client response threads waiting for either of these time-consuming operations, to preserve the server's responsiveness to new requests, but instead fire off the required operations (maybe to a dynamic thread pool) and let callbacks process the results, whether timeout, complete I/O, or errors.
Send the appropriate response depending on what happens first, but be prepared to roll back I/O if you send an error/timeout message and then a completed I/O arrives (due to a race condition between I/O and timer). This implies transactional semantics are required in the server.
This is an area that get increasingly complex as your load grows but good design early on should allow you to scale as load grows. Ideally the client servicing threads should not block at all.

How to integrate LMAX within a real financial application

I am also thinking of integrating the disruptor pattern in our application. I am a bit unsure about a few things before I start using the disruptor
I have 3 producers, mainly a FIX thread which de-serialises the requests. Another thread which continously modifies order price as the market moves. Also we have one more thread which is responsible for de-serialising the requests sent from a GUI application. All three threads currently write to a Blocking Queue (hence we see a lot of contention on the queue)
The disruptor talks about a Single writer principle and from what I have read that approach scales the best. Is there any way we could make the above three threads obey the single writer principle?
Also in a typical request/response application, specially in our case we have contention on an in memory cache, as we need to lock the cache when we update the cache with the response, whilst a request might be happening for the same order. How do we handle this through the disruptor, i.e. how do I tie up a response to a particular request? Can I eliminate the lock on the cache if yes how?
Any suggestions/pointers would be highly appreciated. We are currently using Java 1.6

I'm new to distruptor and am trying to understand as much usecases as possible. I have tried to answer your questions.
Yes, Disruptor can be used to sequence calls from multiple
producers. I understand that all 3 threads try to update the state
of a shared object. And a single consumer which takes necessary action on the shared object. Internally you can have the single consumer delegate calls to the appropriate single threaded handler based on responsibility. The
The Disruptor exactly does this. It sequences the calls such that
the state is accessed only by a thread at a time. If there's a specific order in which the event handlers are to be invoked, set up the memory barrier. The latest version of Disruptor has a DSL that lets you setup the order easily.
The Cache can be abstracted and accessed through the Disruptor. At a time, only a
Reader or a Writer would get access to the cache, since all calls to
the cache are sequential.

Is there a use case for creating threads without synchronization and locks?

Since thread execution happens in a pool, and is not guaranteed to queue in any particular order, then why would you ever create threads without the protection of synchronization and locks? In order to protect data attached to an object's state (what I understand to be the primary purpose of using threads), locking appears to be the only choice. Eventually you'll end up with race conditions and "corrupted" data if you don't synchronize. So if you're not interested in protecting that data, then why use threads at all?

If there's no shared mutable data, there's no need for synchronization or locks.

Delegation, just as one example. Consider a webserver that gets connect requests. It can delegate to a worker thread a particular request. The main thread can pass all the data it wants to the worker thread, as long as that data is immutable, and not have to worry at all about concurrent data access.
(For that matter, both main thread and worker thread can send all the immutable data to each other they want, it just requires a messaging queue of some sort, so the queue may need synchronization but not the data itself. But you don't need a message queue to get data to a worker thread, just construct the data before the thread starts, and as long as the data is immutable at that point, you don't need any synchronization or locks or concurrency management of any sort, other than the ability to run a thread.)

Synchronization and locks protect shared state from conflicting concurrent updates. If there is no shared state to protect, you can run multiple threads without locking and synchronization. This might be the case in a web server with multiple independent worker threads serving incoming requests. Another way to avoid synchronization and locking is to have your threads only operate on immutable shared state: if a thread can't alter any data that another thread is operating on, concurrent unsynchronized access is fine.
Or you might be using an Actor-based system to handle concurrency. Actors communicate by message passing only, there is no shared state for them to worry about. So here you can have many threads running many Actors without locks. Erlang uses this approach, and there is a Scala Actors library that allows you to program this way on the JVM. In addition there are Actors-based libraries for Java.

In order to protect data attached to
an object's state (what I understand
to be the primary purpose of using
threads), locking appears to be the
only choice. ... So if
you're not interested in protecting
that data, then why use threads at
all?
The highlighted bit of your question is incorrect, and since it is the root cause of your "doubts" about threads, it needs to be addressed explicitly.
In fact, the primary purpose for using threads is to allow tasks to proceed in parallel, where possible. On a multiprocessor the parallelism will (all things being equal) speedup your computations. But there are other benefits that apply on a uniprocessor as well. The most obvious one is that threads allow an application to do work while waiting for some IO operation to complete.
Threads don't actually protect object state in any meaningful way. The protection you are attributing to threads comes from:
declaring members with the right access,
hiding state behind getters / setters,
correct use of synchronization,
use of the Java security framework, and/or
sending requests to other servers / services.
You can do all of these independently of threading.

java.util.concurrent.atomic provides for some minimal operations that can be performed in a lock-free and yet thread-safe way. If you can arrange your concurrency entirely around such classes and operations, your performance can be vastly enhanced (as you avoid all the overhead connected with locking). Granted, it's unusual to be working on such a simplifiable problem (more often some locking will be needed), but, if and when you do find yourself in such a situation, well, then, that's exactly the use case you're asking about!-)

There are other kinds of protection for shared data. Maybe you have atomic sections, monitors, software transactional memory, or lock-free data structures. All these ideas support parallel execution without explicit locking. You can Google any of these terms and learn something interesting. If your primary interest is Java, look up Tim Harris's work.

Threads allow multiple parallel units of work to progress concurrently. The synchronisation is simply to protect shard resources from unsafe access if not needed you don't use it.
Processing on threads becomes delayed when accessing certain resources such as IO and it may be desirable to keep the CPU processing other units of work while others are delayed.
As in the example in the other answer listening to services requests may well be a unit of work that is kept independent of responding to a request as the latter my block due to resource contention - say access disk or IO.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.