Kotlin Concurrency againts Goroutine, Task, CompletableFuture

Kotlin Concurrency againts Goroutine, Task, CompletableFuture - java

People always show the same example when I read articles about concurrency in kotlin coroutine or golang goroutine.
Create 100_000 Threads in Java or C#, ooopps Stackoverflow.
Yes. but anyone Who uses directly Thread classes in Java or C#?
In java and C#, There are thread pools for CompletableFuture and Task.
When We try to create 100_000 Task or CompletableFuture, We can do that easily with ExecuterService/ForkJoinPool or dotnet DefaultThread Pool. They will reuse the threads. If there is no available thread. Tasks will wait in the queue.
My Questions;
yes structured concurrency is good for cancellations. But Kotlin uses the Thread Pool like CompletableFuture. But unlike Java Callbacks, It provides natural code syntax. The only Difference is Syntax for Kotlin coroutine between c# Task or Java CompletableFuture?
Kotlin runs on JVM. as far as I know, JVM doesn't support green Threads. But people talk like kotlin uses Green Threads. How is that possible with JVM? And Why Coroutines are called Lightweight Threads. Then We can say CompletableFuture and Task are Lightweight Thread too. Right?
Yes, golang has a scheduler. goroutines are user-level threads. When we create a goroutine it goes to localrunqueue. And a dedicated OS thread gets goroutines one by one from that queue and executes. There are no context switch operations. All of them run on the same OS Thread until blocking. Goroutines are cheap and We can say that YES goroutines are Lightweight Threads.
Maybe I'm completely wrong about coroutines. please correct me.

Making things simple:
Thread - easy to use, because it is sequential. Reading from the network and writing to a file is as simple as: writeToDisk(readFromNetwork()). On the other hand, it is expensive to run each task in a separate thread.
Executors/CompletableFuture - more performant, it makes better use of both CPU and memory. On the other hand, it requires using callbacks and the code quickly becomes hard to read and maintain.
Coroutines - both sequential and performant.
I ignore other features of coroutines like structured concurrency, because this was not your main concern.

Related

Non-blocking Async IO in Java

Is there any way to do asynchronous IO in Java without blocking any threads (including background threads)? Coming from C#, my understanding of async IO is that it when you call
await ReadAsync()
The calling thread (part of a threadpool) steps into the ReadAsync function, at some point calls an asynchronous read function from the OS kernel, and then adds itself back to the threadpool to pick up other Tasks. Once the read is completed, the threadpool is notified and another thread picks up the rest of the Task.
In Java, on the other hand, the documentation and this answer seem to suggest that asynchronous IO functions are simply called by a background thread that then blocks. This seems less performant. Is there any way to achieve true, non-blocking IO in Java?

The AsynchronousFileChannel.open() returns instances of different implementations according to the running environment. On Windows it should return an instance of WindowsAsynchronousFileChannelImpl which uses I/O completion port and avoids blocking threads on IO operations. Threads of thread pool are only used to dispatch results and do not block, unless the end user programmer blocks that thread.
The RxIo is built on top of AFC and provides the AsyncFiles equivalent to the synchronous Files class but with an asynchronous API. Taking advantage of the continuation-passing style of CompletableFuture (equivalent to .net Task) you may read a file content without blocking:
AsyncFiles
.readAll(path)
.thenAccept(body -> /* invoked on completion */)
.exceptionally(excep -> /* invoked on error*/
You may run the unit tests of RxIo and place a breakpoint at open() and inspect the implementation of WindowsAsynchronousFileChannelImpl.

Until some time ago there were problems with asynchronous file I/O on Linux. There was an aio interface, but it was only asynchronous for O_DIRECT, which is quite inconvenient for standard use cases. So the standard JDK implementation of AsynchronousFileChannel for Linux internally uses thread pooling and simple blocking I/O which is not really asynchronous I/O.
Things have changed a bit since Linux introduced the io_uring interface. It is now possible to use real non-blocking file I/O not just for O_DIRECT but for buffered I/O too. And a lot more, to reduce overhead of syscall and increase performance. Read more about io_uring.
At the moment there is no built-in support for io_uring in Java. There have been rumors that support may appear for better project Loom support, but that's just a rumors.
There are third party libraries that add asynchronous file I/O support via io_uring for Java - jasyncfio.

Reactive Programming Advantages/Disadvantages

I keep studying and trying Reactive Style of coding using Reactor and RxJava. I do understand that reactive coding makes better utilization of CPU compared to single threaded execution.
Is there any concrete comparison between reactive programming vs imperative programming in web based applications?
How much is the performance gain, throughput I achieve by using reactive programming over non-reactive programming?
Also what are the advantages and disadvantages of Reactive Programming?
Is there any statistical benchmark?

Well, Reactive Programming means you are doing all your IO bound tasks such as network calls asynchronously. For an instance say your application calls an external REST API or a database, you can do that invocation asynchronously. If you do so your current thread does not block. You can serve lots of requests by merely spawning one or few threads. If you follow blocking approach you need to have one thread to handle each and every request. You may refer my multi part blog post part one, part two and part three for further details.
Other than that you may use callbacks to do the same. You can do asynchronous invocation using callbacks. But if you do so sometimes you may ended up with callback hell. Having one callback inside another leads to very complex codes which are very hard to maintain. On the other hand RxJava lends you write asynchronous code which is much more simple, composable and readable. Also RxJava provides you a lots of powerful operators such as Map, Zip etc which makes your code much more simple while boosting the performance due to parallel executions of different tasks which are not dependent on each other.
RxJava is not another Observer implementation with set of operators rather it gives you good error handling and retry mechanisms which are really handy.
But I have not conducted any bench marking of RxJava with imperative programming approach to commend you statistically. But I am pretty much sure RxJava should yield good performance over blocking mechanisms.
Update
Since I gathered more experience over time, I thought of adding more points to my answer.
Based on the article, ReactiveX is a library for composing asynchronous and event-based programs by using observable sequences. I reckon you to go through this introductory article in the first place.
These are some properties of reactive systems: Event Driven, Scalable, Resilient, Responsive
When it comes to RxJava it offers two main facilities to a programmer. First it offers a nice composable API using a rich set of operators such as zip, concat, map etc. This yields more simple and readable code. When it comes to code, readability and simplicity are the uttermost important properties. Second, it provides excellent abstractions, that enable concurrency to become declarative.
A popular misconception is that Rx is multithreaded by default. In fact, Rx is single-threaded by default. If you want to do things asynchronously, then you have to tell it explicitly using subscribeOn and observeOn operators by passing relevant schedulers. RxJava gives you thread pools to do asynchronous tasks. There are many schedulers such as IO, Computation and so forth. IO scheduler as the name suggests is best suited for IO intensive tasks such as network calls etc. on the contrary, Computation scheduler is good for more CPU intensive computation tasks. You can also hook up your own Executor services with RxJava too. The built in schedulers mainly helps you to get rid of maintaining your own Executor services, making your code more simple.
Finally a word on subscribeOn and observeOn
In the Rx world, there are generally two things you want to control the concurrency model for:
The invocation of the subscription
The observing of notifications
SubscribeOn: specify the Scheduler on which an Observable will operate.
ObserveOn: specify the Scheduler on which an observer will observe this Observable

Disadvantages
More memory intensive to store streams of data most of the times (since it is based on streams over time).
Might feel unconventional to learn at start(needs everything to be a stream).
Most complexities have to be dealt with at the time of declaration of new services.
Lack of good and simple resources to learn.
Often confused to be equivalent to Functional Reactive Programming.

Apart of what is already mentioned in other responses regarding no blocking features, another great feature about reactive programing is the important use of backpressure. Normally it is used in situations where your publisher emits more information than your consumer can process.
So having this mechanism you can control the flow of traffic between both and avoid nasty out of memory problems.
You can see some practical examples of reactive programming here: https://github.com/politrons/reactive
And about back pressure here: https://github.com/politrons/Akka/blob/master/src/main/scala/stream/BackPressure.scala
By the way, the only disadvantage about reactive programming, is the learning curve because you're changing the programming paradigm. But nowadays all important companies respect and follow the reactive manifesto.

Reactive Programming is a style of micro-architecture involving intelligent routing and consumption of events.
Reactive is that you can do more with less, specifically you can process higher loads with fewer threads.
Reactive types are not intended to allow you to process your requests or data faster.Their strength lies in their capacity to serve more request concurrently, and to handle operations with latency, such as requesting data from a remote server, more efficiently.
They allow you to provide a better quality of service and a predictable capacity planning by dealing natively with time and latency without consuming more resources.
From
https://blog.redelastic.com/what-is-reactive-programming-bc9fa7f4a7fc
https://spring.io/blog/2016/06/07/notes-on-reactive-programming-part-i-the-reactive-landscape
https://spring.io/blog/2016/07/28/reactive-programming-with-spring-5-0-m1

Advantages
Cleaner code, more concise
Easier to read (once you get the hang of
it)
Easier to scale (pipe any operation)
Better error handling
Event-driven inspired -> plays well with streams (Kafka,
RabbitMQ,etc)
Backpressure (client can control flow)
Disadvantages
Can become more memory intensive in some cases
Somewhat steep learning curve

Reactive programming is a kind of imperative programming.
Reactive programming is a kind of parallel programming.
You can achieve performance gain over single threaded execution only if you manage to create parallel branches. Will they executed by multiple threads, or by reactive constructs (which in fact are asynchronous procedures), does not matter.
The single advantage of reactive programming over multithreaded programming is lower memory consumption (each thread requires 0.5...1 megabyte). The disadvantage is less easy programming.
UPDATE (Aug 2020). Parallel programming can be of 2 flavours: mulithreaded programming, where main activity is thread, and asynchronous programming, where main kind of activity is asynchronous procedure (including actors, which are repeatable asynchronous procedures). In mulithreaded programming, various means of communication are used: unbounded queues, bounded (blocking) queues, binary and counting semaphores, countdownLatches and so on. Moreover. there is always possiblity to create your own mean of communication. In asynchronous programming, until recently, only 2 kinds of communicators were used: future for non-repeatable asynchronous procedures, and unbounded queue for actors. Unbounded queue causes problems when producer works faster than consumer. To cope with this problem, new communication protocol was invented: reactive stream, which is combination of unbounded queue and counting (asynchronous) semaphore to make the queue bounded. This is direct analogue to the blocking queue in multithreaded programming. And programming with reactive streams was proudly called Reactive Programming (imagine, if in multithreded programming, programming with blocking queues was called Blocking Programming). But again, no means to create own communication tools were provided to asynchronous programmer. And the asynchronous semaphore cannot be used in its own, only as part of reactive stream. That said, the theory of asynchronous programming, including theory of reactive programming, lags far behind the theory of multithreded programming.
A fancy addition to reactive streams is mapping/filtering functions allowing to write linear piplines like
publisher
.map(()->mappingFunction)
.filter(()->filterFunction)
.flatmap(...)
etc.
But this is not an exclusive feature of reactive programming. And this allows to create only linear piplines, while in multithreaded programming it is easy to create computational graphs of arbitrary topology.

How does multi core performance relate to execution contexts and thread pools in Scala

So i have a existing Spring library that performs some blocking tasks(exposed as services) that i intend to wrap using Scala Futures to showcase multi processor capabilities. The intention is to get people interested in the Scala/Akka tech stack.
Here is my problem.
Lets say i get two services from the existing Spring library. These services perform different blocking tasks(IO,db operations).
How do i make sure that these tasks(service calls) are carried out across multiple cores ?
For example how do i make use of custom execution contexts?
Do i need one per service call?
How does the execution context(s) / thread pools relate to multi core operations ?
Appreciate any help in this understanding.

You cannot ensure that tasks will be executed on different cores. The workflow for the sample program would be as such.
Write a program that does two things on two different threads (Futures, Java threads, Actors, you name it).
JVM sees that you want two threads so it starts two JVM threads and submits them to the OS process dispatcher (or the other
way round, doesn't matter).
OS decides on which core to execute each thread. Usually, it will try to put threads on different cores to maximize the overall efficiency but it is not guaranteed; you might have a situation that your 10 JVM threads will be executed on one core, although this is extreme.
Rule of the thumb for writing concurrent and seemingly parallel applications is: "Here, take my e.g. 10 threads and TRY to split them among the cores."
There are some tricks, like tuning CPU affinity (low-level, very risky) or spawning a plethora of threads to make sure that they are parallelized (a lot of overhead and work for the GC). However, in general, OS is usually not that overloaded and if you create two e.g. actors, one for db one for network IO they should work well in parallel.
UPDATE:
The global ExecutionContext manages the thread pool. However, you can define your own and submit runnables to it myThreadPool.submit(runnable: Runnable). Have a look at the links provided in the comment.

JVM thread management v.s. OS scheduling

As I know, one of the most common JVM concurrency API: futures - at least as implemented in scala - rely on user code to relinquish a thread when it is potentially going to be waiting idle. In scala it's commonly referred to as "avoiding blocking", and the developer has to implement it everywhere it makes sense.
Not quite efficient.
Is there something very entirely inherent to the JVM, that prevents the JVM switching the context of a thread to new tasks - when the thread is idle - as implemented by operating system process schedulers?

Is there something very entirely inherent to the JVM, that prevents the JVM switching the context of a thread to new tasks - when the thread is idle - as implemented by operating system process schedulers?
Mostly the need that such switch has to be done cooperatively. Every single blocking method must be wrapped or re-implemented in a way that allows the task to be resumed once it is done, after all, there is no native thread waiting for completion of the blocking action anymore.
While this can be done in principle for JVM-internal blocking methods, consider arbitrary native code executed via JNI, the JVM wouldn't know how to stack-switch those native threads, they're stuck in native code after all.
You might want to have a look at quasar, as I understand it they implemented such wrappers or equivalents for some JDK-internal methods, such as sleep, park/unpark, channel-based-IO and a bunch of others which allows their fibers (and thus futures running on those fibers) to perform exactly that kind of user-mode context switching while they wait for completion.
Edit: JNI alone already is sufficient to limit user-mode task switching to being an opportunistic optimization that may have to fall back to spinning up additional native threads when native code blocks a thread.
But it is not the only issue, for example on linux truly asynchronous file IO operations need filesystem and kernel support (see this SO question on AIO), which not all of them provide. Where it is not provided it has to be emulated using additional blocking IO threads, thus re-introducing all the overhead we wanted to avoid in the first place. Might as well just block on the thread pool itself and spin up additional threads, at least we'll avoid inter-thread-communication that way.
Memory-mapped files can also block a thread and force the OS-scheduler to suspend the thread due to page faults and I'm not aware of means to cooperate with the virtual memory system to avoid that.
Not to mention that all blocking calls on the VM would have to re-implemented using asynchronous equivalents provided by the OS. Miss even one and you'll have a blocked thread. If you have a blocked thread your thread pools will need an auto-grow feature and we're back to square one.
Last but not least, there may be cases where blocking, one-thread-per-filedescriptor IO may be desirable. The pervasive changes required to guarantee user-mode switching might break those.
So all in all, user mode switching is possible, sometimes. But the JVM cannot make hard guarantees about it so it has to implement all the native thread handling anyway and the programmer will have code at least somewhat cooperatively with the assumptions of the thread pools executing those futures in mind. Some of the cases could be eliminated, but not all of them.

java fork-join executor usage for db access

The ForkJoinTask
explicitly calls out "Subdividable tasks should also not perform blocking I/O". It's primary aim is "computational tasks calculating pure functions or operating on purely isolated objects". My question is :-
Why design the ForkJoinTask to restrict blocking IO tasks?
What are the gotchas if i do implement a blocking IO task?
How come both spring and play frameworks, are full of examples using fork-join executors for DB calls?
In my scenario, a single request does two types of works, one of which is encryption, which pushes CPU core to 100% for 200 ms and second, few database calls. Any kind of static partitioning such as 6 threads for encryption and 2 threads for blocking IO, will not provide optimal usage of the CPU. Hence, having a fork-join executor, with certain level of over provisioning in number of threads over total CPU count, coupled with work stealing, would ensure better usage of CPU resources.
Is my above assumption and understanding around forkjoin executor correct and if not, please point me towards the gap.

Why design the ForkJoinTask to restrict blocking IO tasks?
underlying the fork join pool is shared amount of threads, if there's some IO work blocking on those threads, then less threads for CPU intensive work. other none blocking work will starve.
What are the gotchas if i do implement a blocking IO task?
typically, FJPool allocated thread about the number of processors. so if you do have to use IO blocking on threads, make sure you allocate enough threads for your other tasks.
you can also iso late your IO work on dedicated threads that is not shared with FJ pool. but you call blocking IO, your thread blocks and get scheduled for other task until unblocked
How come both spring and play frameworks, are full of examples using fork-join executors for DB calls?
play is no different. they use dedicated pools for IO task, so other task won't suffer.

The Framework does not restrict any type of processing. It is not recommended to do blocking, etc. I wrote a critique about this framework years ago, here is the point on the recommendations. This was for the Java7 version but it is still applicable for Java8.
Blocking is not fatal, sprint and play block and they work just fine. You need to be careful when using Java8 since there is a default common-fork/join pool and tying up threads there may have consequences for other users. You could always define your own f/j pool with the additional overhead, but at least you wouldn’t interfere with others using the common pool.
Your scenario doesn’t look bad. You’re not waiting for replies from the internet. Give it a try. If you run into difficulty with stalling threads, look into the ForkJoinPool.ManagedBlocker interface. Using that interface informs the f/j pool that you are doing blocking calls and the framework will create compensation threads.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.