As part of a study I am doing, I am exploring the supposed simplicity of using languages like Scala & Clojure to achieve concurrency on the JVM.
By simplicity, I am hoping to prove that these languages provide easier concurrency constructs than what Java 7 provides.
Therefore, I am hoping to find some good references that explain the complexities of Java's concurrency model.
Outside of pointing me in the direction of Google (which I have already searched with limited success), I would appreciate if those in-the-know could provide me with some good references to get me started off in this area.
Thanks
Java does not support lambda expressions. Creating an inline callback (eg, for the completion of an asynchronous call) requires 5 lines of boilerplate for an anonymous type.
This strongly discourages people from using callbacks. This is probably why Java 7 still does not have an interface for a callback that takes a value (as opposed to Runnable and Callbable), whereas C# has had one since 2005.
Therefore, the JDK does not have any real support for asynchronous operations.
The key to an asynchronous operation is the ability to kick off a long-running request, and have it run a callback when it finishes, without consuming a thread for the duration of the request. In Java, you can only do this by making a separate thread call get() on a Future<V>. This limits the concurrency of an application using the standard API to the number of threads you can sanely support.
To solve this problem, Google's Guava framework for better Java code introduces a ListenableFuture<V> interface which does have completion callbacks.
Languages like Scala fix this problem by supporting lambda expressions (which compile to anonymous classes) and adding their own Promise / Future types.
While higher level languages are easier to use multiple cores, what is often forgotten is why you want to use multiple cores which is to make the program faster e.g. increase its throughput.
When you consider options which increase concurrency, you need to test whether these options actually improve performance in some way. (Because very often they don't)
e.g. STM (Software Transactional Memory) makes it easier to write multi-threaded applications without having to worry about concurrency issues. The problem is that for trivial examples, it would be faster to not use STM and only use one thread.
Using multiple threads adds complexity and makes your application more fragile, so there has to be a good reason to do it otherwise you should stick to the simplest solution possible.
For more discussion
http://vanillajava.blogspot.co.uk/2011/11/why-concurency-examples-are-confusing.html
Related
Does anybody know about a tool that allows the explicit switching of threads at certain points in the code?
I am testing Software Transactional Memory for my bachelor thesis and for these tests, I need specific execution orders of threads (e.g. thread 1 reads 2 variables, after that switch to thread 2 and write to a variable, etc.). The problem is, the software library implementing the STM prohibits normal java synchronizaton methods in the code, so I cannot use sychronized blocks, locks or semaphores.
I was hoping someone knows about a tool like Concurrit (https://code.google.com/archive/p/concurrit/), only for Java...
No such tool exists. The only way you could achieve something like this would be to (drastically) modify the JVM itself, replacing the existing thread scheduling mechanisms. That would be an impractically large project ... by itself.
Opinion: the Concurrit DSL is not designed to be a practical programming language, and adding the mechanisms that it provides to a practical programming language most likely would make it it non-performant. Naturally, there is unlikely to be much enthusiasm for implementing such a tool for a performant1 language such as Java.
1 - Relatively speaking.
How can scala make writing multi-threaded programs easier than in java? What can scala do (that java can't) to facilitate taking advantage of multiple processors?
The rules for concurrency are
1 avoid it if you can
2 share nothing if you can
3 share immutable objects if you can
4 be very careful (and lucky)
For rule 2 Scala helps by providing a nicely integrated message passing library out of the box in the form of the actors.
For rule 3 Scala helps to make everything immutable by default.
For rule 4 Scala's flexible syntax allows the creation of internal DSL's making it easier and less wordy to express what you need consicely. i.e. less place for surprises (if done well)
If one takes Akka as a foundation for concurrent (and distributed) computing, one might argue the only difference is the usual stuff that distinguishes Scala from Java, since Akka has both Java and Scala bindings for all its facilities.
There is nothing Scala does that Java does not. That would be silly. Scala runs on the same JVM that Java does.
What Scala does do is make it easier to write, easier to reason about and easier to debug a multi-thread program.
The good bits of Scala for concurrency are its focus on immutable objects, its message-passing and its Actors.
This gives you thread-safe read-only data, easy ways to pass that data to other threads, and easy use of a thread pool.
May I know is there any C++ equivalent class, to Java java.util.concurrent.ArrayBlockingQueue
http://download.java.net/jdk7/docs/api/java/util/concurrent/ArrayBlockingQueue.html
Check out tbb::concurrent_bounded_queue from the Intel Threading Building Blocks (TBB).
(Disclaimer: I haven't actually had a chance to use it in a project yet, but I've been following TBB).
The current version of C++ doesn't include anything equivalent (it doesn't include any thread support at all). The next version of C++ (C++0x) doesn't include a direct equivalent either.
Instead, it has both lower level constructs from which you could create a thread safe blocking queue (e.g. a normal container along with mutexes, condition variables, etc., to synchronize access to it).
It also has a much higher level set of constructs: a promise, a future, a packaged_task, and so on. These completely hide the relatively low level details like queuing between the threads. Instead, you basically just ask for something to be done, and sometime later you can get a result. All the details in between are handled internally.
If you want something right now, you might consider the Boost Interprocess library. This includes (among other things) a Message Queue class. If memory serves, it supports both blocking and non-blocking variants.
Intel's Threading Building Blocks has a couple different concurrent queues, one of which might be similar.
concurrent_queue might be the one you are looking for. It comes with Parallel Patterns library from Microsoft.
This is my C++ implementation of ArrayBlockingQueue trying to be as close and conformant as possible to Java implementation. except iterator thread safety rest is perfectly compliant. I dont consider generally there is a need to iterate the whole queue ar run time.
https://github.com/anandkulkarnisg/ArrayBlockingQueue
The Examples should demonstrate how to use the blocking queue. It is internally implemented as a circular buffer based queue using raw array [ for good performance ].
Standard C++ has no equivalent, as it has no concept of concurrency; without concurrency, such a structure is both useless and dangerous, as operating on it could potentially block forever if there are no other threads.
It would be easy to implement, however, but the implementation details would depend on the threading library you're using.
As a side note, the upcoming C++1x standard will add some basic threading features to the standard library.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
Also, if not python or java, then would you more generally pick a statically-typed language or a dynamic-type language?
I would choose the JVM over python, primarily because multi-threading in Python is impeded by the Global Interpreter Lock. However, Java is unlikely to be your best when running on the JVM. Clojure or Scala (using actors) are both likely to be better suited to multi-threaded problems.
If you do choose Java you should consider making use of the java.util.concurrent libraries and avoid multi-threading primitives such as synchronized.
For concurrency, I would use Java. By use Java, I actually mean Scala, which borrows a lot from Erlang's concurrency constructs, but is (probably) more accessible to a Java developer who has never used either before.
Python threads suffer from having to wait for the Global Interpreter Lock, making true concurrency (within a single process) unachievable for CPU-bound programs. As I understand, Stackless Python solves some (though not all) of CPython's concurrency deficiencies, but as I have not used it, I can't really advise on it.
Definetely Stackless Python! That a Python variant especially made for concurrency.
But in the end it depends on your target platform and what you are trying to achieve.
If not Java/Python I would go for a functional language since taking side effects into account is one of the complexities of writing concurrent software. (As far as your question goes : this one happens to be statical typed, but compiler infered most of the time)
Personally I would pick F#, since I've seen lots of nice examples of writing concurrent software with ease using it.
As an introduction : this man is equally fun as inspiring, even a must have seen if you are not interested in F# what so ever.
I don't think the argument is about language choice or static or dynamic typing - it's between two models of concurrency - shared memory and message passing. Which model makes more sense in your situation & does your chosen language allow you to make a choice or are you forced to adopt one model over the other?
Why not have a look at Erlang (which has dynamic typing) and message passing, the Actor model, and read why Joe Armstrong doesn't like shared memory. There's also a interesting discussion about java concurrency using locks and threads here on SO.
I don't know about Python, but Java, along with the inbuilt locks and threads model, has a mesasge passing framework called Kilim.
I would use Java, via Jython. Java has strong thread capabilities, and it can be written using the Python syntax with Jython, so you got the best of the two worlds.
Python itself is not really good with concurrency, and is slower than Java anyway.
But if you have concurrency issues and free hands, I'd have a look at Erlang because it has been design for such problems. Of course, you must consider Erlang only if you have:
time to master a (very) new technology
control on a reasonable part of the production chain, since Erland need some adaptations in your toolbox to fit
Neither. Concurrent programming is notoriously hard to get correct. There is the option of using a process oriented programming language like occam-pi which is based of the idea of communicating sequential processes and the pi calculus. This allows compile time checking for deadlock and many other problems that arise during concurrent systems development. If you do not like occam-pi, which I cant blame you if you dont, you could try Go the new language from google which also implements a version of CSP.
For some tasks, Python is too slow. Your single thread Java program could be faster than the concurrent version of Python on a multi-core computer...
I'd like to use Java or Scala, F# or simply go to C++(MPI and OpenMPI).
The Java environment (JVM + libraries) is better for concurrency than (C)Python, but Java the language sucks. I would probably go with another language on the JVM - Jython has already been mentioned, and Clojure and Scala both have excellent support for concurrency.
Clojure is particularly good - it has support for high performance persistent data structures, agents and software transactional memory. It is a dynamic language but you can give it type hints to get performance as good as Java.
Watch this video on InfoQ by Richard Hickey (creator of Clojure) on the problems with traditional approaches to concurrency, and how Clojure handles it.
I'd look at Objective-C and the Foundation Framework. Asynchronous, concurrent programming is well provided for.
This of course depends on your access to Apple's Developer Tools or GnuStep, but if you have access to either one it's a good route to take with concurrent programming.
The answer is that it depends. For example are you trying to take advantage of multiple cores or cpus on a single machine or are you wanting to distribute your task across many machines? How important is speed vs. ease of implementation?
As mentioned before, Python has the Global Interpreter Lock but you could use the multiprocessing module. Note that while Stackless is very cool, it won't utilise multiple cores on its own. Python is usually considered easier to work with than Java. If speed is a priority Java is usually faster.
The java.util.concurrent library in Java makes writing concurrent applications on a single machine simpler but you'll still need to synchronise around any shared state. While Java isn't necessarily the best language for concurrency, there are a lot of tools, libraries, documentation and best practices out there to help.
Using message passing and immutability instead of threads and shared state is considered the better approach to programming concurrent applications. Functional languages that discourage mutability and side effects are often preferred as a result. If distributing your concurrent applications across multiple machines is a requirement, it is worth looking at runtimes designed for this e.g. Erlang or Scala Actors.
When Java is providing the capabilities for concurrent programming, what are the major advantages in using Clojure (instead of Java)?
Clojure is designed for concurrency.
Clojure provides concurrency primitives at a higher level of abstraction than Java. Some of these are:
A Software Transactional Memory system for dealing with synchronous and coordinated changes to shared references. You can change several references as an atomic operation and you don't have to worry about what the other threads in your program are doing. Within your transaction you will always have a consistent view of the world.
An agent system for asynchronous change. This resembles message passing in Erlang.
Thread local changes to variables. These variables have a root binding which are shared by every thread in your program. However, when you re-bind a variable it will only be visible in that thread.
All these concurrency primitives are built on top of Clojures immutable data structures (i.e., lists, maps, vectors etc.). When you enter the world of mutable Java objects all of the primitives break down and you are back to locks and condition variables (which also can be used in clojure, when necessary).
Without being an expert on Clojure I would say that the main advantage is that Clojure hides a lot of the details of concurrent programming and as we all know the devil is in the details, so I consider that a good thing.
You may want to check this excellent presentation from Rick Hickey (creator of Clojure) on concurrency in Clojure. EDIT: Apparently JAOO has removed the old presentations. I haven't been able to locate a new source for this yet.
Because Clojure is based on the functional-programming paradigm, which is to say that it achieves safety in concurrency by following a few simple rules:
immutable state
functions have no side effects
Programs written thus pretty much have horizontal scalability built-in, whereas a lock-based concurrency mechanism (as with Java) is prone to bugs involving race conditions, deadlocks etc.
Because the world has advanced in the past 10 years and the Java language (!= the JVM) is finding it hard to keep up. More modern languages for the JVM are based on new ideas and improved concepts which makes many tedious tasks much more simple and safe.
One of the cool things about having immutable types is that most of the built-in functions are already multi-threaded. A simple 'reduce' will span multiple cores/processors, without any extra work from you.
So, sure you can be multi-threaded with Java, but it involves locks and whatnot. Clojure is multi-threaded without any extra effort.
Yes, Java provides all necessary capabilities for concurrent programs.
An analogy: C provides all necessary capabilities for memory-safe programs, even with lots of string handling. But in C memory safety is the programmer's problem.
As it happens, analyzing concurrency is quite hard. It's better to use inherently safe mechanisms rather than trying to anticipate all possible concurrency hazards.
If you attempt to make a shared-memory mutable-data-structure concurrent program safe by adding interlocks you are walking on a tightrope. Plus, it's largely untestable.
One good compromise might be to write concurrent Java code using Clojure's functional style.
In addition to Clojure's approach to concurrency via immutable data, vars, refs (and software transactional memory), atoms and agents... it's a Lisp, which is worth learning. You get Lisp macros, destructuring, first class functions and closures, the REPL, and dynamic typing - plus literals for lists, vectors, maps, and sets - all on top of interoperability with Java libraries (and there's a CLR version being developed too.)
It's not exactly the same as Scheme or Common Lisp, but learning it will help you if you ever want to work through the Structure and Interpretation of Computer Programs or grok what Paul Graham's talking about in his essays, and you can relate to this comic from XKCD. ;-)
This video presentation makes a very strong case, centred around efficient persistent data structures implemented as tries.
Java programming language evolution is quite slow mainly because of Sun's concern about backward compatibility.
Why don't you want just directly use JVM languages like Clojure and Scala?