Java8 stream().map().reduce() is really map reduce - java

I saw this code somewhere using stream().map().reduce().
Does this map() function really works parallel? If Yes, then how many maximum number of threads it can initiate for map() function?
What if I use parallelStream() instead of just stream() for the below particular use-case.
Can anyone give me good example of where to NOT use parallelStream()
Below code is just to extract tName from tCode and returns comma separated String.
String ts = atList.stream().map(tcode -> {
return CacheUtil.getTCache().getTInfo(tCode).getTName();
}).reduce((tName1, tName2) -> {
return tName1 + ", " + tName2;
}).get();

this stream().map().reduce() is not parallel, thus a single thread acts on the stream.
you have to add parallel or in other cases parallelStream (depends on the API, but it's the same thing). Using parallel by default you will get number of available processors - 1; but the main thread is used too in the ForkJoinPool#commonPool; thus there will be usually 2, 4, 8 threads etc. To check how many you will get, use:
Runtime.getRuntime().availableProcessors()
You can use a custom pool and get as many threads as you want, as shown here.
Also notice that the entire pipeline is run in parallel, not just the map operation.
There isn't a golden law about when to use and when not to use parallel streams, the best way is to measure. But there are obvious choices, like a stream of 10 elements - this is way too little to have any real benefit from parallelization.

All parallel streams use common fork-join thread pool and if you submit a long-running task, you effectively block all threads in the pool. Consequently you block all other tasks that are using parallel streams.
There are only two options how to make sure that such thing will never happen. The first is to ensure that all tasks submitted to the common fork-join pool will not get stuck and will finish in a reasonable time. But it's easier said than done, especially in complex applications. The other option is to not use parallel streams and wait until Oracle allows us to specify the thread pool to be used for parallel streams.
Use case
Lets say you have a collection (List) which gets loaded with values at the start of application and no new value is added to it at any later point. In above​ scenario you can use parallel stream without any concerns.
Don't worry stream is efficient and safe.

Related

ParallelStream pool size

I have a parallel stream with a few database queries inside like this:
private void processParallel() {
List\<Result\> = objects.parallelStream().peek(object -\> {
doSomething(object)
}
}
private void doSomething(object) {
CompletableFuture<String> param =
CompletableFuture.supplyAsync(() -> objectService.getParam(object.getField())),
executor)
.thenApply(object-> Optional.ofNullable(object)
.map(Object::getParam)
.orElse(null));
}
I need to specify the pool size, but setting the property "java.util.concurrent.ForkJoinPool.common.parallelism","20" is not working, probably because of locking the stream. Is there any way to limit the max amount of threads?
Since parallel streams are using Fork/Join framework under the hood, to limit the number of treads employed by the stream, you can wrap the stream with a Callable and define a ForkJoinPool having the required level of parallelism as described.
The threads occupied by the parallel Stream would be taken from the new created ForkJoinPool, to which the callable task was submitted (not from the common poll) as described here.
The downside of this approach is that you're relying on the implementation detail of the Stream API.
And also as #Louis Wasserman has pointed out in the comment
you probably might need another way of limiting the number of threads used in the stream. Because you're performing database queries, each tread would require a database connection to do its job, hence the number of threads should not be greater than the number of available connections that the data source can provide at the moment. And if you have multiple processes that can fire these asynchronous tasks simultaneously (for instance in a web application), it doesn't seem to be a good idea to try to develop your own solution. If that's the case, you might consider using a framework like Spring WebFlax.

Thread usage of Java Parallel Stream Reduce [duplicate]

In JDK8, how many threads are spawned when i'm using parallelStream? For instance, in the code:
list.parallelStream().forEach(/** Do Something */);
If this list has 100000 items, how many threads will be spawned?
Also, do each of the threads get the same number of items to work on or is it randomly allotted?
The Oracle's implementation[1] of parallel stream uses the current thread and in addition to that, if needed, also the threads that compose the default fork join pool ForkJoinPool.commonPool(), which has a default size equal to one less than the number of cores of your CPU.
That default size of the common pool can be changed with this property:
-Djava.util.concurrent.ForkJoinPool.common.parallelism=8
Alternatively, you can use your own pool:
ForkJoinPool myPool = new ForkJoinPool(8);
myPool.submit(() ->
list.parallelStream().forEach(/* Do Something */);
).get();
Regarding the order, jobs will be executed as soon as a thread is available, in no specific order.
As correctly pointed out by #Holger this is an implementation specific detail (with just one vague reference at the bottom of a document), both approaches will work on Oracle's JVM but are definitely not guaranteed to work on JVMs from other vendors, the property could not exist in a non-Oracle implementation and Streams could not even use a ForkJoinPool internally rendering the alternative based on the behavior of ForkJoinTask.fork completely useless (see here for details on this).
While #uraimo is correct, the answer depends on exactly what "Do Something" does. The parallel.streams API uses the CountedCompleter Class which has some interesting problems. Since the F/J framework does not use a separate object to hold results, long chains may result in an OOME. Also those long chains can sometimes cause a Stack Overflow. The answer to those problems is the use of the Paraquential technique as I pointed out in this article.
The other problem is excessive thread creation when using nested parallel forEach.

ParallelStream for Files

The new Stream API in Java 8 is really nice, especially for the parallel processing capabilities. However, I don't see how to apply the parallel processing outside of the Collections parallelStream method.
For example, if I am creating a Stream from a File, I use the following:
Stream<String> lines = Files.lines(Paths.get("test.csv"));
However, there is no counterpart parallelStream method, like in Collections. It seems like there could be one thread grabbing the next line, while there could be several threads parsing and processing the lines.
Could this be done with StreamSupport.stream()?
There's a much simpler answer: Any stream can be turned parallel by calling .parallel():
Stream<String> lines = Files.lines(Paths.get("test.csv"))
.parallel();
The .parallelStream() method on Collection is just a convenience.
Note that, unless you're doing a lot of processing per line, the sequential nature of IO from the file will probably dominate and you may not get as much parallelism as you hope.
Yes - turns out you can create a parallel stream from the sequential stream with StreamSupport.stream(). Following the pattern of my question, it would look like the following.
StreamSupport.stream(Files.lines(Paths.get("test.csv")).spliterator(), true);
The 'true' is to make it parallel. In testing, it expanded the use from a single core to all cores on my machine. It read the lines in order, however the processing of the lines did not complete in order, which is fine for my purposes.

How to keep a fixed size pool of ListenableFutures?

I am reading a large file of Urls and do requests to a service. The request are executed by a client that returns ListenableFuture. Now I want to keep a pool of ListenableFutures, e.g. have N Futures being executed concurrently in maximum.
The problem I see is that I have no control over the ExecutorService the ListenableFutures are executed in because of the third-party library. Otherwise I would just create a FixedSizePool and create my own Callables.
1) A naïve implementation would be to spawn N Futures and then use AllAsList which would satisfy the fixed size criteria but makes all wait for the slowest request.
Out of order processing is ok.
2) A slightly better naïve option would be to use the first idea and combine it with a rate limiter, by setting N and rate in a way that the amount of concurrent requests is in good approximation to the desired pool size. But I am actually not looking for a way to a Throttle the calls, e.g. using RateLimiter.
3) A last option would be to spawn N Futures and have a Callback that spawns a new one. This satisfies the criteria of a fixed size and minimizes the idle time, but there I don't know how to detect the end if my program, i.e. close the file.
4) A non-ListenableFuture-related approach would be to just .get() the result directly and deal with the embarrassly parallel tasks by creating a simple Threadpool.
For knowing the job queue is empty i.e. closing the file I am thinking of using a CountdownLatch. Which should work for many options.
Hmm. How do you feel about just using a java.util.concurrent.Semaphore?
Semaphore gate = new Semaphore(10);
Runnable release = gate::release; // java 8 syntax.
Iterator<URL> work = ...;
while(work.hasNext() && gate.acquire()) {
ListenableFuture f = ThirdPartyLibrary.doWork(work.next());
f.addListener( release, MoreExecutors.sameThreadExecutor() );
}
You could add other listeners maybe by using Futures.addCallback(ListenableFuture, FutureCallback) to do something with the results, as long as you're careful to release() on both success and error.
It might work.
Your option 3 sounds reasonable. If you want to cleanly detect when all requests have completed, one simple approach is to create a new SettableFuture to represent completion.
When your callback tries to take the next request from the queue, and finds it to be empty, you can call set() on the future to notify anything that's waiting for all requests to complete. Propagating exceptions from the individual request futures is left as an exercise for the reader.
Use a FixedSizePool for embarrassedly parallel tasks and .get() the future's result immediately.
This simplifies the code and allows each worker to have modifiable context.

Java ExecutorService - sometimes slower than sequential processing?

I'm writing a simple utility which accepts a collection of Callable tasks, and runs them in parallel. The hope is that the total time taken is little over the time taken by the longest task. The utility also adds some error handling logic - if any task fails, and the failure is something that can be treated as "retry-able" (e.g. a timeout, or a user-specified exception), then we run the task directly.
I've implemented this utility around an ExecutorService. There are two parts:
submit() all the Callable tasks to the ExecutorService, storing the Future objects.
in a for-loop, get() the result of each Future. In case of exceptions, do the "retry-able" logic.
I wrote some unit tests to ensure that using this utility is faster than running the tasks in sequence. For each test, I'd generate a certain number of Callable's, each essentially performing a Thread.sleep() for a random amount of time within a bound. I experimented with different timeouts, different number of tasks, etc. and the utility seemed to outperform sequential execution.
But when I added it to the actual system which needs this kind of utility, I saw results that were very variable - sometimes the parallel execution was faster, sometimes it was slower, and sometimes it was faster, but still took a lot more time than the longest individual task.
Am I just doing it all wrong? I know ExecutorService has invokeAll() but that swallows the underlying exceptions. I also tried using a CompletionService to fetch task results in the order in which they completed, but it exhibited more or less the same behavior. I'm reading up now on latches and barriers - is this the right direction for solving this problem?
I wrote some unit tests to ensure that using this utility is faster than running the tasks in sequence. For each test, I'd generate a certain number of Callable's, each essentially performing a Thread.sleep() for a random amount of time within a bound
Yeah this is certainly not a fair test since it is using neither CPU nor IO. I certainly hope that parallel sleeps would run faster than serial. :-)
But when I added it to the actual system which needs this kind of utility, I saw results that were very variable
Right. Whether or not a threaded application runs faster than a serial one depends a lot on a number of factors. In particular, IO bound applications will not improve in performance since they are bound by the IO channel and really cannot do concurrent operations because of this. The more processing that is needed by the application, the larger the win is to convert it to be multi-threaded.
Am I just doing it all wrong?
Hard to know without more details. You might consider playing around with the number of threads that are running concurrently. If you have a ton of jobs to process you should not be using a Executos.newCachedThreadPool() and should optimized the newFixedSizeThreadPool(...) depending on the number of CPUs your architecture has.
You also may want to see if you can isolate the IO operations in a few threads and the processing to other threads. Like one input thread reading from a file and one output thread (or a couple) writing to the database or something. So multiple sized pools may do better for different types of tasks instead of using a single thread-pool.
tried using a CompletionService to fetch task results in the order in which they completed
If you are retrying operations, using a CompletionService is exactly the way to go. As jobs finish and throw exceptions (or return failure), they can be restarted and put back into the thread-pool immediately. I don't see any reason why your performance problems would be because of this.
Multi-threaded programming doesn't come for free. It has an overhead. The over head can easily exceed and performance gain and usually makes your code more complex.
Additional threads give access to more cpu power (assuming you have spare cpus) but in general they won't make you HDD spin faster , give you more network bandwidth or speed up something which is not cpu bound.
Multiple threads can help give you a greater share of an external resource.

Categories