Supposedly, I have the following method
public static Stream<CompletableFuture<String>> findPricesStream(String product) {}
This method will look for the cheapest price given a product and it returns a stream of CompletableFuture.
Now I want to react to the value in the stream as soon as it is available. To do that,
I adopt the method thenAccept and the implementation could be as following
1. public static void reactToEarliestResultWithRaw() {
2. long start = System.nanoTime();
3. CompletableFuture[] priceFuture = findPricesStream(WHATEVER_PRODUCT_NAME)
4. .map(completableFuture -> completableFuture.thenAccept(
5. s -> System.out.println(s + " (done in " + (System.nanoTime() - start) / 1_000_000 + " msecs)")))
6. .toArray(CompletableFuture[]::new);
7. CompletableFuture.allOf(priceFuture).join();
8. System.out.println("All shops have now responded in " + (System.nanoTime() - start) / 1_000_000 + " msecs");
9. }
With this implementation, I got the desired output
LetsSaveBig price is 151.227 (done in 5476 msecs)
BuyItAll price is 211.66 (done in 5747 msecs)
MyFavoriteShop price is 206.30200000000002 (done in 6968 msecs)
BestPrice price is 131.917 (done in 8110 msecs)
All shops have now responded in 8110 msecs
Now I would like to take a further step to make the code more readable.
I chained another map that is responsible for joining all of CompletableFuture
1. public static void reactToEarliestResultWithoutRaw() {
2. long start = System.nanoTime();
3. List<Void> completeFutures = findPricesStream(WHATEVER_PRODUCT_NAME)
4. .map(completableFuture -> completableFuture.thenAccept(
5. s -> System.out.println(s + " (done in " + (System.nanoTime() - start) / 1_000_000 + " msecs)")))
6. .map(CompletableFuture::join)
7. .toList();
8. int size = completeFutures.size();
9. if (isComplete(size)) {
10. System.out.println("All shops have now responded in " + (System.nanoTime() - start) / 1_000_000 + " msecs");
11. }
12.
13. private static boolean isComplete(int size) {
14. return size == shops.size();
15. }
I got the output
BestPrice price is 123.17400000000002 (done in 2060 msecs)
LetsSaveBig price is 109.67200000000001 (done in 6025 msecs)
MyFavoriteShop price is 131.21099999999998 (done in 13860 msecs)
BuyItAll price is 164.392 (done in 18434 msecs)
All shops have now responded in 18434 msecs
The result makes me surprised!
I expect the elapsed time for both should be somehow the same, but they are a huge difference.
Do I misunderstand the way of using join here?
Reference
The implementation comes from the book Modern Java in Action: Lambdas, streams, functional and reactive programming 2nd Edition and
I have modified it a bit for the experiment.
The "surprising" results are due to how findPricesStream is implemented: it returns shops.stream().map(shop -> CompletableFuture.supplyAsync(...). The CompletableFuture is not constructed until a terminal operation is applied to the returned stream. This is done in your own method, after you call .toList().
The terminal operation toList() does this:
For the first shop, it constructs a CompletableFuture, which starts running.
The CompletableFuture is joined, i.e. the main thread waits until it is finished.
Then the next CompletableFuture is constructed for the next shop, and so on.
So the prices are calculated sequentially. To make the calculations run in parallel, create the list first (so that all futures are started) and then join them:
3. List<CompletableFuture<Void>> futures = findPricesStream(WHATEVER_PRODUCT_NAME)
4. .map(completableFuture -> completableFuture.thenAccept(
5. s -> System.out.println(s + " (done in " + (System.nanoTime() - start) / 1_000_000 + " msecs)")))
6. .toList();
7. List<Void> results = futures.stream()
.map(CompletableFuture::join)
.toList();
Related
I'm trying locate the first (any) member of a list that matches a given predicate like so:
Item item = items.parallelStream()
.map(i -> i.doSomethingExpensive())
.filter(predicate)
.findAny()
.orElse(null);
I would expect that once findAny() gets a match, it would return immediately, but that doesn't appear to be the case. Instead it seems to wait for the map method to finish on most of the elements before returning. How can I return the first result immediately and cancel the other parallel streams? Is there a better way to do this than using streams such as CompletableFuture?
Here's a simple example to show the behavior:
private static void log(String msg) {
private static void log(String msg) {
SimpleDateFormat sdf = new SimpleDateFormat("HH:mm:ss.SSS");
System.out.println(sdf.format(new Date()) + " " + msg);
}
Random random = new Random();
List<Integer> nums = Arrays.asList(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14);
Optional<Integer> num = nums.parallelStream()
.map(n -> {
long delay = Math.abs(random.nextLong()) % 10000;
log("Waiting on " + n + " for " + delay + " ms");
try { Thread.sleep(delay); }
catch (InterruptedException e) { System.err.println("Interruption error"); }
return n * n;
})
.filter(n -> n < 30)
.peek(n -> log("Found match: " + n))
.findAny();
log("First match: " + num);
Log output:
14:52:27.061 Waiting on 9 for 2271 ms
14:52:27.061 Waiting on 2 for 1124 ms
14:52:27.061 Waiting on 13 for 547 ms
14:52:27.061 Waiting on 4 for 517 ms
14:52:27.061 Waiting on 1 for 1210 ms
14:52:27.061 Waiting on 6 for 2646 ms
14:52:27.061 Waiting on 0 for 4393 ms
14:52:27.061 Waiting on 12 for 5520 ms
14:52:27.581 Found match: 16
14:52:27.582 Waiting on 3 for 5365 ms
14:52:28.188 Found match: 4
14:52:28.275 Found match: 1
14:52:31.457 Found match: 0
14:52:32.950 Found match: 9
14:52:32.951 First match: Optional[0]
Once a match is found (in this case 16), findAny() does not return immediately, but instead blocks until the remaining threads finish. In this case, the caller is waiting an extra 5 seconds before returning after a match has already been found.
Instead it seems to wait for the map method to finish on most of the elements before returning.
This is not correct.
When speaking of the elements which are already being processed, it will wait for the completion of all of them, as the Stream API allows concurrent processing of data structures which are not intrinsically thread safe. It must ensure that all potential concurrent access has been finished before returning from the terminal operation.
When talking about the entire stream, it’s simply not fair to test a stream of only 14 elements on an 8 core machine. Of course, there will be at least 8 concurrent operations started, that’s what it is all about. You are adding fuel to the flames by using findFirst() instead of findAny(), as that doesn’t mean returning the first found element in processing order, but the first element in encounter order, i.e. exactly zero in your example, so threads processing other chunks than the first can’t assume that their result is the correct answer and are even more willing to help processing other candidates than with findAny().
When you use
List<Integer> nums = IntStream.range(0, 200).boxed().collect(Collectors.toList());
Optional<Integer> num = nums.parallelStream()
.map(n -> {
long delay = ThreadLocalRandom.current().nextInt(10_000);
log("Waiting on " + n + " for " + delay + " ms");
LockSupport.parkNanos(TimeUnit.MILLISECONDS.toNanos(delay));
return n * n;
})
.filter(n -> n < 40_000)
.peek(n -> log("Found match: " + n))
.findAny();
log("First match: " + num);
You will get a similar number of tasks running to completion, despite the far bigger number of stream elements.
Note that CompletableFuture also doesn’t support interruption, so the only builtin feature for returning any result and canceling the other jobs, that comes into my mind, is the old ExecutorService.invokeAny.
To built the feature of mapping and filtering for it, we can use the following helper function:
static <T,R> Callable<R> mapAndfilter(T t, Function<T,R> f, Predicate<? super R> p) {
return () -> {
R r = f.apply(t);
if(!p.test(r)) throw new NoSuchElementException();
return r;
};
}
Unfortunately, there’s only the option of completing with a value or exceptionally, therefore we have to use an exception for non-matching elements.
Then we can use it like
ExecutorService es = ForkJoinPool.commonPool();
Integer result = es.invokeAny(IntStream.range(0, 100)
.mapToObj(i -> mapAndfilter(i,
n -> {
long delay = ThreadLocalRandom.current().nextInt(10_000);
log("Waiting on " + n + " for " + delay + " ms");
LockSupport.parkNanos(TimeUnit.MILLISECONDS.toNanos(delay));
return n * n;
},
n -> n < 10_000))
.collect(Collectors.toList()));
log("result: "+result);
and it will not only cancel the pending tasks, it will return without waiting for them to finish.
Of course, this implies that the source data, the jobs operating upon, must be either immutable or thread safe.
You can use this code to illustrate how parallelStream works:
final List<String> list = Arrays.asList("first", "second", "third", "4th", "5th", "7th", "8th", "9th", "10th", "11th", "12th", "13th");
String result = list.parallelStream()
.map(s -> {
System.out.println("map: " + s);
return s;
})
.filter(s -> {
System.out.println("fiter: " + s);
return s.equals("8th");
})
.findFirst()
.orElse(null);
System.out.println("result=" + result);
There are two options to achieve what you're looking for, to stop expensive operation with a filter:
Don't use streams at all, use a simple for or enhanced for
Filter first, then map with the expensive operation
There are several things at play here. The first thing is that parallelStream() uses the common ForkJoinPool by default, which makes the calling thread participate as well. This means that if one of the slow tasks is currently running on the calling thread, it has to finish before the caller gets the control back.
You can see this by modifying the code a little bit to log the thread names, and log when finished the wating:
private static void log(String msg) {
SimpleDateFormat sdf = new SimpleDateFormat("HH:mm:ss.SSS");
System.out.println(sdf.format(new Date()) + " [" + Thread.currentThread().getName() + "] " + " " + msg);
}
public static void main(String[] args) {
Random random = new Random();
List<Integer> nums = Arrays.asList(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14);
Optional<Integer> num = nums.parallelStream()
.map(n -> {
long delay = Math.abs(random.nextLong()) % 10000;
log("Waiting on " + n + " for " + delay + " ms");
try {
Thread.sleep(delay);
} catch (InterruptedException e) {
System.err.println("Interruption error");
}
log("finished waiting");
return n * n;
})
.filter(n -> n < 30)
.peek(n -> log("Found match: " + n))
.findAny();
log("First match: " + num);
}
Sample output:
13:56:52.954 [main] Waiting on 9 for 9936 ms
13:56:52.956 [ForkJoinPool.commonPool-worker-1] Waiting on 4 for 7436 ms
13:56:52.970 [ForkJoinPool.commonPool-worker-2] Waiting on 1 for 6523 ms
13:56:52.983 [ForkJoinPool.commonPool-worker-3] Waiting on 6 for 7488 ms
13:56:59.494 [ForkJoinPool.commonPool-worker-2] finished waiting
13:56:59.496 [ForkJoinPool.commonPool-worker-2] Found match: 1
13:57:00.392 [ForkJoinPool.commonPool-worker-1] finished waiting
13:57:00.392 [ForkJoinPool.commonPool-worker-1] Found match: 16
13:57:00.471 [ForkJoinPool.commonPool-worker-3] finished waiting
13:57:02.892 [main] finished waiting
13:57:02.894 [main] First match: Optional[1]
Here as you can see, 2 matches are found but the main thread is still busy, so it cannot return the match now.
This does not always explain all cases though:
13:58:52.116 [main] Waiting on 9 for 5256 ms
13:58:52.143 [ForkJoinPool.commonPool-worker-1] Waiting on 4 for 4220 ms
13:58:52.148 [ForkJoinPool.commonPool-worker-2] Waiting on 1 for 2136 ms
13:58:52.158 [ForkJoinPool.commonPool-worker-3] Waiting on 6 for 7262 ms
13:58:54.294 [ForkJoinPool.commonPool-worker-2] finished waiting
13:58:54.295 [ForkJoinPool.commonPool-worker-2] Found match: 1
13:58:56.364 [ForkJoinPool.commonPool-worker-1] finished waiting
13:58:56.364 [ForkJoinPool.commonPool-worker-1] Found match: 16
13:58:57.399 [main] finished waiting
13:58:59.422 [ForkJoinPool.commonPool-worker-3] finished waiting
13:58:59.424 [main] First match: Optional[1]
This might be explained by the way the fork-join pool merges the results. It seems some improvements are possible.
As an alternative, you could indeed do this using CompletableFuture:
// you should probably also pass your own executor to supplyAsync()
List<CompletableFuture<Integer>> futures = nums.stream().map(n -> CompletableFuture.supplyAsync(() -> {
long delay = Math.abs(random.nextLong()) % 10000;
log("Waiting on " + n + " for " + delay + " ms");
try {
Thread.sleep(delay);
} catch (InterruptedException e) {
System.err.println("Interruption error");
}
log("finished waiting");
return n * n;
})).collect(Collectors.toList());
CompletableFuture<Integer> result = CompletableFuture.allOf(futures.toArray(new CompletableFuture[0]))
.thenApply(unused -> futures.stream().map(CompletableFuture::join).filter(n -> n < 30).findAny().orElse(null));
// shortcircuiting
futures.forEach(f -> f.thenAccept(r -> {
if (r < 30) {
log("Found match: " + r);
result.complete(r);
}
}));
// cancelling remaining tasks
result.whenComplete((r, t) -> futures.forEach(f -> f.cancel(true)));
log("First match: " + result.join());
Output:
14:57:39.815 [ForkJoinPool.commonPool-worker-1] Waiting on 0 for 7964 ms
14:57:39.815 [ForkJoinPool.commonPool-worker-3] Waiting on 2 for 5743 ms
14:57:39.817 [ForkJoinPool.commonPool-worker-2] Waiting on 1 for 9179 ms
14:57:45.562 [ForkJoinPool.commonPool-worker-3] finished waiting
14:57:45.563 [ForkJoinPool.commonPool-worker-3] Found match: 4
14:57:45.564 [ForkJoinPool.commonPool-worker-3] Waiting on 3 for 7320 ms
14:57:45.566 [main] First match: 4
Note that the cancel(true) does not actually cancel the ongoing tasks (no interruption will occur for example), but it prevents further tasks from being run (you can even see that it might not be immediate since worker 3 still started to execute the next one).
You should also use your own executor, with the appropriate size based on whether it is more CPU or I/O intensive. As you can see, the default uses the common pool and thus it does not use all cores.
The allOf() is mainly needed in case no match is found. If you can guarantee that there is at least one match, you could simply use a `new CompletableFuture() instead.
Finally, as a simple approach I repeated the filter check, but it's easy to move that logic inside the main logic, return null or a marker, and then test for that in both places.
See also How to make a future that gets completed when any of the given CompletableFutures is completed with a result that matches a certain predicate?
This was asked during an online interview process many days back, the question follows like this:
A computer system uses a preemptive process scheduling methodology called Less than average first which works as follows:
A process that has the remaining execution time lesser than the average execution times of all processes is executed.
If multiple processes satisfying the first condition are found, the process which arrived earlier is executed. If no processes exist that do not satisfy this, then the smallest remaining process is chosen first.
Given the arrival times and the total execution times of each process, find the total time for which each process remaining idle before its execution is completed.
Example: Each line contains arrival time and the remaining execution time:
1 4
2 2
3 1
Output:
Total time for which each process remains idle before its execution is completed.
4
Explanation:
At time = 1, only one process exists so it will be executed.
At time = 2, the avg. execution time is (3+2)/2 = 2.5, the remaining execution time of the 2nd process is lesser than the average. Hence end process will be executed.
Hence the final processes execution sequence for each time unit is:
1 2 2 3 1 1 1
Method signature is:
int process(int input[][]) {
}
I tried to understand this question by reading it many times, but I am not able to understand, can you please help me with how to solve this?
You can implement a method getNextProcess to determine the next process to run and then build a scheduler that actually processes the input and adds up the waiting times. This way you can simulate the whole processing and then have the result in the end.
Determining the next process to run (implementing the rules as stated in the question):
static Optional<Process> getNextProcess(List<Process> processes) {
if (processes.isEmpty()) {
return Optional.empty();
}
// calculate average
double avg = processes.stream()
.mapToInt(p -> p.getRemainingTime())
.average().getAsDouble();
// return first process with 'remaining time < avg'
// simultaneously process with smallest remaining time
Process nextProcess = null;
int minRemainingTime = Integer.MAX_VALUE;
for (Process p : processes) {
if (p.getRemainingTime() < avg) {
return Optional.of(p);
}
if (p.getRemainingTime() < minRemainingTime) {
nextProcess = p;
minRemainingTime = p.getRemainingTime();
}
}
return Optional.ofNullable(nextProcess);
}
This is how the simulator/scheduler can look like:
static int process(int[][] input) {
// transform the 2d-array into a custom object for convenience
List<Process> incomingProcesses = new ArrayList<>();
for (int i = 0; i < input.length; i++) {
incomingProcesses.add(new Process(i + 1, input[i][0], input[i][1]));
}
// simulate scheduling
int time = 1;
int totalWaitingTime = 0;
List<Process> currentProcesses = new ArrayList<>();
while (!incomingProcesses.isEmpty() || !currentProcesses.isEmpty()) {
// handle new processes that arrive at this time step
final int finalTime = time;
List<Process> newProcesses = incomingProcesses.stream()
.filter(p -> p.getArrivalTime() == finalTime)
.collect(Collectors.toList());
currentProcesses.addAll(newProcesses);
incomingProcesses.removeAll(newProcesses);
// remove processes with no remaining time
currentProcesses.removeIf(p -> p.getRemainingTime() <= 0);
// increase total waiting time (reduction follows later)
totalWaitingTime += currentProcesses.size();
// get next process and if found,
// reduce its remaining time and decrease total waiting time
Optional<Process> nextProcess = getNextProcess(currentProcesses);
if (nextProcess.isPresent()) {
Process p = nextProcess.get();
System.out.println("Process " + p.getId());
p.setRemainingTime(p.getRemainingTime() - 1);
totalWaitingTime -= 1; // reduction since this process actually run
}
time++; // move to next time step
}
return totalWaitingTime;
}
Let's test that:
public static void main(String[] args) {
int result = process(new int[][]{
{1, 4},
{2, 2},
{3, 1}
});
System.out.println("The result is: " + result);
}
Output:
Process 1
Process 2
Process 2
Process 3
Process 1
Process 1
Process 1
The result is: 4
I have a function that is iterating the list using parallelStream in forEach is then calling an API with the the item as param. I am then storing the result in a hashMap.
try {
return answerList.parallelStream()
.map(answer -> getReplyForAnswerCombination(answer))
.collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));
} catch (final NullPointerException e) {
log.error("Error in generating final results.", e);
return null;
}
When I run it on laptop 1, it takes 1 hour.
But on laptop 2, it takes 5 hours.
Doing some basic research I found that the parallel streams use the default ForkJoinPool.commonPool which by default has one less threads as you have processors.
Laptop1 and laptop2 have different processors.
Is there a way to find out how many streams that can run parallelly on Laptop1 and Laptop2?
Can I use the suggestion given here to safely increase the number of parallel streams in laptop2?
long start = System.currentTimeMillis();
IntStream s = IntStream.range(0, 20);
System.setProperty("java.util.concurrent.ForkJoinPool.common.parallelism", "20");
s.parallel().forEach(i -> {
try { Thread.sleep(100); } catch (Exception ignore) {}
System.out.print((System.currentTimeMillis() - start) + " ");
});
Project Loom
If you want maximum performance on threaded code that blocks (as opposed to CPU-bound code), then use virtual threads (fibers) provided in Project Loom. Preliminary builds are available now, based on early-access Java 16.
Virtual threads
Virtual threads can be dramatically faster because a virtual thread is “parked” while blocked, set aside, so another virtual thread can make progress. This is so efficient for blocking tasks that threads can number in the millions.
Drop the streams approach. Merely send off each input to a virtual thread.
Full example code
Let's define classes for Answer and Reply, our inputs & outputs. We will use record, a new feature coming to Java 16, as an abbreviated way to define an immutable data-driven class. The compiler implicitly creates default implementations of constructor, getters, equals & hashCode, and toString.
public record Answer (String text)
{
}
…and:
public record Reply (String text)
{
}
Define our task to be submitted to an executor service. We write a class named ReplierTask that implements Runnable (has a run method).
Within the run method, we sleep the current thread to simulate waiting for a call to a database, file system, and/or remote service.
package work.basil.example;
import java.time.Duration;
import java.time.Instant;
import java.util.UUID;
import java.util.concurrent.ConcurrentMap;
public class ReplierTask implements Runnable
{
private Answer answer;
ConcurrentMap < Answer, Reply > map;
public ReplierTask ( Answer answer , ConcurrentMap < Answer, Reply > map )
{
this.answer = answer;
this.map = map;
}
private Reply getReplyForAnswerCombination ( Answer answer )
{
// Simulating a call to some service to produce a `Reply` object.
try { Thread.sleep( Duration.ofSeconds( 1 ) ); } catch ( InterruptedException e ) { e.printStackTrace(); } // Simulate blocking to wait for call to service or db or such.
return new Reply( UUID.randomUUID().toString() );
}
// `Runnable` interface
#Override
public void run ( )
{
System.out.println( "`run` method at " + Instant.now() + " for answer: " + this.answer );
Reply reply = this.getReplyForAnswerCombination( this.answer );
this.map.put( this.answer , reply );
}
}
Lastly, some code to do the work. We make a class named Mapper that contains a main method.
We simulate some input by populating an array of Answer objects. We create an empty ConcurrentMap in which to collect the results. And we assign each Answer object to a new thread where we call for a new Reply object and store the Answer/Reply pair as an entry in the map.
package work.basil.example;
import java.time.Duration;
import java.time.Instant;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.*;
public class Mapper
{
public static void main ( String[] args )
{
System.out.println("Runtime.version(): " + Runtime.version() );
System.out.println("availableProcessors: " + Runtime.getRuntime().availableProcessors());
System.out.println("maxMemory: " + Runtime.getRuntime().maxMemory() + " | maxMemory/(1024*1024) -> megs: " +Runtime.getRuntime().maxMemory()/(1024*1024) );
Mapper app = new Mapper();
app.demo();
}
private void demo ( )
{
// Simulate our inputs, a list of `Answer` objects.
int limit = 10_000;
List < Answer > answers = new ArrayList <>( limit );
for ( int i = 0 ; i < limit ; i++ )
{
answers.add( new Answer( String.valueOf( i ) ) );
}
// Do the work.
Instant start = Instant.now();
System.out.println( "Starting work at: " + start + " on count of tasks: " + limit );
ConcurrentMap < Answer, Reply > results = new ConcurrentHashMap <>();
try
(
ExecutorService executorService = Executors.newVirtualThreadExecutor() ;
// Executors.newFixedThreadPool( 5 )
// Executors.newFixedThreadPool( 10 )
// Executors.newFixedThreadPool( 1_000 )
// Executors.newVirtualThreadExecutor()
)
{
for ( Answer answer : answers )
{
ReplierTask task = new ReplierTask( answer , results );
executorService.submit( task );
}
}
// At this point the flow-of-control blocks until all submitted tasks are done.
// The executor service is automatically closed by this point as well.
Duration elapsed = Duration.between( start , Instant.now() );
System.out.println( "results.size() = " + results.size() + ". Elapsed: " + elapsed );
}
}
We can change out the Executors.newVirtualThreadExecutor() with a pool of platform threads, to compare against virtual threads. Let's try a pool of 5, 10, and 1,000 platform threads on a Mac mini Intel with macOS Mojave sporting 6 real cores, no hyper-threading, 32 gigs of memory, and OpenJDK special build version 16-loom+9-316 assigned maxMemory of 8 gigs.
10,000 tasks at 1 second each
Total elapsed time
5 platform threads
half-hour — PT33M29.755792S
10 platform threads
quarter-hour — PT16M43.318973S
1,000 platform threads
10 seconds — PT10.487689S
10,000 platform threads
Error…unable to create native thread: possibly out of memory or process/resource limits reached
virtual threads
Under 3 seconds — PT2.645964S
Caveats
Caveat: Project Loom is experimental and subject to change, not intended for production use yet. The team is asking for folks to give feedback now.
Caveat: CPU-bound tasks such as encoding video should stick with platform/kernel threads rather than virtual threads. Most common code doing blocking operations such as I/O, like accessing files, logging, hitting a database, or making network calls, will likely see massive performance boosts with virtual threads.
Caveat: You must have enough memory available for many or even all of your tasks to be running simultaneously. If not enough memory will be available, you must take additional steps to throttle the virtual threads.
The setting java.util.concurrent.ForkJoinPool.common.parallelism will have an effect on the threads available to use for operations which make use of the ForkJoinPool, such as Stream.parallel(). However: whether your task uses more threads depends on the number of items in the stream, and whether it takes less time to run depends on the nature of each task and your available processors.
This test program shows the effect of changing this system property with a trivial task:
public static void main(String[] args) {
ConcurrentHashMap<String,String> threads = new ConcurrentHashMap<>();
int max = Integer.parseInt(args[0]);
boolean parallel = args.length < 2 || !"single".equals(args[1]);
int [] arr = IntStream.range(0, max).toArray();
long start = System.nanoTime();
IntStream stream = Arrays.stream(arr);
if (parallel)
stream = stream.parallel();
stream.forEach(i -> {
threads.put("hc="+Thread.currentThread().hashCode()+" tn="+Thread.currentThread().getName(), "value");
});
long end = System.nanoTime();
System.out.println("parallelism: "+System.getProperty("java.util.concurrent.ForkJoinPool.common.parallelism"));
System.out.println("Threads: "+threads.keySet());
System.out.println("Array size: "+arr.length+" threads used: "+threads.size()+" ms="+TimeUnit.NANOSECONDS.toMillis(end-start));
}
Adding more threads won't necessarily speed things up. Here are some examples from test run to count the threads used. It may help you decide on best approach for your own task contained in getReplyForAnswerCombination().
java -cp example.jar -Djava.util.concurrent.ForkJoinPool.common.parallelism=1000 App 100000
Array size: 100000 threads used: 37
java -cp example.jar -Djava.util.concurrent.ForkJoinPool.common.parallelism=50 App 100000
Array size: 100000 threads used: 20
java -cp example.jar APP 100000 single
Array size: 100000 threads used: 1
I suggest you see the thread pooling (with or without LOOM) in #Basil Bourque answer and also the JDK source code of the ForkJoinPool constructor has some details on this system property.
private ForkJoinPool(byte forCommonPoolOnly)
I'm messing around with RxJava and I want to stream a thousand consecutive integers. Then I want to asynchronously split them into odd and even streams, and then print them asynchronously.
However, I'm getting nothing printed out or at least very partial output. What am I missing? Did I schedule incorrectly? Or is the console having multithreading issues in Eclipse?
public static void main(String[] args) {
List<Integer> values = IntStream.range(0,1000).mapToObj(i -> Integer.valueOf(i)).collect(Collectors.toList());
Observable<Integer> ints = Observable.from(values).subscribeOn(Schedulers.computation());
Observable<Integer> evens = ints.filter(i -> Math.abs(i) % 2 == 0);
Observable<Integer> odds = ints.filter(i -> Math.abs(i) % 2 != 0);
evens.subscribe(i -> System.out.println(i + " IS EVEN " + Thread.currentThread().getName()));
odds.subscribe(i -> System.out.println(i + " IS ODD " + Thread.currentThread().getName()));
}
You are starting the processing pipeline using Schedules.computation which runs daemon threads. Thus when your main thread finishes, those threads are terminated before processing your observable.
So if you would like to see your results printed you could have your main thread wait for the results (e.g. by Thread.sleep) or subscribe on the calling thread by removing subscribeOn. There is also an option to create your own scheduler which will run non-daemon threads.
class Foo{
int len;
}
public class Main {
public static void main(String[] args) throws Exception{
System.out.println(Stream.of("alpha", "beta", "gamma", "delta").parallel().reduce(
new Foo(),
(f, s) -> { f.len += s.length(); return f; },
(f1, f2) -> {
Foo f = new Foo();
/* check self-reduction
if (f1 == f2) {
System.out.println("equal");
f.len = f1.len;
return f;
}
*/
f.len = f1.len + f2.len;
return f;
}
).len);
}
The code tries to count the total length of several strings.
This piece of code prints 19 only if
1.I use sequential stream (by removing the "parallel()" function call)
or
2.I use Integer instead of Foo which is simply a wrapper around an int.
Otherwise the console will print 20 or 36 instead. To debug this issue, I added the code "check self-reduction" which does change the output: "equal" always gets printed twice. The console will sometimes print 8, sometimes 10.
My understanding is that reduce() is a Java implementation of parallel foldr/foldl. The 3rd argument of reduce(), combiner is used to merge results of parallel execution of reduction. Is that right? If so, why would the result of reduction ever need to combine with itself? Further, how do I fix this code so that it gives correct output and still runs parallel?
EDIT:
Please ignore the fact that I did not use method reference to simplify the code, as my ultimate goal was to zip by adding more fields to Foo.
Your code is horribly broken. You are using a reducer function which fails the requirement that the accumulator/combiner functions be associative, stateless, and non-interfering. And a mutable Foo is not an identity for the reduction. All of these can lead to incorrect results when executed in parallel.
You're also making it far harder than you need to! Try this:
int totalLen =
Stream.of(... stuff ...)
.parallel()
.mapToInt(String::length)
.sum();
or
int totalLen =
Stream.of(... stuff ...)
.parallel()
.mapToInt(String::length)
.reduce(0, Integer::sum);
Further, you're trying to use reduce which reduces over values (which is why it works with Integer), but you're trying to use mutable state containers for your reduction result. If you want to reduce into a mutable state container (like a List or StringBuilder), use collect() instead, which is designed for mutation.
I think the problem is that the "identity" Foo is being reused too much.
Here's a modification where each Foo is given its own ID number so that we can track it:
class Foo {
private static int currId = 0;
private static Object lock = new Object();
int id;
int len;
public Foo() {
synchronized(lock) {
id = currId++;
}
}
}
public class Main {
public static void main(String[] args) throws Exception{
System.out.println(Stream.of("alpha", "beta", "gamma", "delta").parallel().reduce(
new Foo(),
(f, s) -> {
System.out.println("Adding to #" + f.id + ": " +
f.len + " + " + s.length() + " => " + (f.len+s.length()));
f.len += s.length(); return f; },
(f1, f2) -> {
Foo f = new Foo();
f.len = f1.len + f2.len;
System.out.println("Creating new #" + f.id + " from #" + f1.id + " and #" + f2.id + ": " +
f1.len + " + " + f2.len + " => " + (f1.len+f2.len));
return f;
}
).len);
}
The output I get is:
Adding to #0: 0 + 5 => 5
Adding to #0: 0 + 4 => 4
Adding to #0: 5 + 5 => 10
Adding to #0: 9 + 5 => 14
Creating new #2 from #0 and #0: 19 + 19 => 38
Creating new #1 from #0 and #0: 14 + 14 => 28
Creating new #3 from #2 and #1: 38 + 28 => 66
66
It's not consistent every time. The thing I notice is that each time you say f.len += s.length(), it adds to the same Foo, which means that the first new Foo() is being executed only once, and lengths keep getting added into it, so that the same input strings' lengths are counted multiple times. Since there are apparently multiple parallel threads accessing it at the same time, the results above are a little strange and change from run to run.