Java, returning value from thread using Future - java

Suppose the following illustrative example. There is the class B involving some numerical procedures, for example the factorial. The computation runs in a separate thread:
public class B implements Callable <Integer> {
private int n;
public B(int n_) {n = n_;}
public Integer call() {return f();}
public Integer f() {
if (n == 1) return 1;
else {
int fn = 1;
for (int i = n; i > 1; i--) fn *= i;
return fn;
}
}
}
The next class A is using the factorial to evaluate the remainder r = x^n /n!
public class A {
public double rem (double x, int n){
B b = new B(n);
ExecutorService es = Executors.newFixedThreadPool(5);
Future <Integer> nf = es.submit(b); //Factorial
es.submit(()->
{
double r = 1; //Remainder x^n/n
for (int i = 1; i <= n; i++) r = r * x;
try { r = r / nf.get();}
catch (Exception e) {e.printStackTrace();}
return r;
});
return 0;
}
}
How to ensure that rem() function returns the value after the submit() procedure has been finished? Unfortunately, this does not work:
public static void main(String[] args) {
A a = new A();
double r = a.rem(0.5, 10);
}
Is it necessary to run A in another thread and modify A so that:
public class A implements Callable <Double> {
private int n;
private double x;
public A(double x_, int n_) {x = x_; n = n_;}
public Double call() {return rem(x, n);}
....
}
and run A.rem() in a separate thread ?
public static void main(String[] args) {
A a = new A(0.5, 10);
ExecutorService es = Executors.newFixedThreadPool(5);
Future <Double> nf = es.submit(a); //Factorial
double r = nf.get();
}
Is there any simpler solution avoiding two different threads?
Could I ask for a short sample code?

Using Future.get() inside a task submitted to a thread pool is dangerous: current thread is blocked and cannot run other tasks. This may lead to thread starvation - a specific kind of deadlock.
The correct approach is to make acyclic graph where each node is an asynchronous function call of type CompletableFuture, which runs only after all arguments are calculated. Only the general result is extracted using Future.get() called on the main thread.
This is an example of such a graph, made close to what you wanted to implement: first, functions factorial and power run in parallel. As soon as they both complete, function to compute reminder is called.
public static long fact(int n) {
long res = 1;
for (int i = n; i > 1; i--) res *= i;
return res;
}
public static double pow(double base, int pow) {
double r = 1;
for (int i = 0; i < pow; i++) r *= base;
return r;
}
public static double rem(double val1, long val2) {
return val1/val2;
}
public static void main(String[] args) throws ExecutionException, InterruptedException {
ExecutorService es = Executors.newFixedThreadPool(5);
double base = 0.5;
int n = 10;
CompletableFuture<Double> f1 = CompletableFuture.supplyAsync(() -> pow(base, n), es);
CompletableFuture<Long> f2 = CompletableFuture.supplyAsync(() -> fact(n), es);
CompletableFuture<Double> f3 = f1.thenCombineAsync(f2, (v1,v2)->rem(v1,v2), es);
double r1 = f3.get();
System.out.println("r1="+r1);
// compare with the result of synchronous execution:
double r2 = rem(pow(base, n), fact(n));
System.out.println("r2="+r2);
}

Callable objects implement the method get which returns the value calculated by the thread - have a look at the following link:
https://blogs.oracle.com/corejavatechtips/using-callable-to-return-results-from-runnables

Related

Running single java thread is faster than main?

im doing a few concurrency experiments in java.
I have this prime calculation method, which is just for mimicking a semi-expensive operation:
static boolean isprime(int n){
if (n == 1)
return false;
boolean flag = false;
for (int i = 2; i <= n / 2; ++i) {
if (n % i == 0) {
flag = true;
break;
}
}
return ! flag;
}
And then I have this main method, which simply calculates all prime number from 0 to N, and stores results in a array of booleans:
public class Main {
public static void main(String[] args) {
final int N = 100_000;
int T = 1;
boolean[] bool = new boolean[N];
ExecutorService es = Executors.newFixedThreadPool(T);
final int partition = N / T;
long start = System.nanoTime();
for (int j = 0; j < N; j++ ){
boolean res = isprime(j);
bool[j] = res;
}
System.out.println(System.nanoTime()-start);
}
This gives me results like: 893888901 n/s 848995600 n/s
And i also have this drivercode, where I use a executorservice where I use one thread to do the same:
public class Main {
public static void main(String[] args) {
final int N = 100_000;
int T = 1;
boolean[] bool = new boolean[N];
ExecutorService es = Executors.newFixedThreadPool(T);
final int partition = N / T;
long start = System.nanoTime();
for (int i = 0; i < T; i++ ){
final int current = i;
es.execute(new Runnable() {
#Override
public void run() {
for (int j = current*partition; j < current*partition+partition; j++ ){
boolean res = isprime(j);
bool[j] = res;
}
}
});
}
es.shutdown();
try {
es.awaitTermination(1, TimeUnit.MILLISECONDS);
} catch (Exception e){
System.out.println("what?");
}
System.out.println(System.nanoTime()-start);
}
this gives results like: 9523201 n/s , 15485300 n/s.
Now the second example is, as you can read, much faster than the first. I can't really understand why that is? should'nt the exercutorservice thread (with 1 thread) be slower, since it's basically doing the work sequentially + overhead from "awaking" the thread, compared to the main thread?
I was expecting the executorservice to be faster when I started adding multiple threads, but this is a little counterintuitive.
It's the timeout at the bottom of your code. If you set that higher you arrive at pretty similar execution times.
es.awaitTermination(1000, TimeUnit.MILLISECONDS);
The execution times you mention for the first main are much higher than the millisecond you allow the second main to wait for the threads to finish.

Using ThreadPool for parallelisation of Matrix Multiplication

I am trying to parallel a matrix multiplication.
I have achieved parallelization by calculating each cell of Matrix C in a separate thread. (I hope i have done this correctly).
My question here is if using thread pool is the best way for creating threads. (Sorry i am unfamiliar with this and someone suggested to do in this way)
Also will i see a great difference in the time it takes to calculate with a sequential version of the program compared to this?
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
public class ParallelMatrix {
public final static int N = 2000; //Random size of matrix
public static void main(String[] args) throws InterruptedException {
long startTime = System.currentTimeMillis();
//Create and multiply matrix of random size N.
double [][] a = new double [N][N];
double [][] b = new double [N][N];
double [][] c = new double [N][N];
int i,j,k;
for(i = 0; i < N ; i++) {
for(j = 0; j < N ; j++){
a[i][j] = i + j;
b[i][j] = i * j;
}
ExecutorService pool = Executors.newFixedThreadPool(1);
for(i = 0; i < N; i++) {
for(j = 0; j < N; j++) {
pool.submit(new Multi(N,i,j,a,b,c));
}
}
pool.shutdown();
pool.awaitTermination(1, TimeUnit.DAYS);
long endTime = System.currentTimeMillis();
System.out.println("Calculation completed in " +
(endTime - startTime) + " milliseconds");
}
static class Multi implements Runnable {
final int N;
final double [][] a;
final double [][] b;
final double [][] c;
final int i;
final int j;
public Multi(int N, int i, int j, double[][] a, double[][] b, double[][] c){
this.N=N;
this.i=i;
this.j=j;
this.a=a;
this.b=b;
this.c=c;
}
#Override
public void run() {
for(int k = 0; k < N; k++)
c[i][j] += a[i][k] * b[k][j];
}
}
}
You have to balance between scheduling overhead, operation duration and number of available cores. For a start, size your thread pool according to the number of cores available newFixedThreadPool(Runtime.getRuntime().availableProcessors()).
To minimize scheduling overhead you want to slice the operation into just as many independent tasks (of ideally equal execution time) as you have processors.
Generally, the smaller the operation you do in a slice, the more scheduling overhead you have. What you have now (N square tasks) has excessive overhead (you will create and submit 2000 times 2000 Multi runnables which each do very little work).

Write a function when given one integer x ; returns e with the following approximation

![enter image description here][1]The question:
Write a function with name ʻ expʼ that ,when given one integer x; returns
with the following approximation. In this function, you should use previous two functions to calculate factorial and power.
This is my code;
import java.util.Scanner;
public class ass7_q3 {
public static int power(int x, int y)
{
int result = 1;
for(int i = 1; i <= y; i++)
{
result = result * x;
}
return result;
}
public static int factorial(int n)
{
int fact = 1;
for(int i = 1; i<= n; i++)
fact = fact * i;
return fact;
}
public static int exp( int x)
{
int result;
result = (power(x,x) / factorial(x) );
return result;
}
public static void main(String[] args) {
Scanner read = new Scanner(System.in);
int sum = 0;
int x;
x = read.nextInt();
for(int i=0; i<=10; i++)
{
sum = sum + exp(x);
}
System.out.println(sum);
}
}
However, when I run this code, it always gives me the wrong answer.
What can I do?
You should start by working with doubles instead of integers. You can't expect to approximate a Real number using only integer calculations.
For example, power(x,x) / factorial(x) would always return an integer, since both methods return an int.
5/2 will return 2. You need cast it to double like ((double)5)/2 will return 2.5
You should think about range of integer ( in java 2,147,483,647)
if your input is grater than 12, integer can not handle.
Factorial method will return wrong value. because
factorial(12)=479,001,600 but
factorial(13)=6,227,020,800

Issue with Cumulative Sum

Here is an easy one that I am having trouble with. The problem requires me to write a method called fractionSum which accepts an integer parameter and returns a double of the sum of the first n terms.
for instance, if the parameter is 5, the program would add all the fractions of (1+(1/2)+(1/3)+(1/4)+(1/5)). In other words, it's a form of the Riemann sum.
For some reason, the for loop does not accumulate the sum.
Here's the code:
public class Exercise01 {
public static final int UPPER_LIMIT = 5;
public static void main(String[] args) {
System.out.print(fractionSum(UPPER_LIMIT));
}
public static double fractionSum(int n) {
if (n<1) {
throw new IllegalArgumentException("Out of range.");
}
double total = 1;
for (int i = 2; i <= n; i++) {
total += (1/i);
}
return total;
}
}
you need to type cast to double
try this way
public class Exercise01 {
public static final int UPPER_LIMIT = 5;
public static void main(String[] args) {
System.out.print(fractionSum(UPPER_LIMIT));
}
public static double fractionSum(int n) {
if (n<1) {
throw new IllegalArgumentException("Out of range.");
}
double total = 1;
for (int i = 2; i <= n; i++) {
total += (1/(double)i);
}
return total;
}
}
The operation
(1/i)
is working on integers and hence will generate the result in terms of int. Update it to:
(1.0/i)
to get the fractional result and not the int result.

Parallel implementation of Levenshtein distance slows down with more threads

This is a parallel implementation of Levenshtein distance that I was writing for fun. I'm disappointed in the results. I am running this on a core i7 processor, so I have plenty of available threads. However, as I increase the thread count, the performance degrades significantly. By that I mean it actually runs slower with more threads for input of the same size.
I was hoping that someone could look at the way I am using threads, and the java.util.concurrent package, and tell me if I am doing anything wrong. I'm really only interested in reasons why the parallelism is not working as I would expect. I don't expect the reader to look at the complicated indexing going on here. I believe the calculations I'm doing are correct. But even if they are not, I think I should still be seeing a close to linear speed-up as I increase the number of threads in the threadpool.
I've included the benchmarking code I used. I'm using libraries found here for benchmarking. The second code block is what I used for benchmarking.
Thanks for any help :).
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.*;
public class EditDistance {
private static final int MIN_CHUNK_SIZE = 5;
private final ExecutorService threadPool;
private final int threadCount;
private final String maxStr;
private final String minStr;
private final int maxLen;
private final int minLen;
public EditDistance(String s1, String s2, ExecutorService threadPool,
int threadCount) {
this.threadCount = threadCount;
this.threadPool = threadPool;
if (s1.length() < s2.length()) {
minStr = s1;
maxStr = s2;
} else {
minStr = s2;
maxStr = s1;
}
maxLen = maxStr.length();
minLen = minStr.length();
}
public int editDist() {
int iterations = maxLen + minLen - 1;
int[] prev = new int[0];
int[] current = null;
for (int i = 0; i < iterations; i++) {
int currentLen;
if (i < minLen) {
currentLen = i + 1;
} else if (i < maxLen) {
currentLen = minLen;
} else {
currentLen = iterations - i;
}
current = new int[currentLen * 2 - 1];
parallelize(prev, current, currentLen, i);
prev = current;
}
return current[0];
}
private void parallelize(int[] prev, int[] current, int currentLen,
int iteration) {
int chunkSize = Math.max(current.length / threadCount, MIN_CHUNK_SIZE);
List<Future<?>> futures = new ArrayList<Future<?>>(currentLen);
for (int i = 0; i < currentLen; i += chunkSize) {
int stopIdx = Math.min(currentLen, i + chunkSize);
Runnable worker = new Worker(prev, current, currentLen, iteration,
i, stopIdx);
futures.add(threadPool.submit(worker));
}
for (Future<?> future : futures) {
try {
Object result = future.get();
if (result != null) {
throw new RuntimeException(result.toString());
}
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
} catch (ExecutionException e) {
// We can only finish the computation if we complete
// all subproblems
throw new RuntimeException(e);
}
}
}
private void doChunk(int[] prev, int[] current, int currentLen,
int iteration, int startIdx, int stopIdx) {
int mergeStartIdx = (iteration < minLen) ? 0 : 2;
for (int i = startIdx; i < stopIdx; i++) {
// Edit distance
int x;
int y;
int leftIdx;
int downIdx;
int diagonalIdx;
if (iteration < minLen) {
x = i;
y = currentLen - i - 1;
leftIdx = i * 2 - 2;
downIdx = i * 2;
diagonalIdx = i * 2 - 1;
} else {
x = i + iteration - minLen + 1;
y = minLen - i - 1;
leftIdx = i * 2;
downIdx = i * 2 + 2;
diagonalIdx = i * 2 + 1;
}
int left = 1 + ((leftIdx < 0) ? iteration + 1 : prev[leftIdx]);
int down = 1 + ((downIdx < prev.length) ? prev[downIdx]
: iteration + 1);
int diagonal = penalty(x, y)
+ ((diagonalIdx < 0 || diagonalIdx >= prev.length) ? iteration
: prev[diagonalIdx]);
int dist = Math.min(left, Math.min(down, diagonal));
current[i * 2] = dist;
// Merge prev
int mergeIdx = i * 2 + 1;
if (mergeIdx < current.length) {
current[mergeIdx] = prev[mergeStartIdx + i * 2];
}
}
}
private int penalty(int maxIdx, int minIdx) {
return (maxStr.charAt(maxIdx) == minStr.charAt(minIdx)) ? 0 : 1;
}
private class Worker implements Runnable {
private final int[] prev;
private final int[] current;
private final int currentLen;
private final int iteration;
private final int startIdx;
private final int stopIdx;
Worker(int[] prev, int[] current, int currentLen, int iteration,
int startIdx, int stopIdx) {
this.prev = prev;
this.current = current;
this.currentLen = currentLen;
this.iteration = iteration;
this.startIdx = startIdx;
this.stopIdx = stopIdx;
}
#Override
public void run() {
doChunk(prev, current, currentLen, iteration, startIdx, stopIdx);
}
}
public static void main(String args[]) {
int threadCount = 4;
ExecutorService threadPool = Executors.newFixedThreadPool(threadCount);
EditDistance ed = new EditDistance("Saturday", "Sunday", threadPool,
threadCount);
System.out.println(ed.editDist());
threadPool.shutdown();
}
}
There is a private inner class Worker inside EditDistance. Each worker is responsible for filling in a range of the current array using EditDistance.doChunk. EditDistance.parallelize is responsible for creating those workers, and waiting for them to finish their tasks.
And the code I am using for benchmarks:
import java.io.PrintStream;
import java.util.concurrent.*;
import org.apache.commons.lang3.RandomStringUtils;
import bb.util.Benchmark;
public class EditDistanceBenchmark {
public static void main(String[] args) {
if (args.length != 2) {
System.out.println("Usage: <string length> <thread count>");
System.exit(1);
}
PrintStream oldOut = System.out;
System.setOut(System.err);
int strLen = Integer.parseInt(args[0]);
int threadCount = Integer.parseInt(args[1]);
String s1 = RandomStringUtils.randomAlphabetic(strLen);
String s2 = RandomStringUtils.randomAlphabetic(strLen);
ExecutorService threadPool = Executors.newFixedThreadPool(threadCount);
Benchmark b = new Benchmark(new Benchmarker(s1, s2, threadPool,threadCount));
System.setOut(oldOut);
System.out.println("threadCount: " + threadCount +
" string length: "+ strLen + "\n\n" + b);
System.out.println("s1: " + s1 + "\ns2: " + s2);
threadPool.shutdown();
}
private static class Benchmarker implements Runnable {
private final String s1, s2;
private final int threadCount;
private final ExecutorService threadPool;
private Benchmarker(String s1, String s2, ExecutorService threadPool, int threadCount) {
this.s1 = s1;
this.s2 = s2;
this.threadPool = threadPool;
this.threadCount = threadCount;
}
#Override
public void run() {
EditDistance d = new EditDistance(s1, s2, threadPool, threadCount);
d.editDist();
}
}
}
It's very easy to accidentally write code that does not parallelize very well. A main culprit is when your threads compete for underlying system resources (e.g. a cache line). Since this algorithm inherently acts on things that are close to each other in physical memory, I suspect pretty strongly that may be the culprit.
I suggest you review this excellent article on False Sharing
http://www.drdobbs.com/go-parallel/article/217500206?pgno=3
and then carefully review your code for cases where threads would block one another.
Additionally, running more threads than you have CPU cores will slow down performance if your threads are CPU bound (if you're already using all cores to near 100%, adding more threads will only add overhead for the context switches).

Categories