Java Completable future thread are alive after method execute - java

I have written a small program to check behavior of Completable Future. I have not overridden the common pool.
I did not found any shut down method and when i print active number of thread at the end, i found my thread active.
My question is when they will end, if i am using it in life application?
And do they create to many thread if i use it in public Api, who have much traffic?
My samlple code
`
package rar;
import java.util.Arrays;
import java.util.List;
import java.util.Objects;
import java.util.Set;
import java.util.concurrent.CompletableFuture;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.stream.Collectors;
public class Rar {
public static void main(String[] args) {
Rar r = new Rar();
Set<Thread> threadSet = Thread.getAllStackTraces().keySet();
System.out.println(threadSet); r.dfo(); threadSet =
Thread.getAllStackTraces().keySet(); System.out.println("uyuuuu"+threadSet);
}
private static void doTask3() {
for(int i=0; i<5;i++) {
try {
Thread.sleep(5000);
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.print(3);
}
}
public void dfo() {
System.out.println("In main");
ExecutorService executor = Executors.newFixedThreadPool(3);
CompletableFuture<Void> thenCompose =CompletableFuture.allOf(
CompletableFuture.runAsync(() -> doTask1()),
CompletableFuture.runAsync(() -> doTask2(2)),
CompletableFuture.runAsync(() -> doTask3()));
//executor.shutdown();
try {
Set<Thread> threadSet = Thread.getAllStackTraces().keySet();
System.out.println(threadSet);
thenCompose.get();
} catch (InterruptedException | ExecutionException e) {
e.printStackTrace();
}
System.out.println("Exiting main");
Set<Thread> threadSeWt = Thread.getAllStackTraces().keySet();
System.out.println(threadSeWt);
}
private void doTask2(int num) {
for(int i=0; i<5;i++) {
try {
Thread.sleep(5000);
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.print("4");
}
}
private int doTask1() {
for(int i=0; i<5;i++) {
try {
Thread.sleep(5001);
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.print(1);
}
return 5;
}
}
`
Sample Output:
[Thread[Finalizer,8,system], Thread[Attach Listener,5,system], Thread[Signal Dispatcher,9,system], Thread[Reference Handler,10,system], Thread[main,5,main]]
In main
[Thread[Finalizer,8,system], Thread[ForkJoinPool.commonPool-worker-1,5,main], Thread[main,5,main], Thread[ForkJoinPool.commonPool-worker-2,5,main], Thread[Attach Listener,5,system], Thread[ForkJoinPool.commonPool-worker-3,5,main], Thread[Signal Dispatcher,9,system], Thread[Reference Handler,10,system]]
431413431431431Exiting main
[Thread[Finalizer,8,system], Thread[ForkJoinPool.commonPool-worker-1,5,main], Thread[main,5,main], Thread[ForkJoinPool.commonPool-worker-2,5,main], Thread[Attach Listener,5,system], Thread[ForkJoinPool.commonPool-worker-3,5,main], Thread[Signal Dispatcher,9,system], Thread[Reference Handler,10,system]]
uyuuuu[Thread[Finalizer,8,system], Thread[ForkJoinPool.commonPool-worker-1,5,main], Thread[main,5,main], Thread[ForkJoinPool.commonPool-worker-2,5,main], Thread[Attach Listener,5,system], Thread[ForkJoinPool.commonPool-worker-3,5,main], Thread[Signal Dispatcher,9,system], Thread[Reference Handler,10,system]]

The executor in your dfo() method is not used.
When you use runAsync() method, it runs on the common pool. That's why you see ForkJoinPool in your debug messages.
The size of that pool is limited by default by "number of your CPU cores - 1".
Try running 20 tasks, and you'll see that your thread count stops growing, once it reaches the maximum.
You don't need to stop the threads of the ForkJoinPool. From the documentation:
its threads are slowly reclaimed during periods of non-use

Related

How to get server status using multi-threads periodically

The below code works fine and it connects to a given server (host, port) and gets the connection status.
What it does is:
PollService implements the Callable interface and connects to a server(host, port) then it returns the status.
Since this should happen periodically, it iterates the Hashmap entries in a while(true) loop infinitely.
The problem: On the server-side, I see it takes 2 or 3 seconds to reach the thread and if I use Runnable with periodic implementation it connects within 1 sec. Looks like iterating the Hashmap infinitely is a slow approach.
However, I can not use Runnable as it doesn't return the status of the connection which I need later to use.
Below is the ServiceMonitor class (client) which connects to the server.
package org.example;
import java.time.LocalDateTime;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
import java.util.logging.Level;
import java.util.logging.Logger;
import java.util.stream.Collectors;
public class ServicesMonitor {
private ExecutorService scheduledExecutorService = null;
private static Logger logger = Logger.getLogger(ServicesMonitor.class.getName());
private final Map<ServiceType, List<ClientMonitorService>> clientMonitorServicesMap = new HashMap<>();
public void registerInterest(ClientMonitorService clientMonitorService) {
clientMonitorServicesMap.computeIfAbsent(clientMonitorService.getServiceToMonitor().getServiceType(), v -> new ArrayList<>()).add(clientMonitorService);
}
public Map<ServiceType, List<ClientMonitorService>> getClineMonitorService() {
return clientMonitorServicesMap;
}
public void poll(){
//Observable.interval(1, TimeUnit.SECONDS).st
}
public void pollServices() {
scheduledExecutorService = Executors.newFixedThreadPool(clientMonitorServicesMap.size());
try {
while (true) {
clientMonitorServicesMap.forEach((k, v) -> {
Future<Boolean> val = scheduledExecutorService.submit(new PollService(k));
try {
boolean result = val.get();
System.out.println("service " + k.getHost() + ":" + k.getPort() + "status is " + result);
if (result) {
List<ClientMonitorService> list = v.stream().filter(a -> LocalDateTime.now().getSecond() % a.getServiceToMonitor().getFreqSec() == 0)
.collect(Collectors.toList());
list.stream().forEach(a -> System.out.println(a.getClientId()));
}
} catch (InterruptedException e) {
e.printStackTrace();
} catch (ExecutionException e) {
e.printStackTrace();
}
});
}
} catch (Exception e) {
logger.log(Level.SEVERE, e.getMessage());
} finally {
scheduledExecutorService.shutdown();
}
}
}
How to improve the performance of this code by reducing the time it takes to connect to the server?
How to improve this code?
after using the get(1, TimeUnit.SECONDS); I started to see improvement on the server side as well (Reaching the threads less than 1 second) since we are not waiting more than 1 second on the client side.
while (true) {
clientMonitorServicesMap.forEach((k, v) -> {
Future<Boolean> val = scheduledExecutorService.submit(new PollService(k));
try {
boolean result = val.get(1, TimeUnit.SECONDS);
System.out.println("service " + k.getHost() + ":" + k.getPort() + "status is " + result);
if (result) {
List<ClientMonitorService> list = v.stream()
//.filter(a -> LocalDateTime.now().getSecond() % a.getServiceToMonitor().getFreqSec() == 0)
.collect(Collectors.toList());
list.stream().forEach(a -> System.out.println(a.getClientId()));
}
} catch (InterruptedException e) {
logger.log(Level.WARNING,"Interrupted -> " + k.getHost()+":"+k.getPort());
} catch (ExecutionException e) {
logger.log(Level.INFO,"ExecutionException exception -> "+ k.getHost()+":"+k.getPort());
} catch (TimeoutException e) {
logger.log(Level.INFO,"TimeoutException exception -> "+ k.getHost()+":"+k.getPort());
}
});
}

How to wait for full completion of a completable future with runAsync?

This test fails:
package com.stackoverflow.demo;
import java.util.concurrent.CompletableFuture;
import java.util.concurrent.ConcurrentLinkedQueue;
import java.util.concurrent.ForkJoinPool;
import org.junit.Assert;
import org.junit.Test;
public class AsyncTest {
#Test
public void test1() {
Assert.assertTrue("please run this test in a machine with 2 or more cores", ForkJoinPool.getCommonPoolParallelism() > 1);
CompletableFuture<String> cf = CompletableFuture.completedFuture("ok");
ConcurrentLinkedQueue<String> out = new ConcurrentLinkedQueue<>();
cf.thenRunAsync(() -> {
out.add("one");
try {
Thread.sleep(2000);
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
out.add("two");
}, ForkJoinPool.commonPool());
cf.join();
Assert.assertEquals(2, out.size());
}
}
I was surprised because I expected cf.join() to take all attached tasks into account. I am sure it says somewhere in the documentation that join only waits for the initial task, but somehow I missed it.
How can I get the behavior I want: Wait for a CompletableFuture and all its attached subtasks to complete?
Fixed it while proof-reading my post:
public class AsyncTest {
#Test
public void test1() {
Assert.assertTrue("please run this test in a machine with 2 or more cores", ForkJoinPool.getCommonPoolParallelism() > 1);
CompletableFuture<String> cf = CompletableFuture.completedFuture("ok");
ConcurrentLinkedQueue<String> out = new ConcurrentLinkedQueue<>();
CompletableFuture<Void> cf2 = cf.thenRunAsync(() -> {
out.add("one");
try {
Thread.sleep(2000);
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
out.add("two");
}, ForkJoinPool.commonPool());
cf2.join();
Assert.assertEquals(2, out.size());
}
}

spark application does not stop when multiple threads share the same spark context

I have tried to reproduce the problem i am facing. My problem statement - In a folder multiple files are present. I need to do word counts for each file and print the result. Each file should be processed parallely! of course, there is a limit to parallelism. I have written the following code to accomplish it. It is running fine.The cluster is having spark installation of mapR. The cluster has spark.scheduler.mode = FIFO.
Q1- is there a better way to accomplish the task mentioned above?
Q2- i have observed that the application does not stop even when it
has completed the word counting of avaialble files. i am unable to
figure out how to deal with it?
package groupId.artifactId;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
import java.util.concurrent.TimeUnit;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaSparkContext;
public class Executor {
/**
* #param args
*/
public static void main(String[] args) {
final int threadPoolSize = 5;
SparkConf sparkConf = new SparkConf().setMaster("yarn-client").setAppName("Tracker").set("spark.ui.port","0");
JavaSparkContext jsc = new JavaSparkContext(sparkConf);
ExecutorService executor = Executors.newFixedThreadPool(threadPoolSize);
List<Future> listOfFuture = new ArrayList<Future>();
for (int i = 0; i < 20; i++) {
if (listOfFuture.size() < threadPoolSize) {
FlexiWordCount flexiWordCount = new FlexiWordCount(jsc, i);
Future future = executor.submit(flexiWordCount);
listOfFuture.add(future);
} else {
boolean allFutureDone = false;
while (!allFutureDone) {
allFutureDone = checkForAllFuture(listOfFuture);
System.out.println("Threads not completed yet!");
try {
Thread.sleep(2000);//waiting for 2 sec, before next check
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
printFutureResult(listOfFuture);
System.out.println("printing of future done");
listOfFuture.clear();
System.out.println("future list got cleared");
}
}
try {
executor.awaitTermination(5, TimeUnit.MINUTES);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
private static void printFutureResult(List<Future> listOfFuture) {
Iterator<Future> iterateFuture = listOfFuture.iterator();
while (iterateFuture.hasNext()) {
Future tempFuture = iterateFuture.next();
try {
System.out.println("Future result " + tempFuture.get());
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (ExecutionException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
private static boolean checkForAllFuture(List<Future> listOfFuture) {
boolean status = true;
Iterator<Future> iterateFuture = listOfFuture.iterator();
while (iterateFuture.hasNext()) {
Future tempFuture = iterateFuture.next();
if (!tempFuture.isDone()) {
status = false;
break;
}
}
return status;
}
package groupId.artifactId;
import java.io.Serializable;
import java.util.Arrays;
import java.util.concurrent.Callable;
import org.apache.spark.api.java.JavaPairRDD;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.api.java.function.FlatMapFunction;
import org.apache.spark.api.java.function.Function2;
import org.apache.spark.api.java.function.PairFunction;
import scala.Tuple2;
public class FlexiWordCount implements Callable<Object>,Serializable {
private static final long serialVersionUID = 1L;
private JavaSparkContext jsc;
private int fileId;
public FlexiWordCount(JavaSparkContext jsc, int fileId) {
super();
this.jsc = jsc;
this.fileId = fileId;
}
private static class Reduction implements Function2<Integer, Integer, Integer>{
#Override
public Integer call(Integer i1, Integer i2) {
return i1 + i2;
}
}
private static class KVPair implements PairFunction<String, String, Integer>{
#Override
public Tuple2<String, Integer> call(String paramT)
throws Exception {
return new Tuple2<String, Integer>(paramT, 1);
}
}
private static class Flatter implements FlatMapFunction<String, String>{
#Override
public Iterable<String> call(String s) {
return Arrays.asList(s.split(" "));
}
}
#Override
public Object call() throws Exception {
JavaRDD<String> jrd = jsc.textFile("/root/folder/experiment979/" + fileId +".txt");
System.out.println("inside call() for fileId = " + fileId);
JavaRDD<String> words = jrd.flatMap(new Flatter());
JavaPairRDD<String, Integer> ones = words.mapToPair(new KVPair());
JavaPairRDD<String, Integer> counts = ones.reduceByKey(new Reduction());
return counts.collect();
}
}
}
Why is Program not closing automatically ?
Ans : you have not closed the Sparkcontex , try changing main method to this :
public static void main(String[] args) {
final int threadPoolSize = 5;
SparkConf sparkConf = new SparkConf().setMaster("yarn-client").setAppName("Tracker").set("spark.ui.port","0");
JavaSparkContext jsc = new JavaSparkContext(sparkConf);
ExecutorService executor = Executors.newFixedThreadPool(threadPoolSize);
List<Future> listOfFuture = new ArrayList<Future>();
for (int i = 0; i < 20; i++) {
if (listOfFuture.size() < threadPoolSize) {
FlexiWordCount flexiWordCount = new FlexiWordCount(jsc, i);
Future future = executor.submit(flexiWordCount);
listOfFuture.add(future);
} else {
boolean allFutureDone = false;
while (!allFutureDone) {
allFutureDone = checkForAllFuture(listOfFuture);
System.out.println("Threads not completed yet!");
try {
Thread.sleep(2000);//waiting for 2 sec, before next check
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
printFutureResult(listOfFuture);
System.out.println("printing of future done");
listOfFuture.clear();
System.out.println("future list got cleared");
}
}
try {
executor.awaitTermination(5, TimeUnit.MINUTES);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
jsc.stop()
}
Is there a better way ?
Ans : Yes you should pass the directory of the files to sparkcontext and use .textFile over directory , in this case spark would parallaize the reads from directories over the executors . If you try to create threads yourself and then use the same spark context to re-submit job for each file you are adding a extra overhead of submitting application to yarn queue .
I think the fastest approach would be to directly pass the entire directory and create RDD out of it and then then let spark launch parallel task to process all the files in different executors .You can experiment with using .repartition() method over the RDD , as it would launch that many tasks to run parallely .

Why is this code executing sequentially?

Below code :
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
public class ThreadTest {
private static int counter = 0;
private static ExecutorService executorService = Executors.newCachedThreadPool();
private static List<Integer> intValues = new ArrayList<Integer>();
public static void main(String args[]){
for(int counter = 0; counter < 10; ++counter){
intValues.add(testCallback());
}
for(int i : intValues){
System.out.println(i);
}
System.exit(0);
}
public static Integer testCallback() {
Future<Integer> result = executorService.submit(new Callable<Integer>() {
public Integer call() throws Exception {
counter += 1;
Thread.sleep(500);
return counter;
}
});
try {
return result.get();
} catch (InterruptedException e) {
e.printStackTrace();
} catch (ExecutionException e) {
e.printStackTrace();
}
return null;
}
}
Outputs :
1
2
3
4
5
6
7
8
9
10
This program takes approx 5 seconds to run. I am trying to execute multiple invocations of testCallback method in a seperate thread so I would expect this method to run in 10 threads concurrently where each thread uses approx 500 miliseconds of time. So over all I expet the program to run in < 1 second.
Why is counter not being invoked in seperate threads concurrently ?
result.get();
This is a blocking call that waits for the task to complete.
Therefore, you're waiting for each task to finish before starting the next one.

HtmlUnit WebClient Timeout

In my previous questions about HtmlUnit
Skip particular Javascript execution in HTML unit
and
Fetch Page source using HtmlUnit : URL got stuck
I had mentioned that URL is getting stuck. I also found out that it is getting stuck due to one of the methods(parse) in HtmlUnit library is not coming out of execution.
I did further work on this. I wrote code to get out of the method if it takes more than specified time-out seconds to complete.
import java.io.IOException;
import java.net.MalformedURLException;
import java.util.Date;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;
import com.gargoylesoftware.htmlunit.BrowserVersion;
import com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException;
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.HtmlPage;
public class HandleHtmlUnitTimeout {
public static void main(String[] args) throws FailingHttpStatusCodeException, MalformedURLException, IOException, InterruptedException, TimeoutException
{
Date start = new Date();
String url = "http://ericaweiner.com/collections/";
doWorkWithTimeout(url, 60);
}
public static void doWorkWithTimeout(final String url, long timeoutSecs) throws InterruptedException, TimeoutException {
//maintains a thread for executing the doWork method
ExecutorService executor = Executors.newFixedThreadPool(1);
//logger.info("Starting method with "+timeoutSecs+" seconds as timeout");
//set the executor thread working
final Future<?> future = executor.submit(new Runnable() {
public void run()
{
try
{
getPageSource(url);
}
catch (Exception e)
{
throw new RuntimeException(e);
}
}
});
//check the outcome of the executor thread and limit the time allowed for it to complete
try {
future.get(timeoutSecs, TimeUnit.SECONDS);
} catch (Exception e) {
//ExecutionException: deliverer threw exception
//TimeoutException: didn't complete within downloadTimeoutSecs
//InterruptedException: the executor thread was interrupted
//interrupts the worker thread if necessary
future.cancel(true);
//logger.warn("encountered problem while doing some work", e);
throw new TimeoutException();
}finally{
executor.shutdownNow();
}
}
public static void getPageSource(String productPageUrl)
{
try {
if(productPageUrl == null)
{
productPageUrl = "http://ericaweiner.com/collections/";
}
WebClient wb = new WebClient(BrowserVersion.FIREFOX_3_6);
wb.getOptions().setTimeout(120000);
wb.getOptions().setJavaScriptEnabled(true);
wb.getOptions().setThrowExceptionOnScriptError(true);
wb.getOptions().setThrowExceptionOnFailingStatusCode(false);
HtmlPage page = wb.getPage(productPageUrl);
wb.waitForBackgroundJavaScript(4000);
wb.closeAllWindows();
}
catch (FailingHttpStatusCodeException e)
{
e.printStackTrace();
}
catch (MalformedURLException e)
{
e.printStackTrace();
}
catch (IOException e)
{
e.printStackTrace();
}
}
}
This code does come out of doWorkWithTimeout(url, 60); method. But this does not terminate.
When I try to call similiar implementation with following code:
import java.util.Date;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
import java.util.concurrent.TimeUnit;
import org.apache.log4j.Logger;
public class HandleScraperTimeOut {
private static Logger logger = Logger.getLogger(HandleScraperTimeOut .class);
public void doWork() throws InterruptedException {
logger.info(new Date()+ "Starting worker method ");
Thread.sleep(20000);
logger.info(new Date()+ "Ending worker method ");
//perform some long running task here...
}
public void doWorkWithTimeout(int timeoutSecs) {
//maintains a thread for executing the doWork method
ExecutorService executor = Executors.newFixedThreadPool(1);
logger.info("Starting method with "+timeoutSecs+" seconds as timeout");
//set the executor thread working
final Future<?> future = executor.submit(new Runnable() {
public void run()
{
try
{
doWork();
}
catch (Exception e)
{
throw new RuntimeException(e);
}
}
});
//check the outcome of the executor thread and limit the time allowed for it to complete
try {
future.get(timeoutSecs, TimeUnit.SECONDS);
} catch (Exception e) {
//ExecutionException: deliverer threw exception
//TimeoutException: didn't complete within downloadTimeoutSecs
//InterruptedException: the executor thread was interrupted
//interrupts the worker thread if necessary
future.cancel(true);
logger.warn("encountered problem while doing some work", e);
}
executor.shutdown();
}
public static void main(String a[])
{
HandleScraperTimeOut hcto = new HandleScraperTimeOut ();
hcto.doWorkWithTimeout(30);
}
}
If anybody can have a look and tell me what is the issue, it will be really helpful.
For more details about issue, you can look into Skip particular Javascript execution in HTML unit
and
Fetch Page source using HtmlUnit : URL got stuck
Update 1
Strange thing is : future.cancel(true); is returning TRUE in both cases.
How I expected it to be was :
With HtmlUnit it should return FALSE since process is still hanging.
With normal Thread.sleep(); it should return TRUE since the process
got cancelled successfully.
Update 2
It only hangs with http://ericaweiner.com/collections/ URL. If I give any other URL i.e. http://www.google.com , http://www.yahoo.com , It does not hand. In these cases it throws IntruptedException and come out of the Process.
It seems that http://ericaweiner.com/collections/ page source has certain elements which are causing problems.
Future.cancel(boolean) returns:
false if the task could not be cancelled, typically because it has already completed normally
true otherwise
Cancelled means means the thread did not finish before cancel, the canceled flag was set to true and if requested the thread was interrupted.
Interrupt the thread menans it called Thread.interrupt and nothing more. Future.cancel(boolean) does not check if the thread actually stopped.
So it is right that cancel return true on that cases.
Interrupting a thread means it should stop as soon as possible but it is not enforced. You can try to make it stop/fail closing a resource it needs or something. I usually do that with a thread reading (waiting incoming data) from a socket. I close the socket so it stops waiting.

Categories