Parallel stream doesn't look like working in parallel, completely - java

1. Set's parallelStream doesn't use enough thread.
Java8 parallelStream doesn't working exactly parallel.
In my computer, java8 set's parallelStream is not using enough thread when task's count is smaller than processor's count.
public class ParallelStreamSplitTest {
#Test
public void setStreamParallelTest() {
System.out.printf("Total processor count : %d \n", Runtime.getRuntime().availableProcessors());
long start = System.currentTimeMillis();
IntStream.range(1, 8).boxed().collect(Collectors.toCollection(HashSet::new)).parallelStream().forEach((index) -> {
System.out.println("Starting " + Thread.currentThread().getName() + ", index=" + index + ", " + new Date());
try {
Thread.sleep(1000);
} catch (Exception e) {
}
});
long end = System.currentTimeMillis();
System.out.println(Thread.currentThread().getName() + "'s elapsed time : " + (end - start));
}
#Test
public void intStreamParallelTest() {
System.out.printf("Total processor count : %d \n", Runtime.getRuntime().availableProcessors());
long start = System.currentTimeMillis();
IntStream.range(1, 8).parallel().forEach(index -> {
System.out.println("Starting " + Thread.currentThread().getName() + ", index=" + index + ", " + new Date());
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
}
});
long end = System.currentTimeMillis();
System.out.println(Thread.currentThread().getName() + "'s elapsed time : " + (end - start));
}
}
In my code, setStreamParallelTest takes 4 seconds whereas intStreamParallelTest takes 1 second.
I expect that setStreamParallelTest also done in 1 seconds.
Is it bug?
2. Is it okay to use parallel stream to call another api in web application? If it is wrong, why?
My web application need to call another api server in parallel. So I use parallel stream to call api.
Sets.newHashSet(api1, api2, api3, api4).parallelStream().forEach(api -> callApiSync(api))
I think all requests bound for my server share a fork-join pool. so, It looks dangerous when one of api's response is slow.
Is it correct?

The contract for parallelStream says:
Returns a possibly parallel Stream with this collection as its source. It is allowable for this method to return a sequential stream.
If you want to invoke several tasks in parallel, use an ExecutorService.

Related

How to delay asynchronously when a method is called consecutively?

I want to have a delay of 1 minute before the printFirst() method is called without affecting the main thread.
Code
I tried
// define delaying print-method using Timer
static void printFirst() {
new java.util.Timer().schedule(
new java.util.TimerTask() {
public void run() {
System.out.println(ts() + " First");
}
},60000
);
}
// main to run
System.out.println(ts() + " Zero");
printFirst();
printFirst();
printFirst();
System.out.println(ts() + " Second");
System.out.println(ts() + " Third");
System.out.println(ts() + " Fourth");
Actual Output
but the output was
Timestamp: 2023-01-05 17:40:43.664 Zero
Timestamp: 2023-01-05 17:40:43.666 Second
Timestamp: 2023-01-05 17:40:43.667 Third
Timestamp: 2023-01-05 17:40:43.667 Fourth
Timestamp: 2023-01-05 17:41:13.681 First
Timestamp: 2023-01-05 17:41:13.681 First
Timestamp: 2023-01-05 17:41:13.681 First
Expected
I was expecting an interval of 1 min between the 3 lines ending with "First".
Timestamp: 2023-01-05 17:40:43.664 Zero
Timestamp: 2023-01-05 17:40:43.666 Second
Timestamp: 2023-01-05 17:40:43.667 Third
Timestamp: 2023-01-05 17:40:43.667 Fourth
Timestamp: 2023-01-05 17:41:43.667 First
Timestamp: 2023-01-05 17:42:43.667 First
Timestamp: 2023-01-05 17:43:43.667 First
We can achieve the expected results using ExecutorService class with the newSingleThreadExecutor() method. See the code below.
static ExecutorService es = Executors.newSingleThreadExecutor();
static void printFirst() {
es.submit(() -> {
try {
TimeUnit.MILLISECONDS.sleep(60000);
System.out.println(ts() + " First");
} catch (InterruptedException e) {
e.printStackTrace();
}
});
}
The main method is still the same.
System.out.println(ts() + " Zero");
printFirst();
printFirst();
printFirst();
System.out.println(ts() + " Second");
System.out.println(ts() + " Third");
System.out.println(ts() + " Fourth");
When run, the results are:
2023-01-06 00:05:45.72 Zero
2023-01-06 00:05:45.763 Second
2023-01-06 00:05:45.763 Third
2023-01-06 00:05:45.763 Fourth
2023-01-06 00:06:45.776 First
2023-01-06 00:07:45.782 First
2023-01-06 00:08:45.784 First
As an option you can set up your timer for periodic execution and limit the number of runs
static void printFirst(long delay, int numberOfIterations) {
new java.util.Timer().schedule(
new java.util.TimerTask() {
private int counter = 0;
public void run() {
System.out.println(ts() + " First");
if(++counter == numberOfIterations) {
this.cancel();
}
}
}, delay, delay
);
}
And then just run it with one command printFirst(60000, 3);
Issue
The delay in your Timer.schedule(TimerTask task, long initialDelay) is just for the initial delay, time between scheduled and start of task. See parameter in docs:
delay - delay in milliseconds before task is to be executed.
Solution
For your intent - as interval - you need the overloaded sister-method schedule(TimerTask task, long delay, long period) passing the 1 minute interval as third argument to parameter named period:
Schedules the specified task for repeated fixed-delay execution, beginning after the specified delay. Subsequent executions take place at approximately regular intervals separated by the specified period.
import java.util.Timer;
import java.util.TimerTask;
// define the task, here in a method on demand
public static TimerTask createPrintTask(String text) {
return new TimerTask() {
public void run() {
System.out.println(ts() + text); // not sure how ts() is defined
}
};
}
// define print-method using Timer with interval and zero delay
static void printScheduledWithInterval(String text, int intervalMillis) {
var initialDelayMillis = 0; // start immediately (first after 0 seconds)
new Timer().schedule(createPrintTask(text), initialDelayMillis, intervalMillis);
}
// MAIN
System.out.println(ts() + " Zero");
// use to start the task asynchronously (it will continue to print every interval until canceled)
printScheduledWithInterval(" First", 60_000); // 1 minute interval
System.out.println(ts() + " Second");
System.out.println(ts() + " Third");
System.out.println(ts() + " Fourth");
Note: In you code there was printFirst called 3 times. Presumably to have 3 different tasks started or to have 3 lines printed.
When does the scheduling of task end?
The printScheduledWithInterval here, rather means schedule a print-job to be executed repeatedly on a fixed-interval.
Since there are no requirements about number of printed lines or max-execution count, the timer will continuously start new tasks.
While calling function, you can set the number of seconds as a parameter
System.out.println(ts() + " Zero");
printFirst(60000);
printFirst(12000);
printFirst(18000);
System.out.println(ts() + " Second");
System.out.println(ts() + " Third");
System.out.println(ts() + " Fourth");
In your function,
static void printFirst(long time) {
new java.util.Timer().schedule(
new java.util.TimerTask() {
public void run() {
System.out.println(ts() + " First");
}
},time
);
}

Rxjava2 blockingSubscribe vs subscribe

I have read the explanation about blockingSubscribe() and subscribe() but neither I can write nor find an example to see the difference of these. It seems that both of these work the same way. Could someone provide an example of these 2, preferably in Java.
blockingSubscribe blocks the current thread and processes the incomnig events on there. You can see this by running some async source:
System.out.println("Before blockingSubscribe");
System.out.println("Before Thread: " + Thread.currentThread());
Observable.interval(1, TimeUnit.SECONDS)
.take(5)
.blockingSubscribe(t -> {
System.out.println("Thread: " + Thread.currentThread());
System.out.println("Value: " + t);
});
System.out.println("After blockingSubscribe");
System.out.println("After Thread: " + Thread.currentThread());
subscribe gives no such confinement and may run on arbitrary threads:
System.out.println("Before subscribe");
System.out.println("Before Thread: " + Thread.currentThread());
Observable.timer(1, TimeUnit.SECONDS, Schedulers.io())
.concatWith(Observable.timer(1, TimeUnit.SECONDS, Schedulers.single()))
.subscribe(t -> {
System.out.println("Thread: " + Thread.currentThread());
System.out.println("Value: " + t);
});
System.out.println("After subscribe");
System.out.println("After Thread: " + Thread.currentThread());
// RxJava uses daemon threads, without this, the app would quit immediately
Thread.sleep(3000);
System.out.println("Done");

ForkJoinFramework only uses two workers

I have an application which crawls around six thousand urls.To minimize this work i created a RecursiveTask which consumes a ConcurrentLinkedQueue of all URLs to crawl. It splits up to 50 off and if the que is empty it crawls it directly but if not it first creates a new instance of itself and forks it, after that it crawls the subset of 50 and after that it will join the forked task.
Now comes my problem, until each thread has worked of his 50 all four work quick anf at the same time. But after two stop working and waiting for join and only the other two are working and creating new forks and crawling pages.
To visualize this i count the number how mouch URLs a Thread crawls and let a JavaFX gui show it.
What do i wrong so the ForkJoinFramewok only uses two of my four allowed threads? What can i do to change it?
Here is my compute method of the task:
LOG.debug(
Thread.currentThread().getId() + " Starting new Task with "
+ urlsToCrawl.size() + " left."
);
final ConcurrentLinkedQueue<D> urlsToCrawlSubset = new ConcurrentLinkedQueue<>();
for (int i = 0; i < urlsToCrawl.size() && i < config.getMaximumUrlsPerTask(); i++)
{
urlsToCrawlSubset.offer(urlsToCrawl.poll());
}
LOG.debug(
Thread.currentThread().getId() + " Crated a Subset with "
+ urlsToCrawlSubset.size() + "."
);
LOG.debug(
Thread.currentThread().getId()
+ " Now the Urls to crawl only left " + urlsToCrawl.size() + "."
);
if (urlsToCrawl.isEmpty())
{
LOG.debug(Thread.currentThread().getId() + " Crawling the subset.");
crawlPage(urlsToCrawlSubset);
}
else
{
LOG.debug(
Thread.currentThread().getId()
+ " Creating a new Task and crawling the subset."
);
final AbstractUrlTask<T, D> otherTask = createNewOwnInstance();
otherTask.fork();
crawlPage(urlsToCrawlSubset);
taskResults.addAll(otherTask.join());
}
return taskResults;
And here is an snapshot of my diagram:
P.s. If i allow up to 80 threads it will us them until every has 50 URLs crawled an then uses only two.
And if you're interested, here is the complete source code: https://github.com/mediathekview/MServer/tree/feature/cleanup
I fixed it. My error was, that i splitted then worked a small protion and than waited instead of split it into half, and then call my self again with the rest other half etc.
In other words before i splitted and worked directly but correct is to split till all is splitted and then start working.
Here is my code how it looks now:
#Override
protected Set<T> compute()
{
if (urlsToCrawl.size() <= config.getMaximumUrlsPerTask())
{
crawlPage(urlsToCrawl);
}
else
{
final AbstractUrlTask<T, D> rightTask = createNewOwnInstance(createSubSet(urlsToCrawl));
final AbstractUrlTask<T, D> leftTask = createNewOwnInstance(urlsToCrawl);
leftTask.fork();
taskResults.addAll(rightTask.compute());
taskResults.addAll(leftTask.join());
}
return taskResults;
}
private ConcurrentLinkedQueue<D> createSubSet(final ConcurrentLinkedQueue<D> aBaseQueue)
{
final int halfSize = aBaseQueue.size() / 2;
final ConcurrentLinkedQueue<D> urlsToCrawlSubset = new ConcurrentLinkedQueue<>();
for (int i = 0; i < halfSize; i++)
{
urlsToCrawlSubset.offer(aBaseQueue.poll());
}
return urlsToCrawlSubset;
}

Pause execution of a loop in main method till all Threads finish Java 1.5

I am reading multiple arguments from command line using Java 1.5 . The arguments are names of flat files. I loop thru the arguments in the main method and call a method which in turn creates a bunch of threads to process the file. I need to pause the loop till all threads processing the first argument complete and then move on to create threads for the second argument. How can I queue the arguments or pause the loop execution in my main method till all threads processing current argument complete?
Use Threadpools and an Executor. Take a look at the java.util.concurrent package.
for(String argument:args){
//you said you want multiple threads to work on a single argument.
//create callables instead and use a ThreadPool
List<Callable<YourResult>> lstCallables = createCallablesFor(argument);
List<Future<YourResult>> futures = Executors.newCachedThreadPool().invokeAll(lstCallables);
for(Future<YourResult> future:futures){
//this get() waits until the thread behind the current future is done.
// it also returns whatever your callable might return.
future.get();
}
// at this point, all the threads working on the current argument are finished
// and the next loop iteration works on the next argument
}
I wonder if you are looking for something like cyclic barriers.
You need to start the thread job inside the loop for one argument so that after one job is finished next loop is started and next thread job for next argument is started. And further you can work in your thread job where you defined that.
Example: this is just a snippet
for (int i = 0; i < count; i++) {
t[i] = new RunDemo();
String[] serverList = srv[i].split(",");
String logName = filename + "_" + serverList[0] + "_log";
String sql = "INSERT INTO .....(any query)";
t[i].setStr("sqlplus -L " + username[i] + "/" + password[i] + "#"
+ serverList[1] + ":" + serverList[2] + "/" + serverList[3]
+ " #" + filename1);
t[i].setLogName(logName);
t[i].setDirectory(dir);
try{
conn.UpdateQuery(sql);
log.info("Inserted into the table data with query " + sql);
}
catch (Exception e){
log.info("The data can't be inserted into table with " + e.getMessage() + " sql query " + sql);
}
new Thread(t[i]).start();
}
Here in every loop new thread with different serverList is created and started.
Now the job definition is given below:
public void run() {
JShell jshell = new JShell();
try {
log.info("Command is: " + this.str + " log name: " + this.LogName + " in directory: " + this.directory);
jshell.executeCommand(this.str, this.LogName, this.directory);
log.info("Executed command successfully");
} catch (Exception e1) {
log.info("Error at executing command with error stack: ");
e1.printStackTrace();
}
DBConnection conn1 = new DBConnection();
String sql = "UPDATE patcheventlog SET ENDTIME=SYSDATE WHERE LOGFILE='" + this.directory + this.LogName + "'";
try {
//conn1.callConnection("192.168.8.81", "d2he");
conn1.callConnection(ip, sid);
conn1.UpdateQuery(sql);
conn1.disposeConnection();
} catch (SQLException e) {
e.printStackTrace();
} catch (ClassNotFoundException e) {
e.printStackTrace();
}
System.out.print(this.LogName);
}
So this is how you work with the threads inside the loop. You don't need to pause your loop.
Hope that helps.

Threads in Java, states? Also what is the right way to use them?

I'm up for my exame presentation the day after tomorrow, so i need to get some straight before it which i hope you guys can help me with.
First i do know that there are 4 states of Threads (i.e Running, Ready, Blocked, Terminated), however i'm not quite sure how it works in Java. In my code i use the thread.sleep(3000) to do some waiting in the program, does this make the thread Blocked or Ready?
Also it have come to my attention that i might not have used the threads the right way, let me show you some code
public class BattleHandler implements Runnable {
private Player player;
private Monster enemyMonster;
private Dungeon dungeon;
private JTextArea log;
private GameScreen gScreen;
public void run() {
try {
runBattle();
}
catch(Exception e) { System.out.println(e);}
}
public BattleHandler(Player AttackingPlayer, JTextArea log, GameScreen gScreen) {
this.player = AttackingPlayer;
this.log = log;
this.gScreen = gScreen;
}
public void setDungeon(Dungeon dungeon) {
this.dungeon = dungeon;
}
public Dungeon getDungeon() {
return dungeon;
}
public Monster getEnemyMonster() {
return enemyMonster;
}
public void setMonster() {
// First check if dungeon have been init, if not we can't generate the mob
if(dungeon != null) {
enemyMonster = new Monster();
// Generate monster stats
enemyMonster.generateStats(dungeon);
}else {
System.out.println("Dungeon was not initialized");
}
}
public void runBattle() throws InterruptedException {
// Start battle, and run until a contester is dead.
while(player.getHealth() > 0 && enemyMonster.getHealth() > 0) {
int playerStrikeDmg = player.strike();
if(enemyMonster.blockDefend()) {
log.setText( log.getText() + "\n" + player.getName() +" tried to strike " + enemyMonster.getName()+ ", but " + enemyMonster.getName() + " Blocked.");
}else if(enemyMonster.dodgeDefend()) {
log.setText( log.getText() + "\n" + player.getName() +" tried to strike " + enemyMonster.getName()+ ", but " + enemyMonster.getName() + " Blocked.");
}else {
enemyMonster.defend(playerStrikeDmg);
log.setText( log.getText() + "\n" + player.getName() +" strikes " + enemyMonster.getName()+ " for: " + playerStrikeDmg + " left: "+ enemyMonster.getHealth());
}
if(enemyMonster.getHealth() < 1) break;
Thread.sleep(3000);
// Monster Turn
int monsterDmg = enemyMonster.strike();
if(player.blockDefend()) {
log.setText( log.getText() + "\n" + enemyMonster.getName() +" tried to strike " + player.getName()+ ", but " + player.getName()+ " Blocked.");
}else if(player.dodgeDefend()) {
log.setText( log.getText() + "\n" + enemyMonster.getName() +" tried to strike " + player.getName()+ ", but " + player.getName()+ " Dodged.");
}else {
player.defend(monsterDmg);
log.setText( log.getText() + "\n" + enemyMonster.getName() +" strikes " + player.getName()+ " for: " + monsterDmg + " left: "+ player.getHealth());
}
gScreen.updateBot();
Thread.sleep(3000);
}
When i coded this i thought it was cool, but i have seen some make a class just for controlling the Thread itself. I have just made the class who uses the Sleep runable(Which is not shown in the code, but its a big class).
Would be good to get this straight, so i can point i out before they ask me about it, you know take away there ammunition. :D
Hope you guys can help me :).
Thx
Threads have more than 4 states. Also, I recommend reading Lesson: Concurrency for more information regarding threads.
Note that if you're looking to execute a task at a set interval, I highly recommend using the Executors framework.
Blocked - it will not run at all until timeout. Ready is 'runnable now but there is no processor available to run it - will run as soon as a processor becomes available'.
As all the other guys state, there are more than those, here's a simple listing:
Running - Guess what, it's running
Waiting - It waits for another thread to complete its calculation (that's the wait() method in Java). Basically such a thread can also be run by the scheduler, like the "ready" state threads.
Ready - Means that the Thread is ready for execution, once the OS-Scheduler turns to this Thread, it will execute it
Blocked - Means that there is another operation, blocking this threads execution, such as IO.
Terminated - Guess what, it's done and will be removed by the OS-Scheduler.
For a complete listing, look at the famous Wikipedia ;)
http://en.wikipedia.org/wiki/Process_state

Categories