java multi-threading - java

I'm executing a batch file using java command and reading batch data in text file putting in database. For example I have to run for 430 nodes within 15 min interval using same batch file. So I divided 430 node in 12 threads, so each thread contains 40 nodes pointing same batch file . But threads running parallel are not able wait for the batch file command to complete. I can't make wait for each thread, because all task should complete within 15 min. Any suggestions?
Below is piece of code running multi-threading.
for (int i = 0; i < noOfMainThreads; i++) {
// running 12 thread for 40 node
threadArr[i] = new Thread(runnableArr[i]);
runnableArr[i] = new CodeBatchfile(nodeArr,nodeidArr);
}
for (int i = 0; i < noOfMainThreads; i++) {
threadArr[i].start;
}
class CodeBatchfile{
void run(){
for (int i=1;i<nodename.length;i++) {
// exciting batch file using 12 threads.
cmd = filepath + " " + nodenamelocal;
try {
process = Runtime.getRuntime().exec(cmd, null, bdir);
process.waitFor();
}
catch(Exception ex) {
System.out.println("Exception Running batch file" + ex.getLocalizedMessage());
}
}
}

Use ExecutorService instead. Build a pipeline where each step works like this:
Create a job object which has all the information to do the task and which contains fields for the results. Create all job objects and put them into the queue for the service to run them.
So the first step would be to create 430 jobs to run the batch program. Each job would start the batch program and wait for it to terminate. After the batch terminates, you read the output and put that into the job instance.
Create an executor which runs N jobs in parallel. You will need to tune N; if it's a CPU intensive task, N == number of cores. if it's an IO intensive job, try higher values (2-4 times the CPU cores usually work well).
Put all the jobs into the executor's queue. Wait for jobs that finish, create new jobs from them and put them into the input queue of the executor.
Keep a job counter (started, finished) so you know when to stop.
Tutorial.

I think you should use CyclicBarrier, barrier allow you to wait in specific point until all the threads reach to the barrier, so after executing the batch you should call the wait of the cyclicBarrier. a good code example can be found here:
enter link description here

Related

Java executor service: Waiting for all tasks to finish

I am trying to introduce concurrency in my program. Structure of program is something like this:
ThreadPoolExecutor executor = (ThreadPoolExecutor) Executors.newFixedThreadPool(5);
List<String> initialData = dao.fetchFromDB("input");
Queue queue = new MyQueue();
queue.add(initialData);
while(queue.length() > 0) {
int startingLength = queue.length();
for (int i = 0; i < startingLength; i++) {
String input = queue.remove();
if(input.equals("some value")) {
missionAccomplished = true;
break;
} else {
MyRunnable task = new MyRunnable(input, queue, dao);
executor.execute(task);
}
}
if(missionAccomplished) {
break;
}
executor.shutdown();
}
So queue contains the data needed to be processed one by one. Inside while loop I run a for loop which picks data from queue one by one sequentially and performs some check on it, and if check fails I create a runnable task with this data and hands it over to executor(as DB operation is time consuming, I want to use parallelism for it). for loop picks data only upto certain length in given iteration of while.
What I want to achieve is that 'while' loop goes to next iteration only when all tasks submitted to executor in current iteration are finished.
How can this be achieved?
try-with-resources in Project Loom
You asked:
What I want to achieve is that 'while' loop goes to next iteration only when all tasks submitted to executor in current iteration are finished.
Project Loom promises to make this simpler.
One of the changes brought by Project Loom is that the ExecutorService interface is a sub-interface of AutoCloseable. This means we can use try-with-resources syntax. The try-with-resources automatically blocks until all submitted tasks are done/failed/canceled — just what you asked for.
Also, the executor service is automatically shut down when exiting the try. These changes mean your code becomes simpler and clearer.
Also, for code that blocks often, such as database access, you will see dramatically faster performance using virtual threads (a.k.a. fibers). Virtual threads is another new feature of Project Loom. To get this feature, call Executors.newVirtualThreadExecutor.
Experimental builds of Project Loom are available now, based on early-access Java 17. The Loom team is asking for feedback. For more info, see recent presentations and interviews by Ron Pressler of Oracle.
System.out.println( "INFO - executor service about to start. " + Instant.now() );
try (
ExecutorService executorService = Executors.newVirtualThreadExecutor() ;
)
{
for ( int i = 0 ; i < 7 ; i++ )
{
executorService.submit( ( ) -> System.out.println( Instant.now() ) );
}
}
// Notice that when reaching this point we block until all submitted tasks still running are fin
// because that is the new behavior of `ExecutorService` being `AutoCloseable`.
System.out.println( "INFO - executor service shut down at this point. " + Instant.now() );
When run.
INFO - executor service about to start. 2021-02-08T06:27:03.500093Z
2021-02-08T06:27:03.554440Z
2021-02-08T06:27:03.554517Z
2021-02-08T06:27:03.554682Z
2021-02-08T06:27:03.554837Z
2021-02-08T06:27:03.555015Z
2021-02-08T06:27:03.555073Z
2021-02-08T06:27:03.556675Z
INFO - executor service shut down at this point. 2021-02-08T06:27:03.560723Z

How does BufferedReader readline work in a continous program?

So I know I can use readline to get the program output line by line but if I use a while loop
String l;
while( (l=input.readLine()) != null)
rcv = rcv + l
return rcv;
But this freezes my program until the external process finishes giving output. I want to listen to the output as the external process gives it. It can take a long time for the external program to exit.
I tried using read() but it also freezes my program until the end. How can I read the output, whatever is available and then do my processing? and then go back to read output again?
You can use a separate thread to read the input stream. The idea is that the blocking operations should happen in a separate thread, so your application main thread is not blocked.
One way to do that is submitting a Callable task to an executor:
ExecutorService executor = Executors.newSingleThreadExecutor();
Future<String> processOutput = executor.submit(() -> {
// your code to read the stream goes here
String l, rcv;
while( (l=input.readLine()) != null) { ... }
return rcv;
});
This returns a "future" which is a way to represent a value that may not be available now but might be at some point in the future. You can check if the value is available now, or wait for the value to be present with a timeout, etc.

Run a command based program at a custom date-time (Add/modify/delete)

I have a python script which takes few params as argument and I need to run tasks based on this script at a given date and time with other params. I am making an UI to add/modify/delete such tasks with all given params. How do I do it? Is there any tool available? I dont think crontabs are the best solution to this especially due to frequent need of task modification/deletion. The requirement is for linux machine.
One soln could be: Create an API to read all the tasks stored in DB to execute the python script timely and call that API after every few minutes via crontab.
But I am looking for a better solution. Suggestions are welcome.
I am assuming that all the arguments (command line) are know beforehand, in which case you have couple of options
Use a python scheduler to programmatically schedule your
tasks without cron. This scheduler script can be run either as daemon or via cron job to run all the time.
Use a python crontab module to modify
cron jobs from python program itself
If the arguments to scripts are generated dynamically at various time schedule (or user provided), then the only is to use a GUI to get the updated arguments and run python script to modify cron jobs.
from datetime import datetime
from threading import Timer
x=datetime.today()
y=x.replace(day=x.day+1, hour=1, minute=0, second=0, microsecond=0)
delta_t=y-x
secs=delta_t.seconds+1
def hello_world():
print "hello world"
#...
t = Timer(secs, hello_world)
t.start()
This will execute a function in the next day at 1 am.
You could do it with timer units with systemd. What are the advantages over cron?
Dependencies to other services can be defined, so that either other
services must be executed first so that a service is started at all
(Requires), or a service is not started if it would get into
conflict with another service currently running (Conflicts).
Relative times are possible: You can cause Timer Units to start a
service every ten minutes after it has been executed. This
eliminates overlapping service calls that at some point cause the
CPU to be blocked because the interval in the cron is too low.
Since Timer Units themselves are also services, they can be
elegantly activated or deactivated, while cronjobs can only be
deactivated by commenting them out or deleting them.
Easily understandable indication of times and spaces compared to
Cron.
Here come an example:
File: /etc/systemd/system/testfile.service
[Unit]
Description=Description of your app.
[Service]
User=yourusername
ExecStart=/path/to/yourscript
The Timer Unit specifies that the service unit defined above is to be started 30 minutes after booting and then ten minutes after its last activity.
File: /etc/systemd/system/testfile.timer
[Unit]
Description=Some description of your task.
[Timer]
OnBootSec=30min
OnUnitInactiveSec=10min
Persistent=true
User=testuser
Unit=testfile.service
[Install]
WantedBy=timers.target
One solution would be to have a daemon running in the background, waking up regularly to execute the due tasks.
It would sleep x minutes, then query the database for all the not-yet-done tasks which todo datetime is smaller than the current datetime. It would execute the tasks, mark the tasks as done, save the result and go back to sleep.
You can also use serverless computation, such as AWS Lambda which can be triggered by scheduled events. It seems to support the crontab notation or similar but you could also add the next event every time you finish one run.
I found the answer to this myself i.e, Timers Since my experience and usecase was in java, I used it by creating REST API in spring and managing in-memory cache of timers in java layer as a copy of DB. One can use Timers in any language to achieve something similar. Now I can run any console based application and pass all the required arguments inside the respective timer. Similarly I can update or delete any timer by simply calling .cancel() method on that respective timer from the hashmap.
public static ConcurrentHashMap<String, Timer> PostCache = new ConcurrentHashMap<>();
public String Schedulepost(Igpost igpost) throws ParseException {
String res = "";
TimerTask task = new TimerTask() {
public void run() {
System.out.println("Sample Timer basedTask performed on: " + new Date() + "\nThread's name: " + Thread.currentThread().getName());
System.out.println(igpost.getPostdate()+" "+igpost.getPosttime());
}
};
DateFormat dateFormatter = new SimpleDateFormat("yyyy-MM-dd HH:mm");
Date date = dateFormatter.parse(igpost.getPostdate()+" "+igpost.getPosttime());
Timer timer = new Timer(igpost.getImageurl());
CacheHelper.PostCache.put(igpost.getImageurl(),timer);
timer.schedule(task, date);
return res;
}
Thankyou everybody for suggestions.

Camunda - executing processes in specific order

Let's say that we have bussiness process A. Process A might take more or less time (it's not known).
Normally you can have multiple A processes, but sometimes during some operations we need to make sure that one process execution is made after previous one.
How can we achieve it in Camunda? Tried to find something like process dependency (so process starts after previous one is complete), but couldn't find anything :(
I thought about adding some variable in process (like depending_process) and checking if specified process is done, but maybe there would be some better solution.
Ok, after some research I got solution.
On the beginning of every process I check for processes started by current user:
final DateTime selfOrderDate = (DateTime) execution.getVariable(PROCESS_ORDER_DATE);
List<ProcessInstance> processInstanceList = execution
.getProcessEngineServices()
.getRuntimeService()
.createProcessInstanceQuery()
.processDefinitionId(execution.getProcessDefinitionId())
.variableValueEquals(CUSTOMER_ID, execution.getVariable(CUSTOMER_ID))
.active()
.list();
int processesOrderedBeforeCurrentCount = 0;
for (ProcessInstance processInstance : processInstanceList) {
ExecutionEntity entity = (ExecutionEntity) processInstance;
if (processInstance.getId().equals(execution.getId()))
continue;
DateTime orderDate = (DateTime) entity.getVariable(PROCESS_ORDER_DATE);
if (selfOrderDate.isAfter(orderDate)) {
processesOrderedBeforeCurrentCount += 1;
}
}
Then I save number of previously started processes to Camunda and in next task check if it's equal to 0. If yes, I proceed, if nope, I wait 1s (using Camunda's timer) and check again.

Waiting for a single Shell script in a series of scripts to run to completion before continuing (Java, MySQL, JUnit)

I'm working on a Java program that incorporates Process and Runtime to run several shell scripts for automated testing. All but one of these scripts runs fine, which causes an issue with script calls following it. Ex:
process = runtime.exec("A.sh");
process = runtime.exec("B.sh");
process = runtime.exec("C.sh");
A.sh runs fine and only takes a few seconds to run. B.sh, however, takes a couple minutes to run, and I think this is causing a problem with running C.sh, since they both interact with the same MySQL table and the overlap causes a Communications Link Failure.
Without overloading you with unnecessary information, my question is, how can I wait to ensure a run shell script has been run to completion/termination before moving on to the next exec() call?
What I've tried:
process.waitFor()
This doesn't work, I don't think it waits until the script is completely done
process.wait(long time_period)
This doesn't work since it causes the current thread to wait which results in the remaining shell script calls to get skipped and the next test case to begin prematurely.
The shell script I call that causes the problem is not a simple script, but I didn't write it myself and have little understanding of what it does behind the scenes. The only relevant information I have about it is that it directly connects to the MySQL database in question whereas my program uses java.sql.* to (I believe) remotely connect (although it is a local database on a remote machine).
Edit:
After following a suggestion, I've looked into the Apache Commons Exec and tried a new strategy, unsuccessfully.
ExecuteWatchdog watchdog = new ExecuteWatchdog(300000); //For five minutes
CommandLine cmdline = CommandLine.parse("./directory/shell.sh");
DefaultExecutor executor = setExitValue(0);
executor.setWatchdog(watchdog);
int exitVal = executor.execute(cmdLine);
//A line to log the exit val in another file
My log gives no implication that the shell script was actually run, as the time between a logged statement saying "shell.sh begins" and "test 2 starts" are essentially the same instant, which means the ~2 minute process that shell.sh runs never happens. Where did I go wrong?
I use Apache Commons Exec. Have synchronous and asynchronous execution support. Execution timeout can be set.
First paragraph from their tutorial page:
At this point we can safely assume that you would like to start some
subprocesses from within your Java application and you spent some time
here to do it properly. You look at Commons Exec and think "Wow -
calling Runtime.exec() is easy and the Apache folks are wasting their
and my time with tons of code". Well, we learned it the hard way (in
my case more than once) that using plain Runtime.exec() can be a
painful experience. Therefore you are invited to delve into
commons-exec and have a look at the hard lessons the easy way ...
Advanced usage example (some code is missing like BusinessException and "StreamUtil.closeQuietly", but it could be easily replaced):
ExecuteWatchdog watchdog = new ExecuteWatchdog(EXECUTION_TIMEOUT_IN_MS);
DefaultExecutor executor = new DefaultExecutor();
executor.setWatchdog(watchdog);
executor.setExitValue(0);
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
ByteArrayOutputStream errorStream = new ByteArrayOutputStream();
executor.setStreamHandler(new PumpStreamHandler(outputStream, errorStream));
try {
log.info(commandLine.toString());
int exitCode = executor.execute(commandLine, (Map<?, ?>) null);
if (exitCode != 0)
throw new BusinessException("Process exited with non-zero exit code.");
return outputStream.toString();
} catch (ExecuteException e) {
String errorStreamStr = null;
if (errorStream.size() != 0)
errorStreamStr = errorStream.toString();
StringBuilder errorMessageBuilder = new StringBuilder();
errorMessageBuilder.append("main.error").append(":\n").append(
e.getMessage()).append("\n\n");
if (errorStreamStr != null) {
errorMessageBuilder.append("additional.error").append(":\n").append(errorStreamStr).append("\n\n");
}
errorMessageBuilder.append("command.line").append(":\n").append(commandLine.toString());
if (log.isDebugEnabled())
log.debug(errorMessageBuilder.toString());
throw new BusinessException(errorMessageBuilder.toString());
} catch (IOException e) {
throw new IllegalStateException(e);
} finally {
StreamUtil.closeQuietly(outputStream, errorStream);
}

Categories