I want submit my MR job using YARN java API, I try to do it like WritingYarnApplications, but I don't know what to add amContainer, below is code I have written:
package org.apache.hadoop.examples;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.yarn.api.protocolrecords.GetNewApplicationResponse;
import org.apache.hadoop.yarn.api.records.ApplicationId;
import org.apache.hadoop.yarn.api.records.ApplicationSubmissionContext;
import org.apache.hadoop.yarn.api.records.ContainerLaunchContext;
import org.apache.hadoop.yarn.api.records.Resource;
import org.apache.hadoop.yarn.client.api.YarnClient;
import org.apache.hadoop.yarn.client.api.YarnClientApplication;
import org.apache.hadoop.yarn.util.Records;
import org.mortbay.util.ajax.JSON;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
public class YarnJob {
private static Logger logger = LoggerFactory.getLogger(YarnJob.class);
public static void main(String[] args) throws Throwable {
Configuration conf = new Configuration();
YarnClient client = YarnClient.createYarnClient();
client.init(conf);
client.start();
System.out.println(JSON.toString(client.getAllQueues()));
System.out.println(JSON.toString(client.getConfig()));
//System.out.println(JSON.toString(client.getApplications()));
System.out.println(JSON.toString(client.getYarnClusterMetrics()));
YarnClientApplication app = client.createApplication();
GetNewApplicationResponse appResponse = app.getNewApplicationResponse();
ApplicationId appId = appResponse.getApplicationId();
// Create launch context for app master
ApplicationSubmissionContext appContext = Records.newRecord(ApplicationSubmissionContext.class);
// set the application id
appContext.setApplicationId(appId);
// set the application name
appContext.setApplicationName("test");
// Set the queue to which this application is to be submitted in the RM
appContext.setQueue("default");
// Set up the container launch context for the application master
ContainerLaunchContext amContainer = Records.newRecord(ContainerLaunchContext.class);
//amContainer.setLocalResources();
//amContainer.setCommands();
//amContainer.setEnvironment();
appContext.setAMContainerSpec(amContainer);
appContext.setResource(Resource.newInstance(1024, 1));
appContext.setApplicationType("MAPREDUCE");
// Submit the application to the applications manager
client.submitApplication(appContext);
//client.stop();
}
}
I can run a mapreduce job properly with command interface:
hadoop jar wordcount.jar org.apache.hadoop.examples.WordCount /user/admin/input /user/admin/output/
But how can I submit this wordcount job in yarn java api?
You do not use Yarn Client to submit job, instead use MapReduce APIs to submit job. See this link for Example
However if you need more control on the job, like getting status of completion, Mapper phase status, Reducer phase status, etc, you can use
job.submit();
Instead of
job.waitForCompletion(true)
You can use functions job.mapProgress() and job.reduceProgress() to get the status. There are lots of functions in job object which you can explore.
As far as your query about
hadoop jar wordcount.jar org.apache.hadoop.examples.WordCount /user/admin/input /user/admin/output/
Whats happening here is you are running your driver program which is available in wordcount.jar. Instead of doing "java -jar wordcount.jar" you are using "hadoop jar wordcount.jar". you can as well use "yarn jar wordcount.jar". Hadoop/Yarn will setup necessary additional classpaths compared to java -jar command. This executes the "main()" of your driver program which is available in class org.apache.hadoop.examples.WordCount as specified in the command.
You can check out the source here Source for WordCount class
The only reason i would assume you want to submit job via yarn is to integrate it with some kind of service which kicks up MapReduce2 jobs on certain events.
For this you can always have your drivers main() something like this.
public class MyMapReduceDriver extends Configured implements Tool {
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
/******/
int errCode = ToolRunner.run(conf, new MyMapReduceDriver(), args);
System.exit(errCode);
}
#Override
public int run(String[] args) throws Exception {
while(true) {
try{
runMapReduceJob();
}
catch(IOException e)
{
e.printStackTrace();
}
}
}
private void runMapReduceJob() {
Configuration conf = new Configuration();
Job job = new Job(conf, "word count");
/******/
job.submit();
// Get status
while(job.getJobState()==RUNNING || job.getJobState()==PREP){
Thread.sleep(1000);
System.out.println(" Map: "+ StringUtils.formatPercent(job.mapProgress(), 0) + " Reducer: "+ StringUtils.formatPercent(job.reduceProgress(), 0));
}
}}
Hope this helps.
I am trying to run AWS-CLI from a batch script to sync files with S3, then automatically close the cmd window.
In all my batch scripts without AWS-CLI involved the Process.waitFor method will cause the cmd window to automatically exit upon process execution completion, but this is not the case when I have an AWS CLI command in there.
The S3 Sync will finish and I will be left with an open cmd window, and the program will not continue until I manually close it.
Is there something special I need to do in order to make Process.waitFor work in this case, or otherwise automatically close the cmd window upon script completion?
This question is unique because the command normally returns just fine, but is not in the specific case of using AWS CLI.
You're probably not reading the process output, so it's blocked trying to write to stdout.
This works for me:
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.util.concurrent.CompletableFuture;
public class S3SyncProcess {
public static void main(String[] args) throws IOException, InterruptedException {
// sync dir
Process process = Runtime.getRuntime().exec(
new String[] {"aws", "s3", "sync", "dir", "s3://my.bucket"}
);
CompletableFuture.runAsync(() -> pipe(process.getInputStream(), System.out));
CompletableFuture.runAsync(() -> pipe(process.getErrorStream(), System.err));
// Wait for exit
System.exit(process.waitFor());
}
private static void pipe(InputStream in, OutputStream out) {
int c;
try {
while ((c = in.read()) != -1) {
out.write(c);
}
} catch (IOException e) {
// ignore
}
}
}
Am working on some web application and i got struck in some where and i need your help.
I developed one java web application using struts framworks. This application takes source folder from user and copied in unique directory inside server and it will execute batch process on each source folder and batch will write logs on inside respective folder.
Uploading and copying source folder is working fine but real issue am facing is executing batch concurrently.
When one user upload source folder and start batch execution and same time another user also upload source folder and start batch and i need to know when first user batch completes and second user completes.How to track concurrent threads completed or not.Am using Executor executor= Executors.newCachedThreadPool(); Please find below code
private class BackgroundTask extends SwingWorker<Integer, String> {
String srcpath;
String log;
BackgroundTask(String path)
{
this.srcpath=path;
}
private int status;
public BackgroundTask() {
}
#Override
protected Integer doInBackground() {
try {
final File batchFile = new File("D://chetan//bin//runner.bat");
ProcessBuilder builder = new ProcessBuilder();
builder.directory(new File(srcpath));
String[] cmd = { "cmd", "/c",batchFile.getAbsolutePath(),"-X"};
for (int x = 0; x < cmd.length; x++) {
System.out.println("Command :" + cmd[x]);
}
builder.command(cmd);
Process process;
process = builder.start();
InputStream is1 = process.getInputStream();
InputStreamReader isr = new InputStreamReader(is1);
BufferedReader br = new BufferedReader(isr);
String line;
File logfile;
File logpath=new File(srcpath+File.separator+"LOG");
if(logpath.isDirectory())
{
logfile=new File(logpath+File.separator+"runner.log");
}
else
{
logpath.mkdir();
logfile=new File(logpath+File.separator+"runner.log");
}
logfile.createNewFile();
while ((line = br.readLine()) != null) {
//appendText(line);
log+=line+"\n";
}
FileUtils.writeStringToFile(logfile,log);
process.getInputStream().close();
process.getOutputStream().close();
process.getErrorStream().close();
process.destroy();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return status;
}
#Override
protected void process(java.util.List<String> messages) {
// statusLabel.setText((this.getState()).toString());
}
#Override
protected void done() {
}
}
Second Class
public class RunnerAnalysisAction extends ActionSupport implements SessionAware {
private BackgroundTask backgroundTask;
public String execute() {
String projectRoot="D:\\Sample_DemoProjects\\DEMO_375530\\";
backgroundTask = new BackgroundTask(projectRoot+ProjectName);
Executor executor= Executors.newCachedThreadPool();
executor.execute(backgroundTask);
return SUCCESS;
}
}
The above code working fine and creates log file in corresponding source folder but i need to
know when the batch will completes the task because once batch completes its task this application will trigger email
to user with log.How to know the particular task completed or not.Please provide some sample code.Thanks.
You didn't specify the nature of your web application. If it is not clustered (running single instance) then keep a static variable (or an instance variable inside ActionServlet, which is singleton). Set this variable to "true" when a thread (worker task) from previous user is already running and reset to false when thread completes. Use this flag to check whether another thread (from another user) can run- true means can't run, false means can run. Alternatively define the ExecutorService as a shared (static) instance and then make it a SingleThreaded pool and submit workers to it. Hope you got the idea.
To track concurrent threads I have used java.util.concurrent.Future.
A Future represents the result of an asynchronous computation. Methods are provided to check if the computation is complete, to wait for its completion, and to retrieve the result of the computation. The result can only be retrieved using method get when the computation has completed, blocking if necessary until it is ready.
You can try this complete example
I'm launching an external application from within my Java application (8u11). However the application becomes non-responsive to UI input up under Windows XP and Windows 7 with the standard hour glass/spinner.
I've narrowed this problem down to whether or not I use Process.waitFor(). If I call it I see the problems, if I don't it works fine. Also the application un-freezes if I then quit the Java application.
My question is why is this the case - how can calling waitFor() possibly effect the internal runnings of a child process? And how can I avoid this problem?
The application in question is LinPhone.exe but I don't believe the issue is specific to the application - there must be some general way in which it handles standard IO etc which the way I'm interfering with by calling waitFor().
I need to use Process.waitFor() so I can track when the application has exited.
I've simplified the issue to this SCCEE.
import java.io.BufferedReader;
import java.io.File;
import java.io.IOException;
import java.io.InputStreamReader;
public class LinphoneTest {
public static void main(String[] args) throws IOException,
InterruptedException {
String phoneAppPath = "C:\\Program Files\\Linphone\\bin\\linphone.exe";
ProcessBuilder processBuilder = new ProcessBuilder(phoneAppPath);
// move up from bin/linephone.exe
File workingDir = new File(phoneAppPath).getParentFile()
.getParentFile();
processBuilder.directory(workingDir);
processBuilder.redirectErrorStream();
Process process = processBuilder.start();
final BufferedReader stdout = new BufferedReader(
new InputStreamReader(process.getInputStream()));
String line = null;
try {
while (((line = stdout.readLine()) != null)) {
System.out.println(line);
}
} catch (IOException e) {
e.printStackTrace();
}
new Thread(() -> {
try {
process.waitFor();
} catch (InterruptedException e) {
e.printStackTrace();
}
}, "process wait").start();
Thread.sleep(Long.MAX_VALUE);
}
}
To summarise discussion, I was not handling standard error streams causing linphone to lockup as its output buffer for standard error became full especially as linphone produces a lot of verbose output on standard error. When RunTime.exec wont provides excellent summary of the pitfalls involved in calling processes from Java.
There was also a typo, I was calling the non-standard named getter
processBuilder.redirectErrorStream();
when I should have been calling the "setter"
processBuilder.redirectErrorStream(true);
I want to prevent the user from running my java application multiple times in parallel.
To prevent this, I have created a lock file when am opening the application, and delete the lock file when closing the application.
When the application is running, you can not open an another instance of jar. However, if you kill the application through task manager, the window closing event in the application is not triggered and the lock file is not deleted.
How can I make sure the lock file method works or what other mechanism could I use?
You could use a FileLock, this also works in environments where multiple users share ports:
String userHome = System.getProperty("user.home");
File file = new File(userHome, "my.lock");
try {
FileChannel fc = FileChannel.open(file.toPath(),
StandardOpenOption.CREATE,
StandardOpenOption.WRITE);
FileLock lock = fc.tryLock();
if (lock == null) {
System.out.println("another instance is running");
}
} catch (IOException e) {
throw new Error(e);
}
Also survives Garbage Collection.
The lock is released once your process ends, doesn't matter if regular exit or crash or whatever.
Similar discussion is at
http://www.daniweb.com/software-development/java/threads/83331
Bind a ServerSocket. If it fails to bind then abort the startup. Since a ServerSocket can be bound only once, only single instsances of the program will be able to run.
And before you ask, no. Just because you bind a ServerSocket, does not mean you are open to network traffic. That only comes into effect once the program starts "listening" to the port with accept().
I see two options you can try:
Use a Java shutdown hook
Have your lock file hold the main process number. The process should exist when you lanuch another instance. If it's not found in your system, you can assume that the lock can be dismissed and overwritten.
Creating a server socket, bounds to a specific port with a ServerSocket instance as the application starts is a straight way.
Note that ServerSocket.accept() blocks, so running it in its own thread makes sense to not block the main Thread.
Here is an example with a exception thrown as detected :
public static void main(String[] args) {
assertNoOtherInstanceRunning();
... // application code then
}
public static void assertNoOtherInstanceRunning() {
new Thread(() -> {
try {
new ServerSocket(9000).accept();
} catch (IOException e) {
throw new RuntimeException("the application is probably already started", e);
}
}).start();
}
You could write the process id of the process that created the lock file into the file.
When you encounter an existing lock file, you do not just quit, but you check if the process with that id is still alive. If not, you can go ahead.
You can create a Server socket like
new ServerSocket(65535, 1, InetAddress.getLocalHost());
at very beginning of your code. Then if AddressAlreadyInUse exception caught in main block you can display the appropriate message.
There are already available java methods in File class to achieve the same. The method is deleteOnExit() which ensure the file is automatically deleted when the JVM exits. However, it does not cater to forcible terminations. One should use FileLock in case of forcible termination.
For more details check, https://docs.oracle.com/javase/7/docs/api/java/io/File.html
Thus code snippet which could be used in the main method can be like :
public static void main(String args[]) throws Exception {
File f = new File("checkFile");
if (!f.exists()) {
f.createNewFile();
} else {
System.out.println("App already running" );
return;
}
f.deleteOnExit();
// whatever your app is supposed to do
System.out.println("Blah Blah")
}
..what other mechanism could I use?
If the app. has a GUI it can be launched using Java Web Start. The JNLP API provided to web-start offers the SingleInstanceService. Here is my demo. of the SingleInstanceService.
You can write something like this.
If file exists try to delete it. if it is not able to delete. We can say that application is already running.
Now create the same file again and redirect the sysout and syserr.
This works for me
Simple lock and advanced lock
I developed 2 solutions for this problem. I was also looking for an easy way of doing this without using any libraries and a lot of code.
My solutions are based on: https://stackoverflow.com/a/46705579/10686802 which I have improved upon. Therefore I would like to thank #akshaya pandey and #rbento
Simple file lock
package YOUR_PACKAGE_NAME;
import java.io.File;
import java.io.IOException;
/**
* Minimal reproducible example (MRE) - Example of a simple lock file.
* #author Remzi Cavdar - ict#remzi.info - #Remzi1993
*/
public class Main {
public static void main(String[] args) {
/*
* Prevents the user of starting multiple instances of the application.
* This is done by creating a temporary file in the app directory.
* The temp file should be excluded from git and is called App.lock in this example.
*/
final File FILE = new File("App.lock");
try {
if (FILE.createNewFile()) {
System.out.println("Starting application");
} else {
System.err.println("The application is already running!");
return;
}
} catch (IOException e) {
throw new RuntimeException(e);
}
/*
* Register a shutdown hook to delete the lock file when the application is closed. Even when forcefully closed
* with the task manager. (Tested on Windows 11 with JavaFX 19)
*/
FILE.deleteOnExit();
// Whatever your app is supposed to do
}
}
Advanced lock
package YOUR_PACKAGE_NAME;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.nio.channels.FileChannel;
import java.nio.channels.FileLock;
/**
* Minimal reproducible example (MRE) - Example of a more advanced lock system.
* #author Remzi Cavdar - ict#remzi.info - #Remzi1993
*/
public class Main {
public static void main(String[] args) {
/*
* Prevents the user of starting multiple instances of the application.
* This is done by creating a temporary file in the app directory.
* The temp file should be excluded from git and is called App.lock in this example.
*/
final File FILE = new File("App.lock");
if (FILE.exists()) {
System.err.println("The application is already running!");
return;
}
try (
FileOutputStream fileOutputStream = new FileOutputStream(FILE);
FileChannel channel = fileOutputStream.getChannel();
FileLock lock = channel.lock()
) {
System.out.println("Starting application");
} catch (FileNotFoundException e) {
throw new RuntimeException(e);
} catch (IOException e) {
throw new RuntimeException(e);
}
/*
* Register a shutdown hook to delete the lock file when the application is closed. Even when forcefully closed
* with the task manager. (Tested on Windows 11 with JavaFX 19)
*/
FILE.deleteOnExit();
// Whatever your app is supposed to do
}
}
Testing
Tested on: 31-10-2022
Tested OS: Windows 11 - Version 21H2 (OS Build 22000.1098)
Tested with: OpenJDK 19 - Eclipse Temurin JDK with Hotspot 19+36(x64)
I closed the application and also forcefully closed the application with task manager on Windows both times the lock file seems to be deleted upon (force) close.
I struggled with this same problem for a while... none of the ideas presented here worked for me. In all cases, the lock (file, socket or otherwise) did not persist into the 2nd process instance, so the 2nd instance still ran.
So I decided to try an old school approach to simply crate a .pid file with the process id of the first process. Then any 2nd process would quit if it finds the .pid file, and also the process number specified in the file is confirmed to be still running. This approach worked for me.
There is a fair bit of code, which I provide here in full for your use... a complete solution.
package common.environment;
import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;
import javax.annotation.Nonnull;
import javax.annotation.Nullable;
import java.io.*;
import java.nio.charset.Charset;
public class SingleAppInstance
{
private static final #Nonnull Logger log = LogManager.getLogger(SingleAppInstance.class.getName());
/**
* Enforces that only a single instance of the given component is running. This
* is resilient to crashes, unexpected reboots and other forceful termination
* scenarios.
*
* #param componentName = Name of this component, for disambiguation with other
* components that may run simultaneously with this one.
* #return = true if the program is the only instance and is allowed to run.
*/
public static boolean isOnlyInstanceOf(#Nonnull String componentName)
{
boolean result = false;
// Make sure the directory exists
String dirPath = getHomePath();
try
{
FileUtil.createDirectories(dirPath);
}
catch (IOException e)
{
throw new RuntimeException(String.format("Unable to create directory: [%s]", dirPath));
}
File pidFile = new File(dirPath, componentName + ".pid");
// Try to read a prior, existing pid from the pid file. Returns null if the file doesn't exist.
String oldPid = FileUtil.readFile(pidFile);
// See if such a process is running.
if (oldPid != null && ProcessChecker.isStillAllive(oldPid))
{
log.error(String.format("An instance of %s is already running", componentName));
}
// If that process isn't running, create a new lock file for the current process.
else
{
// Write current pid to the file.
long thisPid = ProcessHandle.current().pid();
FileUtil.createFile(pidFile.getAbsolutePath(), String.valueOf(thisPid));
// Try to be tidy. Note: This won't happen on exit if forcibly terminated, so we don't depend on it.
pidFile.deleteOnExit();
result = true;
}
return result;
}
public static #Nonnull String getHomePath()
{
// Returns a path like C:/Users/Person/
return System.getProperty("user.home") + "/";
}
}
class ProcessChecker
{
private static final #Nonnull Logger log = LogManager.getLogger(io.cpucoin.core.platform.ProcessChecker.class.getName());
static boolean isStillAllive(#Nonnull String pidStr)
{
String OS = System.getProperty("os.name").toLowerCase();
String command;
if (OS.contains("win"))
{
log.debug("Check alive Windows mode. Pid: [{}]", pidStr);
command = "cmd /c tasklist /FI \"PID eq " + pidStr + "\"";
}
else if (OS.contains("nix") || OS.contains("nux"))
{
log.debug("Check alive Linux/Unix mode. Pid: [{}]", pidStr);
command = "ps -p " + pidStr;
}
else
{
log.warn("Unsupported OS: Check alive for Pid: [{}] return false", pidStr);
return false;
}
return isProcessIdRunning(pidStr, command); // call generic implementation
}
private static boolean isProcessIdRunning(#Nonnull String pid, #Nonnull String command)
{
log.debug("Command [{}]", command);
try
{
Runtime rt = Runtime.getRuntime();
Process pr = rt.exec(command);
InputStreamReader isReader = new InputStreamReader(pr.getInputStream());
BufferedReader bReader = new BufferedReader(isReader);
String strLine;
while ((strLine = bReader.readLine()) != null)
{
if (strLine.contains(" " + pid + " "))
{
return true;
}
}
return false;
}
catch (Exception ex)
{
log.warn("Got exception using system command [{}].", command, ex);
return true;
}
}
}
class FileUtil
{
static void createDirectories(#Nonnull String dirPath) throws IOException
{
File dir = new File(dirPath);
if (dir.mkdirs()) /* If false, directories already exist so nothing to do. */
{
if (!dir.exists())
{
throw new IOException(String.format("Failed to create directory (access permissions problem?): [%s]", dirPath));
}
}
}
static void createFile(#Nonnull String fullPathToFile, #Nonnull String contents)
{
try (PrintWriter writer = new PrintWriter(fullPathToFile, Charset.defaultCharset()))
{
writer.print(contents);
}
catch (IOException e)
{
throw new RuntimeException(String.format("Unable to create file at %s! %s", fullPathToFile, e.getMessage()), e);
}
}
static #Nullable String readFile(#Nonnull File file)
{
try
{
try (BufferedReader fileReader = new BufferedReader(new FileReader(file)))
{
StringBuilder result = new StringBuilder();
String line;
while ((line = fileReader.readLine()) != null)
{
result.append(line);
if (fileReader.ready())
result.append("\n");
}
return result.toString();
}
}
catch (IOException e)
{
return null;
}
}
}
To use it, simply invoke it like this:
if (!SingleAppInstance.isOnlyInstanceOf("my-component"))
{
// quit
}
I hope you find this helpful.
Finally I found really simple library to achieve this. You can you use JUniqe.
The JUnique library can be used to prevent a user to run at the same
time more instances of the same Java application.
This is an example how to use it from the documentation
public static void main(String[] args) {
String appId = "myapplicationid";
boolean alreadyRunning;
try {
JUnique.acquireLock(appId);
alreadyRunning = false;
} catch (AlreadyLockedException e) {
alreadyRunning = true;
}
if (!alreadyRunning) {
// Start sequence here
}
}
here is a pretty rudimental approach.
If your application is launched from a script, check the running java/javaw processes and their command line before launch
In windows
REM check if there is a javaw process running your.main.class
REM if found, go to the end of the script and skip the launch of a new instance
WMIC path win32_process WHERE "Name='javaw.exe'" get CommandLine 2>nul | findstr your.main.class >nul 2>&1
if %ERRORLEVEL% EQU 0 goto:eof
javaw your.main.class