Java maximum heap size on SGE cluster

Java maximum heap size on SGE cluster - java

I have a pipeline written in python calling some processes in Java. The pipeline runs with two possible modes, on local mode (on a single node) or on SGE cluster.
When I set the option to cluster mode, the error message in the logs are such this
Invalid maximum heap size: -Xmx4g -jar
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
In local mode, there is no error and no problems.
My question is what can cause such an error ?
My class to run jobs either locally or on cluster is as follow
class LocalJobManager(JobManager):
def __init__(self):
self.cmd_strs = []
def add_job(self, cmd, cmd_args, **kwargs):
cmd_str = ' '.join([cmd, ] + [str(x) for x in cmd_args])
self.cmd_strs.append(cmd_str)
def run_job(self, cmd, cmd_args, **kwargs):
cmd_str = ' '.join([cmd, ] + [str(x) for x in cmd_args])
self._run_cmd(cmd_str)
def wait(self):
for cmd_str in self.cmd_strs:
self._run_cmd(cmd_str)
def _run_cmd(self, cmd_str):
'''
Throw exception if run command fails
'''
process = subprocess.Popen(cmd_str, stdin=subprocess.PIPE, shell=True)
process.stdin.close()
sts = os.waitpid(process.pid, 0)
if sts[1] != 0:
raise Exception('Failed to run {0}\n'.format(cmd_str))
class ClusterJobManager(JobManager):
def __init__(self, log_dir=None):
import drmaa
self._drmaa = drmaa
self.log_dir = log_dir
if self.log_dir is not None:
make_directory(self.log_dir)
self.session = self._drmaa.Session()
self.session.initialize()
self.job_ids = Queue()
self._lock = threading.Lock()
def add_job(self, cmd, cmd_args, mem=4, max_mem=10, num_cpus=1):
job_id = self._run_job(cmd, cmd_args, mem, max_mem, num_cpus)
self.job_ids.put(job_id)
def run_job(self, cmd, cmd_args, mem=4, max_mem=10, num_cpus=1):
job_id = self._run_job(cmd, cmd_args, mem, max_mem, num_cpus)
self._check_exit_status(job_id)
def wait(self):
self._lock.acquire()
job_ids = []
while not self.job_ids.empty():
job_ids.append(self.job_ids.get())
self.session.synchronize(job_ids, self._drmaa.Session.TIMEOUT_WAIT_FOREVER, False)
self._lock.release()
for job_id in job_ids:
self._check_exit_status(job_id)
def close(self):
self.session.control(self._drmaa.Session.JOB_IDS_SESSION_ALL, self._drmaa.JobControlAction.TERMINATE)
self.session.exit()
def _run_job(self, cmd, cmd_args, mem, max_mem, num_cpus):
job_template = self._init_job_template(cmd, cmd_args, mem, max_mem, num_cpus)
job_id = self.session.runJob(job_template)
self.session.deleteJobTemplate(job_template)
return job_id
def _init_job_template(self, cmd, cmd_args, mem, max_mem, num_cpus):
job_template = self.session.createJobTemplate()
job_template.remoteCommand = cmd
job_template.args = [str(x) for x in cmd_args]
job_template.workingDirectory = os.getcwd()
if self.log_dir is not None:
job_template.errorPath = ':' + self.log_dir
job_template.outputPath = ':' + self.log_dir
job_template.nativeSpecification = '-l mem_free={mem}G,mem_token={mem}G,h_vmem={max_mem}G -V -w n -pe ncpus {num_cpus}'.format(**locals())
return job_template
def _check_exit_status(self, job_id):
return_value = self.session.wait(job_id, self._drmaa.Session.TIMEOUT_WAIT_FOREVER)
if return_value.exitStatus != 0:
raise Exception('Job {0} failed with exit status {1}.'.format(return_value.jobId,
return_value.exitStatus))
Usually the Could not create the Java Virtual Machine (as I am reading though some forum) is caused by syntax error, even though the command called is correct and works locally, besides the class to run jobs on the cluster show above, runs for everything except Java
Thanks

I've run into this on SGE. You may have a default hard memory limit set around 4GB and Java seems to use a bit more than that 4GB you are setting in the -Xmx4g parameter during initialization. Can you see if you administrator has set a hard memory limit? Typically you will set or override the default limit using:
qsub -l h_vmem=16G
Try giving far more memory than needed via that parameter and see if that solves it and then ratchet down the h_vmem as far down as it goes without crashing.

Related

Activating virtualenv via Java ProcessBuilder

Getting the following when trying to programmatically activate Python's virtualenv via the code below:
java.io.IOException: Cannot run program "." (in directory "/Users/simeon.../..../reporting"): error=13, Permission denied
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
at VirtualEnvCreateCmdTest.runCommandInDirectory(VirtualEnvCreateCmdTest.java:30)
at VirtualEnvCreateCmdTest.createVirtEnv(VirtualEnvCreateCmdTest.java:61)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at ......
Caused by: java.io.IOException: error=13, Permission denied
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.<init>(UNIXProcess.java:247)
at java.lang.ProcessImpl.start(ProcessImpl.java:134)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
... 25 more
Code:
public class VirtualEnvCreateCmdTest {
private final static Logger LOG = LoggerFactory.getLogger(VirtualEnvCreateCmdTest.class);
private void runCommandInDirectory(String path,String ... command) throws Throwable
{
LOG.info("Running command '"+String.join(" ",command)+"' in path '"+path+"'");
ProcessBuilder builder = new ProcessBuilder(command)
.directory(new File(path))
.inheritIO();
Process pr = builder.start();
final String failureMsg = format("Failed to run '%s' in path '%s'. Got exit code: %d", join(" ",command), path, pr.exitValue());
LOG.info("prepared message {}: ", failureMsg);
if(!pr.waitFor(120, TimeUnit.SECONDS))
{
throw new Exception(failureMsg);
}
int output = IOUtils.copy(pr.getInputStream(), System.out);
int exitCode=pr.exitValue();
if(exitCode!=0)
throw new Exception(failureMsg);
}
#Test
public void createVirtEnv() throws Throwable {
String path = "/Users/simeon/.../reporting";
String [] commands = new String[]{".", "activate"};
//String [] commands = new String [] {"/bin/bash", "-c", ". /Users/simeon/..../venv2.7/bin/activate"};
runCommandInDirectory(path, commands);
}
Changing permissioning on the file, doesn't seem to work:
chmod u+x ./bin/activate
Doing it via /bin/bash still doesn't work though with a different error.
At the same time the following works fine on the command line:
. /Users/simeon/.../venv2.7/bin/activate
Any samples of how one would invoke python virtualenv activate command from within Java?
=== What worked: =====
The following ended up working for me:
private void runDjangoMigrate() throws Throwable {
final String REPORTING_PROJECT_LOCATION = "/Users/simeon.../.../reporting";
final String UNIX_SHELL_LOCATION = "/bin/bash";
final String PYTHON_VIRTUALENV_ACTIVATOR_COMMAND =". /Users/simeon.../.../venv2.7/bin/activate;";
final String PYTHON_VIRTUALENV_ACTIVATOR_COMMAND =". " + PYTHON_VIRTUALENV_ACTIVATE_SCRIPT_LOCATION + ";";
final String DJANGO_MANAGE_MODULE = " manage.py";
final String[] DJANGO_MIGRATE_COMMAND = new String[] { UNIX_SHELL_LOCATION, "-c", PYTHON_VIRTUALENV_ACTIVATOR_COMMAND
+ PYTHON_INTERPRETER + DJANGO_MANAGE_MODULE + " migrate --noinput --fake-initial" };
runCommandInDirectory(REPORTING_PROJECT_LOCATION, DJANGO_MIGRATE_COMMAND);
}
private void runCommandInDirectory(String path, String... command) throws Throwable {
LOG.info(format("Running '%s' command in path '%s'", join(" ",command),path));
ProcessBuilder builder = new ProcessBuilder(command).directory(new File(path)).inheritIO();
Process pr = null;
pr = builder.start();
IOUtils.copy(pr.getInputStream(), System.out);
boolean terminated = pr.waitFor(SPAWNED_PYTHON_PROCESS_TTL_SEC, SECONDS);
int exitCode = -1;
if (terminated) {
exitCode = pr.exitValue();
}
if (exitCode != 0) {
final String failureMsg = format("Failed to run '%s' in path '%s'. Got exit code: %d", join(" ", command),
path, pr.exitValue());
throw new Exception(failureMsg);
}
}
and producing the following expected output:
[java] System check identified some issues:
[java]
[java] WARNINGS:
[java] ?: (urls.W005) URL namespace 'admin' isn't unique. You may not be able to reverse all URLs in this namespace
[java] Building permissions...
[java] Operations to perform:
[java] Apply all migrations: admin, auth, contenttypes, sessions
[java] Running migrations:
[java] No migrations to apply.
[java] System check identified some issues:
[java]
[java] WARNINGS:
[java] ?: (urls.W005) URL namespace 'admin' isn't unique. You may not be able to reverse all URLs in this namespace
[java] Building permissions...
[java] User exists, exiting normally due to --preserve
[java] System check identified some issues:
[java]

You can activate a Python virtual environment in bash (and perhaps zsh) and Python, but not in Java or any other environment. The main thing to understand is that virtual environment activation doesn't do some system magic — the activation script just changes the current environment, and Python virtual environments are prepared to changes shell or Python but nothing else.
In terms of Java it means that when you call runCommandInDirectory() it runs a new shell, that new shell briefly activates a virtual env, but then the shell exits and all changes that "activates" the virtual env are gone.
This in turn means that if you need to run some shell or Python commands in a virtual env you have to activate the environment in every runCommandInDirectory() invocation:
String [] commands1 = new String [] {"/bin/bash", "-c", ". /Users/simeon/..../venv2.7/bin/activate; script1.sh"};
runCommandInDirectory(path, commands)
String [] commands2 = new String [] {"/bin/bash", "-c", ". /Users/simeon/..../venv2.7/bin/activate; script2.sh"};
runCommandInDirectory(path, commands)
For Python scripts it's a bit simpler as you can run python from the environment and it automagically activates the environment:
String [] commands = new String [] {"/Users/simeon/..../venv2.7/bin/python", "script.py"};
runCommandInDirectory(path, commands)

Using wildcard in to install a server printer

In java, I am trying to use a cmd command to install a printer from a printer server. Knowing the 3 first letters of the printer name, I would like to use wildcards. I have tried the following command but it doesn't do anything.
Process p3 = Runtime.getRuntime().exec( "CSCRIPT c:\\windows\\System32\\Printing_Admin_Scripts\\en-US\\prnmngr.vbs -l -s \\\\svmsimp1 -a -p \"slj05%\" " );

Read the prnmngr.vbs script. There is next function for parsing the command line (abbreviated, omitted parts replaced by '...'):
' Parse the command line into its components
function ParseCommandLine(iAction, strServer, strPrinter, strDriver _
, strPort, strUser, strPassword)
on error resume next
'...'
dim oArgs
dim iIndex
'...'
iIndex = 0
set oArgs = wscript.Arguments
while iIndex < oArgs.Count
select case oArgs(iIndex)
case "-a"
iAction = kActionAdd
'...'
case "-p"
iIndex = iIndex + 1
strPrinter = oArgs(iIndex)
'...'
case else
Usage(true)
exit function
end select
iIndex = iIndex + 1
wend
'...'
end function
Crucial line strPrinter = oArgs(iIndex) says the script takes printer name from command line parameter -p "printer" as such, with no attempt to treat any wildcards. Neither in this function nor anywhere else in the whole script.
BTW, all parameters to the ParseCommandLine function are passed by reference (default if ByVal and ByRef are omitted).
Conclusion: although you would like to use wildcards, it's impossible with -a argument (add local printer) as in next call
'...'
select case iAction
case kActionAdd
iRetval = AddPrinter(strServer, strPrinter, strDriver _
, strPort, strUser, strPassword)
'...'
we can see (inside AddPrinter function):
'...'
set oPrinter = oService.Get("Win32_Printer").SpawnInstance_
'...'
oPrinter.DriverName = strDriver
oPrinter.PortName = strPort
oPrinter.DeviceID = strPrinter
oPrinter.Put_(kFlagCreateOnly)
'...'
On the other hand, you could parse output from prnmngr -l -s server to get desired printer names.

Scala - Fails to execute a process through terminal in a particular scenario

I'm using graphviz to generate graphs based on the messages passed in a scala program.
To invoke the graphviz application from inside the scala program, I'm using the exec() method (similar to Java). It successfully executed the command and created the graph when I used the below code snippet:
var cmd: String = "dot -Tpng Graph.dot -o Graph.png"
var run: Runtime = Runtime.getRuntime() ;
var pr: Process = run.exec(cmd) ;
However It fails to execute after changing the path of the input and output files (I just included a directory inside which the input file and output file resides as shown below)
def main(args: Array[String]): Unit = {
var DirectoryName: String = "Logs"
var GraphFileName: String = DirectoryName + File.separator + "Graph.dot"
val GraphFileObj: File = new File(GraphFileName)
// var cmd: String = "dot -Tpng Graph.dot -o Graph.png"
var cmd: String = "dot -Tpng \"" + GraphFileObj.getAbsolutePath + "\" -o \"" + DirectoryName + File.separator + "Graph.png\"" ;
println(cmd)
var run: Runtime = Runtime.getRuntime() ;
var pr: Process = run.exec(cmd) ;
}
The same command when executed through terminal gives proper output. Can you please help me to find what I'm missing?

exec is not a shell...e.g. quoting won't work as you expect, and thus your path (which may contain spaces, etc) will not be processed as you expect. The command will be broken apart using StringTokenizer, and your literal quotes will be...well..literal.
Use the form of exec that takes an array instead, so you can tokenize the command correctly.
val args = Array[String]("dot", "-Tpng", GraphFileObj.getAbsolutePath, ...);
run.exec(args)

Powershell: Capturing standard out and error with Process object

I want to start a Java program from PowerShell and get the results printed on the console.
I have followed the instructions of this question:
Capturing standard out and error with Start-Process
But for me, this is not working as I expected. What I'm doing wrong?
This is the script:
$psi = New-object System.Diagnostics.ProcessStartInfo
$psi.CreateNoWindow = $true
$psi.UseShellExecute = $false
$psi.RedirectStandardOutput = $true
$psi.RedirectStandardError = $true
$psi.FileName = 'java.exe'
$psi.Arguments = #("-jar","tools\compiler.jar","--compilation_level", "ADVANCED_OPTIMIZATIONS", "--js", $BuildFile, "--js_output_file", $BuildMinFile)
$process = New-Object System.Diagnostics.Process
$process.StartInfo = $psi
$process.Start() | Out-Null
$process.WaitForExit()
$output = $process.StandardOutput.ReadToEnd()
$output
The $output variable is always empty (and nothing is printed on the console of course).

The docs on the RedirectStandardError property suggests that it is better to put the WaitForExit() call after the ReadToEnd() call. The following works correctly for me:
$psi = New-object System.Diagnostics.ProcessStartInfo
$psi.CreateNoWindow = $true
$psi.UseShellExecute = $false
$psi.RedirectStandardOutput = $true
$psi.RedirectStandardError = $true
$psi.FileName = 'ipconfig.exe'
$psi.Arguments = #("/a")
$process = New-Object System.Diagnostics.Process
$process.StartInfo = $psi
[void]$process.Start()
$output = $process.StandardOutput.ReadToEnd()
$process.WaitForExit()
$output

Small variation so that you can selectively print the output if needed. As in if your looking just for error or warning messages and by the way Keith you saved my bacon with your response...
$psi = New-object System.Diagnostics.ProcessStartInfo
$psi.CreateNoWindow = $true
$psi.UseShellExecute = $false
$psi.RedirectStandardOutput = $true
$psi.RedirectStandardError = $true
$psi.FileName = 'robocopy'
$psi.Arguments = #("$HomeDirectory $NewHomeDirectory /MIR /XF desktop.ini /XD VDI /R:0 /W:0 /s /v /np")
$process = New-Object System.Diagnostics.Process
$process.StartInfo = $psi
[void]$process.Start()
do
{
$process.StandardOutput.ReadLine()
}
while (!$process.HasExited)

Here is a modification to paul's answer, hopefully it addresses the truncated output. i did a test using a failure and did not see truncation.
function Start-ProcessWithOutput
{
param ([string]$Path,[string[]]$ArgumentList)
$Output = New-Object -TypeName System.Text.StringBuilder
$Error = New-Object -TypeName System.Text.StringBuilder
$psi = New-object System.Diagnostics.ProcessStartInfo
$psi.CreateNoWindow = $true
$psi.UseShellExecute = $false
$psi.RedirectStandardOutput = $true
$psi.RedirectStandardError = $true
$psi.FileName = $Path
if ($ArgumentList.Count -gt 0)
{
$psi.Arguments = $ArgumentList
}
$process = New-Object System.Diagnostics.Process
$process.StartInfo = $psi
[void]$process.Start()
do
{
if (!$process.StandardOutput.EndOfStream)
{
[void]$Output.AppendLine($process.StandardOutput.ReadLine())
}
if (!$process.StandardError.EndOfStream)
{
[void]$Error.AppendLine($process.StandardError.ReadLine())
}
Start-Sleep -Milliseconds 10
} while (!$process.HasExited)
#read remainder
while (!$process.StandardOutput.EndOfStream)
{
#write-verbose 'read remaining output'
[void]$Output.AppendLine($process.StandardOutput.ReadLine())
}
while (!$process.StandardError.EndOfStream)
{
#write-verbose 'read remaining error'
[void]$Error.AppendLine($process.StandardError.ReadLine())
}
return #{ExitCode = $process.ExitCode; Output = $Output.ToString(); Error = $Error.ToString(); ExitTime=$process.ExitTime}
}
$p = Start-ProcessWithOutput "C:\Program Files\7-Zip\7z.exe" -ArgumentList "x","-y","-oE:\PowershellModules",$NewModules.FullName -verbose
$p.ExitCode
$p.Output
$p.Error
the 10ms sleep is to avoid spinning cpu when nothing to read.

I was getting the deadlock scenario mentioned by Ash using Justin's solution. Modified the script accordingly to subscribe to async event handlers to get the output and error text which accomplishes the same thing but avoids the deadlock condition.
Seemed to resolve the deadlock issue in my testing without altering the return data.
# Define global variables used in the Start-ProcessWithOutput function.
$global:processOutputStringGlobal = ""
$global:processErrorStringGlobal = ""
# Launch an executable and return the exitcode, output text, and error text of the process.
function Start-ProcessWithOutput
{
# Function requires a path to an executable and an optional list of arguments
param (
[Parameter(Mandatory=$true)] [string]$ExecutablePath,
[Parameter(Mandatory=$false)] [string[]]$ArgumentList
)
# Reset our global variables to an empty string in the event this process is called multiple times.
$global:processOutputStringGlobal = ""
$global:processErrorStringGlobal = ""
# Create the Process Info object which contains details about the process. We tell it to
# redirect standard output and error output which will be collected and stored in a variable.
$ProcessStartInfoObject = New-object System.Diagnostics.ProcessStartInfo
$ProcessStartInfoObject.FileName = $ExecutablePath
$ProcessStartInfoObject.CreateNoWindow = $true
$ProcessStartInfoObject.UseShellExecute = $false
$ProcessStartInfoObject.RedirectStandardOutput = $true
$ProcessStartInfoObject.RedirectStandardError = $true
# Add the arguments to the process info object if any were provided
if ($ArgumentList.Count -gt 0)
{
$ProcessStartInfoObject.Arguments = $ArgumentList
}
# Create the object that will represent the process
$Process = New-Object System.Diagnostics.Process
$Process.StartInfo = $ProcessStartInfoObject
# Define actions for the event handlers we will subscribe to in a moment. These are checking whether
# any data was sent by the event handler and updating the global variable if it is not null or empty.
$ProcessOutputEventAction = {
if ($null -ne $EventArgs.Data -and $EventArgs.Data -ne ""){
$global:processOutputStringGlobal += "$($EventArgs.Data)`r`n"
}
}
$ProcessErrorEventAction = {
if ($null -ne $EventArgs.Data -and $EventArgs.Data -ne ""){
$global:processErrorStringGlobal += "$($EventArgs.Data)`r`n"
}
}
# We need to create an event handler for the Process object. This will call the action defined above
# anytime that event is triggered. We are looking for output and error data received by the process
# and appending the global variables with those values.
Register-ObjectEvent -InputObject $Process -EventName "OutputDataReceived" -Action $ProcessOutputEventAction
Register-ObjectEvent -InputObject $Process -EventName "ErrorDataReceived" -Action $ProcessErrorEventAction
# Process starts here
[void]$Process.Start()
# This sets up an asyncronous task to read the console output from the process, which triggers the appropriate
# event, which we setup handlers for just above.
$Process.BeginErrorReadLine()
$Process.BeginOutputReadLine()
# Wait for the process to exit.
$Process.WaitForExit()
# We need to wait just a moment so the async tasks that are reading the output of the process can catch
# up. Not having this sleep here can cause the return values to be empty or incomplete. In my testing,
# it seemed like half a second was enough time to always get the data, but you may need to adjust accordingly.
Start-Sleep -Milliseconds 500
# Return an object that contains the exit code, output text, and error text.
return #{
ExitCode = $Process.ExitCode;
OutputString = $global:processOutputStringGlobal;
ErrorString = $global:processErrorStringGlobal;
ExitTime = $Process.ExitTime
}
}

Perl Pipe not redirecting Java process output

I'm trying to control a game server and display it's output in real time. This is what I have so far:
#!/usr/bin/perl -w
use IO::Socket;
use Net::hostent; # for OO version of gethostbyaddr
$PORT = 9000; # pick something not in use
$server = IO::Socket::INET->new( Proto => 'tcp',
LocalPort => $PORT,
Listen => SOMAXCONN,
Reuse => 1);
die "can't setup server" unless $server;
print "[Server $0 accepting clients]\n";
while ($client = $server->accept()) {
$client->autoflush(1);
print $client "Welcome to $0; type help for command list.\n";
$hostinfo = gethostbyaddr($client->peeraddr);
printf "[Connect from %s]\n", $hostinfo->name || $client->peerhost;
print $client "Command? ";
while ( <$client>) {
next unless /\S/; # blank line
if (/quit|exit/i) {
last; }
elsif (/fail|omg/i) {
printf $client "%s\n", scalar localtime; }
elsif (/start/i ) {
if (my $ping_pid = open(JAVA, "screen java -jar craftbukkit-0.0.1-SNAPSHOT.jar |")) {
while (my $ping_output = <JAVA>) {
# Do something with the output, let's say print
print $client $ping_output;
# Kill the C program based on some arbitrary condition (in this case
# the output of the program itself).
}
}
printf $client "I think it started...\n Say status for output\n"; }
elsif (/stop/i ) {
print RSPS "stop";
close(RSPS);
print $client "Should be closed.\n"; }
elsif (/status/i ) {
$output = RSPS;
print $client $output; }
else {
print $client "FAIL\n";
}
} continue {
print $client "Command? ";
}
close $client;
}
It starts the process just fine, the only flaw is that it's not outputting the output of the Java process to the socket (It is displaying the output in the terminal window that Perl was initiated with) I've tried this with ping and it worked just fine, any ideas?
Thanks in advance!

It sounds like the Java code (or maybe it's screen) is printing some output to the standard error stream. Assuming that you don't want to capture it separately, some easy fixes are:
Suppress it:
open(JAVA, "screen java -jar craftbukkit-0.0.1-SNAPSHOT.jar 2>/dev/null |")
Capture it in the standard output stream:
open(JAVA, "screen java -jar craftbukkit-0.0.1-SNAPSHOT.jar 2>&1 |")

screen redirects the output to the "screen". Get rid of the screen in the command.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java maximum heap size on SGE cluster - java

Related

Activating virtualenv via Java ProcessBuilder

Using wildcard in to install a server printer

Scala - Fails to execute a process through terminal in a particular scenario

Powershell: Capturing standard out and error with Process object

Perl Pipe not redirecting Java process output

Categories

Resources