ProcessBuilder/Runtime.exec() with Weka Command Line Demonstrating Peculiar Behavior - java

Below is basically an MCVE of my full problem, which is much messier. What you need to know is that the following line runs when directly put in terminal:
java -classpath /path/to/weka.jar weka.filters.MultiFilter \
-F "weka.filters.unsupervised.attribute.ClusterMembership -I first" \
-i /path/to/in.arff
This is relatively straightforward. Basically, all I am doing is trying to cluster the data from in.arff using all of the default settings for the ClusterMembership filter, but I want to ignore the first attribute. I have the MultiFilter there because in my actual project, there are other filters, so I need this to stay. Like previously mentioned, this works fine. However, when I try to run the same line with ProcessBuilder, I get a "quote parse error", and it seems like the whole structure of nesting quotes breaks down. One way of demonstrating this is trying to get the following to work:
List<String> args = new ArrayList<String>();
args.add("java");
args.add("-cp");
args.add("/path/to/weka.jar");
args.add("weka.filters.MultiFilter");
args.add("-F");
args.add("\"weka.filters.unsupervised.attribute.ClusterMembership");
args.add("-I");
args.add("first\"");
args.add("-i");
args.add("/path/to/in.arff");
ProcessBuilder pb = new ProcessBuiler(args);
// ... Run the process below
At first glance, you might think this is identical to the above line (that's certainly what my naive self thought). In fact, if I just print args out with spaces in between each one, the resulting strings are identical and run perfectly if directly copy and pasted to the terminal. However, for whatever reason, the program won't work as I got the message (from Weka) Quote parse error. I tried googling and found this question about how ProcessBuilder adds extra quotes to the command line (this led me to try numerous combinations of escape sequences, all of which did not work), and read this article about how ProcessBuilder/Runtime.exec() work (I tried both ProcessBuilder and Runtime.exec(), and ultimately the same problem persisted), but couldn't find anything relevant to what I needed. Weka already had bad documentation, and then their Wikispace page went down a couple weeks ago due to Wikispaces shutting down, so I have found very little info on the Weka side.
My question then is this: Is there a way to get something like the second example I put above to run such that I can group arguments together for much larger commands? I understand it may require some funky escape sequences (or maybe not?), or perhaps something else I have not considered. Any help here is much appreciated.
Edit: I updated the question to hopefully give more insight into what my problem is.

You don't need to group arguments together. It doesn't even work, as you've already noted. Take a look what happens when I call my Java programm like this:
java -jar Test.jar -i -s "-t 500"
This is my "program":
public class Test {
public static void main(String[] args) {
for( String arg : args ) {
System.out.println(arg);
}
}
}
And this is the output:
-i
-s
-t 500
The quotes are not included in the arguments, they are used to group the arguments. So when you pass the arguments to the ProcessBuilder like you did, it is essentially like you'd written them with quotes on the command line and they are treated as a single argument, which confuses the parser.
The quotes are only necessary when you have nested components, e.g. FilteredClassifier. Maybe my answer on another Weka question can help you with those nested components. (I recently changed the links to their wiki to point to the Google cache until they established a new wiki.)
Since you didn't specify what case exactly caused you to think about grouping, you could try to get a working command line for Weka and then use that one as input for a program like mine. You can then see how you would need to pass them to a ProcessBuilder.
For your example I'd guess the following will work:
List<String> args = new ArrayList<String>();
args.add("java");
args.add("-cp");
args.add("/path/to/weka.jar");
args.add("weka.filters.MultiFilter");
args.add("-F");
args.add("weka.filters.unsupervised.attribute.ClusterMembership -I first");
args.add("-i");
args.add("/path/to/in.arff");
ProcessBuilder pb = new ProcessBuiler(args);
Additional details
What happens inside Weka is basically the following: The options from the arguments are first processed by weka.filters.Filter, then all non-general filter options are processed by weka.filters.MultiFilter, which contains the following code in setOptions(...):
filters = new Vector<Filter>();
while ((tmpStr = Utils.getOption("F", options)).length() != 0) {
options2 = Utils.splitOptions(tmpStr);
filter = options2[0];
options2[0] = "";
filters.add((Filter) Utils.forName(Filter.class, filter, options2));
}
Here, tmpStr is the value for the -F option and will be processed by Utils.splitOption(tmpStr) (source code). There, all the quoting and unquoting magic happens, so that the next component will receive an options array that looks just like it would look if it was a first-level component.

Related

running curl command in java is not working

I followed these tutorials How to use cURL in Java?, https://www.baeldung.com/java-curl to learn how to run curl from Java but it is not working.
The curl command (to delete page) is running fine from the terminal and gives a response in XML, but nada when I try in java, no error, no XML, the page also remains intact.
This is what I tried:
public class TryCurl {
static String command = "curl -u username:password -X POST -F cmd=\"deletePage\" -F path=\"/content/demo-task/section\" https://author1.dev.demo.adobecqms.net/bin/wcmcommand";
public static void main(String[] args) throws Exception {
Process process = Runtime.getRuntime().exec(command);
process.getInputStream();
process.destroy();
}
}
What am I doing wrong here?
Why are you using CURL for this; that's very 'fragile' (likely to fail in the future, it requires a lot of things to be in place, such as: curl to even be installed on the hardware you run this on, curl to be in the path, the password to not have spaces or other special characters in it, and more. Java is perfectly capable of making a POST request to a server.
That's not how a single process command works. Runtime.exec is not a shell - things like: Untangle quotes into arguments are shellisms, and process doesn't do any of it. Use ProcessBuilder and pass each argument separately, no quotes.
You can't just get the inputstream and discard, you need to actually read the bytes off of it, or curl will just sit there and wait for somebody to grab the data it is spitting out to standard out. This gets complicated, which brings us back to #1 which is probably a lot easier here.
you immediately run process.destroy() which will, rather obviously, destroy that process, hence the name. curl will just quit, because you asked it to, before it so much as finishes sending the POST.
execing is a lot more complicated than it sounds.

Capture "locate" output with ProcessBuilder [duplicate]

I need to build the following command using ProcessBuilder:
"C:\Program Files\USBDeview\USBDeview.exe" /enable "My USB Device"
I tried with the following code:
ArrayList<String> test = new ArrayList<String>();
test.add("\"C:\\Program Files\\USBDeview\\USBDeview.exe\"");
test.add("/enable \"My USB Device\"");
ProcessBuilder processBuilder = new ProcessBuilder(test);
processBuilder.start().waitFor();
However, this passes the following to the system (verified using Sysinternals Process Monitor)
"C:\Program Files\USBDeview\USBDeview.exe" "/enable "My USB Device""
Note the quote before /enable and the two quotes after Device. I need to get rid of those extra quotes because they make the invocation fail. Does anyone know how to do this?
Joachim is correct, but his answer is insufficient when your process expects unified arguments as below:
myProcess.exe /myParameter="my value"
As seen by stefan, ProcessBuilder will see spaces in your argument and wrap it in quotes, like this:
myProcess.exe "/myParameter="my value""
Breaking up the parameter values as Joachim recommends will result in a space between /myparameter= and "my value", which will not work for this type of parameter:
myProcess.exe /myParameter= "my value"
According to Sun, in their infinite wisdom, it is not a bug and double quotes can be escaped to achieve the desired behavior.
So to finally answer stefan's question, this is an alternative that SHOULD work, if the process you are calling does things correctly:
ArrayList<String> test = new ArrayList<String>();
test.add("\"C:\\Program Files\\USBDeview\\USBDeview.exe\"");
test.add("/enable \\\"My USB Device\\\"");
This should give you the command "C:\Program Files\USBDeview\USBDeview.exe" "/enable \"My USB Device\"", which may do the trick; YMMV.
As far as I understand, since ProcessBuilder has no idea how parameters are to be passed to the command, you'll need to pass the parameters separately to ProcessBuilder;
ArrayList<String> test = new ArrayList<String>();
test.add("\"C:\\Program Files\\USBDeview\\USBDeview.exe\"");
test.add("/enable");
test.add("\"My USB Device\"");
First, you need to split up the arguments yourself - ProcessBuilder doesn't do that for you - and second you don't need to put escaped quotes around the argument values.
ArrayList<String> test = new ArrayList<String>();
test.add("C:\\Program Files\\USBDeview\\USBDeview.exe");
test.add("/enable");
test.add("My USB Device");
The quotes are necessary on the command line in order to tell the cmd parser how to break up the words into arguments, but ProcessBuilder doesn't need them because it's already been given the arguments pre-split.
I wasn't able to get it to work in any of the above ways. I ended up writing the command to a separate script (with "\ " for each space) and writing that into a script file, then calling the script file.
Split the arguments and add it to the command list. The ProcessBuilder will append quotes to the argument if it contains space in it.
ArrayList<String> cmd= new ArrayList<String>();
cmd.add("C:\\Program Files\\USBDeview\\USBDeview.exe");
cmd.add("/enable");
cmd.add("My USB Device");

Using java code in the django framework

Okay, so I have a simple interface that I designed with the Django framework that takes natural language input from a user and stores it in table.
Additionally I have a pipeline that I built with Java using the cTAKES library to do named entity recognition i.e. it will take the text input submitted by the user and annotate it with relevant UMLS tags.
What I want to do is take the input given from the user then once, its submitted, direct it into my java-cTAKES pipeline then feed the annotated output back into the database.
I am pretty new to the web development side of this and can't really find anything on integrating scripts in this sense. So, if someone could point me to a useful resource or just in the general right direction that would be extremely helpful.
=========================
UPDATE:
Okay, so I have figured out that the subprocess is the module that I want to use in this context and I have tried implementing some simple code based on the documentation but I am getting an
Exception Type: OSError
Exception Value: [Errno 2] No such file or directory
Exception Location: /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py in _execute_child, line 1335.
A brief overview of what I'm trying to do:
This is the code I have in views. Its intent is to take text input from the model form, POST that to the DB and then pass that input into my script which produces an XML file which is stored in another column in the DB. I'm very new to django so I'm sorry if this is an simple fix, but I couldn't find any documentation relating django to subprocess that was helpful.
def queries_create(request):
if not request.user.is_authenticated():
return render(request, 'login_error.html')
form = QueryForm(request.POST or None)
if form.is_valid():
instance = form.save(commit=False)
instance.save()
p=subprocess.Popen([request.POST['post'], './path/to/run_pipeline.sh'])
p.save()
context = {
"title":"Create",
"form": form,
}
return render(request, "query_form.html", context)
Model code snippet:
class Query(models.Model):
problem/intervention = models.TextField()
updated = models.DateTimeField(auto_now=True, auto_now_add=False)
timestamp = models.DateTimeField(auto_now=False, auto_now_add=True)
UPDATE 2:
Okay so the code is no longer breaking by changing the subprocess code as below
def queries_create(request):
if not request.user.is_authenticated():
return render(request, 'login_error.html')
form = QueryForm(request.POST or None)
if form.is_valid():
instance = form.save(commit=False)
instance.save()
p = subprocess.Popen(['path/to/run_pipeline.sh'], stdin=subprocess.PIPE,
stdout=subprocess.PIPE)
(stdoutdata, stderrdata) = p.communicate()
instance.processed_data = stdoutdata
instance.save()
context = {
"title":"Create",
"form": form,
}
return render(request, "query_form.html", context)
However, I am now getting a "Could not find or load main class pipeline.CtakesPipeline" that I don't understand since the script runs fine from the shell in this working directory. This is the script I am trying to call with subprocess.
#!/bin/bash
INPUT=$1
OUTPUT=$2
CTAKES_HOME="full/path/to/CtakesClinicalPipeline/apache-ctakes-3.2.2"
UMLS_USER="####"
UMLS_PASS="####"
CLINICAL_PIPELINE_JAR="full/path/to/CtakesClinicalPipeline/target/
CtakesClinicalPipeline-0.0.1-SNAPSHOT.jar"
[[ $CTAKES_HOME == "" ]] && CTAKES_HOME=/usr/local/apache-ctakes-3.2.2
CTAKES_JARS=""
for jar in $(find ${CTAKES_HOME}/lib -iname "*.jar" -type f)
do
CTAKES_JARS+=$jar
CTAKES_JARS+=":"
done
current_dir=$PWD
cd $CTAKES_HOME
java -Dctakes.umlsuser=${UMLS_USER} -Dctakes.umlspw=${UMLS_PASS} -cp
${CTAKES_HOME}/desc/:${CTAKES_HOME}/resources/:${CTAKES_JARS%?}:
${current_dir}/${CLINICAL_PIPELINE_JAR} -
-Dlog4j.configuration=file:${CTAKES_HOME}/config/log4j.xml -Xms512M -Xmx3g
pipeline.CtakesPipeline $INPUT $OUTPUT
cd $current_dir
I'm not sure how to go about fixing this error so any help is appreciated.
If I understand you correctly, you want to pipe the value of request.POST['post'] to the program run_pipeline.sh and store the output in a field of your instance.
You are calling subprocess.Popen incorrectly. It should be:
p = subprocess.Popen(['/path/to/run_pipeline.sh'], stdin=subprocess.PIPE, stdout=subprocess.PIPE)
Then pass in the input and read the output
(stdoutdata, stderrdata) = p.communicate()
Then save the data, e.g. in a field of your instance
instance.processed_data = stdoutdata
instance.save()
I suggest you first make sure to get the call to the subprocess working in a Python shell and then integrate it in your Django app.
Please note that creating a (potentially long-running) subprocess in a request is really bad practice and can lead to a lot of problems. The best practice is to delegate long-running tasks in a job queue. For Django, Celery is probably most commonly used. There is a bit of setup involved, though.

Java: how to read options from command line like -input or --h?

Searched for a while now...
I don't want to use a special parser, just build-in methods.
The problem is I want to read options like -i=something in a simple way.
Then - in the script - I can call these options like args[i] or so.
Is there any way?
Thanks
EDIT: Example
Command Line: java scriptname -write="test"
Script: System.out.println(args[write]);
Output: test
If you want to you pass parameters as key-value pairs while invoking java program, you can do that using -D flag like this
java -Demail=test#gmail.com -DuserName="John Watson" MainProgam
If your value has spaces in it, enclose that in double quotes. Now
you can access these values as system properties in your code like this
String email = System.getProperty("email");
String name = System.getProperty("userName");
All the parameters passed after class name in java command are accessible in args[] of main method, you can access them only through indices not with key value pairs

Executing linux commands from inside java program

I am trying to create a GUI using java swing. From there I have to run linux system commands. I tried using exec(). But the exec() function is unable to parse the string if it contains single quotes. The code which I have used is as follows-
Process p = Runtime.getRuntime().exec("cpabe-enc pub_key message.txt '( it_department or ( marketing and manager ) )'")
BufferedReader stdInput = new BufferedReader(new InputStreamReader(p.getInputStream()));
But I am getting error when I run the program as--syntax error at "'(".
The same command runs when I write
Process p = Runtime.getRuntime().exec("cpabe-enc pub_key message.txt default")
Please help. Thanks in advance for your help.
Split up the parameters into an array instead, one string for each argument, and use the exec-method that takes as String[] instead, that generally works better for arguments.
Somethign along the lines of:
Runtime.getRuntime().exec(new String[] {"cpabe-enc", "pub_key", "message.txt", "( it_department or ( marketing and manager ) )"});
or whatever what your exact parameters are.
Its because the runtime does not interpret the '(...)' as a single parameter like you intend.
Try using ProcessBuilder instead:
http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/ProcessBuilder.html
I recently got this kind of problem solved. I was using javaFX to call shell scripts on button click .. which is very much similar to your swing application scenario...
Here are the links hope it might help you...
How to code in java to run unix shell script which use rSync internally in windows environment using cygwin?
Getting error in calling shell script in windows environment using java code and cygwin...!
Happy coding... :)

Categories