Executing terminal command from java [duplicate] - java

This question already has answers here:
How to make pipes work with Runtime.exec()?
(4 answers)
Closed 8 years ago.
I am new to java and learning it. I want to run a command line program from java class.
This is the command i want to run:(only linux)
path/to/folder/$ echo "Inhibition of NF-kappaB activation reversed the anti-apoptotic effect of isochamaejasmin." | ./geniatagger
This will give me output which I want to store in a java object.
String output:
Inhibition Inhibition NN B-NP O
of of IN B-PP O
NF-kappaB NF-kappaB NN B-NP B-protein
activation activation NN I-NP O
reversed reverse VBD B-VP O
the the DT B-NP O
anti-apoptotic anti-apoptotic JJ I-NP O
effect effect NN I-NP O
of of IN B-PP O
isochamaejasmin isochamaejasmin NN B-NP O
. . . O O
Please guide me how to achieve that?

Command line interpreters, mostly called shells, are a typical platform specific feature not supported well by Java. This is not to say it is impossible, but you have to ask yourself whether you want this feature to work on all platforms or not.
If you don't care about multi-platform, you can look into Runtime.exec
E.g:
Runtime.getRuntime().exec("cmd /c echo Hello World!");
If you do care, maybe you are better off doing the same thing from Java directly. Mostly shell commands can be achieved from Java directly without too much code. It will then be cross-platform and the performance will often be much better.
If you decide to go with invocation of shell commands anyway, to act on the output of it, learn how to access the standard input/output streams. Look at this example.
UPDATE
I think I now see that probably you want to write a program ./geniatagger that will accept input from the command line, e.g. output of 'echo' that was piped to it etc?
If so, look at using System.in, the standard input stream.

Related

Automatically run the command alternatives --config java to choose the option provided with update-alternatives --install [duplicate]

This question already has answers here:
Passing arguments to an interactive program non-interactively
(5 answers)
Closed 2 years ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Original close reason(s) were not resolved
Is it possible to have a bash script automatically handle prompts that would normally be presented to the user with default actions? Currently I am using a bash script to call an in-house tool that will display prompts to the user (prompting for Y/N) to complete actions, however the script I'm writing needs to be completely "hands-off", so I need a way to send Y|N to the prompt to allow the program to continue execution. Is this possible?
A simple
echo "Y Y N N Y N Y Y N" | ./your_script
This allow you to pass any sequence of "Y" or "N" to your script.
This is not "auto-completion", this is automation. One common tool for these things is called Expect.
You might also get away with just piping input from yes.
If you only have Y to send :
$> yes Y |./your_script
If you only have N to send :
$> yes N |./your_script
I found the best way to send input is to use cat and a text file to pass along whatever input you need.
cat "input.txt" | ./Script.sh
In my situation I needed to answer some questions without Y or N but with text or blank. I found the best way to do this in my situation was to create a shellscript file. In my case I called it autocomplete.sh
I was needing to answer some questions for a doctrine schema exporter so my file looked like this.
-- This is an example only --
php vendor/bin/mysql-workbench-schema-export mysqlworkbenchfile.mwb ./doctrine << EOF
`#Export to Doctrine Annotation Format` 1
`#Would you like to change the setup configuration before exporting` y
`#Log to console` y
`#Log file` testing.log
`#Filename [%entity%.%extension%]`
`#Indentation [4]`
`#Use tabs [no]`
`#Eol delimeter (win, unix) [win]`
`#Backup existing file [yes]`
`#Add generator info as comment [yes]`
`#Skip plural name checking [no]`
`#Use logged storage [no]`
`#Sort tables and views [yes]`
`#Export only table categorized []`
`#Enhance many to many detection [yes]`
`#Skip many to many tables [yes]`
`#Bundle namespace []`
`#Entity namespace []`
`#Repository namespace []`
`#Use automatic repository [yes]`
`#Skip column with relation [no]`
`#Related var name format [%name%%related%]`
`#Nullable attribute (auto, always) [auto]`
`#Generated value strategy (auto, identity, sequence, table, none) [auto]`
`#Default cascade (persist, remove, detach, merge, all, refresh, ) [no]`
`#Use annotation prefix [ORM\]`
`#Skip getter and setter [no]`
`#Generate entity serialization [yes]`
`#Generate extendable entity [no]` y
`#Quote identifier strategy (auto, always, none) [auto]`
`#Extends class []`
`#Property typehint [no]`
EOF
The thing I like about this strategy is you can comment what your answers are and using EOF a blank line is just that (the default answer). Turns out by the way this exporter tool has its own JSON counterpart for answering these questions, but I figured that out after I did this =).
to run the script simply be in the directory you want and run 'sh autocomplete.sh' in terminal.
In short by using << EOL & EOF in combination with Return Lines you can answer each question of the prompt as necessary. Each new line is a new answer.
My example just shows how this can be done with comments also using the ` character so you remember what each step is.
Note the other advantage of this method is you can answer with more then just Y or N ... in fact you can answer with blanks!
Hope this helps someone out.
There is a special build-in util for this - 'yes'.
To answer all questions with the same answer, you can run
yes [answer] |./your_script
Or you can put it inside your script have specific answer to each question

Expect Programming: How to expect exactly what is prompted?

I am asking this question particularly for an Expect implementation in Java. However, I would like to know general suggestions as well.
In Expect programming, is it possible to expect exactly what is prompted after spawning a new process?
For example, instead of expecting some pattern or a fixed string, isn't it better to just expect what is prompted. I feel this should be really helpful at times(especially when there's no conditional sending).
Consider the sample java code here that uses JSch and Expect4j java libraries to do ssh and execute list of commands(ls,pwd,mkdir testdir) on the remote machine.
My question here is why is it necessary to specify a pattern for the prompt? Is it not possible it to get the exact prompt from Channel itself and expect it?
I've programmed in "expect" and in "java".
I think you misunderstand what "expect" basically does. It doesn't look for exact items prompted after spawning a new process.
An expect program basically consists of:
Something that reads the terminal
A set of patterns (typically regular expressions), coupled to a blocks of code.
So, when a new process is spawned, there's a loop that looks something like this
while (terminal.hasMoreText()) {
buffered_text += terminal.readInput();
for (Pattern pattern : patterns) {
if (pattern.matches(buffered_text)) {
String match = pattern.getMatch(buffered_text);
bufferedText.removeAllTextBefore(match);
bufferedText.removeText(match);
pattern.executeBlock();
}
}
}
Of course, this is a massive generalization. But it is close enough to illustrate that expect itself doesn't "exactly expect" anything after launching a process. The program provided to the expect interpreter (which primarily consists of patterns and blocks of code to execute when the patterns match) contains the items which the interpreter's loop will use to match the process's output.
This is why you see some pretty odd expect scripts. For example, nearly everyone "expects" "ogin:" instead of "Login:" because there's little consistency on whether the login prompt is upper or lower case.
You don't have to expect anything. You're free to just send commands immediately and indiscriminately.
It's considered good practice to only reply to specific prompts so that you don't accidentally ruin something by saying the wrong thing at the wrong time, but you're entirely free to ignore this.
The main consideration is that while your normal flow might be:
$ create-backup
$ mkdir latest
$ mv backup.tar.gz latest
With no expectations and just blindly writing input, you can end up with this:
$ create-backup
Disk full, cleanup started...
Largest file: precious-family-memories.tar (510MB)
[R]emove, [S]ave, [A]bort
Invalid input: m
Invalid input: k
Invalid input: d
Invalid input: i
Removing file...
$ latest
latest: command not found
$ mv backup.tar.gz latest
whereas a program that expects $ before continuing would just wait and eventually realize that things are not going according to plan.
A few commands are sensitive to timing (e.g. telnet), but other than that you can send commands whenever you want, with or without waiting for anything at all.

Calling JAR from Python

I have the code below:
from subprocess import Popen, PIPE, STDOUT
p = Popen(['java', '-jar', 'action.jar'], stdin=PIPE, stdout=PIPE, stderr=STDOUT)
stdout1, stderr1 = p.communicate(input=sample_input1)
print "Result is", stdout1
p = Popen(['java', '-jar', 'action.jar'], stdin=PIPE, stdout=PIPE, stderr=STDOUT)
stdout2, stderr2 = p.communicate(input=sample_input2)
print "Result is", stdout2
Loading the jar takes a lot of time and is very inefficient. Is there any way to avoid reloading it the second time, in the second line p = Popen(...), i.e. just loading it once in the beginning and continue using that instance? I tried to remove the second line unsuccessfully, Python complains:
"ValueError: I/O operation on closed file".
Is there any solution to this? Thanks!
communicate() waits for the process to terminate, so that explains the error you're getting -- the second time you call it, the process isn't running any more.
It really depends on how that JAR was written, and the kind of input it expects. If it supports executing its action more than once based on input, and if you can reformat your input that way, then it would work. If the JAR does its thing once and terminates, there's not much you can do.
If you don't mind writing a bit of Java, you can add a wrapper around the classes in action.jar that takes both your sample inputs in turn and passes them to the code in the jar.
You can save on the cost of starting up the Java Virtual Machine, using a tool like Nailgun.

Slow ANTLR4 generated Parser in Python, but fast in Java

I am trying to convert ant ANTLR3 grammar to an ANTLR4 grammar, in order to use it with the antlr4-python2-runtime.
This grammar is a C/C++ fuzzy parser.
After converting it (basically removing tree operators and semantic/syntactic predicates), I generated the Python2 files using:
java -jar antlr4.5-complete.jar -Dlanguage=Python2 CPPGrammar.g4
And the code is generated without any error, so I import it in my python project (I'm using PyCharm) to make some tests:
import sys, time
from antlr4 import *
from parser.CPPGrammarLexer import CPPGrammarLexer
from parser.CPPGrammarParser import CPPGrammarParser
currenttimemillis = lambda: int(round(time.time() * 1000))
def is_string(object):
return isinstance(object,str)
def parsecommandstringline(argv):
if(2!=len(argv)):
raise IndexError("Invalid args size.")
if(is_string(argv[1])):
return True
else:
raise TypeError("Argument must be str type.")
def doparsing(argv):
if parsecommandstringline(argv):
print("Arguments: OK - {0}".format(argv[1]))
input = FileStream(argv[1])
lexer = CPPGrammarLexer(input)
stream = CommonTokenStream(lexer)
parser = CPPGrammarParser(stream)
print("*** Parser: START ***")
start = currenttimemillis()
tree = parser.code()
print("*** Parser: END *** - {0} ms.".format(currenttimemillis()-start))
pass
def main(argv):
tree = doparsing(argv)
pass
if __name__ == '__main__':
main(sys.argv)
The problem is that the parsing is very slow. With a file containing ~200 lines it takes more than 5 minutes to complete, while the parsing of the same file in antlrworks only takes 1-2 seconds.
Analyzing the antlrworks tree, I noticed that the expr rule and all of its descendants are called very often and I think that I need to simplify/change these rules to make the parser operate faster:
Is my assumption correct or did I make some mistake while converting the grammar? What can be done to make parsing as fast as on antlrworks?
UPDATE:
I exported the same grammar to Java and it only took 795ms to complete the parsing. The problem seems more related to python implementation than to the grammar itself. Is there anything that can be done to speed up Python parsing?
I've read here that python can be 20-30 times slower than java, but in my case python is ~400 times slower!
I confirm that the Python 2 and Python 3 runtimes have performance issues. With a few patches, I got a 10x speedup on the python3 runtime (~5 seconds down to ~400 ms).
https://github.com/antlr/antlr4/pull/1010
I faced a similar problem so I decided to bump this old post with a possible solution. My grammar ran instantly with the TestRig but was incredibly slow on Python 3.
In my case the fault was the non-greedy token that I was using to produce one line comments (double slash in C/C++, '%' in my case):
TKCOMM : '%' ~[\r\n]* -> skip ;
This is somewhat backed by this post from sharwell in this discussion here: https://github.com/antlr/antlr4/issues/658
When performance is a concern, avoid using non-greedy operators, especially in parser rules.
To test this scenario you may want to remove non-greedy rules/tokens from your grammar.
Posting here since it may be useful to people that find this thread.
Since this was posted, there have been several performance improvements to Antlr's Python target. That said, the Python interpreter will be intrinsically slower than Java or other compiled languages.
I've put together a Python accelerator code generator for Antlr's Python3 target. It uses Antlr C++ target as a Python extension. Lexing & parsing is done exclusively in C++, and then an auto-generated visitor is used to re-build the resulting parse tree in Python. Initial tests show a 5x-25x speedup depending on the grammar and input, and I have a few ideas on how to improve it further.
Here is the code-generator tool: https://github.com/amykyta3/speedy-antlr-tool
And here is a fully-functional example: https://github.com/amykyta3/speedy-antlr-example
Hope this is useful to those who prefer using Antlr in Python!
I use ANTLR in python3 target these days.
And a file with 500~ lines just take about less than 20 sec to parse.
So turning to Python3 target might help

How to use Stanford CoreNLP java library with Ruby for sentiment analysis?

I'm trying to do sentiment analysis on a large corpus of tweets in a local MongoDB instance with Ruby on Rails 4, Ruby 2.1.2 and Mongoid ORM.
I've used the freely available https://loudelement-free-natural-language-processing-service.p.mashape.com API on Mashape.com, however it starts timing out after pushing through a few hundred tweets in rapid fire sequence -- clearly it isn't meant for going through tens of thousands of tweets and that's understandable.
So next I thought I'd use the Stanford CoreNLP library promoted here: http://nlp.stanford.edu/sentiment/code.html
The default usage, in addition to using the library in Java 1.8 code, seems to be to use XML input and output files. For my use case this is annoying given I have tens of thousands of short tweets as opposed to long text files. I would want to use CoreNLP like a method and do a tweets.each type of loop.
I guess one way would be to construct an XML file with all of the tweets and then get one out of the Java process and parse that and put it back to the DB, but that feels alien to me and would be a lot of work.
So, I was happy to find on the site linked above a way to run CoreNLP from the command line and accept the text as stdin so that I didn't have to start fiddling with the filesystem but rather feed the text as a parameter. However, starting up the JVM separately for each tweet adds a huge overhead compared to using the loudelement free sentiment analysis API.
Now, the code I wrote is ugly and slow but it works. Still, I'm wondering if there's a better way to run the CoreNLP java program from within Ruby without having to start fiddling with the filesystem (creating temp files and giving them as params) or writing Java code?
Here's the code I'm using:
def self.mass_analyze_w_corenlp # batch run the method in multiple Ruby processes
todo = Tweet.all.exists(corenlp_sentiment: false).limit(5000).sort(follow_ratio: -1) # start with the "least spammy" tweets based on follow ratio
counter = 0
todo.each do |tweet|
counter = counter+1
fork {tweet.analyze_sentiment_w_corenlp} # run the analysis in a separate Ruby process
if counter >= 5 # when five concurrent processes are running, wait until they finish to preserve memory
Process.waitall
counter = 0
end
end
end
def analyze_sentiment_w_corenlp # run the sentiment analysis for each tweet object
text_to_be_analyzed = self.text.gsub("'"){" "}.gsub('"'){' '} # fetch the text field of DB item strip quotes that confuse the command line
start = "echo '"
finish = "' | java -cp 'vendor/corenlp/*' -mx250m edu.stanford.nlp.sentiment.SentimentPipeline -stdin"
command_string = start+text_to_be_analyzed+finish # assemble the command for the command line usage below
output =`#{command_string}` # run the CoreNLP on the command line, equivalent to system('...')
to_db = output.gsub(/\s+/, "").downcase # since CoreNLP uses indentation, remove unnecessary whitespace
# output is in the format of "neutral, "positive", "negative" and so on
puts "Sentiment analysis successful, sentiment is: #{to_db} for tweet #{text_to_be_analyzed}."
self.corenlp_sentiment = to_db # insert result as a field to the object
self.save! # sentiment analysis done!
end
You can at least avoid the ugly and dangerous command line stuff by using IO.popen to open and communicate with the external process, for example:
input_string = "
foo
bar
baz
"
output_string =
IO.popen("grep 'foo'", 'r+') do |pipe|
pipe.write(input_string)
pipe.close_write
pipe.read
end
puts "grep said #{output_string.strip} but not bar"
EDIT: to avoid the overhead of reloading the Java program on each item, you can open the pipe around the todo.each loop an communicate with the process like this
inputs = ['a', 'b', 'c', 'd']
IO.popen('cat', 'r+') do |pipe|
inputs.each do |s|
pipe.write(s + "\n")
out = pipe.readline
puts "cat said '#{out.strip}'"
end
end
that is, if the Java program supports such line-buffered "batch" input. However, it should not be very difficult to modify it to do so, if not.
As suggested in the comments by #Qualtagh, I decided to use JRuby.
I first attempted to use Java to use MongoDB as the interface (read directly from MongoDB, analyze with Java / CoreNLP and write back to MongoDB), but the MongoDB Java Driver was more complex to use than the Mongoid ORM I use with Ruby, so this is why I felt JRuby was more appropriate.
Doing a REST service for Java would have required me first to learn how to do a REST service in Java, which might have been easy, or then not. I didn't want to spend time figuring that out.
So the code I needed to do to run my code was:
def analyze_tweet_with_corenlp_jruby
require 'java'
require 'vendor/CoreNLPTest2.jar' # I made this Java JAR with IntelliJ IDEA that includes both CoreNLP and my initialization class
analyzer = com.me.Analyzer.new # this is the Java class I made for running the CoreNLP analysis, it initializes the CoreNLP with the correct annotations etc.
result = analyzer.analyzeTweet(self.text) # self.text is where the text-to-be-analyzed resides
self.corenlp_sentiment = result # adds the result into this field in the MongoDB model
self.save!
return "#{result}: #{self.text}" # for debugging purposes
end

Categories