NLTK unable to find java.exe (spontaneous path reduction) - java

Similar questions were posted here and here, and my question is actually based on what was suggested in answers to those questions.
I try to parse some German texts using Stanford Parser and NLTK.
from nltk.parse.stanford import StanfordParser
import os
os.environ['STANFORD_PARSER'] ='C:\PretestKorpus\stanford-parser-full-2018-10-17'
os.environ['STANFORD_MODELS'] = 'C:\PretestKorpus\stanford-parser-full-2018-10-17'
parser=StanfordParser(model_path="C:\PretestKorpus\germanPCFG.ser.gz")
new=list(parser.raw_parse("Es war einmal ein Bauer"))
Then, of course, I get NLTK was unable to find the java file! error:
So I set configurations like this:
nltk.internals.config_java('C:\Program Files (x86)\Java\jre1.8.0_251\bin\java.exe')
but it returns
NLTK was unable to find the C:\Program Files (x86)\Java\jre1.8.0_251in\java.exe file!
Use software specific configuration paramaters or set the JAVAHOME environment variable.
So, somehow Python reduces the path \\jre1.8.0_251\bin\java.exe to \\jre1.8.0_251in\java.exe
Looks like this:
Setting environment variable does not help either (returns NLTK was unable to find the java file!error). Obviously, python does not read the path correctly. But for what reason and how to fix that? Any help will be appreciated.

In python \b inside a String is resolved to a backspace character. Therefore you see the white BS in the picture, becuase the console tries to represent this special character (BS for backspace).
What you need to do is to escape the \ inside your String like so
nltk.internals.config_java('C:\\Program Files (x86)\\Java\\jre1.8.0_251\\bin\\java.exe')
It is a good practice to alway escape all backslash characters, so you can be sure that problems like this one never occur.

Related

Logstash.bat cannot find main class Error

I'm trying to install Logstash in my system and when I'm trying to run logstash.bat,
I'm getting the below error:
Error: Could not find or load main class 7.1.0\logstash-7.2.0\logstash-7.2.0\log stash-core\lib\jars\animal-sniffer-annotations-1.14.jar;D:\ELK
Whats the reason?
I have added the below code on the logstash.bat file as well.
set JAVA_HOME=path\to\custom_jdk_folder\jdk_8u161set
CLASSPATH=%JAVA_HOME%\bin
But the error still exists.
The reason is that ... somehow ... the batch file has gotten the Java command line wrong. It looks like it has misinterpreted something as the class name.
The current version of the logstash.bat file is here. As you can see, it is assembling the Java command line from a variety of things including:
parameters on the command line,
the logstash "jvm.options" file, and
the list of logstash's JAR files from "logstash-core\lib\jars"
It is unclear what has actually gone wrong, but this kind of problem often happens if there is an unexpected (unquoted) space in the Java command line.
My recommendation would be to debug what the BAT file is actually doing, starting by finding out what the command line actually looks like.
Also, take a look at the explanation of what the command line should look like in:
What does "Could not find or load main class" mean?

Matlab installation (LD_LIBRARY_PATH) messes up other library files

I am trying to install Matlab on a Linux machine, but setting LD_LIBRARY_PATH (as the installation requires) breaks other library files. I am not an Linux expert, but I have tried several things and cannot get it working correctly. I have even contacted Matlab support, got the issue elevated to the dev team, and was basically told "haha sucks to suck". I have seen a few other people online have had the same issue, but either their questions were never answered or they had a slightly different problem and their solution didn't apply to me.
Installing on a VM running Ubuntu:
I set LD_LIBRARY_PATH as the instructions say, then it breaks network files. I can ping google.com, but I cannot nslookup google.com or visit it in a browser. Nslookup provides this error:
nslookup: /usr/local/MATLAB/MATLAB_Runtime/v90/bin/glnxa64/libcrypto.so.1.0.0: no version information available (required by /usr/lib/libdns.so.100)
03-Feb-2016 11:32:22.361 ENGINE_by_id failed (crypto failure)
03-Feb-2016 11:32:22.362 error:25070067:DSO support routines:DSO_load:could not load the shared library:dso_lib.c:244:
03-Feb-2016 11:32:22.363 error:260B6084:engine routines:DYNAMIC_LOAD:dso not found:eng_dyn.c:447:
03-Feb-2016 11:32:22.363 error:2606A074:engine routines:ENGINE_by_id:no such engine:eng_list.c:418:id=gost
(null): dst_lib_init: crypto failure
The installation worked though (I can run my Java programs that reference compiled Matlab functions). Unsetting LD_LIBRARY_PATH fixes the network files but then I can't run programs anymore.
Installing on EC2 instance:
On an EC2 instance it does not break the network files (nslookup is fine). Instead it messes up Python library files. Trying to use any aws cli command, I get the error:
File "/usr/bin/aws", line 19, in <module>
import awscli.clidriver
File "/usr/lib/python2.7/dist-packages/awscli/clidriver.py", line 16, in <module>
import botocore.session
File "/usr/lib/python2.7/dist-packages/botocore/session.py", line 25, in <module>
import botocore.config
File "/usr/lib/python2.7/dist-packages/botocore/config.py", line 18, in <module>
from botocore.compat import six
File "/usr/lib/python2.7/dist-packages/botocore/compat.py", line 139, in <module>
import xml.etree.cElementTree
File "/usr/lib64/python2.7/xml/etree/cElementTree.py", line 3, in <module>
from _elementtree import *
ImportError: PyCapsule_Import could not import module "pyexpat"
Printing sys.path in Python shows lib-dynload is already there though, so it doesn't seem to the problem.
And when trying to run the program, I get:
Exception in thread "main" java.lang.LinkageError: libXt.so.6: cannot open shared object file: No such file or directory
at com.mathworks.toolbox.javabuilder.internal.DynamicLibraryUtils.dlopen(Native Method)
at com.mathworks.toolbox.javabuilder.internal.DynamicLibraryUtils.loadLibraryAndBindNativeMethods(DynamicLibraryUtils.java:134)
at com.mathworks.toolbox.javabuilder.internal.MWMCR.<clinit>(MWMCR.java:1529)
at VectorAddExample.VectorAddExampleMCRFactory.newInstance(VectorAddExampleMCRFactory.java:48)
at VectorAddExample.VectorAddExampleMCRFactory.newInstance(VectorAddExampleMCRFactory.java:59)
at VectorAddExample.VectorAddClass.<init>(VectorAddClass.java:62)
at com.mypackage.Example.main(Example.java:13)
I'm at a brick wall and really have no clue how to proceed.
Maybe something else already needs LD_LIBRARY_PATH set to work. Make sure you prepend not overwrite:
export LD_LIBRARY_PATH=new/path:$LD_LIBRARY_PATH
Edit:
OK, if LD_LIBRARY_PATH was initially empty, this suggests that Matlab comes with shared libraries that are incompatible with your system ones:
nslookup: /usr/local/MATLAB/MATLAB_Runtime/v90/bin/glnxa64/libcrypto.so.1.0.0: no version information available (required by /usr/lib/libdns.so.100)
suggests that /usr/lib/libdns.so.100 needs libcrypto.so.1.0.0, which is now being resolved to the one that comes with MATLAB, which is incompatible.
You can check the dependencies of a dll by
ldd /usr/lib/libcrypto.so.1.0.0
and hopefully you can find a configuration that keeps both MATLAB and your system happy. Unfortunately, this may involve a lot of trial and error.
If there is no such configuration, you can try setting LD_LIBRARY_PATH only when you run MATLAB:
LD_LIBRARY_PATH=$MATLAB_LD_LIBRARY_PATH matlab
Edit 2:
Well, for the Python issue, it seems to boil down to pyexpat, which is a wrapper around the standard expat XML parser. Try doing (name guessed since I don't have a Linux right now):
ldd /usr/local/lib/python2.7/site-packages/libpyexpat.so
and see what that depends on. Probably, it will be libexpat.so, which is now being resolved to MATLAB's version.
try the following command:
export LD_LIBRARY_PATH=/usr/local/MATLAB/MATLAB_Runtime/v90/runtime/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v90/bin/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v90/sys/os‌​/glnxa64:$LD_LIBRARY_PATH
Perhaps not helpful for OP but if you are generating a python package with MATLAB, you could modify the generated __init__.py file MATLAB creates for your package.
Specifically, the generated __init__.py file contains the following line (as of MATLAB 2017a):
PLATFORM_DICT = {'Windows': ['PATH','dll',''], 'Linux': ['LD_LIBRARY_PATH','so','libmw'], 'Darwin': ['DYMCR_LIBRARY_PATH','dylib','libmw']}
For Linux platform, you could simply replace LD_LIBRARY_PATH with something else such as MCR_LIBRARY_PATH to prevent mucking with your shared libs.
sed -i -e 's/LD_LIBRARY_PATH/MCR_LIBRARY_PATH/g' /MY/PACKAGE/BUILD/PATH/__init__.py
Then obviously export MCR_LIBRARY_PATH before using python.

Why couldn't "org.antlr.v4.runetime.misc.TestRig" not be found or load?

So, here's my problem. I've got my ANTLR4 code successfully compiled, without errors and now I want to test it out. The ANTLR4 Documentation tells me, to test my applications, I shall do this:
java org.antlr.v4.runtime.misc.TestRig
I've tried this and got following error:
Error: Main Class org.antlr.v4.runtime.misc.TestRig couldn't be found or load.
I've checked if my CLASSPATH wasn't set, but everything was correctly set as it should be. I also tried moving the file directly to my test folder and opened CMD there and tried it again, I occur the same error. Searching in the Internet didn't help, as no one seemed to have occurred this error with ANTLR4 before.
Specs:
Java 1.7.0.55
ANTLR 4.4
There seems to be something wrong with your classpath, contrary to your belief everything is okay.
When I download the ANTLR 4 JAR and run TestRig:
wget http://www.antlr.org/download/antlr-4.4-complete.jar
...
java -cp antlr-4.4-complete.jar org.antlr.v4.runtime.misc.TestRig
I see the following on my console:
java org.antlr.v4.runtime.misc.TestRig GrammarName startRuleName
[-tokens] [-tree] [-gui] [-ps file.ps] [-encoding encodingname]
[-trace] [-diagnostics] [-SLL]
[input-filename(s)]
Use startRuleName='tokens' if GrammarName is a lexer grammar.
Omitting input-filename makes rig read from stdin.

Getting a TypeError: wrong argument type Array (expected Module) with include java_import

I'm just starting using JRuby and when I test following code saved as a script
include java_import
frame = javax.swing.JFrame.new
frame.getContentPane.add javax.swing.JLabel.new('Hello, World!')
frame.setDefaultCloseOperation javax.swing.JFrame::EXIT_ON_CLOSE
frame.pack
frame.set_visible true
I get the following error
Switch to inspect mode.
irb(main):001:0> include java_import
TypeError: wrong argument type Array (expected Module)
from org/jruby/RubyModule.java:2068:in `include'
from (irb):1:in `evaluate'
from org/jruby/RubyKernel.java:1066:in `eval'
from org/jruby/RubyKernel.java:1392:in `loop'
from org/jruby/RubyKernel.java:1174:in `catch'
from org/jruby/RubyKernel.java:1174:in `catch'
from C:\jruby-1.7.2\/bin/jirb_swing:54:in `(root)'
irb(main):002:0>
Can someone help me identify what I'm doing wrong? I have jruby-1.7.2 installed (C:\jruby-1.7.2), added to the PATH in system variables, and running on Windows 7. At the command prompt, I can test jruby and java version correctly
I also launched the jirb_swing via jruby(1.9.3).exe and at the prompt typed
irb(main):001:0> include java_import
and I get the same error message
Update
Hi All
After searching further on the Internet pages on JRuby and Java, I have resolved the problem successfully.
First the two ways that worked for me in calling Java from JRuby:
Using "include Java" (without the quotes in my script file and with capital J)
Using "require 'java'" (without the " quotes and notice the single quotes around java, now starting with lower-cap j)
The "java_import" I was using was an error on my part in being careless as I read the detailed document at GitHub (https://github.com/jruby/jruby/wiki/CallingJavaFromJRuby)
I will research more about the differences or lack thereof in using the "include" vs. "require" statements and post back. As a real novice, this has been a real eye opener
Sri

MeCab path parameters do not accept whitespace on Windows

I have successfully used MeCab Java for calling Mecab from my Java code.
I use the following statement to initialize the tagger:
tagger = new Tagger("--node-format=%f[7]\\t --unk-format=%m\\t --eos-format=\\n --rcfile=" + filePath + "/mecabrc" + " --dicdir=" + filePath + "/ipadic");
Now I am facing a problem that filePath might actually contain whitespace characters, for example: c:\folder name\. When I try using such a path, I get an errors from Mecab saying:
java.lang.RuntimeException: C:\src\c\common\mecab\src\main\c\tagger.cpp(151) [load_dictionary_resource(param)] C:\src\c\common\mecab\src\main\c\param.cpp(71) [ifs] no such file or directory: c:/folder
Which means Mecab did not recognize the whitespace correctly.
Any idea how can I direct Mecab to accept the whitescapes in a Windows file path?
I read the MeCab source code and there is no way to get MeCab to accept white space in the path without editing the source and compiling a custom version. You have at least three work-around options:
Rename the directory to something without spaces
Use a relative path if possible
Use windows 8.3 filenames
Here is a link to showing more information on how to get 8.3 filenames in java.

Categories