JPype (Python): importing folder of jar's - java

i am using JPype in order to work with java classes in python.
I have a folder that contains multiple self-written .jar files.
I know how to import multiple .jar's on the long way:
...
CLASSPATH = "/path/to/jars/first.jar:/path/to/jars/second.jar"
jpype.startJVM(jpype.getDefaultJVMPath(), "-ea", "-Djava.class.path=%s" % CLASSPATH)
MYLIB= jpype.JPackage("org").mylib
MyClass = MYLIB.MyClass
myObj = MyClass()
This works fine, but i think there might be a better way.
I already tried this:
CLASSPATH = "/path/to/jars/*.jar"
and this:
CLASSPATH = "/path/to/jars/*"
In both cases following error occurs:
user#user:~/path/to/python/$ python test.py
Traceback (most recent call last):
File "test.py", line 23, in <module>
myObj = MyClass()
File "/usr/local/lib/python2.7/dist-packages/JPype1-0.6.2-py2.7-linux-x86_64.egg/jpype/_jpackage.py", line 60, in __call__
raise TypeError("Package {0} is not Callable".format(self.__name))
TypeError: Package org.mylib.MyClass is not Callable
My Question:
Is there any way to easily import a folder that contains multiple .jar's in JPype?

You can join the list of jar files with Python code without hardcoding
f'{str.join(":", ["path/to/jars/"+name for name in os.listdir("path/to/jars")])}'

Related

tabula-py unable to read pdf file

My code:
import tabula
import os
dir_path = os.path.dirname(os.path.realpath(__file__))
file_path = dir_path + '\ALPINE_' + str(20191107) + '.pdf'
print(file_path)
df = tabula.read_pdf('ALPINE_20191107.pdf',multiple_tables=True, pages="all")
result:
runfile('C:/Users/Admin/Documents/lucas/testTabula.py.py', wdir='C:/Users/Admin/Documents/lucas')
Traceback (most recent call last):
File "<ipython-input-29-a6b390aef3cf>", line 1, in <module>
runfile('C:/Users/Admin/Documents/lucas/sem título0.py', wdir='C:/Users/Admin/Documents/lucas')
File "C:\ProgramData\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 827, in runfile
execfile(filename, namespace)
File "C:\ProgramData\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 110, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/Users/Admin/Documents/lucas/sem título0.py", line 12, in <module>
df = tabula.read_pdf('ALPINE_20191107.pdf',multiple_tables=True, pages="all")
File "C:\ProgramData\Anaconda3\lib\site-packages\tabula\io.py", line 332, in read_pdf
return _extract_from(raw_json, pandas_options)
File "C:\ProgramData\Anaconda3\lib\site-packages\tabula\io.py", line 664, in _extract_from
df[c] = pd.to_numeric(df[c], errors="ignore")
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\tools\numeric.py", line 138, in to_numeric
raise TypeError("arg must be a list, tuple, 1-d array, or Series")
TypeError: arg must be a list, tuple, 1-d array, or Series
It's function doesn't seem to work. I could directly type the path to make even simpler, but it didn't work either. It could be a problem with the pdf file, but I already saw it working in another environment with the same script and the same file.
I already have java set on both possible PATHs ('C:\Program Files\Java\jre1.8.0_231\bin') as by documentation but it really doesn't matter, the error occurs with or without then set on PATH. I've tried adding jdk as well but didn't solve either.
I notice the error mentioning pandas so maybe it's conflicting with my version (the latest), but i'm not sure.
python is 3.7.4 and java is the latest to this date
I have had the same issue. I was using the version installed using pip, i.e. tabula-py 2.0.0. I uninstalled the version, and installed from Anaconda using conda install -c conda-forge tabula-py, and current version is tabula-py 1.4.1, which resolved this issue.

rJava NoSuchFileException when running jar from R

I'm trying to run a runnable JAR file from R, using the rJava package. This jar has to write and read some files to/from external folders, that are in the same path as the jar file itself, like this:
mypath/myjar.jar
mypath/folder1
mypath/folder2
mypath/input_file1.txt
mypath/input_file2.txt
The program works fine if I call it by opening a console in "mypath" and running the jar the following way:
java -jar myjar.jar input_file1.txt input_file2.txt false
But when I try to run this code in R, using rJava, the code crashes at some point, because it can't find neither mypath/folder1 nor mypath/folder2, even though the working directory is correctly defined as "mypath".
jinit(".",force.init=TRUE) # this starts the JVM
.jaddClassPath("myjar.jar")
jobject <- .jnew("package_name/Main") ## call the constructor
result_java <- rJava::.jcall(obj = jobject, returnSig = "V", method = "main", c("input_file1.txt","input_file2.txt","false"))
In fact, the java program is called, it is able to actually find the input files which are also in mypath, but for some reason crashes when it tries to write to folders in mypath (such as folder1 and folder2) with the error:
Error executing task java.nio.file.NoSuchFileException: folder1/some_file.txt
I really have no idea what's going on, spent hours on this. Am I missing something really obvious here?
When you run your code using Java, you are inside mypath and locations folder1 and folder2 are visible to your code.
Maybe, you should pass (as argument) location of directory, and instead of accessing folder1 in your Java code, you should access explicit path.
result_java <-
rJava::.jcall(
obj = jobject,
returnSig = "V",
method = "main",
c(
"input_file1.txt",
"input_file2.txt",
"false",
"full_path_to_your_mypath_location"))
Then, inside main, you could simply open full_path_to_your_mypath_location/some_file.txt. When you start R, you probably no longer inside directory with your code. You can also try to change dir
setwd(full_path_to_your_mypath_location)

RobotFramework ImportError: No module named foo

I have a class in Java which looks like this:
package com.charandeepmatta.keywords;
import org.robotframework.javalib.annotation.RobotKeyword;
import org.robotframework.javalib.annotation.RobotKeywords;
#RobotKeywords
public class SampleKeywords {
#RobotKeyword
public void printToErrorStream() {
System.err.println("!!! Hello from keyword developed in java ...");
}
}
And my test case looks like this
*** Settings ***
Library org.robotframework.javalib.library.AnnotationLibrary /**.class
*** Test Cases ***
Keyword defined in java class can print to error stream
Print To Error Stream
When I try to run it on RIDE it gives me the following error
[ ERROR ] Error in file 'C:\Users\BFerreira\git\robotframework-maven-project\src\main\robot\suite\OwnDevelopedKeywordTestCase.txt':
Importing test library 'org.robotframework.javalib.library.AnnotationLibrary' failed:
ImportError: No module named org.robotframework.javalib.library
Traceback (most recent call last):
None
PYTHONPATH:
C:\Python27\lib\site-packages\robot\libraries
C:\Python27\lib\site-packages
C:\Windows\system32\python27.zip
C:\Python27\DLLs
C:\Python27\lib
C:\Python27\lib\plat-win
C:\Python27\lib\lib-tk
C:\Python27
C:\Python27\lib\site-packages\wx-2.8-msw-unicode
.
C:\Users\user1\git\robotframework-maven-project\src\main\robot\suite
Everything is in the same classpath, can anyone help?
From the looks of your output, you are not executing with jybot/Jython. Jython is required to load Java classes in a Python interpreter. Here is what the output would look like if you were:
PYTHONPATH:
C:\apps\Python27\Lib\site-packages
C:\apps\jython2.5.3\Lib\site-packages\setuptools-0.6c11-py2.5.egg
C:\apps\jython2.5.3\Lib\site-packages\pip-1.2.1-py2.5.egg
C:\apps\jython2.5.3\Lib
__classpath__
__pyclasspath__/
C:\apps\jython2.5.3\Lib\site-packages
.
c:\ws\local
CLASSPATH:
C:\apps\jython2.5.3\jython.jar
A word of caution: if you run the Robot Framework jar (e.g. java -jar robotframework-2.5.3.jar ...) as some examples suggest, all classpath settings are ignored. You would have to put all your dependencies in one jar for that way to work...

Error in Udf of pig

I am new to pig. I wrote a UDF in pig and used it in my pig script. But it gives following error
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve UserDefined.PartsOfSpeech using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]
Here is my UDF code
public String exec(Tuple input) throws IOException {
//my code here
}
Here is my pig script
REGISTER /home/bigdata/NetBeansProjects/UserDefined/dist/UserDefined.jar
a = load '/user/bigdata/json' using TextLoader() as (input:chararray);
b = foreach a GENERATE UserDefined.PartsOfSpeech(input);
In the above code UserDefined is my package name and PartsOfSpeech is my class name
The error message says that Pig cannot find UserDefined.PartsOfSpeech.
What package declaration does PartsOfSpeech.java have at the top of the file?
If the package declaration is package com.my.company; try this instead:
REGISTER /home/bigdata/NetBeansProjects/UserDefined/dist/UserDefined.jar
a = load '/user/bigdata/json' using TextLoader() as (input:chararray);
b = foreach a GENERATE com.my.company.PartsOfSpeech(input);
That is, replace UserDefined.PartsOfSpeech(input) with com.my.company.PartsOfSpeech(input) since the UDF is located in the package com.my.company.
Also, consider using the DEFINE keyword in your Pig script so you don't need to repeat com.my.company every time you use PartsOfSpeech.
DEFINE PartsOfSpeech UserDefined.dist.PartsOfSpeech();
REGISTER /home/bigdata/NetBeansProjects/UserDefined/dist/UserDefined.jar
a = load '/user/bigdata/json' using TextLoader() as (input:chararray);
b = foreach a GENERATE PartsOfSpeech(input);
There is more information about DEFINE in Chapter 5 of Alan Gates' Programming Pig: http://chimera.labs.oreilly.com/books/1234000001811/ch05.html#udf_define.
Here is an example of DEFINE from Gates' book:
--define.pig
register 'your_path_to_piggybank/piggybank.jar';
define reverse org.apache.pig.piggybank.evaluation.string.Reverse();
divs = load 'NYSE_dividends' as (exchange:chararray, symbol:chararray,
date:chararray, dividends:float);
backwards = foreach divs generate reverse(symbol);
Before compiling your UDF(java class) make sure you have mentioned package name properly. for example if you have mentioned package name-
package com.pig.udf;
It means you need to take care of directory in your linux box as well.
you can follow below mentioned steps to create jar -
Create directory using
mkdir -p com/pig/udf
Create your java class with package com.pig.udf
Compile your java source code using command
javac -cp /usr/lib/pig-0.12.0.2.0.6.0-76.jar YourClass.java
Then go to the directory where you want to create jar for now -
cd ../../..
Now create jar using below command
jar -cvf yourJarName.jar com/
Register the jar in your script using keyword "register" followed by path of the jar
Now use your jar with keyword com.pig.udf.YourJavaClassName
for your scenerio -
REGISTER /home/bigdata/NetBeansProjects/UserDefined/dist/UserDefined.jar
a = load '/user/bigdata/json' using TextLoader() as (input:chararray);
b = foreach a GENERATE com.pig.udf.PartsOfSpeech(input);

Using a Python Script in Java (Eclipse)

I've been looking to incorporate a Python Script a friend made for me into a Java application that I am trying to develop. After some trial and error I finally found out about 'Jython' and used the PythonInterpreter to try and run the script.
However, upon trying to run it, I am getting an error within the Python Script. This is odd because when I try run the script outside of Java (Eclipse IDE in this case), the script works fine and does exactly what I need it to (extract all the images from the .docx files stored in its same directory).
Can someone help me out here?
Java:
import org.python.core.PyException;
import org.python.util.PythonInterpreter;
public class SPImageExtractor
{
public static void main(String[] args) throws PyException
{
try
{
PythonInterpreter.initialize(System.getProperties(), System.getProperties(), new String[0]);
PythonInterpreter interp = new PythonInterpreter();
interp.execfile("C:/Documents and Settings/user/workspace/Intern Project/Proposals/Converted Proposals/Image-Extractor2.py");
}
catch(Exception e)
{
System.out.println(e.toString());
e.printStackTrace();
}
}
}
Java Error regarding Python Script:
Traceback (most recent call last):
File "C:/Documents and
Settings/user/workspace/Intern
Project/Proposals/Converted
Proposals/Image-Extractor2.py", line
19, in
thisDir,_ = path.split(path.abspath(argv[0]))
IndexError: index out of range: 0
Traceback (most recent call last):
File "C:/Documents and
Settings/user/workspace/Intern
Project/Proposals/Converted
Proposals/Image-Extractor2.py", line
19, in
thisDir,_ = path.split(path.abspath(argv[0]))
IndexError: index out of range: 0
Python:
from os import path, chdir, listdir, mkdir, gcwd
from sys import argv
from zipfile import ZipFile
from time import sleep
#A few notes -
#(1) when I do something like " _,variable = something ", that is because
#the function returns two variables, and I only need one. I don't know if it is a
#common convention to use the '_' symbol as the name for the unused variable, but
#I saw it in some guy's code in the past, and I started using it.
#(2) I use "path.join" because on unix operating systems and windows operating systems
#they use different conventions for paths like '\' vs '/'. path.join works on all operating
#systems for making paths.
#Defines what extensions to look for within the file (you can add more to this)
IMAGE_FILE_EXTENSIONS = ('.bmp', '.gif', '.jpg', '.jpeg', '.png', '.tif', '.tiff')
#Changes to the directory in which this script is contained
thisDir = getcwd()
chdir(thisDir)
#Lists all the files/folders in the directory
fileList = listdir('.')
for file in fileList:
#Checks if the item is a file (opposed to being a folder)
if path.isfile(file):
#Fetches the files extension and checks if it is .docx
_,fileExt = path.splitext(file)
if fileExt == '.docx':
#Creates directory for the images
newDirectory = path.join(thisDir, file + "-Images")
if not path.exists(newDirectory):
mkdir(newDirectory)
currentFile = open(file,"r")
for line in currentFile:
print line
sleep(5)
#Opens the file as if it is a zipfile
#Then lists the contents
try:
zipFileHandle = ZipFile(file)
nameList = zipFileHandle.namelist()
for archivedFile in nameList:
#Checks if the file extension is in the list defined above
#And if it is, it extracts the file
_,archiveExt = path.splitext(archivedFile)
if archiveExt in IMAGE_FILE_EXTENSIONS:
zipFileHandle.extract(archivedFile, newDirectory)
except:
pass
My guess is that you don't get command line arguments if the interpreter is called (well not that surprisingly, where should it get the correct values? [or what would be the correct value?]).
os.getcwd()
Return a string representing the current working directory.
Would return the working dir, but presumably that's not what you want.
Not tested, but I think os.path.dirname(os.path.realpath( __ file__)) should work presumably (Note: remove the space there; I should look at the formatting options in detail some time~)

Categories