Flink can't access files in JAR - java

I tried to run a JAR in a flink cluster but I get this FileNotFound Exception.
Caused by: java.io.FileNotFoundException: File file:/tmp/flink-web-88bf3f41-94fc-40bd-a865-bb0e6d5ac95c/flink-web-upload/82227475-523d-4607-8ab2-09bae8602248-tutorial-1.0-jar-with-dependencies.jar!/ldbc_sample/edges.csv does not exist or the user running Flink ('userA') has insufficient permissions to access it.
at org.apache.flink.core.fs.local.LocalFileSystem.getFileStatus(LocalFileSystem.java:106)
The csv files are located in a folder in the resources directory of the project.
I access the file path by:
URL resource = Helper.class.getClassLoader().getResource("ldbc_sample");
return resource.getPath();
I opened the jar and made sure that the files definitely exist, and I also run it locally, and it worked.
What do I have to do, to make sure that flink can access my csv?

Maybe you want to pass your .csv as an argument to your program ?
Something like:
def main(args: Array[String]): Unit = {
val ldbcSample = ParameterTool.fromArgs(args).getRequired("ldbc_sample")
...
}
or you can make your .properties file with different arguments:
ldbc_sample: /ldbc_sample/edges.csv
topic_source: TOPIC_NAME
val jobParams = ParameterTool.fromArgs(args)
val jobArgs = ParameterTool.fromPropertiesFile(jobParams.getRequired("properties_file_path"))

Related

Java JAR file runs on local machine but missing file on others

The JAR file consists of the ffmpeg.exe file and it can run normally on my machine without any problems. However, if I try to run it on another computer it would tell me that java.io.IOException: Cannot run program "ffmpeg.exe": CreateProcess error=2,The system cannot find the file specified from the stacktrace. The way I imported it was
FFMpeg ffmpeg = new FFMpeg("ffmpeg.exe"); //in res folder
...
//ffmpeg class
public FFMPEG(String ffmepgEXE) {
this.ffmepgEXE = ffmepgEXE;
}
The quick fix is you have to put ffmpeg.exe in the same folder with your .jar file.
If you want to read file from resources folder, you have to change this code:
URL resource = Test.class.getResource("ffmpeg.exe");
String filepath = Paths.get(resource.toURI()).toFile().getAbsolutePath();
FFMpeg ffmpeg = new FFMpeg(filepath);

How to add jars on hive shell in a java application

i am trying to add jar on the hive shell. I am aware of the global option on the server but my requirement is to add them per session on the hive shell.
I have used this class for the hdfs dfs commands to add the jars to the hdfs file system
This is what i have tried:
Created a folder on the hdfs /tmp
Add the file to hdfs filesystem using FileSystem.copyFromLocalFile method
(equivalent to the hdfs dfs -put myjar.jar /tmp
Set permissions on the file on fhe fs file system
Check that the jar was loaded to hdfs using the getFileSystem method
List files on the fs FileSystem using listFiles to confirm the jars are there.
This works and I have the jars loaded to hdfs but i cannot add jars to the hive session
When i am trying to add it in the hive shell, i am doing the following:
statement = setStmt(createStatement(getConnection()));
query = "add jar " + path;
statement.execute(query);
I am getting this error [For example path of /tmp/myjar.jar]:
Error while processing statement: /tmp/myjar.jar does not exist
Other permutations on the path such as
query = "add jar hdfs://<host>:<port>" + path;
query = "add jar <host>:<port>" + path;
results with an error.
command to list jars works (with no results)
query = "list jars";
ResultSet rs = statement.executeQuery(query);
I managed to solve this issue
The process failed because of the configuration of the FileSystem.
This object is where we upload the jars to, before adding them on the session.
This is how you init the FileSystem
FileSystem fs = FileSystem.newInstance(conf);
The object conf should have the properties of the hive server.
In order for the process to work, I needed to set the following parameter on the Configuration property
conf.set("fs.defaultFS", hdfsDstStr);

java.io.FileNotFoundException by using play dist

The below code works fine when I start project with play start
object LogFile {
implicit val formats = DefaultFormats
private var fileInput = new FileInputStream("./conf/log4j.properties");
private val properties = new Properties
properties.load(fileInput);
def test(head: String, data: String) {
System.setProperty("my.log", "scala.txt")
PropertyConfigurator.configure(properties)
val log = Logger.getLogger(head)
log.error(data)
}
}
but when I am using sudo /home/ubuntu/play/play dist
and run that I got:
[error] play - Cannot invoke the action, eventually got an error:
java.io.FileNotFoundException: ./conf/log4j.properties (No such file or directory)
What am I doing wrong?
I am using Scala 2.10 with play framework 2.2
You're missing the Log4j Properties file
./conf/log4j.properties
You're probably missing the file:
/home/ubuntu/project/conf/log4j.properties
sudo command changes the user that you are executing as. So the new user possibly has different environment variables.
note: project is application name.
Also, you're using a relative path ./conf/log4j.properties, the root of which will be resolved at runtime based on the home directory that you are executing in.
Possible solutions:
1) Don't use a relative path, rather use an absolute path
2) Change the home directory in the profile of the user that you are executing
your application as ( root user?)
3) Copy the missing file to the directory where your application is looking for the file

Start a java application from Hadoop YARN

I'm trying to run a java application from a YARN application (in detail: from the ApplicationMaster in the YARN app). All examples I found are dealing with bash scripts that are ran.
My problem seems to be that I distribute the JAR file wrongly to the nodes in my cluster. I specify the JAR as local resource in the YARN client.
Path jarPath2 = new Path("/hdfs/yarn1/08_PrimeCalculator.jar");
jarPath2 = fs.makeQualified(jarPath2);
FileStatus jarStat2 = null;
try {
jarStat2 = fs.getFileStatus(jarPath2);
log.log(Level.INFO, "JAR path in HDFS is "+jarStat2.getPath());
} catch (IOException e) {
e.printStackTrace();
}
LocalResource packageResource = Records.newRecord(LocalResource.class);
packageResource.setResource(ConverterUtils.getYarnUrlFromPath(jarPath2));
packageResource.setSize(jarStat2.getLen());
packageResource.setTimestamp(jarStat2.getModificationTime());
packageResource.setType(LocalResourceType.ARCHIVE);
packageResource.setVisibility(LocalResourceVisibility.PUBLIC);
Map<String, LocalResource> res = new HashMap<String, LocalResource>();
res.put("package", packageResource);
So my JAR is supposed to be distributed to the ApplicationMaster and be unpacked since I specify the ResourceType to be an ARCHIVE. On the AM I try to call a class from the JAR like this:
String command = "java -cp './package/*' de.jofre.prime.PrimeCalculator";
The Hadoop logs tell me when running the application: "Could not find or load main class de.jofre.prime.PrimeCalculator". The class exists at exactly the path that is shown in the error message.
Any ideas what I am doing wrong here?
I found out how to start a java process from an ApplicationMaster. Infact, my problem was based on the command to start the process even if this is the officially documented way provided by the Apache Hadoop project.
What I did no was to specify the packageResource to be a file not an archive:
packageResource.setType(LocalResourceType.FILE);
Now the node manager does not extract the resource but leaves it as file. In my case as JAR.
To start the process I call:
java -jar primecalculator.jar
To start a JAR without specifying a main class in command line you have to specify the main class in the MANIFEST file (Manually or let maven do it for you).
To sum it up: I did NOT added the resource as archive but as file and I did not use the -cp command to add the syslink folder that is created by hadoop for the extracted archive folder. I simply startet the JAR via the -jar parameter and that's it.
Hope it helps you guys!

Deployment in tomcat

i am getting a problem
i have deployed a war file, when i run localy through tomcat it works fine but when i run on another system by giveing my system ip and then project folder e.g
http:\192.168.0.145\DllTest it loads the applet but when i click on a button to load the functionality it is throwing an exception
Exception in thread "AWT-EventQueue-3" java.lang.UnsatisfiedLinkError: Expecting an absolute path of the library: http:\192.168.0.145:8080\DllTest\lib\jinvoke.dll
while it is working fine localy but not in another system. Please tell me what is the problem.
Is it a rights issue or something else.
You cannot load a DLL on an external host. It has to be an absolute disk file system -as the exception message already hints. Your best bet is to download it manually, create a temp file and load it instead.
File dllFile = File.createTempFile("jinvoke", ".dll");
InputStream input = new URL(getCodeBase(), "lib/jinvoke.dll").openStream();
OuptutStream output = new FileOutputStream(dllFile);
// Write input to output and close streams the usual Java IO way.
// Then load it using absolute disk file system path.
System.loadLibrary(dllFile.getAbsolutePath());
dllFile.deleteOnExit();

Categories