How do I set my EMR Classpath - java

I am running a job on an AWS EMR cluster, and am having issues with a Jackson library conflict. Based on the article here I tried to add a bootstrap step to set my classpath with the following script:
#!/bin/bash
export HADOOP_USER_CLASSPATH_FIRST=true;
echo "HADOOP_CLASSPATH=s3n://bucket/myjar.jar" > /home/hadoop/conf/hadoop-user-env.sh
I have built my jar so that all its dependencies are included with it. The first problem I have when I do this is that my enable debugging step that I have dies with the following error:
Exception in thread "main" java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.amazon.ws.emr.hadoop.fs.EmrFileSystem not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1895)
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2427)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2440)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2479)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2461)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:372)
at com.amazon.elasticmapreduce.scriptrunner.ScriptRunner.fetchFile(ScriptRunner.java:39)
at com.amazon.elasticmapreduce.scriptrunner.ScriptRunner.main(ScriptRunner.java:56)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: java.lang.ClassNotFoundException: Class com.amazon.ws.emr.hadoop.fs.EmrFileSystem not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1801)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1893)
... 13 more
So I have two questions, what is wrong with this regards to the enable debugging step also? Is it valid to give my classpath as a s3 location? If not what should the value of:
/path/to/my.jar
be in the example on the page indicated above?

Looking at your bootstrap action, it looks like there might be a mistake in your string. The line should look like the following:
#!/bin/bash
export HADOOP_USER_CLASSPATH_FIRST=true
echo "HADOOP_CLASSPATH=/path/to/my.jar" >> /home/hadoop/conf/hadoop-user-env.sh
Note the '>>' characters. A single '>' means that you're replacing the entire file with the output of the 'echo' command, whereas a double '>>' means you're appending that line at the end of the script. Additionally, a semi-colon isn't needed in a Bash script.
References : http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-hadoop-config_hadoop-user-env.sh.html
PS : Amazon's awesome support found this question and replied to my email; although this question was not asked by me. So this is the attribution to the author - AWS Support Engineer named Rendy O.

Related

Failed to use ESAPI 2.1.0.1: AccessController class (org.owasp.esapi.reference.DefaultAccessController) CTOR threw exception

I am encounting a problem in using esapi-2.1.0.1.jar.
My develop environment:
Jdk 1.8
eclipse 2018-12
ESAPI works fine in method A, but fialed in method B.
package com;
import org.owasp.esapi.ESAPI;
public class MainEntryPoint {
public static void main(String[] args) {
// TODO Auto-generated method stub
System.out.println("A");
System.out.println(ESAPI.encoder().encodeForHTML("<li>ABC some html here"));
System.out.println("A end");
System.out.println("B");
System.out.println(ESAPI.accessController().toString());
System.out.println("B end");
}
}
console:
A
System property [org.owasp.esapi.opsteam] is not set
System property [org.owasp.esapi.devteam] is not set
Attempting to load ESAPI.properties via file I/O.
Attempting to load ESAPI.properties as resource file via file I/O.
Found in 'org.owasp.esapi.resources' directory: C:\Users\Ansticelee\.esapi\2101\ESAPI.properties
Loaded 'ESAPI.properties' properties file
SecurityConfiguration for Validator.ConfigurationFile.MultiValued not found in ESAPI.properties. Using default: false
Attempting to load validation.properties via file I/O.
Attempting to load validation.properties as resource file via file I/O.
Found in 'org.owasp.esapi.resources' directory: C:\Users\Ansticelee\.esapi\2101\validation.properties
Loaded 'validation.properties' properties file
<li>ABC some html here
A end
B
Exception in thread "main" org.owasp.esapi.errors.ConfigurationException: java.lang.reflect.InvocationTargetException AccessController class (org.owasp.esapi.reference.DefaultAccessController) CTOR threw exception.
at org.owasp.esapi.util.ObjFactory.make(ObjFactory.java:129)
at org.owasp.esapi.ESAPI.accessController(ESAPI.java:85)
at com.MainEntryPoint.main(MainEntryPoint.java:20)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.owasp.esapi.util.ObjFactory.make(ObjFactory.java:86)
... 2 more
Caused by: java.lang.NoClassDefFoundError: org/apache/commons/configuration/ConfigurationException
at org.owasp.esapi.reference.DefaultAccessController.<init>(DefaultAccessController.java:32)
at org.owasp.esapi.reference.DefaultAccessController.getInstance(DefaultAccessController.java:22)
... 7 more
Caused by: java.lang.ClassNotFoundException: org.apache.commons.configuration.ConfigurationException
at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
... 9 more
What did I done? :
create a new Project in eclipse (Dynamic Web Project)
put "esapi-2.1.0.1.jar" & "log4j-1.2.12.jar" into folder WebContent/WEB-INF/lib
put "ESAPI.properties" & "validation.properties" into folder C:\Users\Ansticelee.esapi\2101
set VM arguments :
-Dorg.owasp.esapi.resources="C:\Users\Ansticelee\.esapi\2101"
right click on MainEntryPoint.java > Run As > Java Application
(remind : MainEntryPoint.java is my java file)
(detail of 4.: in eclipse, right click on my project (Run As > Run Configurations... > Java Application > esapi2101 > Arguments > VM arguments)
detail of 2. & 3.
I downloaded esapi-2.1.0.1.jar from https://repo1.maven.org/maven2/org/owasp/esapi/esapi/2.1.0.1/
I downloaded log4j-1.2.12.jar from https://mvnrepository.com/artifact/log4j/log4j/1.2.12
I downloaded 2 properties from https://github.com/ESAPI/esapi-java-legacy/tree/2.1/configuration/.esapi
detail of method B:
I saw it in page.16 (step 5) from https://owasp.org/www-pdf-archive/JavaEE-ESAPI_2.0a_install.pdf
It is my first time of posting a question. Thanks for your patience of reading.
How could I make the method B mentioned above works fine?
The root cause of this is Eclipse can't find the class
org.apache.commons.configuration.ConfigurationException
as evidenced by this ClassNotFoundException:
Caused by: java.lang.ClassNotFoundException: org.apache.commons.configuration.ConfigurationException
That means that you are missing whatever Apache Commons jar contains that class. (There are several Apache Commons libraries that ESAPI uses, so I can't be more specific without doing extra research and I'm in a hurry at the moment.) Given that you mentioned that you explicitly added the Log4J jar (in step 2 you mention 'put "esapi-2.1.0.1.jar" & "log4j-1.2.12.jar" into folder WebContent/WEB-INF/lib'), it appears that you don't have your Eclipse project configured as a Maven or Gradle project. Reconfigure your Eclipse project as a Maven or Gradle project and those transitive dependencies jars should get downloaded and pulled in for you automatically. If you don't take that approach, you are going to keep on running into similar problems.
One last important point... please use ESAPI 2.5.1.0, which is the latest version as of this posting. Because of the small size of the ESAPI team, we can only actively address issues that arise in the latest version at the time the question is posted. (In this case, I don't think your problem has anything to do with what version of ESAPI you are using, which is why I responded.) Also if you plan to uses ESAPI for anything serious, you need to use a version that doesn't have any know vulnerabilities. Note that 2.1.0.1 has 2 known CVEs; see https://mvnrepository.com/artifact/org.owasp.esapi/esapi for details and available versions.
Hope this helps.

How to work run/implement Benchmark for optaplanner?

I need assistance with benchmarking using OptaPlanner. There are two issues I am running into. The first is seeing the results from benchmarking the provided examples. I see that the vehiclerouting example has two apps. One of them is VehicleRoutingBenchmarkApp. I ran this application and thought that the index.html would be generated however it wasn't. So I am not clear on how to do this.
The second issue is implementation. I am just experimenting, so I added the code snippet as instructed by the documentation in the VehicleRoutingApp.main() so that I can see what will happen.
Documentation link here.
System.out.println("-------------- benchmark stuff --------------");
PlannerBenchmarkFactory plannerBenchmarkFactory = PlannerBenchmarkFactory.createFromXmlResource(
"org/optaplanner/examples/vehiclerouting/benchmark/vehicleRoutingBenchmarkConfig.xml");
PlannerBenchmark plannerBenchmark = plannerBenchmarkFactory.buildPlannerBenchmark();
plannerBenchmark.benchmark();
The result of this was a constant stream of outputted logs. It is difficult to follow what it is being conveyed. How do I get all of that translated into a nice GUI as I believe index.html is suppose to do? Thanks in advance.
The tracestack I am getting is the following:
Exception in thread "main" java.lang.IllegalStateException: The directory dataDir (C:\Intellij\Workspace\optaplanner-developer\data\vehiclerouting) does not exist.
Either the working directory should be set to the directory that contains the data directory (which is not the data directory itself), or the system property org.optaplanner.examples.dataDir should be set properly.
The data directory is different in a git clone (optaplanner/optaplanner-examples/data) and in a release zip (examples/sources/data).
In an IDE (IntelliJ, Eclipse, NetBeans), open the "Run configuration" to change "Working directory" (or add the system property in "VM options").
at org.optaplanner.examples.common.persistence.AbstractSolutionDao.<init>(AbstractSolutionDao.java:46)
at org.optaplanner.examples.common.persistence.XStreamSolutionDao.<init>(XStreamSolutionDao.java:32)
at org.optaplanner.examples.vehiclerouting.persistence.VehicleRoutingDao.<init>(VehicleRoutingDao.java:25)
at org.optaplanner.examples.vehiclerouting.persistence.VehicleRoutingImporter.<init>(VehicleRoutingImporter.java:57)
at org.optaplanner.examples.vehiclerouting.persistence.VehicleRoutingFileIO.<init>(VehicleRoutingFileIO.java:28)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at java.lang.Class.newInstance(Class.java:442)
at org.optaplanner.core.config.util.ConfigUtils.newInstance(ConfigUtils.java:46)
at org.optaplanner.benchmark.config.ProblemBenchmarksConfig.buildSolutionFileIO(ProblemBenchmarksConfig.java:149)
at org.optaplanner.benchmark.config.ProblemBenchmarksConfig.buildProblemBenchmarkList(ProblemBenchmarksConfig.java:111)
at org.optaplanner.benchmark.config.SolverBenchmarkConfig.buildSolverBenchmark(SolverBenchmarkConfig.java:88)
at org.optaplanner.benchmark.config.PlannerBenchmarkConfig.buildPlannerBenchmark(PlannerBenchmarkConfig.java:210)
at org.optaplanner.benchmark.impl.XStreamXmlPlannerBenchmarkFactory.buildPlannerBenchmark(XStreamXmlPlannerBenchmarkFactory.java:156)
at org.optaplanner.examples.common.app.CommonBenchmarkApp.buildAndBenchmark(CommonBenchmarkApp.java:68)
at org.optaplanner.examples.vehiclerouting.app.VehicleRoutingBenchmarkApp.main(VehicleRoutingBenchmarkApp.java:24)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)
For those trying to learn optaplanner and has the same question, the solution is in your referenced benchmarkConfig.xml file. You should create an .xml that has a root <plannerBenchmark> tag. Inside of this you can add another tag called <benchmarkDiretory>. This is where you can specify where you would like your report to be generated. Be sure to set your <inputSolutionFile> to point towards the proper datasets, which can be either .xml or .vrp. The rest works like magic.
Should look something similar to the following:
<plannerBenchmark>
<benchmarkDirectory>local/data/report/vehiclerouting</benchmarkDirectory>
.....
<inputSolutionFile>data/vehiclerouting/unsolved/TestCase_1.xml</inputSolutionFile>
.....
</plannerBenchmark>

cobertura-instrument.sh fails to instrument jar file with java.lang.NoClassDefFoundError: net.sourceforge.cobertura.instrument.InstrumentMain

I'm trying to instrument jar file (from Spacewalk project) so I can measure code coverage of mine testing, but it is failing:
# /opt/cobertura-2.1.1/cobertura-instrument.sh --datafile /tmp/out /usr/share/rhn/lib/rhn.jar
Exception in thread "main" java.lang.NoClassDefFoundError: net.sourceforge.cobertura.instrument.InstrumentMain
Caused by: java.lang.ClassNotFoundException: net.sourceforge.cobertura.instrument.InstrumentMain
at java.net.URLClassLoader.findClass(URLClassLoader.java:432)
at java.lang.ClassLoader.loadClass(ClassLoader.java:676)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:358)
at java.lang.ClassLoader.loadClass(ClassLoader.java:642)
Could not find the main class: net.sourceforge.cobertura.instrument.InstrumentMain. Program will exit.
I have tried to provide one random class (in the ideal state I want to instrument all of them) from that jar as well, but with same result:
# jar tf rhn.jar | tail
org/cobbler/CobblerConnection.class
[...]
# /opt/cobertura-2.1.1/cobertura-instrument.sh --datafile /tmp/out /usr/share/rhn/lib/rhn.jar org.cobbler.CobblerConnection
I'm pretty sure I'm just missing something what it is trying to tell me.
I'm using cobertura-2.1.1 downloaded from SourceForge and extracted into /opt, running on Red Hat Enterprise Linux 6.
OK, this was simple:
# dos2unix /opt/cobertura-2.1.1/cobertura-instrument.sh
also it is missing bash shebang (#!/bin/bash), so you might need to add it to the beginning of the file (I do not know why it worked for me even without that).

ZMQ - libzmq.so.3: cannot open shared object file: No such file or directory

I'm trying to embed zeroMQ in my app, I followed this guideline to install ZMQ, so till here everything works fine.
I have this line of code in my app:
ZMQ.Context m_context = ZMQ.context(1);
but above line of code raise below exception:
Exception in thread "main" java.lang.UnsatisfiedLinkError: /tmp/libjzmq-812339378390536247.lib: libzmq.so.3: cannot open shared object file: No such file or directory
at java.lang.ClassLoader$NativeLibrary.load(Native Method)
at java.lang.ClassLoader.loadLibrary1(ClassLoader.java:1939)
at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1864)
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1825)
at java.lang.Runtime.load0(Runtime.java:792)
at java.lang.System.load(System.java:1059)
at org.zeromq.EmbeddedLibraryTools.loadEmbeddedLibrary(EmbeddedLibraryTools.java:136)
at org.zeromq.EmbeddedLibraryTools.<clinit>(EmbeddedLibraryTools.java:22)
at org.zeromq.ZMQ.<clinit>(ZMQ.java:38)
at com.castaclip.verticals.Messenger.<init>(Messenger.java:125)
at com.castaclip.verticals.PushMessenger.<init>(PushMessenger.java:30)
at com.castaclip.verticals.pushserver.App.setup(App.java:60)
at com.castaclip.verticals.pushserver.App.main(App.java:41)
The error is exactly pointing to this line.
P.S: its a little bit difficult to fully explain this question.. if you have any question plz let me know. thanks.
If you've successfully built libzmq and jzmq in that order, I would run:
$ sudo ldconfig
to update the system library cache. Then I would check to see if LD_LIBRARY_PATH is defined like Raffian mentioned, or set your library path explicitly to something like:
$ java -Djava.library.path=/usr/lib:/usr/local/lib
Finally I tried to figure out the problem.
I was using zeromq-2.1.10 and this was part of the problem.
So I installed zeromq-3.2.3 from the source and problem resolved.
I encountered a mystifying instance of this message when I:
# java -Djava.library.path=/usr/hf/zmq/lib/ -cp '/usr/hf/lib/*:.' com.zmqtest.MA
Exception in thread "main"
java.lang.UnsatisfiedLinkError: /usr/hf/zmq/lib/libjzmq.so:
libzmq.so.3: cannot open shared object file: No such file or directory
which was fixed with a solution that makes no sense at all to me:
# LD_LIBRARY_PATH=/usr/hf/zmq/lib/ java -Djava.library.path=/usr/hf/zmq/lib/ -cp '/usr/hf/lib/*:.' com.zmqtest.MA
wierd.

bin/hive giving issue with the errors

I have installed the hive using source and run ant package.
as per cwiki.apache.org document, I have added PATH var also i.e $HIVE_HOME and $PATH but running the command from base directory (bin/hive or hive)
It give the following error.
I have added the patch (HIVE-3606.1.patch) to resolve it but still it's not working.
Command to add patch:
hive-0.10.0-bin]$ patch -p0 < ~/Downloads/HIVE-3606.1.patch
To run Hive:
hive-0.10.0-bin]$ bin/hive
Exception in thread "main" java.lang.NoSuchFieldError: ALLOW_UNQUOTED_CONTROL_CHARS
at org.apache.hadoop.hive.ql.udf.generic.GenericUDTFJSONTuple.<clinit>(GenericUDTFJSONTuple.java:59)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:113)
at org.apache.hadoop.hive.ql.exec.FunctionRegistry.registerGenericUDTF(FunctionRegistry.java:545)
at org.apache.hadoop.hive.ql.exec.FunctionRegistry.registerGenericUDTF(FunctionRegistry.java:539)
at org.apache.hadoop.hive.ql.exec.FunctionRegistry.<clinit>(FunctionRegistry.java:472)
at org.apache.hadoop.hive.ql.session.SessionState.<init>(SessionState.java:202)
at org.apache.hadoop.hive.cli.CliSessionState.<init>(CliSessionState.java:86)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:635)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:613)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Can anyone help here?
It's most probably because your Hadoop uses a different (older) version of Jackson libraries than Hive. As a quick workaround you can replace jackson-core-asl-X-X-X.jar and jackson-mapper-asl-X.X.X.jar in $HADOOP_HOME/lib with the newer ones in $HIVE_HOME/lib
Its because you are working with old version of Hadoop.
If you have Hadoop, its better to compile the source code yourself with the following command for old version of Hadoop:
$ svn co http://svn.apache.org/repos/asf/hive/trunk hive
$ cd hive
$ mvn clean install -Phadoop-2,dist
Check this link for more info: https://cwiki.apache.org/confluence/display/Hive/GettingStarted
Then, change the jackson* file names in $HADOOP_HOME/lib and add an .old postfix to them (Its a good practice not to delete them, as we may want them in future):
$ mv jackson-core-asl-1.0.1.jar jackson-core-asl-1.0.1.jar.old
$ mv jackson-mapper-asl-1.0.1.jar jackson-mapper-asl-1.0.1.jar.old
You can find the new jackson compiled files somewhere around Hive's packaging folder, mine is in:
packaging/target/apache-hive-0.14.0-SNAPSHOT-bin/apache-hive-0.14.0-SNAPSHOT/bin/hcatalog/share/webhcat/svr/lib
If you can't find it, its ok. use the following command in your hive directory.
$ find ./ -iname "*jackson*"
It will show you all the jackson* files that it can find. Then go to that specific folder that contains them and copy all of them to the $HADOOP_HOME/lib (currently we may just need the "jackson-core-*" but we copy all for future use):
$ cp jackson* $HADOOP_HOME/lib
Ask if you have more enquiries.

Categories