I am trying to test if I can jar 3rd party dependencies into my jar, SSH the jar over to a remote machine, and then run a map reduce job.
Process:
With my project, I run mvn clean package and that produces the files my-appy-1.0-SNAPSHOT.jar and original-my-appy-1.0-SNAPSHOT.jar. I scp the first file over to my remote machine and run the command:
hadoop jar hadoop my-appy-1.0-SNAPSHOT.jar /user/bli1/wordcount/input /user/bli1/wordcount/output
I also tried:
hadoop my-appy-1.0-SNAPSHOT.jar WordCount /user/bli1/wordcount/input /user/bli1/wordcount/output
I'm not sure why I am getting this error:
Error: Could not find or load main class my-appy-1.0-SNAPSHOT.jar
pom.xml:
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>2.18.1</version>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.1</version>
<configuration>
<source>1.7</source>
<target>1.7</target>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.2</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<createDependencyReducedPom>false</createDependencyReducedPom>
<artifactSet>
<excludes>
<exclude>org.hamcrest:*</exclude>
<exclude>org.mockito:*</exclude>
<exclude>org.objenesis:*</exclude>
</excludes>
</artifactSet>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/LICENSE</exclude>
<exclude>META-INF/license</exclude>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<manifestEntries>
<Main-Class>com.mycompany.app.WordCount</Main-Class>
<Build-Number>1</Build-Number>
</manifestEntries>
</transformer>
</transformers>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
Open your *.jar file and check it's MANIFEST.mf file. It should contain the "Main-Class" row with your class with "public static main()" method. If there is no "Main-Class" add it yourself
This link will help you do this: https://docs.oracle.com/javase/tutorial/deployment/jar/appman.html
you need to give full path to main class including package name.
hadoop my-appy-1.0-SNAPSHOT.jar com.mycompany.app.WordCount /user/bli1/wordcount/input /user/bli1/wordcount/output
The command you run through jar file is as below:
hadoop jar jar_filename.jar package_name.class_name HDFS_inputfile_name HDFS_output_directory
Your command should be
hadoop jar my-appy-1.0-SNAPSHOT.jar com.mycompany.app.WordCount /user/bli1/wordcount/input /user/bli1/wordcount/output
Related
I am migrating from Dataflow sdk 1.x to 2.x which uses apache beam.The code runs properly when i run it on eclipse using Run as java application.But when i compile it using mvn clean compile assembly:single and then run the class file using the command java -cp
I get these logs followed by an exception
INFO: PipelineOptions.filesToStage was not specified. Defaulting to files
from the classpath: will stage 1 files. Enable logging at DEBUG level to see
which files will be staged.
Apr 25, 2018 11:17:01 AM com.example.LogParser parse
INFO: Currently processing file: File1.gz
Exception in thread "main" java.lang.NoSuchMethodError:
com.google.common.base.Preconditions.checkArgument(ZLjava/lang/
String;Ljava/lang/Object;Ljava/lang/Object;)V at
org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO$Write.expand(BigQueryIO.java:1512)
at org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO$Write.expand(BigQueryIO.java:1041)
at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:537)
at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:472)
at org.apache.beam.sdk.values.PCollection.apply(PCollection.java:286)
at com.example.LogParser.parse(LogParser.java:541)
at
com.example.LogParser.main(LogParser.java:167)
I tried updating guava version but it didn't help.
I also noticed that when i run my class file using eclipse i get this log that differs when i run it using jar file
INFO: PipelineOptions.filesToStage was not specified. Defaulting to files
from the classpath: will stage 147 files. Enable logging at DEBUG level to
see which files will be staged.
I was able to solve the error by adding these maven dependencies instead of maven-assembly-plugin and then using mvn package instead of mvn assembly
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>${maven-compiler-plugin.version}</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>${maven-surefire-plugin.version}</version>
<configuration>
<parallel>all</parallel>
<threadCount>4</threadCount>
<redirectTestOutputToFile>true</redirectTestOutputToFile>
</configuration>
<dependencies>
<dependency>
<groupId>org.apache.maven.surefire</groupId>
<artifactId>surefire-junit47</artifactId>
<version>${maven-surefire-plugin.version}</version>
</dependency>
</dependencies>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<version>${maven-jar-plugin.version}</version>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>${maven-shade-plugin.version}</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<finalName>${project.artifactId}-bundled-${project.version}</finalName>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/LICENSE</exclude>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
</transformers>
</configuration>
</execution>
</executions>
</plugin>
Using Maven I compiled my project into a JAR that includes all the dependencies except for one big dependecy. The inclusion of the dependecies is done using:
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<version>2.2.1</version>
<configuration>
<archive>
<manifest>
<mainClass>com.mypackage.Main</mainClass>
</manifest>
</archive>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
<executions>
<execution>
<id>make-assembly</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
Exclusion of dependencies is done with <scope>provided</scope>
The target myjar.jar is in the same folder as BigExternalJar.jar, but when I try to run:
java -cp ".:BigExternalJar.jar:myjar.jar" -jar myjar.jar
I get an exception for missing classes (those classes are from BigExternalJar.jar).
How can one pack dependencies into a JAR, using Maven only, but still be able to add additional JARs in classpath? Note that the BigExternalJar is not always in the same folder so I cannot add it manually to the MANIFEST file.
There are two similar questions that might look duplicate but they do not have an answer to this situation.
Eclipse: How to build an executable jar with external jar? AND
Running a executable JAR with external dependencies
The classpath argument is ignored if you use the -jar option. Only the classpath provided in the manifest is used.
<build>
<plugins>
<!-- compiler插件, 设定JDK版本 -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>1.7</source>
<target>1.7</target>
<encoding>UTF-8</encoding>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<createDependencyReducedPom>false</createDependencyReducedPom>
<transformers>
<transformer
implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<mainClass>xxx.xxx.yourmain</mainClass>
</transformer>
<transformer
implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
<resource>META-INF/spring.handlers</resource>
</transformer>
<transformer
implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
<resource>META-INF/spring.schemas</resource>
</transformer>
</transformers>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
pls try this~~~ all external jar will be in your packaged jar
I have a Java project that I use maven for dependency resolution and building the self-contained, runnable/uber jar (via maven-shade-plugin currently) and I'd like to start playing around with mixing Kotlin into the projects for some new features.
Is using maven to create a runnable/uber jar comprised of Java (primarily) and Kotlin something that is relatively simple to do, and/or supported? Or am I looking at a hack job of gluing together a bunch of stuff that might work if done right? Kotlin isn't a requirement, but since it's supposed to be able to mix freely with Java, it's something I've been wanting to try out on this project. Really trying to gauge if it's a can of worms, or no big deal (and if it's nbd, best direction to look into?)
Here is the build section of my pom.xml so you can see how I'm creating it currently
<build>
<finalName>my-server</finalName>
<resources>
<resource>
<directory>src/main/resources</directory>
<filtering>true</filtering>
</resource>
</resources>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<version>3.0.0</version>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.4.3</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<!-- exclude signatures -->
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<manifestEntries>
<Main-Class>com.mycompany.myproject.mymainclass</Main-Class>
</manifestEntries>
</transformer>
</transformers>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.6.0</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
</plugins>
</build>
Anyone have experience doing something like this? Simple? or just stay away?
I am creating an uber jar i.e. jar with dependencies for my project. I have a bunch of properties files that the project uses. I want to be able to change these properties files before running my project so i want them to be outside of the jar. Here is the relevant sections of my pom
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.6.1</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<version>2.2</version>
<configuration>
<artifactSet>
<excludes>
<exclude>**/*.properties</exclude>
<exclude>**/*.json</exclude>
</excludes>
</artifactSet>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
<archive>
<manifest>
<mainClass>path.to.main.Main</mainClass>
</manifest>
<manifestEntries>
<Class-Path>.</Class-Path>
<Class-Path>conf/</Class-Path>
</manifestEntries>
</archive>
</configuration>
<executions>
<execution>
<id>make-assembly</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-resources-plugin</artifactId>
<version>2.4</version>
<executions>
<execution>
<id>copy-resources</id>
<phase>install</phase>
<goals>
<goal>copy-resources</goal>
</goals>
<configuration>
<outputDirectory>${basedir}/target/conf</outputDirectory>
<resources>
<resource>
<directory>src/main/resources</directory>
<includes>
<include>**/*.properties</include>
<include>**/*.json</include>
</includes>
</resource>
</resources>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
so essentially, I want to create a folder ${basedir}/target/conf and copy all the .properties and .json files to it. Also, here is how I am reading the files
InputStream in = this.getClass().getClassLoader().getResourceAsStream("filename.properties");
I am facing a couple of problems
When i do mvn clean install, i still see the all the .properties and .json files in the classes folder. Shouldn't they have been excluded?
The conf folder is created with all of the files, but when I run the jar adn try to change the properties, the changes are not picked up. How can i ensure that the conf folder is being added to the classpath?
I want to be able to load the .properties and .json files from the src/main/resources folder while i am developing so i dont want to put them in a separate folder. Is this possible?
I was facing the same issue where Uber jar is not reading the external configuration file.
I tried below configuration and it worked like charm. Refer below configuration it may help someone having the issue with uber jar not reading extenarl files.
I am not sure if this is the best way but haven't found any soultion online :)
I have included the resources using IncludeResourceTransformer.
Using filter removed the properties file from uber jar.
In classpath /conf reading the properties from external folder.
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.3</version>
<executions> <!-- Run shade goal on package phase -->
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<transformers>
add Main-Class to manifest file
<transformer
implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<manifestEntries>
<Main-Class>JobName</Main-Class>
<Class-Path>conf/</Class-Path>
</manifestEntries>
</transformer>
<transformer
implementation="org.apache.maven.plugins.shade.resource.IncludeResourceTransformer">
<resource>src/main/resources/config.properties</resource>
<file>${project.basedir}/src/main/resources/config.properties</file>
</transformer>
</transformers>
<finalName>FinalJarName</finalName>
<filters>
<filter>
<artifact>groupId:artifactId</artifact>
<excludes>
<exclude>**/*.properties</exclude>
</excludes>
</filter>
</filters>
</configuration>
</execution>
</executions>
</plugin>
good luck.
If I wanted to create a jar file without META-INF nonsense using jar utility I can pass the -M switch, which will:
-M do not create a manifest file for the entries
Note that this is a feature of the jar utility. If I use it, I will get a jar without the META-INF folder and included MANIFEST, basically just an archive of type jar with whatever files/directories I put in it.
How do I do this with the maven-jar-plugin? I need to do this to conform to another process. (They expect a jar with very specific file/folder layout and I cannot have a META-INF folder at the root of the jar file.)
I've got the configuration to create the jar file just right and I don't want to mess with another plugin...
In maven-jar-plugin there is no option to disable creation of manifest folder, but you can disable the maven descriptor directory like this :
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<configuration>
<archive>
<addMavenDescriptor>false</addMavenDescriptor>
<manifest>
<addClasspath>false</addClasspath>
</manifest>
</archive>
</configuration>
</plugin>
If absolutely you want to delete the META-INF folder you can use maven-shade-plugin like this :
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/</exclude>
</excludes>
</filter>
</filters>
</configuration>
</execution>
</executions>
</plugin>
You can use maven-shade-plugin to achieve the desired effect:
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.4.1</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<artifactSet>
<includes>
<include>${project.groupId}:${project.artifactId}</include>
</includes>
</artifactSet>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/</exclude>
</excludes>
</filter>
</filters>
</configuration>
</execution>
</executions>
</plugin>
The configuration filters out the META-INF directory and includes only the current project so that dependencies are not attached.