List all yarn application in hadoop cluster through java - java

On running command yarn application -list on my hadoop cluster, it returns list of applications running.
I want to fetch this list using Java.
Currently I am using yarnClient API
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.7.3</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-client</artifactId>
<version>2.7.0</version>
</dependency>
My code looks like :
YarnConfiguration conf = new YarnConfiguration();
YarnClient yarnClient = YarnClient.createYarnClient();
yarnClient.init(conf);
yarnClient.start();
List<ApplicationReport> list = yarnClient.getApplications();
System.out.print(list.size());
yarnClient.stop();
But this gets hanged at line List<ApplicationReport> list = yarnClient.getApplications() and doesn't move forward.

I had my code hang on #getApplications() when my YarnConfiguration wasn't properly configured. By default it uses 0.0.0.0:8032 as Yarn Resource Manager address. I had to overwrite this with correct address:
YarnConfiguration conf = new YarnConfiguration();
conf.set("yarn.resourcemanager.address", "<hostname>:<port>");
YarnClient yarnClient = YarnClient.createYarnClient();
yarnClient.init(conf);
yarnClient.start();
I tested this with Hadoop 2.6.0, but looks like defaults are the same for 2.7.0 as well (see sources).

Related

How to make spring boot app run on alternate port?

I have a spring boot (2.5.3) app running on a centOS VM behind a firewall. I normally build a fat jar, then run it with a config passed via CLI:
mvn clean package spring-boot:repackage
java -jar target/service.jar --spring.config.location=/path/to/config.properties
run curl GET commands: curl --key /a/b --cert /x/y "https://server-name:8767/path?arg=..."
It works using port 8767 set in the config, and I chose this port a while back randomly.
Since then, I've tried to see if I could make it work with a different port. I opened more ports on the linux public firewall-cmd zone, including 8768 & 9000. Problem is that no matter what I try, the only port I can get the app to run on is 8767. Seems like I've somehow hard-wired it to that port!
Normally server.port is set in the config, but even if I pass another port --server.port=xxxx via cli, the app runs, and logs show it is exposed to xxxx; however, curl can consistently only access 8767, and other ports time out. Or if I set server.port=xxxx in the config, same outcome.
What do I need to do to use a different port? (I saw this...would it help me?)
Dependencies (nothing special)
Dependencies (nothing special)
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
</parent>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
</dependency>
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
</dependency>
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
</dependency>
UPDATE: #Vinit - my main class is exactly like yours, except a std println I have to let me know it's running:
System.out.println("Running...");
As for my application.properties, I cannot paste them as I'm behind a firewall, but they are basically below, and there are more than one of each:
logging.level
server.port=xxxx // as described above, i've tried declaring here or cli
server.ssl
# custom auth properties
customauth.url
spring.profiles.active
spring.application.name
spring.task.scheduling
spring.jmx.enabled
swagger
management.endpoints
sanitization
spring.jackson
On another note, I run
sudo netstat -nlp | grep "<port>"
...before I run the app (where is the port I have either in my config or passed CLI), and no results. Then I run the app, repeat the netstat call, and that port is listening sure enough. But same thing: if 8767, all is well; but if 8768, time out.
Spring boot takes into account cli arguments when you pass the arguments to SpringApplication.run method in the main method. Main class should look like this -
#SpringBootApplication
public class Application {
public static void main(String[] args) {
SpringApplication.run(Application.class, args);
}
}
Pass args as argument to run method and it should take cli arguments into account. With this class, if --server.port=8080 is used as cli argument, then spring application should run on 8080 port.
this is the order of how spring boots evaluates the different approaches of setting the server port are:
embedded server configuration
command-line arguments
property files
main #SpringBootApplication configuration
so two typical issues if the server.port parameter does not work are overridden behavior in a WebServerFactoryCustomizer or in your main method of the SpringBootApplication

Payara embedded change port and add command line parameters

As part of my JEE routine i run a JUnit test using Payara embedded and Maven.
But the process is not optimal.
I need to change the port from default 8080 to 8888 for instance.
Also I receive the following error when test is run:
Caused by: javax.naming.NamingException: Lookup failed for 'resource/frontPageDirectory' in SerialContext[myEnv={java.naming.factory.initial=com.sun.enterprise.namin
I can probably use the following command line flag
-Ddeployment.resource.validation=false
but I don't know how to apply it in my maven file.
My maven file is simply:
<dependency>
<groupId>fish.payara.extras</groupId>
<artifactId>payara-embedded-all</artifactId>
<version>5.192</version>
<scope>test</scope>
</dependency>
So my question is how do I add these parameters to my maven pom file?
Kim

Errors trying to start ZetaSQL planner

I'm trying to run a Beam pipeline with SQL transforms, parsed with ZetaSQL. I begin with setting options with
options.setPlannerName("org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner");
When I try creating my SqlTransform with any given query, I get
java.util.ServiceConfigurationError: org.apache.beam.repackaged.sql.com.google.zetasql.ClientChannelProvider: Provider org.apache.beam.repackaged.sql.com.google.zetasql.JniChannelProvider could not be instantiated
at java.util.ServiceLoader.fail (ServiceLoader.java:232)
at java.util.ServiceLoader.access$100 (ServiceLoader.java:185)
at java.util.ServiceLoader$LazyIterator.nextService (ServiceLoader.java:384)
at java.util.ServiceLoader$LazyIterator.next (ServiceLoader.java:404)
at java.util.ServiceLoader$1.next (ServiceLoader.java:480)
at org.apache.beam.repackaged.sql.com.google.zetasql.ClientChannelProvider.loadChannel (ClientChannelProvider.java:31)
...
at org.apache.beam.sdk.extensions.sql.SqlTransform.expand (SqlTransform.java:82)
at org.apache.beam.sdk.Pipeline.applyInternal (Pipeline.java:539)
at org.apache.beam.sdk.Pipeline.applyTransform (Pipeline.java:473)
at org.apache.beam.sdk.values.PCollection.apply (PCollection.java:357)
...
I've added the following relevant dependencies to my POM in maven:
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-runners-google-cloud-dataflow-java</artifactId>
<version>2.15.0</version>
</dependency>
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-java-extensions-sql</artifactId>
<version>2.16.0</version>
</dependency>
Is there something else I'm missing here?
Unfortunately the ZetaSQL planner will today not work from a MAC, or older versions of Linux. Please see the comment from Rui in:
ZetaSQL Sample Using Apache beam
Looks like this PR maybe useful here ( I have not dug deeply into this to confirm):
https://github.com/google/zetasql/pull/3
As a workaround could you try on a newer version of linux? Maybe in a container?
This is resolved in the latest version of ZetaSQL, which will be used with Beam 2.21. You can also try pulling down a newer version of ZetaSQL (at least 2020.03.2):
<dependency>
<groupId>com.google.zetasql</groupId>
<artifactId>zetasql-jni-channel</artifactId>
<version>2020.03.2</version>
</dependency>

SchemaCrawler error when adding MariaDB artifact

When I add this to the pom.xml:
<!-- https://mvnrepository.com/artifact/us.fatehi/schemacrawler-mariadb -->
<dependency>
<groupId>us.fatehi</groupId>
<artifactId>schemacrawler-mariadb</artifactId>
<version>14.08.06</version>
</dependency>
Then I get an error:
java.util.ServiceConfigurationError: schemacrawler.tools.databaseconnector.DatabaseConnector: Provider schemacrawler.server.mariadb.MariaDBDatabaseConnector could not be instantiated
..
Caused by: java.lang.NoSuchMethodError: schemacrawler.tools.databaseconnector.DatabaseConnector.<init>(Lschemacrawler/tools/databaseconnector/DatabaseServerType;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;)V
I am trying to connect to an Oracle database. This works if I omit MariaDb from the pom.
I am using a higher version of SchemaCrawler:
<dependency>
<groupId>us.fatehi</groupId>
<artifactId>schemacrawler</artifactId>
<version>14.21.02</version>
</dependency>
<!-- https://mvnrepository.com/artifact/us.fatehi/schemacrawler-oracle -->
<dependency>
<groupId>us.fatehi</groupId>
<artifactId>schemacrawler-oracle</artifactId>
<version>14.21.02</version>
</dependency>
I would like to have the MariaDB in the pom.xml and still be able to read Oracle with SchemaCrawler. The error occurs after connecting to the database, in the last line of the following code:
Connection dbConnection = DatabaseBroker.getDbConnection(
eventName,
cbDatabase.getValue(),
tConnectionString.getValue(),
tUsername.getValue(),
tPassword.getValue()
);
//Schema schema = SchemaCrawler.getSchema(dbConnection, SchemaInfoLevel.detailed(), new SchemaCrawlerOptions());
//SchemaCrawler sc = new SchemaCrawler(dbConnection, null);
try
{
Catalog catalog = SchemaCrawlerUtility.getCatalog(dbConnection, null);
You are using incompatible versions of the main SchemaCrawler library and a SchemaCrawler database plugin. You do not need a plugin for MariaDB if you are connecting to Oracle. In fact, SchemaCrawler will work with most databases even without a SchemaCrawler database plugin on the classpath.

#Remote JNDI Communication: Wildfly to JBoss AS 5.1.0.GA

Architecture:
Windows Client -> Wildfly JAX-RS Services -> JBoss 5.1.0.GA legacy system.
I am getting a java.lang.ClassCastException: javax.naming.Reference cannot be cast to com.interfaces.GroupBookingManagerRemote when communicating here between Wildfly JAX-RS Services and JBoss 5.1.0.GA legacy system.
As I am communicating from Wildfly to JBoss AS 5.1.0.GA I am attempting to connect using JNDI.
In my Wildfly Server Maven pom I include:
<dependency>
<groupId>jboss</groupId>
<artifactId>jnp-client</artifactId>
<version>4.2.2.GA</version>
</dependency>
This gives me access to the required org.jnp.* classes and interfaces.
I simply use the following code to connect to my remote machine and retrieve back a GroupBookingManager. However the issue appears when I attempt to cast the class to the interface GroupBookingManagerRemote.
Properties env = new Properties();
env.setProperty(Context.PROVIDER_URL, "jnp://myremoteserver:1099");
env.setProperty(Context.INITIAL_CONTEXT_FACTORY, "org.jnp.interfaces.NamingContextFactory");
env.setProperty(Context.URL_PKG_PREFIXES, "org.jboss.naming:org.jnp.interfaces");
InitialContext initialContext = new InitialContext(env);
Object ref = initialContext.lookup("MyEARFile/GroupBookingManager/remote");
if (ref != null) {
bookingManager = (GroupBookingManagerRemote) ref; // java.lang.ClassCastException: javax.naming.Reference cannot be cast
}
I have a myclient.jar file which I have added to my Wildfly application that contains the remote interface GroupBookingManagerRemote.
Does anyone see any issue with what I have done?
Thanks,
Darren
Thanks for your help Gimby,
I found the answer myself after a bit more messing about.
From Wildfly 8.1.0 (client) -> JBoss AS 5
You do not require any JBoss 5 jars
Firstly you need a reference to the interface that you wish to use on the client side. This can be in a your-project-client.jar. If using Maven you can create a repository and build the Maven directory structure using mvn
mvn install:install-file -DlocalRepositoryPath=DirectoryName -DcreateChecksum=true -Dpackaging=jar -Dfile=Path-to-you-project-client.jar -DgroupId=YourGroupId -DartifactId=YourartifactId -Dversion=1.0
Then in order to connect to the remote machine and cast the interface back to your-interface, you use:
final Properties env = new Properties();
env.put(Context.INITIAL_CONTEXT_FACTORY, org.jboss.naming.remote.client.InitialContextFactory.class.getName());
env.put(Context.PROVIDER_URL, "remote://remoteserver:4447");
InitialContext initialContext = new InitialContext(env);
This uses Wildfly remote:// which is in remote naming and ejb in wildfly-ejb-client-bom
<dependency>
<groupId>org.wildfly</groupId>
<artifactId>wildfly-ejb-client-bom</artifactId>
<version>8.1.0.Final</version>
<scope>compile</scope>
<type>pom</type>
</dependency>
And I also required this dependency for communication
<dependency>
<groupId>org.jboss.xnio</groupId>
<artifactId>xnio-nio</artifactId>
<version>3.2.2.Final</version>
<scope>compile</scope>
</dependency>
and this one for the remote naming.
<dependency>
<groupId>org.jboss</groupId>
<artifactId>jboss-remote-naming</artifactId>
<version>2.0.1.Final</version>
</dependency>
Also note the port is not the ususal port for JBoss 5 JNDI:1099 this is the Default Remoting Port : 4447
Object ref = initialContext.lookup("ejb:Your-EAR/YourClass/remote!" + YouClass.class.getName());
You can then cast your reference to your interface and use it as normal.
Hope this makes sense.

Categories