Write to Mongodb from Apache storm - java

I am trying to write a custom stream grouping which writes to Mongodb. I am running a local cluster for now. I have a custom stream class and a mongo object. I write to mongodb in both the prepare() and chooseTask(). It writes to mongodb but the supervisors cannot start. I see this error in the supervisor log:
b.s.d.worker [ERROR] Error on initialization of server mk-worker
java.lang.NoClassDefFoundError: com/mongodb/MongoClient
at storm.starter.MongoMonitorObject.<init>(MongoMonitorObject.java:23) ~[stormjar.jar:0.10.0]
at storm.starter.ModStreamGrouping.prepare(ModStreamGrouping.java:94)
~[stormjar.jar:0.10.0]
I am making changes in the storm starter project for now.
public class ModStreamGrouping implements CustomStreamGrouping, Serializable{
java.util.List<java.lang.Integer> targetTasks = new ArrayList();
#Override
public List<Integer> chooseTasks(int taskId,List<Object> values) {
System.out.println("taskiD = " + taskId);
System.out.println("values = " + values);
return numTasks[0];
}
#Override
public void prepare(WorkerTopologyContext context, GlobalStreamId stream, java.util.List<java.lang.Integer> targetTasks) {
MongoMonitorObject mmo = new
System.out.println(" in prep() ");
System.out.println("targetTasks = " + targetTasks);
numTasks = targetTasks.size();
}
}
public class MongoMonitorObject {
private static final Logger LOG = LoggerFactory.getLogger(MongoMonitorObject.class);
public MongoMonitorObject(java.util.List<java.lang.Integer> targetTasks){
try{
MongoClient mongoClient = new MongoClient("localhost", 27017);
DB db = mongoClient.getDB( "loadDB" );
DBCollection collection = db.getCollection("testCollection");
for (Integer task : targetTasks) {
BasicDBObject document = new BasicDBObject();
document.put("tid", task);
collection.insert(document);
}
}
catch (UnknownHostException e) {
System.out.println(" in UnknownHostException ");
LOG.info(" in UnknownHostException ");
}
catch (Exception e) {
System.out.println(" in Exception ");
LOG.info(" in Exception ");
}
}
}
The stream grouping is defined in a ModStreamGrouping.java and mongo connection is defined in MongoMonitorObject.java. Both belong to the package storm.starter.
I can upload the topology but the supervisors cannot spawn workers. There's a small link I'm missing somewhere but I don't know where exactly. I added the following in storm starter's pom.xml to include mongodb connectivity:
<dependency>
<groupId>org.mongodb</groupId>
<artifactId>mongo-java-driver</artifactId>
<version>2.13.3</version>
</dependency>
<dependency>
<groupId>org.mongodb</groupId>
<artifactId>bson</artifactId>
<version>2.13.3</version>
</dependency>

Edit:
I read it here: https://github.com/mongodb/mongo-java-driver
mongodb-java-driver is a all-in-one jar, it contains bson and core
Therefore dependency mongodb-java-driver is enough.
If use dependency mongodb-driver , dependencies bson and core are needed.
Original post:
Try add mongodb-driver-core and use the newer version dependencies
<dependency>
<groupId>org.mongodb</groupId>
<artifactId>mongodb-driver-core</artifactId>
<version>3.2.2</version>
</dependency>
The MongoDB Java Driver uber-artifact, containing mongodb-driver, mongodb-driver-core, and bson
Check it here

Related

Cannot get Junit test to run for dynamodb class

I have a class that scans a column from a dynamo db table, whilst using the aws sdk for java(main method taken out for simplicity):
public class fetchCmdbColumn {
public static List<String> CMDB(String tableName, String tableColumn) throws Exception {
DynamoDbClient client = DynamoDbClient.builder()
.region(Region.EU_WEST_1)
.build();
List<String> ListValues = new ArrayList<>();
try {
ScanRequest scanRequest = ScanRequest.builder()
.tableName(tableName)
.build();
ScanResponse response = client.scan(scanRequest);
for (Map<String, AttributeValue> item : response.items()){
Set<String> keys = item.keySet();
for (String key : keys) {
if (key == tableColumn) {
ListValues.add(item.get(key).s()) ;
}
}
}
//To check what is being returned, comment out below
// System.out.println(ListValues);
} catch (DynamoDbException e){
e.printStackTrace();
System.exit(1);
}
client.close();
return ListValues;
}
}
I also have a junit tests created for that class:
public class fetchCMDBTest {
// Define the data members required for the test
private static String tableName = "";
private static String tableColumn = "";
#BeforeAll
public static void setUp() throws IOException {
// Run tests on Real AWS Resources
try (InputStream input = fetchCMDBTest.class.getClassLoader().getResourceAsStream("config.properties")) {
Properties prop = new Properties();
if (input == null) {
System.out.println("Sorry, unable to find config.properties");
return;
}
//load a properties file from class path, inside static method
prop.load(input);
// Populate the data members required for all tests
tableName = prop.getProperty("environment_list");
tableColumn = prop.getProperty("env_name");
} catch (IOException ex) {
ex.printStackTrace();
}
}
#Test
void fetchCMDBtable() throws Exception{
try {
fetchCMDB.CMDB(tableName, tableColumn);
System.out.println("Test 1 passed");
} catch (Exception e) {
System.out.println("Test 1 failed!");
e.printStackTrace();
}
}
}
When i run the test using mvn test I get the error:
software.amazon.awssdk.core.exception.SdkClientException: Multiple HTTP implementations were found on the classpath ,
even though I have only declared the client builder once in the class.
What am i missing?
I run the UNIT tests from the IntelliJ IDE. I find using the IDE works better then from the command line. Once I setup the config.properties file that contains the values for the tests and run them, all tests pass -- as shown here:
In fact - we test all Java V2 code examples in this manner to ensure they all work.
I also tested all DynamoDB examples from the command line using mvn test . All passed:
Amend your test to build a single instance of the DynamoDB client and then as your first test, make sure it was created successfully. See if this works for you. Once you get this working, add more tests!
public class DynamoDBTest {
private static DynamoDbClient ddb;
#BeforeAll
public static void setUp() throws IOException {
// Run tests on Real AWS Resources
Region region = Region.US_WEST_2;
ddb = DynamoDbClient.builder().region(region).build();
try (InputStream input = DynamoDBTest.class.getClassLoader().getResourceAsStream("config.properties")) {
Properties prop = new Properties();
if (input == null) {
System.out.println("Sorry, unable to find config.properties");
return;
}
//load a properties file from class path, inside static method
prop.load(input);
} catch (IOException ex) {
ex.printStackTrace();
}
}
#Test
#Order(1)
public void whenInitializingAWSService_thenNotNull() {
assertNotNull(ddb);
System.out.println("Test 1 passed");
}
Turns out my pom file contained other clients, so had to remove the likes of :
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>s3</artifactId>
<exclusions>
<exclusion>
<groupId>software.amazon.awssdk</groupId>
<artifactId>netty-nio-client</artifactId>
</exclusion>
<exclusion>
<groupId>software.amazon.awssdk</groupId>
<artifactId>apache-client</artifactId>
</exclusion>
</exclusions>
</dependency>
and replaced them with :
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>aws-crt-client</artifactId>
<version>2.14.13-PREVIEW</version>
</dependency>
as mentioned in https://aws.amazon.com/blogs/developer/introducing-aws-common-runtime-http-client-in-the-aws-sdk-for-java-2-x/
as a complement to the other answers, for me only worked the option 4 from the reference.
Option 4: Change the default HTTP client using a system property in Java code.
I defined it on the setUp() method of my integration test using JUnit 5.
#BeforeAll
public static void setUp() {
System.setProperty(
SdkSystemSetting.SYNC_HTTP_SERVICE_IMPL.property(),
"software.amazon.awssdk.http.apache.ApacheSdkHttpService");
}
and because I am using gradle:
implementation ("software.amazon.awssdk:s3:${awssdk2Version}") {
exclude group: 'software.amazon.awssdk', module: 'netty-nio-client'
exclude group: 'software.amazon.awssdk', module: 'apache-client'
}
implementation "software.amazon.awssdk:aws-crt-client:2.17.71-PREVIEW"

Getting "not talking to master and retries used up" Exception at DBCursor.hasNext()

I am getting "not talking to master and retries used up" Exception at statement DBCursor.hasNext().
When I searched, got the solution as set the preference. Still i am getting the issue.
My code is as below:
public void sampleTest() throws Exception
{
MongoClient client = new MongoClient("192.168.20.117", 27017);
DB database = client.getDB("CLME2ECORE");
boolean auth = database.authenticate("tecnotree", ("tecnotree").toCharArray());
DBCollection collection = database.getCollection("RegistrationRequest");
collection.setReadPreference(ReadPreference.primary());
BasicDBObject andQuery = new BasicDBObject("serviceRequest.serviceRequestSubtype.masterCode","RETPOSTREG");
andQuery.append("serviceRequest.serviceRequestStatus.masterCode", "PYMTPEND");
BasicDBObject andFields = new BasicDBObject("serviceRequest.customer.profileDetails.basicDetails.customerCode",1);
andFields.append("_id", 0);
DBCursor dbCursor = collection.find(andQuery);
DBObject dbObject;
dbCursor.setReadPreference(ReadPreference.primary());
if(dbCursor.hasNext())
{
dbObject = dbCursor.next();
String value = dbObject.get("serviceRequest.customer.profileDetails.basicDetails.customerCode").toString();
}
client.close();
}
I am using maven dependencies as
<dependency>
<groupId>org.mongodb</groupId>
<artifactId>mongo-java-driver</artifactId>
<version>2.13.0</version>
</dependency>
<dependency>
<groupId>org.mongodb</groupId>
<artifactId>bson</artifactId>
<version>2.13.0</version>
</dependency>
Please help to resolve this issue.
Do these changes, then it will work.
Pass the complete replica set string while connecting to the mongodb, do not pass an individual server IP.
Change the readPreferance to PrimaryPrefferred, instead of Primary only.
I met the exactly the same problem as you. I'm using java spring connect to a slave mongo database. The following code should be helpful.
#Configuration
#PropertySource("classpath:param.properties")
public class MongoConfig {
public #Bean MongoClientFactoryBean mongo() {
MongoClientFactoryBean mongo = new MongoClientFactoryBean();
mongo.setHost(mongoAddress);
mongo.setPort(mongoPort);
MongoClientOptions mco = new MongoClientOptions.Builder().readPreference(ReadPreference.secondaryPreferred()).build();
mongo.setMongoClientOptions(mco);
return mongo;
}
#Value("${mongo.address}")
private String mongoAddress;
#Value("${mongo.port}")
private Integer mongoPort;
}
The key line is following:
MongoClientOptions mco = new MongoClientOptions.Builder().readPreference(ReadPreference.secondaryPreferred()).build();
Which indicate the mongo client as secondard perferred.

Apache Flink integration with Elasticsearch

I am trying to integrate Flink with Elasticsearch 2.1.1, I am using the maven dependency
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-elasticsearch2_2.10</artifactId>
<version>1.1-SNAPSHOT</version>
</dependency>
and here's the Java Code where I am reading the events from a Kafka queue (which works fine) but somehow the events are not getting posted in the Elasticsearch and there is no error either, in the below code if I change any of the settings related to port, hostname, cluster name or index name of ElasticSearch then immediately I see an error but currently it doesn't show any error nor any new documents get created in ElasticSearch
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
// parse user parameters
ParameterTool parameterTool = ParameterTool.fromArgs(args);
DataStream<String> messageStream = env.addSource(new FlinkKafkaConsumer082<>(parameterTool.getRequired("topic"), new SimpleStringSchema(), parameterTool.getProperties()));
messageStream.print();
Map<String, String> config = new HashMap<>();
config.put(ElasticsearchSink.CONFIG_KEY_BULK_FLUSH_MAX_ACTIONS, "1");
config.put(ElasticsearchSink.CONFIG_KEY_BULK_FLUSH_INTERVAL_MS, "1");
config.put("cluster.name", "FlinkDemo");
List<InetSocketAddress> transports = new ArrayList<>();
transports.add(new InetSocketAddress(InetAddress.getByName("localhost"), 9300));
messageStream.addSink(new ElasticsearchSink<String>(config, transports, new TestElasticsearchSinkFunction()));
env.execute();
}
private static class TestElasticsearchSinkFunction implements ElasticsearchSinkFunction<String> {
private static final long serialVersionUID = 1L;
public IndexRequest createIndexRequest(String element) {
Map<String, Object> json = new HashMap<>();
json.put("data", element);
return Requests.indexRequest()
.index("flink").id("hash"+element).source(json);
}
#Override
public void process(String element, RuntimeContext ctx, RequestIndexer indexer) {
indexer.add(createIndexRequest(element));
}
}
I was indeed running it on the local machine and debugging as well but, the only thing I was missing is to properly configure logging, as most of elastic issues are described in "log.warn" statement. The issue was the exception inside "BulkRequestHandler.java" in elasticsearch-2.2.1 client API, which was throwing the error -"org.elasticsearch.action.ActionRequestValidationException: Validation Failed: 1: type is missing;" As I had created the index but not an type which I find pretty strange as it should be primarily be concerned with index and create the type by default.
I have found a very good example of Flink & Elasticsearch Connector
First Maven dependency:
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-elasticsearch2_2.10</artifactId>
<version>1.1-SNAPSHOT</version>
</dependency>
Second Example Java code
public static void writeElastic(DataStream<String> input) {
Map<String, String> config = new HashMap<>();
// This instructs the sink to emit after every element, otherwise they would be buffered
config.put("bulk.flush.max.actions", "1");
config.put("cluster.name", "es_keira");
try {
// Add elasticsearch hosts on startup
List<InetSocketAddress> transports = new ArrayList<>();
transports.add(new InetSocketAddress("127.0.0.1", 9300)); // port is 9300 not 9200 for ES TransportClient
ElasticsearchSinkFunction<String> indexLog = new ElasticsearchSinkFunction<String>() {
public IndexRequest createIndexRequest(String element) {
String[] logContent = element.trim().split("\t");
Map<String, String> esJson = new HashMap<>();
esJson.put("IP", logContent[0]);
esJson.put("info", logContent[1]);
return Requests
.indexRequest()
.index("viper-test")
.type("viper-log")
.source(esJson);
}
#Override
public void process(String element, RuntimeContext ctx, RequestIndexer indexer) {
indexer.add(createIndexRequest(element));
}
};
ElasticsearchSink esSink = new ElasticsearchSink(config, transports, indexLog);
input.addSink(esSink);
} catch (Exception e) {
System.out.println(e);
}
}

How to Run PDI Transformation with Database from Java?

I am trying to run a PDI transformation involving database (any database, but noSQL one are more preferred) from Java.
I've tried using mongodb and cassandradb and got missing plugins, I've already asked here: Running PDI Kettle on Java - Mongodb Step Missing Plugins, but no one replied yet.
I've tried switching to SQL DB using PostgreSQL too, but it still doesn't work. From the research I did, I think it was because I didn't connect the database from the Java thoroughly, yet I haven't found any tutorial or direction that works for me. I've tried following directions from this blog : http://ameethpaatil.blogspot.co.id/2010/11/pentaho-data-integration-java-maven.html : but still got some problems about repository (because I don't have any and there seems to be required).
The transformations are fine when I run it from Spoon. It only failed when I run it from Java.
Can anyone help me how to run PDI transformation involving database? Where did I go wrong?
Is anyone ever succeeded in running PDI transformation from involving either noSQL and SQL database? what DB did you use?
I'm sorry if I asked too many questions, I am so desperate. any kind of information will be very appreciated. Thank you.
Executing PDI Jobs from Java is pretty straight forward. You just need to import all the necessary jar files (for the databases) and then call in the kettle class. The best way is obviously to use "Maven" to control the dependency. In the maven pom.xml file, just call the database drivers.
A Sample Maven file would be something like below, assuming you are using pentaho v5.0.0GA and Database as PostgreSQL:
<dependencies>
<!-- Pentaho Kettle Core dependencies development -->
<dependency>
<groupId>pentaho-kettle</groupId>
<artifactId>kettle-core</artifactId>
<version>5.0.0.1</version>
</dependency>
<dependency>
<groupId>pentaho-kettle</groupId>
<artifactId>kettle-dbdialog</artifactId>
<version>5.0.0.1</version>
</dependency>
<dependency>
<groupId>pentaho-kettle</groupId>
<artifactId>kettle-engine</artifactId>
<version>5.0.0.1</version>
</dependency>
<dependency>
<groupId>pentaho-kettle</groupId>
<artifactId>kettle-ui-swt</artifactId>
<version>5.0.0.1</version>
</dependency>
<dependency>
<groupId>pentaho-kettle</groupId>
<artifactId>kettle5-log4j-plugin</artifactId>
<version>5.0.0.1</version>
</dependency>
<!-- The database dependency files. Use it if your kettle file involves database connectivity. -->
<dependency>
<groupId>postgresql</groupId>
<artifactId>postgresql</artifactId>
<version>9.1-902.jdbc4</version>
</dependency>
You can check my blog for more. It works for database connections.
Hope this helps :)
I had the same problem in a application using the pentaho libraries. I resolved the problem with this code:
The singleton to init Kettle:
import org.pentaho.di.core.KettleEnvironment;
import org.pentaho.di.core.exception.KettleException;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
/**
* Inicia as configurações das variáveis de ambiente do kettle
*
* #author Marcos Souza
* #version 1.0
*
*/
public class AtomInitKettle {
private static final Logger LOGGER = LoggerFactory.getLogger(AtomInitKettle.class);
private AtomInitKettle() throws KettleException {
try {
LOGGER.info("Iniciando kettle");
KettleJNDI.protectSystemProperty();
KettleEnvironment.init();
LOGGER.info("Kettle iniciado com sucesso");
} catch (Exception e) {
LOGGER.error("Message: {} Cause {} ", e.getMessage(), e.getCause());
}
}
}
And the code that saved me:
import java.io.File;
import java.util.Properties;
import org.pentaho.di.core.Const;
import org.pentaho.di.core.exception.KettleException;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
public class KettleJNDI {
private static final Logger LOGGER = LoggerFactory.getLogger(KettleJNDI.class);
public static final String SYS_PROP_IC = "java.naming.factory.initial";
private static boolean init = false;
private KettleJNDI() {
}
public static void initJNDI() throws KettleException {
String path = Const.JNDI_DIRECTORY;
LOGGER.info("Kettle Const.JNDI_DIRECTORY= {}", path);
if (path == null || path.equals("")) {
try {
File file = new File("simple-jndi");
path = file.getCanonicalPath();
} catch (Exception e) {
throw new KettleException("Error initializing JNDI", e);
}
Const.JNDI_DIRECTORY = path;
LOGGER.info("Kettle null > Const.JNDI_DIRECTORY= {}", path);
}
System.setProperty("java.naming.factory.initial", "org.osjava.sj.SimpleContextFactory");
System.setProperty("org.osjava.sj.root", path);
System.setProperty("org.osjava.sj.delimiter", "/");
}
public static void protectSystemProperty() {
if (init) {
return;
}
System.setProperties(new ProtectionProperties(SYS_PROP_IC, System.getProperties()));
if (LOGGER.isInfoEnabled()) {
LOGGER.info("Kettle System Property Protector: System.properties replaced by custom properies handler");
}
init = true;
}
public static class ProtectionProperties extends Properties {
private static final long serialVersionUID = 1L;
private final String protectedKey;
public ProtectionProperties(String protectedKey, Properties prprts) {
super(prprts);
if (protectedKey == null) {
throw new IllegalArgumentException("Properties protection was provided a null key");
}
this.protectedKey = protectedKey;
}
#Override
public synchronized Object setProperty(String key, String value) {
// We forbid changes in general, but do it silent ...
if (protectedKey.equals(key)) {
if (LOGGER.isDebugEnabled()) {
LOGGER.debug("Kettle System Property Protector: Protected change to '" + key + "' with value '" + value + "'");
}
return super.getProperty(protectedKey);
}
return super.setProperty(key, value);
}
}
}
I think your problem is with connection of data base. You can configure in transformation and do not need use JNDI.
public class DatabaseMetaStep {
private static final Logger LOGGER = LoggerFactory.getLogger(DatabaseMetaStep.class);
/**
* Adds the configurations of access to the database
*
* #return
*/
public static DatabaseMeta createDatabaseMeta() {
DatabaseMeta databaseMeta = new DatabaseMeta();
LOGGER.info("Carregando informacoes de acesso");
databaseMeta.setHostname("localhost");
databaseMeta.setName("stepName");
databaseMeta.setUsername("user");
databaseMeta.setPassword("password");
databaseMeta.setDBPort("port");
databaseMeta.setDBName("database");
databaseMeta.setDatabaseType("MonetDB"); // sql, MySql ...
databaseMeta.setAccessType(DatabaseMeta.TYPE_ACCESS_NATIVE);
return databaseMeta;
}
}
Then you need set the databaseMeta to Transmeta
DatabaseMeta databaseMeta = DatabaseMetaStep.createDatabaseMeta();
TransMeta transMeta = new TransMeta();
transMeta.setUsingUniqueConnections(true);
transMeta.setName("ransmetaNeame");
List<DatabaseMeta> databases = new ArrayList<>();
databases.add(databaseMeta);
transMeta.setDatabases(databases);
I tried your code with a "tranformation without jndi" and works!
But I needed add this repository in my pom.xml:
<repositories>
<repository>
<id>pentaho-releases</id>
<url>http://repository.pentaho.org/artifactory/repo/</url>
</repository>
</repositories>
Also when I try with a datasource I have this error : Cannot instantiate class: org.osjava.sj.SimpleContextFactory [Root exception is java.lang.ClassNotFoundException: org.osjava.sj.SimpleContextFactory]
Complete log here:
https://gist.github.com/eb15f8545e3382351e20.git
[FIX] : Add this dependency :
<dependency>
<groupId>pentaho</groupId>
<artifactId>simple-jndi</artifactId>
<version>1.0.1</version>
</dependency>
After that a new error occurs:
transformation_with_jndi - Dispatching started for transformation [transformation_with_jndi]
Table input.0 - ERROR (version 5.0.0.1.19046, build 1 from 2013-09-11_13-51-13 by buildguy) : An error occurred, processing will be stopped:
Table input.0 - Error occured while trying to connect to the database
Table input.0 - java.io.File parameter must be a directory. [D:\opt\workspace-eclipse\invoke-ktr-jndi\simple-jndi]
Complete log : https://gist.github.com/jrichardsz/9d74c7263f3567ac4b45
[EXPLANATION] This is due to in
KettleEnvironment.init();
https://github.com/jrichardsz/pentaho-pdi-spoon-usefull-templates/blob/master/running-etl-transformation-using-java/researching-pentaho-classes/KettleEnvironment.java
There is a inicialization :
if (simpleJndi) {
JndiUtil.initJNDI();
}
And in JndiUtil:
String path = Const.JNDI_DIRECTORY;
if ((path == null) || (path.equals("")))
https://github.com/jrichardsz/pentaho-pdi-spoon-usefull-templates/blob/master/running-etl-transformation-using-java/researching-pentaho-classes/JndiUtil.java
And in Const class :
public static String JNDI_DIRECTORY = NVL(System.getProperty("KETTLE_JNDI_ROOT"), System.getProperty("org.osjava.sj.root"));
https://github.com/jrichardsz/pentaho-pdi-spoon-usefull-templates/blob/master/running-etl-transformation-using-java/researching-pentaho-classes/Const.java
So wee need set this variable KETTLE_JNDI_ROOT
[FIX] A small change in your example : Just add this
System.setProperty("KETTLE_JNDI_ROOT", jdbcPropertiesPath);
before
KettleEnvironment.init();
A complete example based in your code :
import java.io.File;
import org.pentaho.di.core.KettleEnvironment;
import org.pentaho.di.core.exception.KettleException;
import org.pentaho.di.trans.Trans;
import org.pentaho.di.trans.TransMeta;
public class ExecuteSimpleTransformationWithJndiDatasource {
public static void main(String[] args) {
String resourcesPath = (new File(".").getAbsolutePath())+"\\src\\main\\resources";
String ktr_path = resourcesPath+"\\transformation_with_jndi.ktr";
//KETTLE_JNDI_ROOT could be the simple-jndi folder in your pdi or spoon home.
//in this example, is the resources folder
String jdbcPropertiesPath = resourcesPath;
try {
/**
* Initialize the Kettle Enviornment
*/
System.setProperty("KETTLE_JNDI_ROOT", jdbcPropertiesPath);
KettleEnvironment.init();
/**
* Create a trans object to properly assign the ktr metadata.
*
* #filedb: The ktr file path to be executed.
*
*/
TransMeta metadata = new TransMeta(ktr_path);
Trans trans = new Trans(metadata);
// Execute the transformation
trans.execute(null);
trans.waitUntilFinished();
// checking for errors
if (trans.getErrors() > 0) {
System.out.println("Erroruting Transformation");
}
} catch (KettleException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
For a complete example check my github channel:
https://github.com/jrichardsz/pentaho-pdi-spoon-usefull-templates/tree/master/running-etl-transformation-using-java/invoke-transformation-from-java-jndi/src/main/resources

How can you find the latest version of a maven artifact from Java using aether?

Their documentation is really slim and I was unable to figure it out.
I found a partial answer here, but it doesn't have all the code.
How can you find the latest version of a maven artifact from Java using aether?
The Aether Team maintains a demo page with such an example: FindNewestVersion.
Simplified a bit, this is what it comes down to.
Add to your POM the Aether dependencies:
<dependencies>
<dependency>
<groupId>org.eclipse.aether</groupId>
<artifactId>aether-impl</artifactId>
<version>${aetherVersion}</version>
</dependency>
<dependency>
<groupId>org.eclipse.aether</groupId>
<artifactId>aether-connector-basic</artifactId>
<version>${aetherVersion}</version>
</dependency>
<dependency>
<groupId>org.eclipse.aether</groupId>
<artifactId>aether-transport-file</artifactId>
<version>${aetherVersion}</version>
</dependency>
<dependency>
<groupId>org.eclipse.aether</groupId>
<artifactId>aether-transport-http</artifactId>
<version>${aetherVersion}</version>
</dependency>
<dependency>
<groupId>org.apache.maven</groupId>
<artifactId>maven-aether-provider</artifactId>
<version>${mavenVersion}</version>
</dependency>
</dependencies>
<properties>
<aetherVersion>1.1.0</aetherVersion>
<mavenVersion>3.3.9</mavenVersion>
</properties>
And then, you can use it like such:
public static void main(String[] args) {
RemoteRepository central = new RemoteRepository.Builder("central", "default", "http://repo1.maven.org/maven2/").build();
RepositorySystem repoSystem = newRepositorySystem();
RepositorySystemSession session = newSession(repoSystem);
Artifact artifact = new DefaultArtifact("groupId:artifactId:(0,]");
VersionRangeRequest request = new VersionRangeRequest(artifact, Arrays.asList(central), null);
try {
VersionRangeResult versionResult = repoSystem.resolveVersionRange(session, request);
System.out.println(versionResult.getHighestVersion());
} catch (VersionRangeResolutionException e) {
e.printStackTrace();
}
}
private static RepositorySystem newRepositorySystem() {
DefaultServiceLocator locator = MavenRepositorySystemUtils.newServiceLocator();
locator.addService(RepositoryConnectorFactory.class, BasicRepositoryConnectorFactory.class);
locator.addService(TransporterFactory.class, FileTransporterFactory.class);
locator.addService(TransporterFactory.class, HttpTransporterFactory.class);
return locator.getService(RepositorySystem.class);
}
private static RepositorySystemSession newSession(RepositorySystem system) {
DefaultRepositorySystemSession session = MavenRepositorySystemUtils.newSession();
LocalRepository localRepo = new LocalRepository("target/local-repo");
session.setLocalRepositoryManager(system.newLocalRepositoryManager(session, localRepo));
return session;
}
This creates a reference to the Maven Central repository and uses the version ranges [0,) to specify that we're interested in all versions with an unbounded maximal value. Finally, a version range query is performed and that enables us to determine the latest version.
This is from the project's Aether Demonstration and Examples site. I didn't try to run it, but it should be your answer.
public static void main( String[] args ) throws Exception
{
System.out.println( "------------------------------------------------------------" );
System.out.println( FindNewestVersion.class.getSimpleName() );
RepositorySystem system = Booter.newRepositorySystem();
RepositorySystemSession session = Booter.newRepositorySystemSession( system );
Artifact artifact = new DefaultArtifact( "org.eclipse.aether:aether-util:[0,)" );
VersionRangeRequest rangeRequest = new VersionRangeRequest();
rangeRequest.setArtifact( artifact );
rangeRequest.setRepositories( Booter.newRepositories( system, session ) );
VersionRangeResult rangeResult = system.resolveVersionRange( session, rangeRequest );
Version newestVersion = rangeResult.getHighestVersion();
System.out.println( "Newest version " + newestVersion + " from repository "
+ rangeResult.getRepository( newestVersion ) );
}

Categories