Kafka-Connect add SQL JAR file to classpath - java

I am trying to deploy a connect-standalone job to stream from an mssql server however am facing an issue (Kafka-Connect is part of my Ambari deployment, not docker). This is the properties file I am using:
name=JdbcSourceConnector
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
connection.user=ue
connection.password=pw
tasks.max=1
connection.url=jdbc:sqlserver://servername
topic.prefix=iblog
query=SELECT * FROM IB_WEBLOG_DUMMY_small
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter=org.apache.kafka.connect.json.JsonConverter
poll.interval.ms=5000
table.poll.interval.ms=120000
mode=incrementing
incrementing.column.name=ID
I have added the jar file sqljdbc42.jar to /usr/share/java
and have run export CLASSPATH=/usr/share/java/*
however I still run into the error Failed to find any class that implements Connector and which name matches io.confluent.connect.jdbc.JdbcSourceConnector
Am I doing anything wrong or can I check something else?

Kafka-Connect is part of my Ambari deployment
That would imply you are using Hortonworks installation
You need to
git clone https://github.com/confluentinc/kafka-connect-jdbc/
Checkout a release branch that ideally matches your Kafka version. For example branch v3.1.2 is Kafka 0.10.1.1
mvn clean package will generate some folders in target/ of that project
SCP those files to all Kafka Connect workers in your cluster into /usr/hdp/current/kafka/.../share/java/kafka-connect-jdbc (create this, if not exists)
Restart Kafka processes to pick up the new CLASSPATH settings
You may need some extra Confluent packages that JDBC connect depends on

You need to include the kafka-connect-jdbc jar file, which contains the io.confluent.connect.jdbc.JdbcSourceConnector class.
If you are using maven, you can add it as a dependency:
[Add the following repo to your project if you haven't done so yet.]
<repository>
<id>confluent</id>
<url>http://packages.confluent.io/maven/</url>
</repository>
After this, add the following dependency:
<dependency>
<groupId>io.confluent</groupId>
<artifactId>kafka-connect-jdbc</artifactId>
<version>3.3.0 (or whatever version you want)</version>
</dependency>
https://github.com/confluentinc/kafka-connect-jdbc/issues/356

I too had the same problem. with Couchbase connector not found
ERROR Stopping after connector error (org.apache.kafka.connect.cli.ConnectStandalone:113) java.util.concurrent.ExecutionException: org.apache.kafka.connect.errors.ConnectException: Failed to find any class that implements Connector and which name matches com.couchbase.connect.kafka.CouchbaseSourceConnector
Setting classpath was losing the existing classpath and I couldn't append as the classpath
I moved the required jar file from kafka-connect-couchase/*.jar files to /path/kafka_verison/libs/
libs is A folder where all the jar file stored.

I met the same issue, I resolved it by running connect-standalone in the root folder of confluent, in my case this was: /opt/confluent-5.0.1

Related

Ivy Install task fails with JSCH SFTP error 4 first time, but is successful on subsequent attempts

I am trying to use the ANT Ivy install task to copy a library from one repository to the other.
Some example code within my ANT target:
<ivy:install organisation="testOrg" module="testModuleName" revision="1.2.3" from="fromRepo" to="toRepo"/>
The fromRepo and toRepo are defined in a local ivysettings.xml file.
The resolve (from fromRepo) of the library is successful but the install to toRepo fails, with an SFTP Code 4 error.
impossible to install testOrg#testModuleName;1.2.3: java.io.IOException: Failure
at org.apache.ivy.plugins.repository.sftp.SFTPRepository.put(SFTPRepository.java:164)
at org.apache.ivy.plugins.repository.AbstractRepository.put(AbstractRepository.java:130)
at org.apache.ivy.plugins.resolver.RepositoryResolver.put(RepositoryResolver.java:234)
at org.apache.ivy.plugins.resolver.RepositoryResolver.publish(RepositoryResolver.java:215)
at org.apache.ivy.core.install.InstallEngine.install(InstallEngine.java:150)
at org.apache.ivy.Ivy.install(Ivy.java:537)
at org.apache.ivy.ant.IvyInstall.doExecute(IvyInstall.java:102)
at org.apache.ivy.ant.IvyTask.execute(IvyTask.java:271)
...
Caused by: 4: Failure
at com.jcraft.jsch.ChannelSftp.throwStatusError(ChannelSftp.java:2833)
at com.jcraft.jsch.ChannelSftp.mkdir(ChannelSftp.java:2142)
at org.apache.ivy.plugins.repository.sftp.SFTPRepository.mkdirs(SFTPRepository.java:186)
at org.apache.ivy.plugins.repository.sftp.SFTPRepository.mkdirs(SFTPRepository.java:184)
at org.apache.ivy.plugins.repository.sftp.SFTPRepository.put(SFTPRepository.java:160)
... 37 more
However if I simply run the same target again, the install completes successfully!
It seems to be some issue with creating a directory, from com.jcraft.jsch.ChannelSftp.mkdir(ChannelSftp.java:2142) in the stacktrace.
After running the 1st time, the testOrg/testModuleName directory exists (only testOrg having previously existed).
The 2nd time running the testOrg/testModuleName/1.2.3 directory is created (along with the library artifacts).
If after running the 1st time I delete the testOrg/testModuleName directory it created, it will continue to return the code 4 error.
My ANT library directory contains: jsch-0.1.50.jar which I assume it is using to upload to the destination Ivy Server.
In addition I am using:
Ant 1.8.4
Ivy 2.4.0
Java 1.7.0_80
By debugging the Ivy SFTP source code that creates the new directories on the destination toRepo repository, I was able to see why this was happening.
The code is in the method: SFTPRepository.mkdirs() this recursively calls itself to make each directory in the path if they do not exist.
For my example the directory being uploaded was:
/toRepo/testOrg/testModuleName//1.2.3/
You can see the double slash: // in the middle of the path.
The meant that the mkdirs() method tried to create the testModuleName directory twice. The 2nd time failed which caused the code 4 error.
The reason there is a double slash in the path is because there is no branch for this artifact.
Within my ivy settings file the sftp resolver (for my toRepo repository) artifact patterns were configured to:
<ivy pattern="/toRepo/[organisation]/[module]/[branch]/[revision]/ivy-[revision].xml"/>
<artifact pattern="/toRepo/[organisation]/[module]/[branch]/[revision]/[artifact]-[revision].[ext]"/>
The /[branch]/ part of the pattern is what was generating the // in the path.
There are 2 configurations, one for the ivy.xml file itself and the other for all other artifacts.
Ivy patterns allow the use of parenthesis for optional parts of the pattern.
So changing my configuration to:
<ivy pattern="/toRepo/[organisation]/[module](/[branch])/[revision]/ivy-[revision].xml"/>
<artifact pattern="/toRepo/[organisation]/[module](/[branch])/[revision]/[artifact]-[revision].[ext]"/>
Fixed the issue and the ivy install functioned as expected.
This means that for antifacts where there is no branch defined (like 3rd party artifacts), then the branch directory will not be included in the path.

Zeppelin does not see dependencies from custom repository

I want to add company artifactory to Zeppelin spark interpreter and try to use this document.
So, the URL of our artifactory looks like
http://artifactory.thecompany.com:8081/artifactory/
The access is not restricted to specific user and artifacts are downloadable both from my machine and from machine where Zepplin is running (I tried this with curl).
I've copied the artifact ID from by build.gradle, so I am pretty sure it is correct. However when I try to add the artifact that should be found in my company's artifactory I get error
Error setting properties for interpreter 'spark.spark': Could not find
artifact
com.feedvisor.dataplatform:data-platform-schema-scala:jar:3.0.19-SNAPSHOT
in central (http://repo1.maven.org/maven2/)
This error message sounds like Zeppelin did not try to look for my dependency in custom repository.
I tried to play with artifactory URL using:
http://artifactory.thecompany.com:8081/artifactory/
http://artifactory.thecompany.com:8081/
as well as with "snapshot" property of "Add New Repository" form (using true and false) but nothing helped. The error message does not disappear and classes from the referenced artifact are not found.
Thanks in advance.
For Zeppelin to use your company's repo by default you can set ZEPPELIN_INTERPRETER_DEP_MVNREPO in your ${Z_HOME}/conf/zeppelin-env.sh:
export ZEPPELIN_INTERPRETER_DEP_MVNREPO=http://artifactory.thecompany.com:8081/artifactory/
Alternatively, you can use Dynamic Dependency Loading feature of the notebook:
%dep
z.reset()
z.addRepo("Artifactory").url("http://artifactory.thecompany.com:8081/artifactory/").snapshot()
z.load("com.feedvisor.dataplatform:data-platform-schema-scala:3.0.19-SNAPSHOT")

What causes FileNotFoundException: ...pdq.jar with db2jcc4?

When adding db2jcc4.jar to the system class path, Tomcat 8.0 raises a FileNotFoundException on a jar file that has no apparent reference to my project, pdq.jar.
I couldn't find it anywhere on my system or where it might come from, except through a search which turned up the answer below.
In this case, I have my CATALINA_HOME pointed to C:\tomcat8.0\apache-tomcat-8.0.41 and my project has the following maven dependency defined:
<dependency>
<groupId>com.ibm.db2.jcc</groupId>
<artifactId>db2jcc4</artifactId>
<version>10.1</version>
<scope>system</scope>
<systemPath>${env.CATALINA_HOME}/lib/db2jcc4-10.1.jar</systemPath>
</dependency>
This might happen in the newer versions of Db2 jcc driver:
Beginning with version 4.16 of the IBM Data Server Driver for JDBC and SQLJ, which is shipped with Db2 10.5 on Linux, UNIX, or Windows operating systems, the MANIFEST.MF file for db2jcc4.jar contains a reference to pdq.jar.
IBM Support offers 2 options:
Resolving the problem
To prevent the java.io.FileNotFoundException, you can take one of the following actions:
Edit the MANIFEST.MF file, and remove this line: Class-Path: pdq.jar
Edit the context.xml file for Apache Tomcat, and add an entry like the following one to set the value of scanClassPath to false.
Personally, I prefer the second approach, which can be done as following:
<Context>
...
<JarScanner scanClassPath="false" />
...
</Context>
According to this KB article on IBM, the problem comes from the MANIFEST, which lists pdq.jar, a third party optimization tool.
I had both db2jcc4.jar and db2jcc4.10.1.jar in my lib folder.
While the article suggests editing the MANIFEST file in db2jcc4.jar, version 10.1 does not include this entry at all.
Removing db2jcc4.jar solved my problem, so a solution in this case could also be to upgrade db2jcc4 from an older version to version 10.1, or if that is not possible, edit the manifest file as instructed.
You Just need to update jar db2jcc4.jar to be db2jcc4-10.1.jar
You can find maven dependency / Jar on that link
Kayvan Tehrani's answer explains what's going on here and that this error can be ignored.
Another alternative to clean up the logs is to create a dummy pdq.jar and place it into tomcat's lib folder.
jar -cf pdq.jar ""
(The ": no such file or directory" message from this command is expected.)

Jenkins ERROR: Failed to create /usr/share/tomcat7/.m2 on Maven project

I am running Jenkins ver. 2.60.2 and it doesn't seem possible, within a Maven Job, to define a local repository not in /usr/share/tomcat7/.m2.
Here are my attempts:
I created a Global Maven settings.xml and a Settings file with the Config File Management Plugin, that contains:
<settings>
<localRepository>/srv/maven/.m2/repository</localRepository>
...
</settings>
I Created a new Maven Project. Tried to make the Job see that file by attempting all of the following:
a) Defining either Settings file or Global settings file (I created two identical files) within the build step:
b) Adding a Pre-step Provide Configuration files, and then using the variable MY_SETTINGS either in the Goals and options or MAVEN_OPTS:
c) Use the Provide Configuration files within the build environment (and using the MY_SETINGS in the same way as in the previous step.:
However, none of these seems to work. The job always fails, trying to use the default maven repository location (/usr/share(tomcat7/.m2) - which I have no idea how to re-define:
provisioning config files...
copy managed file [MYFILE settings] to file:/srv/webapps/jenkins/jobs/testJob/workspace#tmp/config3408982272576109420tmp
provisioning config files...
copy managed file [MYFILE settings] to file:/srv/webapps/jenkins/jobs/testJob/workspace#tmp/config2203063037747373567tmp
Parsing POMs
using global settings config with name MYFILE settings
Replacing all maven server entries not found in credentials list is true
Deleting 1 temporary files
ERROR: Failed to create /usr/share/tomcat7/.m2
Finished: FAILURE
Do you know how to make this work within a Maven Job type in Jenkins?

How to publish CruiseControl LATEST build artifacts to a static URL

I have a Java multi-module Maven project that I want to build an MVN site and javadocs and have CruiseControl publish the latest daily builds to a configured static location.
The trouble is the CruiseControl artifactPublisher allows you to specify a dest directory but it is timestamped with the latest time of the last build. I want to be able to publish to a location that gets overridden on each build, such as:
http://cc-buildserver/cruisecontrol/artifacts/gameplatform-documentation/
artifactPublisher documentation:
dir - will copy all files from this
directory
dest - parent directory of actual
destination directory; actual
destination directory name will be the
build timestamp.
subdirectory -
subdirectory under the unique
(timestamp) directory to contain
artifacts
For example if I have a CruiseControl project called gameplatform-documentation and I configure my artifactPublisher as such:
<project name="gameplatform-documentation" forceOnly="true" requireModification="false" forceBuildNewProject="false" buildafterfailed="false">
...
<schedule>
<composite time="2300">
<maven2
mvnhome="${mvn.home}"
pomfile="${dev.root}/gameplatform-parent/pom.xml"
goal="site" />
</composite>
</schedule>
<publishers>
<artifactspublisher
dir="${dev.root}/gameplatform-parent/target/site"
dest="artifacts/gameplatform-documentation" />
</publishers>
</project>
I end up with my Maven generated site and javadocs in a different directory each build:
http://cc-buildserver/cruisecontrol/cruisecontrol/artifacts/gameplatform-documentation/20091110130202/
Maybe I need to use a custom AntPublisher or FTPPublisher and create another webserver to host the published docs. I could also use CC source control tools and checkin the documentation into our SVN server and use that to serve the documentation.
How can this be accomplished?
We ended up using Maven's site deploy plugin to publish the documentation artifacts through SCP (using cygwin SSHD server setup on Windows server) to our CruiseControl server's "artifact" folder:
<distributionManagement>
<site>
<id>dev.website</id>
<url>scp://user#buildserver/cygdrive/c/Users/user/servers/cruisecontrol-project-2.8.3/artifacts/documentation/project/gameplatform</url>
</site>
</distributionManagement>
Then we're able to access the nightly built documentation them by visiting:
http://buildserver:8081/cruisecontrol/artifacts/documentation/project/gameplatform

Categories