Apache Beam Streaming from Pub/Sub to ElasticSearch - java

I'm writing a java streaming pipeline with Apache Beam that reads messages from Google Cloud PubSub and should write them into an ElasticSearch instance. Currently, I'm using the direct runner, but the plan is to deploy the solution on Google Cloud Dataflow.
First of all, I wrote a pipeline that reads from PubSub and writes to text files and it works. Then, I sat up the ElasticSearch instance and also this works. I wrote some documents with curl and it was easy.
Then, when I tried to perform the write with Beam's ElasticSearch connector, I started to get some error. Actually, I get ava.lang.NoSuchMethodError: org.elasticsearch.client.RestClient.performRequest, in spite of the fact that I added the dependency on my pom.xml file.
What I'm doing is essentially this:
messages.apply(
"TwoMinWindow",
Window.into(FixedWindows.of(new Duration(120*1000)))
).apply(
"ElasticWrite",
ElasticsearchIO.write()
.withConnectionConfiguration(
ElasticsearchIO.ConnectionConfiguration
.create(new String[]{"http://xxx.xxx.xxx.xxx:9200"}, "streaming_data", "string")
.withUsername("xxxx")
.withPassword("xxxxxxxx")
)
);
Using the DirectRunner, I'm able to connect to PubSub, but I get an exception when the pipeline tries to connect with the ElasticSearch instance:
java.lang.NoSuchMethodError: org.elasticsearch.client.RestClient.performRequest(Ljava/lang/String;Ljava/lang/String;[Lorg/apache/http/Header;)Lorg/elasticsearch/client/Response;
at org.apache.beam.sdk.util.UserCodeException.wrap (UserCodeException.java:34)
at org.apache.beam.sdk.io.elasticsearch.ElasticsearchIO$Write$WriteFn$DoFnInvoker.invokeSetup (Unknown Source)
at org.apache.beam.sdk.transforms.reflect.DoFnInvokers.tryInvokeSetupFor (DoFnInvokers.java:50)
at org.apache.beam.runners.direct.DoFnLifecycleManager$DeserializingCacheLoader.load (DoFnLifecycleManager.java:104)
at org.apache.beam.runners.direct.DoFnLifecycleManager$DeserializingCacheLoader.load (DoFnLifecycleManager.java:91)
at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture (LocalCache.java:3528)
at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync (LocalCache.java:2277)
at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad (LocalCache.java:2154)
at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get (LocalCache.java:2044)
at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get (LocalCache.java:3952)
at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad (LocalCache.java:3974)
at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get (LocalCache.java:4958)
at org.apache.beam.runners.direct.DoFnLifecycleManager.get (DoFnLifecycleManager.java:61)
at org.apache.beam.runners.direct.ParDoEvaluatorFactory.createEvaluator (ParDoEvaluatorFactory.java:129)
at org.apache.beam.runners.direct.ParDoEvaluatorFactory.forApplication (ParDoEvaluatorFactory.java:79)
at org.apache.beam.runners.direct.TransformEvaluatorRegistry.forApplication (TransformEvaluatorRegistry.java:169)
at org.apache.beam.runners.direct.DirectTransformExecutor.run (DirectTransformExecutor.java:117)
at java.util.concurrent.Executors$RunnableAdapter.call (Executors.java:511)
at java.util.concurrent.FutureTask.run (FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:624)
at java.lang.Thread.run (Thread.java:748)
Caused by: java.lang.NoSuchMethodError: org.elasticsearch.client.RestClient.performRequest(Ljava/lang/String;Ljava/lang/String;[Lorg/apache/http/Header;)Lorg/elasticsearch/client/Response;
at org.apache.beam.sdk.io.elasticsearch.ElasticsearchIO.getBackendVersion (ElasticsearchIO.java:1348)
at org.apache.beam.sdk.io.elasticsearch.ElasticsearchIO$Write$WriteFn.setup (ElasticsearchIO.java:1200)
What I added in the pom.xml is :
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-java-io-google-cloud-platform</artifactId>
<version>${beam.version}</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.elasticsearch.client/elasticsearch-rest-client -->
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-client</artifactId>
<version>${elastic.version}</version>
</dependency>
I'm stuck with this problem and I don't know how to solve it. If I use a JestClient, I'm able to connect to ElasticSearch without any issue.
Have you any suggestion?

You are using a newer version of RestClient that does not have the method performRequest(String, Header). If you look at the latest source code, you can see that the method takes a Request now, whereas in older versions there were methods that took Strings and Headers.
These methods were deprecated and then removed from the code on September 1, 2018.
Either change your code to use the newer Elastic Search library, or specify an older version of the library (it needs to be before 7.0.x, e.g. 6.8.4) that is compatible with your code.

Related

The [0.0.0] version of the [jdbc] client is not compatible with Elasticsearch version [7.10.0];]

I have a strange problem with the Elasticsearch JDBC driver version after packaging.
<dependency>
<groupId>org.elasticsearch.plugin</groupId>
<artifactId>x-pack-sql-jdbc</artifactId>
<version>7.10.0</version>
</dependency>
When I run my code in IDEA to access Elasticsearch, it works normally.
Next, I execute mvn package to get a jar with dependencies.
When I run this jar to access Elasticsearch, the error is as follows:
java.sql.SQLException: Server sent bad type [action_request_validation_exception]. Original type was [Validation Failed: 1: The [0.0.0] version of the [jdbc] client is not compatible with Elasticsearch version [7.10.0];]. [org.elasticsearch.action.ActionRequestValidationException: Validation Failed: 1: The [0.0.0] version of the [jdbc] client is not compatible with Elasticsearch version [7.10.0];
at org.elasticsearch.action.ValidateActions.addValidationError(ValidateActions.java:26)
at org.elasticsearch.xpack.sql.action.AbstractSqlQueryRequest.validate(AbstractSqlQueryRequest.java:239)
at org.elasticsearch.xpack.sql.action.SqlQueryRequest.validate(SqlQueryRequest.java:79)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:144)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:83)
at org.elasticsearch.client.node.NodeClient.executeLocally(NodeClient.java:86)
at org.elasticsearch.client.node.NodeClient.doExecute(NodeClient.java:75)
at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:412)
at org.elasticsearch.xpack.sql.plugin.RestSqlQueryAction.lambda$prepareRequest$0(RestSqlQueryAction.java:116)
at org.elasticsearch.rest.BaseRestHandler.handleRequest(BaseRestHandler.java:115)
at org.elasticsearch.xpack.security.rest.SecurityRestFilter.handleRequest(SecurityRestFilter.java:88)
at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:258)
at org.elasticsearch.rest.RestController.tryAllHandlers(RestController.java:340)
at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:191)
at org.elasticsearch.http.AbstractHttpServerTransport.dispatchRequest(AbstractHttpServerTransport.java:319)
at org.elasticsearch.http.AbstractHttpServerTransport.handleIncomingRequest(AbstractHttpServerTransport.java:384)
......
I guess there was a problem with the version metadata when packaging, but I haven't found a solution.
I found some source code of Elasticsearch that may be useful.
https://github.com/elastic/elasticsearch/blob/master/x-pack/plugin/sql/sql-client/src/main/java/org/elasticsearch/xpack/sql/client/ClientVersion.java#L112
According to https://github.com/elastic/elasticsearch/blob/master/x-pack/plugin/sql/sql-client/src/main/java/org/elasticsearch/xpack/sql/client/ClientVersion.java#L112
I add the following configuration to pom.xml, and the issue is solved.
<manifestEntries>
<X-Compile-Elasticsearch-Version>7.10.0</X-Compile-Elasticsearch-Version>
</manifestEntries>

Could not create File Client. Mismatch found for java and native libraries java build version

I am trying to access mapr path remotely, using a spring boot application. I have set the fs.mapr.bailout.on.library.mismatch property to false, to avoid the error on version mismatch. Still whenever the getFileStatus() function gets called, the application stops with the following error:
2020-12-15 10:07:18,7377 ERROR JniCommon fs/client/fileclient/cc/jni_MapRClient.cc:691 Thread: 123145425235968 Mismatch found for java and native libraries java build version 6.0.1.20180404222005.GA, native build version 6.0.1.20190808152212.GA java patch vserion $Id: mapr-version: 6.0.1.20180404222005.GA 1aeeb6d3c17c777fcba0, native patch version $Id: mapr-version: 6.0.1.20190808152212.GA 1aeeb6d3c17c777fcba0
2020-12-15 10:07:18,7378 ERROR JniCommon fs/client/fileclient/cc/jni_MapRClient.cc:708 Thread: 123145425235968 Client initialization failed.
Code:
FileSystem fileSystem = FileSystem.newInstance(new Configuration());
Configuration conf = fileSystem.getConf();
conf.set("fs.mapr.bailout.on.library.mismatch", "false");
Path OffsetPath = new Path(filePath);
FileStatus file = fileSystem.getFileStatus(filePath); ====> This statement gives error
hbase dependencies used:
<dependency>
<groupId>com.mapr.fs</groupId>
<artifactId>mapr-hbase</artifactId>
<version>6.0.1-mapr</version>
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-client</artifactId>
<version>1.1.8-mapr-1710</version>
</dependency>
How can I correct this?
So it isn't a great idea to just disable the version mismatch error. Some mismatch is OK for some tasks, but some mismatch can be fatal.
To say much more, it would be helpful to have a little bit of clarification.
when you say remote access, are you talking from one cluster to another? Or from a client machine that is outside the cluster?
assuming that you mean the second case (client outside the cluster), what did you install on the client machine?
why did you disable the version mismatch? What versions are you worried will not match?
does the POSIX interface or the local loopback NFS work from the problem node?
does the hadoop shell work from the problem node?

Vert.x use OpenSslEngine for HTTP server

I'm trying to use OpenSSL engine on an HTTP server.
My configuration looks like
HttpServerOptions options = new HttpServerOptions()
.setSsl(config.getSsl())
.setSslEngineOptions(new OpenSSLEngineOptions())
.setClientAuth(ClientAuth.REQUEST)
.setKeyStoreOptions(keystoreOptions)
.setTrustStoreOptions(truststoreOptions)
.setEnabledSecureTransportProtocols(enabledSecureTransportProtocols);
I'm using Vert.x 3.6.2, which brings Netty dependencies 4.1.30. I also added to my pom:
<dependency>
<groupId>io.netty</groupId>
<artifactId>netty-tcnative</artifactId>
<version>2.0.20.Final</version>
<classifier>linux-x86_64-fedora</classifier>
</dependency>
Because my HTTP server is deployed on RHEL 7 with OpenSSL 1.0.1 (I know it's old). I'm getting the following error:
io.vertx.core.VertxException: OpenSSL is not available
And as I can see from logs, netty handler try to load this native library netty-tcnative and can't find it in the classpath:
netty-tcnative not in the classpath; OpenSslEngine will be unavailable. I don't know how to resolve this issue.

What is the Kerberos method?

I'm trying to connect to hive with jdbc. I keep getting this error. I tried looking it up but could not hind anything useful .
This is my connection string:
jdbc:hive2://hostname.xxx.com:10000/default;principal=hive/hostname.xxx.com#HADOOP_ENV.COM
What is this error: java.lang.NoSuchMethodError: org.apache.hadoop.security.authentication.util.KerberosUtil.hasKerberosTicket(Ljavax/security/auth/Subject;)Z
That method exists in Hadoop 2.8 but not in Hadoop 2.7 -- so my guess is that your project dependencies are not aligned with whatever version of Hadoop you have in Production.
Code in trunk
https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/util/KerberosUtil.java
code in branch-2.8.0
https://github.com/apache/hadoop/blob/branch-2.8.0/hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/util/KerberosUtil.java
code in branch-2.7.4
https://github.com/apache/hadoop/blob/branch-2.7.4/hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/util/KerberosUtil.java
Kerberos is an authentication protocol that is used by Hive server (https://en.wikipedia.org/wiki/Kerberos_(protocol))
The problem you are setting is more about a missing library in our pom.xml. Have you include <artifactId>hive-jdbc</artifactId> ?
I think your keberos ticket is not generated properly
Can you try running these two commands in order from the user you are trying to connect:
kdestroy (deleted any kerberos ticket generated before)
kinit (generates a new ticket)
Then try to connect again.

websocket code not working on Prod Tomcat Server - 404 Error

I am trying to run my first websocket app and refered this link to get some sample code . I simply created CustomEndPoint , WSClient class , html file and then ran it on netbeans IDE and it was working like a charm.
I tried to deploy it on tomcat server whose url is accessible using https by changing ws:// with wss:// and it worked on my dev environment but when I deployed the same code on Production env its throwing below error in console:
WebSocket connection to 'wss://xxxxxx-xxx.xxxx.com/websoc/ratesrv' failed: Error during WebSocket handshake: Unexpected response code: 404
For dev environment I below WS call is working :
wsocket = new WebSocket("wss://dev_ip:8443/websoc/ratesrv");
For Prod I am using(note the .com in url):
wss://xxxxx-xxxxx.xx.com/websoc/ratesrv
Do I need to explicitly provide the port number as well in PROD ?
Does your Tomcat version includes Websockets Runtime?
If it does you must delete all the Websockets dependencies from your WAR. Ensure that you call mvn clean after change scope to provided.
If not, you should include it. If you want to use Tyrus just put
<dependency>
<groupId>org.glassfish.tyrus</groupId>
<artifactId>tyrus-container-servlet</artifactId>
<version>1.12</version>
</dependency>
<dependency>
<groupId>org.glassfish.tyrus</groupId>
<artifactId>tyrus-client</artifactId>
<version>1.12</version>
</dependency>
And check that that there are no errors in the Tomcat console when deploy.
Try to use the latest version of tomcat.
tomcat8.5.20 worked for me.

Categories