How to stream data from kafka broker using java spark consumer - java

I have been trying to consume data from a kafka broker running on my local network. I have a producer pushing messages with a string key and json value. I have been trying to learn spark and have written up a scipt in intellij but whenever i run it i get a
java: cannot access scala.Serializable
class file for scala.Serializable not found
package org.example;
import org.apache.kafka.clients.consumer.ConsumerConfig;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.common.serialization.StringDeserializer;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.streaming.Duration;
import org.apache.spark.streaming.api.java.JavaInputDStream;
import org.apache.spark.streaming.api.java.JavaStreamingContext;
import org.apache.spark.streaming.kafka010.ConsumerStrategies;
import org.apache.spark.streaming.kafka010.KafkaUtils;
import org.apache.spark.streaming.kafka010.LocationStrategies;
import java.util.Arrays;
import java.util.Collection;
import java.util.HashMap;
import java.util.Map;
import scala.*;
public class Main {
public static void main(String[] args) throws InterruptedException {
// myApp is the name of the application to show in the cluster UI
//set master is to be a Spark, Mesos or YARN cluster URL
// local[*] just sets it up to run in local mode
SparkConf conf = new SparkConf().setAppName("myApp").setMaster("local[*]"); // to start streaming you need to create a context and the first step in that is creatign your configuration
JavaSparkContext c = new JavaSparkContext(conf);
JavaStreamingContext ssc = new JavaStreamingContext(c, new Duration(1000)); // here is where we create our streaming conext
Map<String, Object> kafkaParams = new HashMap<>();
Collection<String> topics = Arrays.asList("TestTopic","TestTopic1"); // setting up a collection for multiple topics to pull data from
JavaInputDStream<ConsumerRecord<String, String>> stream;
kafkaParams.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "10.0.0.24:9092,"); // this is where we set up our streaming context to get input from a kafka broker
kafkaParams.put(ConsumerConfig.CLIENT_ID_CONFIG, StringDeserializer.class);
//kafkaParams.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, Json.class);
kafkaParams.put(ConsumerConfig.GROUP_ID_CONFIG, "myGroup"); // giving the broker our consumer group
kafkaParams.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest"); // setting it up to pull all records from kafka broker
kafkaParams.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
stream = // you can also create an rdd if batch processing is better see: https://spark.apache.org/docs/latest/streaming-kafka-0-10-integration.html
KafkaUtils.createDirectStream( // here is where the stream is actually created and instantiated
ssc, // this is the streaming context we created earlier
LocationStrategies.PreferConsistent(), //This will distribute partitions among consumers equally
ConsumerStrategies.Subscribe(topics, kafkaParams) //using consumerStrategies allows you to consume from a fixed number of topics, you can also do SubscribePattern to use a regex of topics of interest
);
stream.foreachRDD(x -> x.foreach(System.out::println));
ssc.start();
}
}
Here is my pom file
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>org.example</groupId>
<artifactId>Kafka-Streaming-spark</artifactId>
<version>1.0-SNAPSHOT</version>
<properties>
<java.version>18</java.version>
<maven.compiler.source>18</maven.compiler.source>
<maven.compiler.target>18</maven.compiler.target>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<dependencies>
<!-- https://mvnrepository.com/artifact/org.scala-lang/scala-library -->
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>2.12.13</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.12</artifactId>
<version>3.3.1</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka-0-10_2.12</artifactId>
<version>3.3.1</version>
</dependency>
</dependencies>
</project>
I just set it up to print out records to see if it was even consuming messages but I cant even get it to run. Also aplogize in advance if my question is poor this is my first time posting a question.

Related

Data from Kafka is not printed in console when I submmited jar file. (Spark streaming + Kafka integration 3.1.1)

There is no error when I submitted a jar file.
But data isn't printed when I send data using the HTTP protocol.
(Data is printed well when I check using "kafka-console-consumer.sh" )
[Picture, submitted a jar file: Data isn't printed]
code and dependencies in jar files are down below.
[Picture, Kafka-console-consumer.sh: Data is printed]
command :
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --group test-consumer --topic test01 --from-beginning
[JAVA FILE]
2-1, Dependencies
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.11</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.12</artifactId>
<version>3.1.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.12</artifactId>
<version>3.1.1</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka-0-10_2.12</artifactId>
<version>3.1.1</version>
</dependency>
</dependencies>
2-2, Code
package SparkTest.SparkStreaming;
import org.apache.spark.streaming.*;
import org.apache.spark.streaming.api.java.*;
import java.util.*;
import org.apache.spark.SparkConf;
import org.apache.spark.streaming.kafka010.*;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.common.serialization.StringDeserializer;
public final class JavaWordCount {
public static void main(String[] args) throws Exception {
// Create a local StreamingContext with two working thread and batch interval of 1 second
SparkConf conf = new SparkConf().setMaster("yarn").setAppName("JavaWordCount");
JavaStreamingContext jssc = new JavaStreamingContext(conf, Durations.seconds(1));
// load a topic from broker
Map<String, Object> kafkaParams = new HashMap<>();
kafkaParams.put("bootstrap.servers", "localhost:9092");
kafkaParams.put("key.deserializer", StringDeserializer.class);
kafkaParams.put("value.deserializer", StringDeserializer.class);
kafkaParams.put("group.id", "test-consumer");
kafkaParams.put("auto.offset.reset", "latest");
kafkaParams.put("enable.auto.commit", false);
Collection<String> topics = Arrays.asList("test01");
JavaInputDStream<ConsumerRecord<String, String>> stream =
KafkaUtils.createDirectStream(
jssc,
LocationStrategies.PreferBrokers(),
ConsumerStrategies.<String, String>Subscribe(topics, kafkaParams)
);
JavaDStream<String> data = stream.map(v -> {
return v.value(); // mapping to convert into spark D-Stream
});
data.print();
jssc.start();
jssc.awaitTermination();
}
}
You're using --from-beginning in the console consumer, but auto.offset.reset=latest in the Spark code.
Therefore, you need to run the producer while Spark runs if you want to see any data
You will also want to consider using spark-sql-kafka-0-10 Structured Streaming dependency instead, as you can find in the KafkaWordCount example

Maven / Firebase - Cannot find symbol: variable Firestore Client

I am trying to connect to Firebase within my Java project, managed with Maven. I am following Firebase's Getting Started with Cloud Firestore exactly to set up my development environment and initialize Cloud Firestore. This is what my pom.xml and myClass.java looks like.
My dependency in pom.xml:
<dependencies>
<!-- This dependency is added for Firebase usage -->
<dependency>
<groupId>com.google.firebase</groupId>
<artifactId>firebase-admin</artifactId>
<version>7.1.0</version>
</dependency>
My myClass.java:
package ca.uhn.fhir.example;
import javax.servlet.ServletException;
import javax.servlet.annotation.WebServlet;
import ca.uhn.fhir.context.FhirContext;
import ca.uhn.fhir.rest.server.RestfulServer;
import ca.uhn.fhir.rest.server.interceptor.ResponseHighlighterInterceptor;
import com.google.auth.oauth2.GoogleCredentials;
import com.google.cloud.firestore.Firestore;
import com.google.firebase.FirebaseApp;
import com.google.firebase.FirebaseOptions;
import java.io.*;
#WebServlet("/*")
public class Example02_SimpleRestfulServer extends RestfulServer {
#Override
protected void initialize() throws ServletException {
// Create a context for the appropriate version
setFhirContext(FhirContext.forR4());
// Register resource providers
// Calls the constructor on the resource provider class?
registerProvider(new Example01_PatientResourceProvider());
// Format the responses in nice HTML
registerInterceptor(new ResponseHighlighterInterceptor());
try {
// Initialize Firebase
// Use a service account
String credentials_path = "/my/path/dfdfdfdfd93.json";
InputStream serviceAccount = new FileInputStream(credentials_path);
GoogleCredentials credentials = GoogleCredentials.fromStream(serviceAccount);
FirebaseOptions options = new FirebaseOptions.Builder()
.setCredentials(credentials)
.build();
FirebaseApp.initializeApp(options);
Firestore db = FirestoreClient.getFirestore();
}
catch (IOException e) {
e.printStackTrace();
}
However, I run into the error:
cannot find symbol [ERROR] symbol: variable FirestoreClient. I have tried adding other related dependencies as shown below, but they are not found/don't seem to fix the problem. I think I am following the Firebase tutorial exactly, so what might I be doing wrong? Why is the FirestoreClient not recognized but many of the other variables are?
<!-- Attempting to fix errors (this doesn't seem to do anything) -->
<!-- https://mvnrepository.com/artifact/com.google.cloud/google-cloud-firestore -->
<!-- <dependency>
<groupId>com.google.cloud</groupId>
<artifactId>google-cloud-firestore</artifactId>
<version>2.2.4</version>
</dependency> -->
<!-- Not found -->
<!-- https://mvnrepository.com/artifact/com.google.firebase/firebase-firestore -->
<!-- <dependency>
<groupId>com.google.firebase</groupId>
<artifactId>firebase-firestore</artifactId>
<version>18.1.0</version>
</dependency> -->
<!--Not found-->
<!-- https://mvnrepository.com/artifact/com.google.firebase/firebase-core -->
<!-- <dependency>
<groupId>com.google.firebase</groupId>
<artifactId>firebase-core</artifactId>
<version>18.0.2</version>
</dependency> -->
This is happening because you did not import the FirestoreClient class, so add the following import to your Example02_SimpleRestfulServer class and it will be fixed:
import com.google.firebase.cloud.FirestoreClient;
NOTE: This should also be in the example in the documentation example you shared, I would recommend you to open a Bug Report in Google's IssueTracker for they to fix that documentation, if you'd like.

#SQSListen results in an exception and not working

I've a very simple Spring cloud aws project. I'm using Java 11.
here is the pom:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.2.5.RELEASE</version>
<relativePath/> <!-- lookup parent from repository -->
</parent>
<groupId>com.demo.arf</groupId>
<artifactId>testsqs-boot</artifactId>
<version>0.0.1-SNAPSHOT</version>
<name>testsqs-boot</name>
<description>Demo project for Spring Boot</description>
<dependencies>
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-dependencies</artifactId>
<version>Hoxton.SR3</version>
<type>pom</type>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
<exclusions>
<exclusion>
<groupId>org.junit.vintage</groupId>
<artifactId>junit-vintage-engine</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-aws</artifactId>
<version>2.2.1.RELEASE</version>
</dependency>
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-aws-messaging</artifactId>
<version>2.2.1.RELEASE</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
</plugin>
</plugins>
</build>
</project>
Config class:
package com.demo.arf.testsqsboot;
import com.amazonaws.services.sqs.AmazonSQSAsync;
import com.amazonaws.services.sqs.AmazonSQSAsyncClientBuilder;
import org.springframework.beans.factory.annotation.Value;
//import org.springframework.cloud.aws.messaging.core.QueueMessagingTemplate;
import org.springframework.cloud.aws.messaging.config.SimpleMessageListenerContainerFactory;
import org.springframework.cloud.aws.messaging.core.QueueMessagingTemplate;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import com.amazonaws.auth.AWSStaticCredentialsProvider;
import com.amazonaws.auth.BasicAWSCredentials;
import com.amazonaws.regions.Regions;
//import com.amazonaws.services.sqs.AmazonSQSAsync;
//import com.amazonaws.services.sqs.AmazonSQSAsyncClientBuilder;
#Configuration
public class SQSConfig {
#Value("${cloud.aws.region.static}")
private String region;
#Value("${cloud.aws.credentials.access-key}")
private String awsAccessKey;
#Value("${cloud.aws.credentials.secret-key}")
private String awsSecretKey;
#Bean
public QueueMessagingTemplate queueMessagingTemplate() {
return new QueueMessagingTemplate(amazonSQSAsync());
}
public AmazonSQSAsync amazonSQSAsync() {
return AmazonSQSAsyncClientBuilder.standard().withRegion(Regions.US_EAST_1)
.withCredentials(new AWSStaticCredentialsProvider(new BasicAWSCredentials(awsAccessKey, awsSecretKey)))
.build();
}
#Bean
public SimpleMessageListenerContainerFactory simpleMessageListenerContainerFactory(AmazonSQSAsync amazonSQS){
SimpleMessageListenerContainerFactory factory = new SimpleMessageListenerContainerFactory();
factory.setAmazonSqs(amazonSQS);
factory.setMaxNumberOfMessages(10);
return factory;
}
}
the controller class which send/receive message:
package com.demo.arf.testsqsboot.controller;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.cloud.aws.messaging.core.QueueMessagingTemplate;
import org.springframework.cloud.aws.messaging.listener.annotation.SqsListener;
import org.springframework.messaging.support.MessageBuilder;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;
#RestController
public class SQSController {
#Autowired
private QueueMessagingTemplate queueMessagingTemplate;
private static final Logger LOG = LoggerFactory.getLogger(SQSController.class);
#GetMapping("/send-sqs-message")
public String sendMessage() {
String sqsEndPoint= "https://sqs.us-east-2.amazonaws.com/1234567879/my_queue";
queueMessagingTemplate.convertAndSend(sqsEndPoint, MessageBuilder.withPayload("hello from Spring Boot").build());
return "Hello SQS";
}
#SqsListener("my_queue")
public void getMessage(String message) {
LOG.info(" *********** Message from SQS Queue - "+message);
}
}
application.yml:
server:
port: 9001
cloud:
aws:
region:
static: us-east-1
auto: false
credentials:
access-key: "asdmnasdn"
secret-key: "sfkjsdjksdkj"
end-point:
uri: https://sqs.us-east-2.amazonaws.com/1234567879/my_queue
I can get the send working fine. but when I add the listener, I get the following error during startup and listener does not receive messages:
2020-03-15 01:02:00.677 INFO 15423 --- [ main] o.s.web.context.ContextLoader : Root WebApplicationContext: initialization completed in 3853 ms
2020-03-15 01:02:01.109 INFO 15423 --- [ main] o.s.s.concurrent.ThreadPoolTaskExecutor : Initializing ExecutorService 'applicationTaskExecutor'
**WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.amazonaws.util.XpathUtils (file:/Users/arf/.m2/repository/com/amazonaws/aws-java-sdk-core/1.11.415/aws-java-sdk-core-1.11.415.jar) to method com.sun.org.apache.xpath.internal.XPathContext.getDTMManager()
WARNING: Please consider reporting this to the maintainers of com.amazonaws.util.XpathUtils
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
2020-03-15 01:02:01.749 WARN 15423 --- [ main]**
s.c.a.m.l.SimpleMessageListenerContainer : Ignoring queue with name 'my_queue': The queue does not exist.; nested exception is com.amazonaws.services.sqs.model.QueueDoesNotExistException: The specified queue does not exist for this wsdl version. (Service: AmazonSQS; Status Code: 400; Error Code: AWS.SimpleQueueService.NonExistentQueue; Request ID: 62821505-3f34-5434-a6ee)
2020-03-15 01:02:01.749 INFO 15423 --- [ main] o.s.s.concurrent.ThreadPoolTaskExecutor : Initializing ExecutorService
Also, one basic question. How does the #SQSListener know where to find the aws account info and sqs uri?
I've made it work with the following change in config class. However, I wonder, how most of the sample programs online without this code(which construct AmazonSQSAsync with withEndpointConfiguration) is working.
public QueueMessagingTemplate queueMessagingTemplate(AmazonSQSAsync amazonSQS) {
return new QueueMessagingTemplate(amazonSQS);
}
#Bean
#Primary
public AmazonSQSAsync amazonSQS(AWSCredentialsProvider credentials) {
return AmazonSQSAsyncClientBuilder.standard()
.withCredentials(credentials)
.withEndpointConfiguration(new AwsClientBuilder.EndpointConfiguration(localStackSqsUrl, awsRegion))
.build();
}
#Bean
#Primary
public AWSCredentialsProvider awsCredentialsProvider() {
return new AWSCredentialsProviderChain(
new AWSStaticCredentialsProvider(
new BasicAWSCredentials("local", "stack")));
}```
A couple things.
- First NEVER store your AK/SK in a property or yml file like that. I can tell those are fake values, but you'll always want to pull these from ~/.aws/credentials or instance metadata. The AWS clients like AmazonSqSAsyncClientBuilder will do automatically if you just call .standard(). No need for the credential provider.
- Second, same with region
- Third, I believe #SqsListener will use the ContainerFactory bean you defined earlier, at least that's how the #JmsListener works.
- The error message you received was that your queue name was not found in your account in your selected region. You told it us-east-1, but in your send code you specified us-east-2. My guess based on your post is that your queue is in us-east-2 since your question came up about #SqsListener, not the queueMessagingTemplate.

no listing appears using swagger on jersey 2 with grizzly

I have setup swagger on jersey 2 and grizzly (no web.xml). I am able to access the swagger page however my API resources do not appear.
My main file is seen below
`
package com.beraben.jersey.test;
import com.wordnik.swagger.jaxrs.config.BeanConfig;
import java.net.URI;
import org.glassfish.grizzly.http.server.CLStaticHttpHandler;
import org.glassfish.grizzly.http.server.HttpServer;
import org.glassfish.jersey.grizzly2.httpserver.GrizzlyHttpServerFactory;
import org.glassfish.jersey.server.ResourceConfig;
/**
*
* #author Evan
*/
public class jerseyTestMain {
/**
* #param args the command line arguments
*/
public static final String BASE_URI = "http://localhost:8080/myapp/";
public static HttpServer startServer() {
// create a resource config that scans for JAX-RS resources and providers
// in com.example.rest package
final ResourceConfig rc = new ResourceConfig().packages("com.beraben.jersey.test", "com.wordnik.swagger.jersey.listing");
BeanConfig config = new BeanConfig();
config.setResourcePackage("com.beraben.jersey.test");
config.setVersion("1.0.0");
config.setScan(true);
// create and start a new instance of grizzly http server
// exposing the Jersey application at BASE_URI
return GrizzlyHttpServerFactory.createHttpServer(URI.create(BASE_URI), rc);
}
public static void main(String[] args) throws InterruptedException {
final HttpServer server = startServer();
CLStaticHttpHandler staticHttpHandler = new CLStaticHttpHandler(jerseyTestMain.class.getClassLoader(), "swagger-ui/");
server.getServerConfiguration().addHttpHandler(staticHttpHandler, "/docs");
Object syncObj = new Object();
synchronized (syncObj) {
syncObj.wait();
}
}
}
`
I also have a API setup seen below
package com.beraben.jersey.test;
import com.wordnik.swagger.annotations.Api;
import javax.ws.rs.GET;
import javax.ws.rs.Path;
import javax.ws.rs.Produces;
import javax.ws.rs.core.MediaType;
/**
*
* #author Evan
*/
#Path("myresource")
#Api(value = "/myresource")
public class MyResource {
#GET
#Produces(MediaType.TEXT_PLAIN)
public String getIt(){
return "Got it!";
}
}
I have no problem using the API it returns correctly.
But for some reason I cant get swagger to show details about the API calls, is there something i need to do more to get it to show details about the existing API in my code?
My static files are copied from the sample project
jersey2-grizzly2-swagger-demo
Also for reference, here is my pom file (not slight difference from the demo project is I don't use dependencyManagment to get jersey-bom instead i reference it directly).
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.beraben</groupId>
<artifactId>jersey-test</artifactId>
<version>1.0-SNAPSHOT</version>
<packaging>jar</packaging>
<dependencies>
<dependency>
<groupId>org.glassfish.jersey.archetypes</groupId>
<artifactId>jersey-quickstart-grizzly2</artifactId>
<version>2.22.2</version>
</dependency>
<dependency>
<groupId>org.glassfish.jersey</groupId>
<artifactId>jersey-bom</artifactId>
<version>2.22.2</version>
<type>pom</type>
</dependency>
<dependency>
<groupId>org.glassfish.jersey.containers</groupId>
<artifactId>jersey-container-grizzly2-http</artifactId>
<version>2.22.2</version>
</dependency>
<dependency>
<groupId>com.wordnik</groupId>
<artifactId>swagger-jersey-jaxrs_2.10</artifactId>
<version>1.3.13</version>
</dependency>
</dependencies>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
</properties>
</project>
After looking at the google forms, it turns out that standalone version of swagger does not actually work.
I created a separate maven web app and added my jersey project as dependencies. This worked after fiddling around with the swagger version that matched the jersey version I was using.

How can I connect Storm and D3.js using Redis and Flask?

I have my Storm testing topology done, and before I created a d3 script on an Html code, that readed the data from a text file. I want it now to read the data directly from a Storm topology (a bolt maybe?) But I have no clue of how to do it. I'm using Horton Works Sandbox for the testing, Any help would be apprecieated.
Thanks in advance!
I've found a storm package for redis that I'm trying to use now. It allows you to set a bolt for writting on redis, and I've set the node already. My problem now is that eclipse can't find the imports of the java code and the ones on the pom.xml.I've downloaded the package. My current java bolt and imports are:
package Storm.practice.Storm.Prova;
import backtype.storm.Config;
import backtype.storm.LocalCluster;
import backtype.storm.StormSubmitter;
import backtype.storm.task.OutputCollector;
import backtype.storm.task.TopologyContext;
import backtype.storm.testing.TestWordSpout;
import backtype.storm.topology.OutputFieldsDeclarer;
import backtype.storm.topology.TopologyBuilder;
import backtype.storm.topology.base.BaseRichBolt;
import backtype.storm.tuple.Fields;
import backtype.storm.tuple.Tuple;
import backtype.storm.tuple.Values;
import backtype.storm.utils.Utils;
import backtype.storm.spout.SpoutOutputCollector;
import backtype.storm.topology.base.BaseRichSpout;
import java.util.Map;
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.concurrent.atomic.AtomicLong;
import storm.external.*;// error from here
import storm.external.storm-redis.org.apache.storm.redis.common.config.JedisClusterConfig;
import org.apache.storm.redis.common.config.JedisPoolConfig;
import org.apache.storm.redis.common.mapper.RedisDataTypeDescription;
import org.apache.storm.redis.common.mapper.RedisStoreMapper;
import redis.clients.jedis.JedisCommands;//to here
..........
class MortsStoreMapper implements RedisStoreMapper {
private RedisDataTypeDescription description;
private final String hashKey = "wordCount";
public WordCountStoreMapper() {
description = new RedisDataTypeDescription(
RedisDataTypeDescription.RedisDataType.HASH, hashKey);
}
#Override
public RedisDataTypeDescription getDataTypeDescription() {
return description;
}
#Override
public String getKeyFromTuple(ITuple tuple) {
return tuple.getStringByField("word");
}
#Override
public String getValueFromTuple(ITuple tuple) {
return tuple.getStringByField("count");
}
}
And my pom.xml:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>Storm.practice</groupId>
<artifactId>Storm.Prova</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>Storm.Prova</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.storm</groupId>
<artifactId>storm-core</artifactId>
<version>0.9.1-incubating</version>
</dependency>
<dependency> #error from here...
<groupId>org.apache.storm</groupId>
<artifactId>storm-redis</artifactId>
<version>{0.9.1-incubating}</version>
<type>jar</type>
</dependency>#... to here
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
</dependencies> <build>
<plugins>
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>exec-maven-plugin</artifactId>
<version>1.2.1</version>
<executions>
<execution>
<goals>
<goal>java</goal>
</goals>
</execution>
</executions>
<configuration>
<mainClass>Storm.practice.Storm.Prova.ProvaTopology</mainClass>
</configuration>
</plugin>
</plugins>
</build>
</project>
The errors are that Eclipse can't find the dependences and the packages
Based on your scenario, I think you will need some system or code in the middle that will read data from Storm and push to D3. You can try out something like WSO2 CEP [1], which has the ability to connect to Storm and uses websockets to push events to a dashboard based on d3 [2].
In your scenario, you can map your logic in the Storm bolt to a Siddhi query [3] and then get those events from Storm to WSO2 CEP. Then you can create a websocket publisher to send events to your D3 code using the built-in websocket capabilities of the server.
Please note that this is one of the possible solutions based on your requirements and you might be better off utilizing the capabilities of an already existing CEP system that has integration to Storm and D3.
Hope this helps!
[1] http://wso2.com/products/complex-event-processor/
[2] https://docs.wso2.com/display/CEP400/Visualizing+Results+in+the+Analytics+Dashboard
[3] https://docs.wso2.com/display/CEP400/Sample+0501+-+Processing+a+Simple+Filter+Query+with+Apache+Storm+Deployment
I now it's a bit late, almost a year, but I was reviewing my account, and saw this question.
I finally used Redis, interfaced wit Jedis, that I import as a Maven artifact. Once this was working and I was able to see the results with the Redis Monitor via telnet, I created a simple Node.js code, launched it, and the data was arriving to the client, hence to d3. I needed Socket.io and Redis.js to achieve this, but is working now.
If someone need some details, please, ask me and I will help you happily.

Categories