I had written a code to fetch twitter tweets using kafka, Its working fine but it is not working for partitions. I want to create 3 partitions for one topic .. how to pass the values to partitioner class.. Any suggestions where i am doing wrong
public class kafkaSpoutFetchingRealTweets {
private String consumerKey;
private String consumerSecret;
private String accessToken;
private String accessTokenSecret;
private TwitterStream twitterStream;
/**
* #param contxt
*/
void start(final Context context) {
/** Producer properties **/
Properties props = new Properties();
props.put("metadata.broker.list",
context.getString(Constant.BROKER_LIST));
props.put("partitioner.class","SimplePartitioner");
props.put("serializer.class", context.getString(Constant.SERIALIZER));
props.put("request.required.acks",
context.getString(Constant.REQUIRED_ACKS));
props.put("producer.type", "async");
// props.put("partitioner.class", context.getClass());
ProducerConfig config = new ProducerConfig(props);
final Producer<String, String> producer = new Producer<String, String>(
config);
/** Twitter properties **/
consumerKey = context.getString(Constant.CONSUMER_KEY_KEY);
consumerSecret = context.getString(Constant.CONSUMER_SECRET_KEY);
accessToken = context.getString(Constant.ACCESS_TOKEN_KEY);
accessTokenSecret = context.getString(Constant.ACCESS_TOKEN_SECRET_KEY);
ConfigurationBuilder cb = new ConfigurationBuilder();
cb.setOAuthConsumerKey(consumerKey);
cb.setOAuthConsumerSecret(consumerSecret);
cb.setOAuthAccessToken(accessToken);
cb.setOAuthAccessTokenSecret(accessTokenSecret);
cb.setJSONStoreEnabled(true);
cb.setIncludeEntitiesEnabled(true);
twitterStream = new TwitterStreamFactory(cb.build()).getInstance();
/** Twitter listener **/
StatusListener listener = new StatusListener() {
// The onStatus method is executed every time a new tweet comes
// in.
public void onStatus(Status status) {
if(("en".equals(status.getLang())) && ("en".equals(status.getUser().getLang()))){
KeyedMessage<String, String> data = new KeyedMessage<String, String>(
context.getString(Constant.data),
DataObjectFactory.getRawJSON(status));
producer.send(data);
System.out.println(DataObjectFactory.getRawJSON(status));
}
}
}
public void onDeletionNotice(
StatusDeletionNotice statusDeletionNotice) {
}
public void onTrackLimitationNotice(int numberOfLimitedStatuses) {
}
public void onScrubGeo(long userId, long upToStatusId) {
}
public void onException(Exception ex) {
ex.printStackTrace();
logger.info("Shutting down Twitter sample stream...");
twitterStream.shutdown();
}
public void onStallWarning(StallWarning warning) {
System.out.println("stallWarning");
}
};
String[] lang = { "en" };
fq.language(lang);
twitterStream.addListener(listener);
twitterStream.sample();
}
public static void main(String[] args) {
try {
Context context = new Context(args[0]);
kafkaSpoutFetchingRealTweets tp = new kafkaSpoutFetchingRealTweets();
tp.start(context);
} catch (Exception e) {
e.printStackTrace();
logger.info(e.getMessage());
}
}
}
So there are a couple problems.
Your question and code don't match up. Your questions is asking about creating a topic with 3 partitions. But the code and example that you provided explains how to determine which partition the message should be sent to given that you've already created a topic with 3 partitions.
If you're actually wanting to create a topic with 3 partitions, you need to use the command line client. A sample can be found here, http://kafka.apache.org/documentation.html#quickstart
If you're actually wanting to just determine what partition you need to send data too. You'll need to provide more information about the actual problem you're encountering? Are they all going to the same partition? Then you need to look at how you're calculating the partition in your SimplePartitioner class that you're specifying in your config. What is in the SimplePartitioner class?
props.put("partitioner.class","SimplePartitioner");
Related
I have one scheduler which produces one event. My consumer consumes this event. The payload of this event is a json with below fields:
private String topic;
private String partition;
private String filterKey;
private long CustId;
Now I need to trigger one more consumer which will take all this information which I get a response from first consumer.
#KafkaListener(topics = "<**topic-name-from-first-consumer-response**>", groupId = "group" containerFactory = "kafkaListenerFactory")
public void consumeJson(List<User> data, Acknowledgment acknowledgment,
#Header(KafkaHeaders.RECEIVED_PARTITION_ID) List<Integer> partitions,
#Header(KafkaHeaders.OFFSET) List<Long> offsets) {
// consumer code goes here...}
I need to create some dynamic variable which I can pass in place of topic name.
similarly, I am using the filtering in the configuration file and I need to pass key dynamically in the configuration.
factory.setRecordFilterStrategy(new RecordFilterStrategy<String, Object>() {
#Override
public boolean filter(ConsumerRecord<String, Object> consumerRecord) {
if(consumerRecord.key().equals("**Key will go here**")) {
return false;
}
else {
return true;
}
}
});
How can we dynamically inject these values from the response of first consumer and trigger the second consumer. Both the consumers are in same application
You cannot do that with an annotated listener, the configuration is only used during initialization; you would need to create a listener container yourself (using the ConcurrentKafkaListenerContainerFactory) to dynamically create a listener.
EDIT
Here's an example.
#SpringBootApplication
public class So69134055Application {
public static void main(String[] args) {
SpringApplication.run(So69134055Application.class, args);
}
#Bean
public NewTopic topic() {
return TopicBuilder.name("so69134055").partitions(1).replicas(1).build();
}
}
#Component
class Listener {
private static final Logger log = LoggerFactory.getLogger(Listener.class);
private static final Method otherListen;
static {
try {
otherListen = Listener.class.getDeclaredMethod("otherListen", List.class);
}
catch (NoSuchMethodException | SecurityException ex) {
throw new IllegalStateException(ex);
}
}
private final ConcurrentKafkaListenerContainerFactory<String, String> factory;
private final MessageHandlerMethodFactory methodFactory;
private final KafkaAdmin admin;
private final KafkaTemplate<String, String> template;
public Listener(ConcurrentKafkaListenerContainerFactory<String, String> factory, KafkaAdmin admin,
KafkaTemplate<String, String> template, KafkaListenerAnnotationBeanPostProcessor<?, ?> bpp) {
this.factory = factory;
this.admin = admin;
this.template = template;
this.methodFactory = bpp.getMessageHandlerMethodFactory();
}
#KafkaListener(id = "so69134055", topics = "so69134055")
public void listen(String topicName) {
try (AdminClient client = AdminClient.create(this.admin.getConfigurationProperties())) {
NewTopic topic = TopicBuilder.name(topicName).build();
client.createTopics(List.of(topic)).all().get(10, TimeUnit.SECONDS);
}
catch (Exception e) {
log.error("Failed to create topic", e);
}
ConcurrentMessageListenerContainer<String, String> container =
this.factory.createContainer(new TopicPartitionOffset(topicName, 0));
BatchMessagingMessageListenerAdapter<String, String> adapter =
new BatchMessagingMessageListenerAdapter<>(this, otherListen);
adapter.setHandlerMethod(new HandlerAdapter(
this.methodFactory.createInvocableHandlerMethod(this, otherListen)));
FilteringBatchMessageListenerAdapter<String, String> filtered =
new FilteringBatchMessageListenerAdapter<>(adapter, record -> !record.key().equals("foo"));
container.getContainerProperties().setMessageListener(filtered);
container.getContainerProperties().setGroupId("group.for." + topicName);
container.setBeanName(topicName + ".container");
container.start();
IntStream.range(0, 10).forEach(i -> this.template.send(topicName, 0, i % 2 == 0 ? "foo" : "bar", "test" + i));
}
void otherListen(List<String> others) {
log.info("Others: {}", others);
}
}
spring.kafka.consumer.auto-offset-reset=earliest
Output - showing that the filter was applied to the records with bar in the key.
Others: [test0, test2, test4, test6, test8]
On my flink script I have a stream that I'm getting from one kafka topic, manipulate it and sending it back to kafka using the sink.
public static void main(String[] args) throws Exception {
final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setParallelism(1);
env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
Properties p = new Properties();
p.setProperty("bootstrap.servers", servers_ip_list);
p.setProperty("gropu.id", "Flink");
FlinkKafkaConsumer<Event_N> kafkaData_N =
new FlinkKafkaConsumer("CorID_0", new Ev_Des_Sch_N(), p);
WatermarkStrategy<Event_N> wmStrategy =
WatermarkStrategy
.<Event_N>forMonotonousTimestamps()
.withIdleness(Duration.ofMinutes(1))
.withTimestampAssigner((Event, timestamp) -> {
return Event.get_Time();
});
DataStream<Event_N> stream_N = env.addSource(
kafkaData_N.assignTimestampsAndWatermarks(wmStrategy));
The part above is working fine no problems at all, the part below instead is where I'm getting the issue.
String ProducerTopic = "CorID_0_f1";
DataStream<Stream_Blocker_Pojo.block> box_stream_p= stream_N
.keyBy((Event_N CorrID) -> CorrID.get_CorrID())
.map(new Stream_Blocker_Pojo());
FlinkKafkaProducer<Stream_Blocker_Pojo.block> myProducer = new FlinkKafkaProducer<>(
ProducerTopic,
new ObjSerializationSchema(ProducerTopic),
p,
FlinkKafkaProducer.Semantic.EXACTLY_ONCE); // fault-tolerance
box_stream_p.addSink(myProducer);
No errors everything works fine, this is the Stream_Blocker_Pojo where I'm mapping a stream manipulating it and sending out a new one.(I have simplify my code, just keeping 4 variables and removing all the math and data processing).
public class Stream_Blocker_Pojo extends RichMapFunction<Event_N, Stream_Blocker_Pojo.block>
{
public class block {
public Double block_id;
public Double block_var2 ;
public Double block_var3;
public Double block_var4;}
private transient ValueState<block> state_a;
#Override
public void open(Configuration parameters) throws Exception {
state_a = getRuntimeContext().getState(new ValueStateDescriptor<>("BoxState_a", block.class));
}
public block map(Event_N input) throws Exception {
p1.Stream_Blocker_Pojo.block current_a = state_a.value();
if (current_a == null) {
current_a = new p1.Stream_Blocker_Pojo.block();
current_a.block_id = 0.0;
current_a.block_var2 = 0.0;
current_a.block_var3 = 0.0;
current_a.block_var4 = 0.0;}
current_a.block_id = input.f_num_id;
current_a.block_var2 = input.f_num_2;
current_a.block_var3 = input.f_num_3;
current_a.tblock_var4 = input.f_num_4;
state_a.update(current_a);
return new block();
};
}
This is the implementation of the Kafka Serialization schema.
public class ObjSerializationSchema implements KafkaSerializationSchema<Stream_Blocker_Pojo.block>{
private String topic;
private ObjectMapper mapper;
public ObjSerializationSchema(String topic) {
super();
this.topic = topic;
}
#Override
public ProducerRecord<byte[], byte[]> serialize(Stream_Blocker_Pojo.block obj, Long timestamp) {
byte[] b = null;
if (mapper == null) {
mapper = new ObjectMapper();
}
try {
b= mapper.writeValueAsBytes(obj);
} catch (JsonProcessingException e) {
}
return new ProducerRecord<byte[], byte[]>(topic, b);
}
}
When I open the messages that i sent from my Flink script using kafka, I find that all the variables are "null"
CorrID b'{"block_id":null,"block_var1":null,"block_var2":null,"block_var3":null,"block_var4":null}
It looks like I'm sending out an empty obj with no values. But I'm struggling to understand what I'm doing wrong. I think that the problem could be into my implementation of the Stream_Blocker_Pojo or maybe into the ObjSerializationSchema, Any help would be really appreciated. Thanks
There are two probable issues here:
Are You sure the variable You are passing of type block doesn't have null fields? You may want to debug that part to be sure.
The reason may also be in ObjectMapper, You should have getters and setters available for Your block otherwise Jackson may not be able to access them.
I have a Kafka producer class which works fine. The producer fills the Kafka topic. Its code is in following:
public class kafka_test {
private final static String TOPIC = "flinkTopic";
private final static String BOOTSTRAP_SERVERS = "10.32.0.2:9092,10.32.0.3:9092,10.32.0.4:9092";
public FlinkKafkaConsumer<String> createStringConsumerForTopic(
String topic, String kafkaAddress, String kafkaGroup) {
// ************************** KAFKA Properties ******
Properties props = new Properties();
props.setProperty("bootstrap.servers", kafkaAddress);
props.setProperty("group.id", kafkaGroup);
FlinkKafkaConsumer<String> myconsumer = new FlinkKafkaConsumer<>(
topic, new SimpleStringSchema(), props);
myconsumer.setStartFromLatest();
return myconsumer;
}
private static Producer<Long, String> createProducer() {
Properties props = new Properties();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, BOOTSTRAP_SERVERS);
props.put(ProducerConfig.CLIENT_ID_CONFIG, "MyKafkaProducer");
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, LongSerializer.class.getName());
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
return new KafkaProducer<>(props);
}
public void runProducer(String msg) throws Exception {
final Producer<Long, String> producer = createProducer();
try {
final ProducerRecord<Long, String> record = new ProducerRecord<>(TOPIC, msg );
RecordMetadata metadata = producer.send(record).get();
System.out.printf("sent record(key=%s value='%s')" + " metadata(partition=%d, offset=%d)\n",
record.key(), record.value(), metadata.partition(), metadata.offset());
} finally {
producer.flush();
producer.close();
}
}
}
public class producerTest {
public static void main(String[] args) throws Exception{
kafka_test objKafka=new kafka_test();
String pathFile="/home/cfms11/IdeaProjects/pooyaflink2/KafkaTest/quickstart/lastDay4.csv";
String delimiter="\n";
objKafka.createStringProducer("flinkTopic",
"10.32.0.2:9092,10.32.0.3:9092,10.32.0.4:9092");
Scanner scanner = new Scanner(new File(pathFile));
scanner.useDelimiter(delimiter);
int i=0;
while(scanner.hasNext()){
if (i==0)
TimeUnit.MINUTES.sleep(1);
objKafka.runProducer(scanner.next());
i++;
}
scanner.close();
}
}
Because, I want to provide data for my Flink program, so, I use Kafka. In fact, I have this part code to consume data from Kafka topic:
Properties props = new Properties();
props.setProperty("bootstrap.servers",
"10.32.0.2:9092,10.32.0.3:9092,10.32.0.4:9092");
props.setProperty("group.id", kafkaGroup);
FlinkKafkaConsumer<String> myconsumer = new FlinkKafkaConsumer<>(
"flinkTopic", new SimpleStringSchema(), props);
DataStream<String> text = env.addSource(myconsumer).setStartFromEarliest());
I want to run Producer code at the same time that my program is running. My goal is that Producer send one record to the topic and consumer can poll that record from topic at the same time.
Would you please tell me how it is possible and how to manage it.
I think you need create two class file, one is the producer, the other is the consumer. Create topic first and then run the consumer, or run the producer directly.
I created app for getting info from upwork.com. I use java lib and Upwork OAuth 1.0. The problem is local request to API works fine, but when I do deploy to Google Cloud, my code does not work. I get ({"error":{"code":"503","message":"Exception: IOException"}}).
I create UpworkAuthClient for return OAuthClient and next it is used for requests in JobClient.
run() {
UpworkAuthClient upworkClient = new UpworkAuthClient();
upworkClient.setTokenWithSecret("USER TOKEN", "USER SECRET");
OAuthClient client = upworkClient.getOAuthClient();
//set query
JobQuery jobQuery = new JobQuery();
jobQuery.setQuery("query");
List<JobQuery> jobQueries = new ArrayList<>();
jobQueries.add(jobQuery);
// Get request of job
JobClient jobClient = new JobClient(client, jobQuery);
List<Job> result = jobClient.getJob();
}
public class UpworkAuthClient {
public static final String CONSUMERKEY = "UPWORK KEY";
public static final String CONSUMERSECRET = "UPWORK SECRET";
public static final String OAYTŠ CALLBACK = "https://my-app.com/main";
OAuthClient client ;
public UpworkAuthClient() {
Properties keys = new Properties();
keys.setProperty("consumerKey", CONSUMERKEY);
keys.setProperty("consumerSecret", CONSUMERSECRET);
Config config = new Config(keys);
client = new OAuthClient(config);
}
public void setTokenWithSecret (String token, String secret){
client.setTokenWithSecret(token, secret);
}
public OAuthClient getOAuthClient() {
return client;
}
public String getAuthorizationUrl() {
return this.client.getAuthorizationUrl(OAYTŠ CALLBACK);
}
}
public class JobClient {
private JobQuery jobQuery;
private Search jobs;
public JobClient(OAuthClient oAuthClient, JobQuery jobQuery) {
jobs = new Search(oAuthClient);
this.jobQuery = jobQuery;
}
public List<Job> getJob() throws JSONException {
JSONObject job = jobs.find(jobQuery.getQueryParam());
jobList = parseResponse(job);
return jobList;
}
}
Local dev server works fine, I get resilts on local machine, but in Cloud not.
I will be glad to any ideas, thanks!
{"error":{"code":"503","message":"Exception: IOException"}}
doesn't seem like a response return by Upwork API. Could you please provide the full response including the returned headers? So, we will take a more precise look into it.
I have an app where I filter messages according to some rules(existing some keywords or regexps). These rules are to be stored in .properties file(as they must be persistent). I've figured out how to read data from this file. here is the part of the code:
public class Config {
private static final Config ourInstance = new Config();
private static final CompositeConfiguration prop = new CompositeConfiguration();
public static Config getInstance() {
return ourInstance;
}
public Config(){
}
public synchronized void load() {
try {
prop.addConfiguration(new SystemConfiguration());
System.out.println("Loading /rules.properties");
final PropertiesConfiguration p = new PropertiesConfiguration();
p.setPath("/home/mikhail/bzrrep/DLP/DLPServer/src/main/resources/rules.properties");
p.load();
prop.addConfiguration(p);
} catch (ConfigurationException e) {
e.printStackTrace();
}
final int processors = prop.getInt("server.processors", 1);
// If you don't see this line - likely config name is wrong
System.out.println("Using processors:" + processors);
}
public void setKeyword(String customerId, String keyword){
}
public void setRegexp(String customerId, String regexp)
{}
}
as you see I'm going to add values to some properties. Here is the .properties file itself:
users = admin, root, guest
users.admin.keywords = admin
users.admin.regexps = test-5, test-7
users.root.keywords = root
users.root.regexps = *
users.guest.keywords = guest
users.guest.regexps =
I have a GUI for user to add keywords and regexps to this config. so, how to implement methods setKeyword and setRegexp?
The easyest way I found is to read the current values of the property to the String[], add there a new value and set property.
props.setProperty(fieldName, values);