Samza/Kafka Failed to Update Metadata - java

I am currently working on writing a Samza Script that will just take data from a Kafka topic and output the data to another Kafka topic. I have written a very basic StreamTask however upon execution I am running into an error.
The error is below:
Exception in thread "main" org.apache.samza.SamzaException: org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 193 ms.
at org.apache.samza.coordinator.stream.CoordinatorStreamSystemProducer.send(CoordinatorStreamSystemProducer.java:112)
at org.apache.samza.coordinator.stream.CoordinatorStreamSystemProducer.writeConfig(CoordinatorStreamSystemProducer.java:129)
at org.apache.samza.job.JobRunner.run(JobRunner.scala:79)
at org.apache.samza.job.JobRunner$.main(JobRunner.scala:48)
at org.apache.samza.job.JobRunner.main(JobRunner.scala)
Caused by: org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 193 ms
I not entirely sure how to configure or have the script write the required Kafka metadata. Below is my code for the StreamTask and the properties file. In the properties file I added the Metadata section to see if that would assist in the process afterwards but to no avail. Is that the right direction or am I missing something entirely?
import org.apache.samza.task.StreamTask;
import org.apache.samza.task.MessageCollector;
import org.apache.samza.task.TaskCoordinator;
import org.apache.samza.system.SystemStream;
import org.apache.samza.system.IncomingMessageEnvelope;
import org.apache.samza.system.OutgoingMessageEnvelope;
/*
* Take all messages received and send them to
* a Kafka topic called "words"
*/
public class TestStreamTask implements StreamTask{
private static final SystemStream OUTPUT_STREAM = new SystemStream("kafka" , "words"); // create new system stream for kafka topic "words"
#Override
public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator){
String message = (String) envelope.getMessage(); // pull message from stream
for(String word : message.split(" "))
collector.send(new OutgoingMessageEnvelope(OUTPUT_STREAM, word, 1)); // output messsage to new system stream for kafka topic "words"
}
}
# Job
job.factory.class=org.apache.samza.job.yarn.YarnJobFactory
job.name=test-words
# YARN
yarn.package.path=file://${basedir}/target/${project.artifactId}-${pom.version}-dist.tar.gz
# Task
task.class=samza.examples.wikipedia.task.TestStreamTask
task.inputs=kafka.test
task.checkpoint.factory=org.apache.samza.checkpoint.kafka.KafkaCheckpointManagerFactory
task.checkpoint.system=kafka
task.checkpoint.replication.factor=1
# Metrics
metrics.reporters=snapshot,jmx
metrics.reporter.snapshot.class=org.apache.samza.metrics.reporter.MetricsSnapshotReporterFactory
metrics.reporter.snapshot.stream=kafka.metrics
metrics.reporter.jmx.class=org.apache.samza.metrics.reporter.JmxReporterFactory
# Serializers
serializers.registry.string.class=org.apache.samza.serializers.StringSerdeFactory
serializers.registry.metrics.class=org.apache.samza.serializers.MetricsSnapshotSerdeFactory
# Systems
systems.kafka.samza.factory=org.apache.samza.system.kafka.KafkaSystemFactory
systems.kafka.samza.msg.serde=string
systems.kafka.consumer.zookeeper.connect=localhost:2181/
systems.kafka.consumer.auto.offset.reset=largest
systems.kafka.producer.bootstrap.servers=localhost:9092
# Metadata
systems.kafka.metadata.bootstrap.servers=localhost:9092

This question is about Kafka 0.8 which should be out of support if I am not mistaken.
This fact, combined with the context of people only running into this issue sometimes, but not all the time (and nobody seems to struggle with this in recent years), gives me very good confidence that upgrading to a more recent version of Kafka will resolve the problem.

Related

NoClassDefFoundError: Could not initialize class com.yammer.metrics.Metrics on storm supervisors when trying to read topics from kafka

I am running a topology to read from 100 kafka topics using 100 storm spouts and then 50 bolts emitting the data to my own data processing class, consisting of 10 instances/threads for each bolt. I am getting the following errors now and then(not always) only on some of the supervisor hosts(not always the same). I clearly have com.yammer.metrics.Metrics in my classpath.
However the error keeps bugging me:
java.lang.NoClassDefFoundError: Could not initialize class com.yammer.metrics.Metrics
at kafka.metrics.KafkaMetricsGroup$class.newTimer(KafkaMetricsGroup.scala:52)
at kafka.consumer.FetchRequestAndResponseMetrics.newTimer(FetchRequestAndResponseStats.scala:25)
at kafka.consumer.FetchRequestAndResponseMetrics.<init>(FetchRequestAndResponseStats.scala:26)
at kafka.consumer.FetchRequestAndResponseStats.<init>(FetchRequestAndResponseStats.scala:37)
at kafka.consumer.FetchRequestAndResponseStatsRegistry$$anonfun$2.apply(FetchRequestAndResponseStats.scala:50)
at kafka.consumer.FetchRequestAndResponseStatsRegistry$$anonfun$2.apply(FetchRequestAndResponseStats.scala:50)
at kafka.utils.Pool.getAndMaybePut(Pool.scala:61)
at kafka.consumer.FetchRequestAndResponseStatsRegistry$.getFetchRequestAndResponseStats(FetchRequestAndResponseStats.scala:54)
at kafka.consumer.SimpleConsumer.<init>(SimpleConsumer.scala:39)
at kafka.javaapi.consumer.SimpleConsumer.<init>(SimpleConsumer.scala:34)
at storm.kafka.DynamicPartitionConnections.register(DynamicPartitionConnections.java:43)
at storm.kafka.PartitionManager.<init>(PartitionManager.java:58)
at storm.kafka.ZkCoordinator.refresh(ZkCoordinator.java:78)
at storm.kafka.ZkCoordinator.getMyManagedPartitions(ZkCoordinator.java:45)
at storm.kafka.KafkaSpoutWrapper.nextTuple(KafkaSpoutWrapper.java:74)
at backtype.storm.daemon.executor$fn__8840$fn__8855$fn__8886.invoke(executor.clj:577)
at backtype.storm.util$async_loop$fn__551.invoke(util.clj:491)
at clojure.lang.AFn.run(AFn.java:22)
at java.base/java.lang.Thread.run(Thread.java:833)
Any one having any ideas about this would be greatly appreciated! Thanks!
The storm version I am using: 0.10.2
The kafka version I am using: kafka_2.10-0.8.2.1
The error is weird because it is not always happening and also not always happening for the same supervisor host, not for all the supervisor hosts. I dont know where to begin with

Redistribution fails for large messages in ActiveMQ Artemis

I'm using ActiveMQ Artemis version 2.19.1, and I'm facing an issue in a 6-node (3 masters) cluster where redistribution is failing for large messages with the below warning logs:
23:35:05,551 WARN [org.apache.activemq.artemis.core.server] AMQ222303: Redistribution by Redistributor[TEST_QUEUE/2244] of messageID = 196,950,715 failed: java.lang.UnsupportedOperationException: Method not supported with Large Messages
at org.apache.activemq.artemis.protocol.amqp.broker.AMQPLargeMessage.getData(AMQPLargeMessage.java:311) [artemis-amqp-protocol-2.19.1.jar:2.19.1]
at org.apache.activemq.artemis.protocol.amqp.broker.AMQPMessage.anyMessageAnnotations(AMQPMessage.java:1374) [artemis-amqp-protocol-2.19.1.jar:2.19.1]
at org.apache.activemq.artemis.protocol.amqp.broker.AMQPMessage.hasScheduledDeliveryTime(AMQPMessage.java:1352) [artemis-amqp-protocol-2.19.1.jar:2.19.1]
at org.apache.activemq.artemis.core.postoffice.impl.PostOfficeImpl.processRoute(PostOfficeImpl.java:1499) [artemis-server-2.19.1.jar:2.19.1]
at org.apache.activemq.artemis.core.server.cluster.impl.Redistributor$1.run(Redistributor.java:169) [artemis-server-2.19.1.jar:2.19.1]
at org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:42) [artemis-commons-2.19.1.jar:2.19.1]
at org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:31) [artemis-commons-2.19.1.jar:2.19.1]
at org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:65) [artemis-commons-2.19.1.jar:2.19.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [rt.jar:1.8.0_322]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [rt.jar:1.8.0_322]
at org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118) [artemis-commons-2.19.1.jar:2.19.1]
Later I see broker is removing consumers with below warning since properties=null:
00:10:28,280 WARN [org.apache.activemq.artemis.core.server] AMQ222151: removing consumer which did not handle a message, consumer=ServerConsumerImpl [id=57, filter=null, binding=LocalQueueBinding [address=TEST_QUEUE, queue=QueueImpl[name=TEST_QUEUE, postOffice=PostOfficeImpl [server=ActiveMQServerImpl::name=localhost], temp=false]#9c000b7, filter=null, name=TEST_QUEUE, clusterName=TEST_QUEUE1e359f55-c92b-11ec-b908-005056a3af3f]], message=Reference[157995028]:RELIABLE:AMQPLargeMessage( [durable=true, messageID=157995028, address=TEST_QUEUE, size=0, scanningStatus=SCANNED, applicationProperties={VER=05, trackingId=62701757c80c3004d037ded6}, messageAnnotations={}, properties=null, extraProperties = TypedProperties[_AMQ_AD=TEST_QUEUE]]: java.lang.IllegalArgumentException: Array must not be empty or null
at org.apache.qpid.proton.codec.CompositeReadableBuffer.append(CompositeReadableBuffer.java:688) [proton-j-0.33.10.jar:]
at org.apache.qpid.proton.engine.impl.DeliveryImpl.send(DeliveryImpl.java:345) [proton-j-0.33.10.jar:]
at org.apache.qpid.proton.engine.impl.SenderImpl.send(SenderImpl.java:74) [proton-j-0.33.10.jar:]
at org.apache.activemq.artemis.protocol.amqp.proton.ProtonServerSenderContext$LargeMessageDeliveryContext.deliverInitialPacket(ProtonServerSenderContext.java:686) [artemis-amqp-protocol-2.19.1.jar:2.19.1]
at org.apache.activemq.artemis.protocol.amqp.proton.ProtonServerSenderContext$LargeMessageDeliveryContext.deliver(ProtonServerSenderContext.java:587) [artemis-amqp-protocol-2.19.1.jar:2.19.1]
I have 6 consumers for this queue if one message out of many (let's say 1,000) is large - it should process other messages but, the processing is stopped completely with 0 consumers on the queue.
When you send large messages, within the first block sent the large message is parsed for properties. Most likely you are using large properties in a way the server is not being able to parse the first few bytes of the message. You should avoid using large properties and let the large portion only for the client.
We tried following up with many tests on this JIRA, and the only plausible scenarios would be through large properties, or some incomplete message your client is generating in a way the server can't parse it.
https://issues.apache.org/jira/browse/ARTEMIS-3837
if you provide a reproducer showing how the property is failing to be parsed we will follow up with a possible fix.
Please move any discussion regarding this to the JIRA. It could be a bug caused by your anti-pattern.

Does anyone knows kafka producer hanging fix

Can anyone please tell me about this exception.
ERROR [kafka-producer-network-thread | producer-2] c.o.p.a.s.CalculatorAdapter [CalculatorAdapter.java:285]
Cannot send outgoingDto with decision id = 46d1-9491-123ce9c7a916 in kafka:
org.springframework.kafka.core.KafkaProducerException: Failed to send;
nested exception is org.apache.kafka.common.errors.TimeoutException:
Expiring 1 record(s) for save-request-0:604351 ms has passed since batch creation
at org.springframework.kafka.core.KafkaTemplate.lambda$buildCallback$4(KafkaTemplate.java:602)
at org.springframework.kafka.core.DefaultKafkaProducerFactory$CloseSafeProducer$1.onCompletion(DefaultKafkaProducerFactory.java:871)
at org.apache.kafka.clients.producer.KafkaProducer$InterceptorCallback.onCompletion(KafkaProducer.java:1356)
at org.apache.kafka.clients.producer.internals.ProducerBatch.completeFutureAndFireCallbacks(ProducerBatch.java:231)
at org.apache.kafka.clients.producer.internals.ProducerBatch.done(ProducerBatch.java:197)
at org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:676)
at org.apache.kafka.clients.producer.internals.Sender.sendProducerData(Sender.java:380)
at org.apache.kafka.clients.producer.internals.Sender.runOnce(Sender.java:323)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:239)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: org.apache.kafka.common.errors.TimeoutException:
Expiring 1 record(s) for save-request-0:604351 ms has passed since batch creation
I have been fighting with him for the second week.
Revised a bunch of fix recipes, but none of the recipes helped.
My program sends messages about 60 kilobytes in size, but they do not reach the kafka server.
The entire java application log is filled with exceptions of this kind.
My guess is that the time to fill the batch size takes longer than the time of the transaction, so the message is not sent.
// example
Properties props = new Properties();
...
pros.put(ProducerConfig.BATCH_SIZE_CONFIG, 60000); // 60kb
...
Producer producer = new KafkaProducer<>(props);
Checkout this articles.
Kafka Producer Batch
Kafka Producer batch size
Batch size configuration
http://cloudurable.com/blog/kafka-tutorial-kafka-producer-advanced-java-examples/index.html
https://kafka.apache.org/26/javadoc/org/apache/kafka/clients/producer/ProducerConfig.html

How to display topics using Kafka Clients in Java?

public static void main(String [] args){
Properties config = new Properties();
config.put(AdminClientConfig.BOOTSRAP._SERVERS_CONFIG, "mybroker.ip.address:9092");
AdminClient admin = AdminClient.create(config);
ListTopicsResult ltr = admin.listTopics().names().get();
}
I am catching an ExecutionException with the error messages: org.apache.kafka.common.errors.TimeoutException: Call(callName:listTopics, deadlineMs=1599813311360, tries=1, nextAllowedTryMs=-9223372034707292162) timed out at 9223372036854775807 after 1 attempt(s)
StackTrace points to the class KafkaFutureImpl in wrapAndThrow.
I can't really paste the entire error since I am writing all of this through my mobile phone.
I am using Kafka Clients 2.6.0, JDK 1.8.0_191
this is very weird since the timeout happens instantly, and I have also tried passing arguments into get(time, timeunit) and I am getting the same result.
EDIT:
I was missing some dependencies in the POM. That resolved the issue.

Kafka Streams creates avro topic without schema

I developed a java application which reads data from an avro topic, using Schema Registry, then makes simple transformations and prints the result in the console. By default I used GenericAvroSerde class for keys and values. Everything worked fine except that I had to define additionally configuration for each serde like
final Map<String, String> serdeConfig = Collections.singletonMap("schema.registry.url", kafkaStreamsConfig.getProperty("schema.registry.url"));
final Serde<GenericRecord> keyGenericAvroSerde = new GenericAvroSerde();
final Serde<GenericRecord> valueGenericAvroSerde = new GenericAvroSerde();
keyGenericAvroSerde.configure(serdeConfig, true);
valueGenericAvroSerde.configure(serdeConfig, false);
Without that I always get an error like:
Exception in thread "NTB27821-StreamThread-1" org.apache.kafka.streams.errors.StreamsException: Failed to deserialize value for record. topic=CH-PGP-LP2_S20-002_agg, partition=0, offset=4482940
at org.apache.kafka.streams.processor.internals.SourceNodeRecordDeserializer.deserialize(SourceNodeRecordDeserializer.java:46)
at org.apache.kafka.streams.processor.internals.RecordQueue.addRawRecords(RecordQueue.java:84)
at org.apache.kafka.streams.processor.internals.PartitionGroup.addRawRecords(PartitionGroup.java:117)
at org.apache.kafka.streams.processor.internals.StreamTask.addRecords(StreamTask.java:474)
at org.apache.kafka.streams.processor.internals.StreamThread.addRecordsToTasks(StreamThread.java:642)
at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:548)
at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:519)
Caused by: org.apache.kafka.common.errors.SerializationException: Error deserializing Avro message for id 69
Caused by: java.lang.NullPointerException
at io.confluent.kafka.serializers.AbstractKafkaAvroDeserializer.deserialize(AbstractKafkaAvroDeserializer.java:122)
at io.confluent.kafka.serializers.AbstractKafkaAvroDeserializer.deserialize(AbstractKafkaAvroDeserializer.java:93)
at io.confluent.kafka.serializers.KafkaAvroDeserializer.deserialize(KafkaAvroDeserializer.java:55)
at io.confluent.kafka.streams.serdes.avro.GenericAvroDeserializer.deserialize(GenericAvroDeserializer.java:63)
at io.confluent.kafka.streams.serdes.avro.GenericAvroDeserializer.deserialize(GenericAvroDeserializer.java:39)
at org.apache.kafka.common.serialization.ExtendedDeserializer$Wrapper.deserialize(ExtendedDeserializer.java:65)
at org.apache.kafka.common.serialization.ExtendedDeserializer$Wrapper.deserialize(ExtendedDeserializer.java:55)
at org.apache.kafka.streams.processor.internals.SourceNode.deserializeValue(SourceNode.java:56)
at org.apache.kafka.streams.processor.internals.SourceNodeRecordDeserializer.deserialize(SourceNodeRecordDeserializer.java:44)
at org.apache.kafka.streams.processor.internals.RecordQueue.addRawRecords(RecordQueue.java:84)
at org.apache.kafka.streams.processor.internals.PartitionGroup.addRawRecords(PartitionGroup.java:117)
at org.apache.kafka.streams.processor.internals.StreamTask.addRecords(StreamTask.java:474)
at org.apache.kafka.streams.processor.internals.StreamThread.addRecordsToTasks(StreamThread.java:642)
at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:548)
at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:519)
Well, it was unsual, but fine, after that (when I added configuration call as I posted above) - it worked and my application was able to to all the operations and print out the result.
But!
When I tried to use call through() - just to post data to the new topic - I faced the problem I am asking about: TOPIC WAS CREATED WITHOUT A SCHEMA.
How it can be???
Interesting fact is that the data is being written, but it is:
a) in binary format, so simple consumer cannot read it
b) it has not a schema - so avro consumer can't read it either:
Processed a total of 1 messages
[2017-10-05 11:25:53,241] ERROR Unknown error when running consumer: (kafka.tools.ConsoleConsumer$:105)
org.apache.kafka.common.errors.SerializationException: Error retrieving Avro schema for id 0
Caused by: io.confluent.kafka.schemaregistry.client.rest.exceptions.RestClientException: Schema not found; error code: 40403
at io.confluent.kafka.schemaregistry.client.rest.RestService.sendHttpRequest(RestService.java:182)
at io.confluent.kafka.schemaregistry.client.rest.RestService.httpRequest(RestService.java:203)
at io.confluent.kafka.schemaregistry.client.rest.RestService.getId(RestService.java:379)
at io.confluent.kafka.schemaregistry.client.rest.RestService.getId(RestService.java:372)
at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.getSchemaByIdFromRegistry(CachedSchemaRegistryClient.java:65)
at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.getBySubjectAndId(CachedSchemaRegistryClient.java:131)
at io.confluent.kafka.serializers.AbstractKafkaAvroDeserializer.deserialize(AbstractKafkaAvroDeserializer.java:122)
at io.confluent.kafka.serializers.AbstractKafkaAvroDeserializer.deserialize(AbstractKafkaAvroDeserializer.java:93)
at io.confluent.kafka.formatter.AvroMessageFormatter.writeTo(AvroMessageFormatter.java:122)
at io.confluent.kafka.formatter.AvroMessageFormatter.writeTo(AvroMessageFormatter.java:114)
at kafka.tools.ConsoleConsumer$.process(ConsoleConsumer.scala:140)
at kafka.tools.ConsoleConsumer$.run(ConsoleConsumer.scala:78)
at kafka.tools.ConsoleConsumer$.main(ConsoleConsumer.scala:53)
at kafka.tools.ConsoleConsumer.main(ConsoleConsumer.scala)
[2017-10-05 11:25:53,241] ERROR Unknown error when running consumer: (kafka.tools.ConsoleConsumer$:105)
org.apache.kafka.common.errors.SerializationException: Error retrieving Avro schema for id 0
Caused by: io.confluent.kafka.schemaregistry.client.rest.exceptions.RestClientException: Schema not found; error code: 40403
at io.confluent.kafka.schemaregistry.client.rest.RestService.sendHttpRequest(RestService.java:182)
at io.confluent.kafka.schemaregistry.client.rest.RestService.httpRequest(RestService.java:203)
at io.confluent.kafka.schemaregistry.client.rest.RestService.getId(RestService.java:379)
at io.confluent.kafka.schemaregistry.client.rest.RestService.getId(RestService.java:372)
at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.getSchemaByIdFromRegistry(CachedSchemaRegistryClient.java:65)
at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.getBySubjectAndId(CachedSchemaRegistryClient.java:131)
at io.confluent.kafka.serializers.AbstractKafkaAvroDeserializer.deserialize(AbstractKafkaAvroDeserializer.java:122)
at io.confluent.kafka.serializers.AbstractKafkaAvroDeserializer.deserialize(AbstractKafkaAvroDeserializer.java:93)
at io.confluent.kafka.formatter.AvroMessageFormatter.writeTo(AvroMessageFormatter.java:122)
at io.confluent.kafka.formatter.AvroMessageFormatter.writeTo(AvroMessageFormatter.java:114)
at kafka.tools.ConsoleConsumer$.process(ConsoleConsumer.scala:140)
at kafka.tools.ConsoleConsumer$.run(ConsoleConsumer.scala:78)
at kafka.tools.ConsoleConsumer$.main(ConsoleConsumer.scala:53)
at kafka.tools.ConsoleConsumer.main(ConsoleConsumer.scala)
Of course I checked out the schema registry for the subject:
curl -X GET http://localhost:8081/subjects/agg_value_9-value/versions
{"error_code":40401,"message":"Subject not found."}
But the same call to another topic written by Java App - producer of the initial data shows that schema exist:
curl -X GET http://localhost:8081/subjects/CH-PGP-LP2_S20-002_agg-value/versions
[1]
Both applications use identical "schema.registry.url" configuration
Just to summarize - topic is created, data is written, can be read with simple consumer, but it is binary and the schema doesn't exist.
Also I tried to create a schema with a Landoop, somehow to match the data, but no success - and by the way it is not a proper way to use kafka streams - everything should be done on the fly.
Help, please!
When through is called, the default serde defined via StreamsConfig is used unless users specifically overrides it. Which default serde did you use? To be correct you should be using the AbstractKafkaAvroSerializer which will automatically register the schema for that through topic.

Categories