Creating a generic Avro sink in Spring Cloud Stream / Dataflow - java

I am trying to create a generic receiver for Avro messages in Spring Cloud Data Flow but I am running into some bother. The setup I have currently is a processor converting my input data into an Avro message and pushing this out to the sink. I am using the Spring Schema Registry server and I can see the the schema being POSTed to it and successfully being stored, I can also see it being successfully being retrieved from the registry server by my sink.
If I place my Avro generated object from the processor into the sink application and configure my sink like so with the type declared it works perfectly.
#StreamListener(Sink.INPUT)
public void logHandler(DataRecord data) {
LOGGER.info("data='{}'", data.toString());
}
However, I would like to make it so that my sink does not need to be aware of the schema ahead of time e.g. use the schema from the schema registry and access the fields through data.get("fieldName")
I was hoping to accomplish this through use of the Avro GenericRecord like so:
#StreamListener(Sink.INPUT)
public void logHandler(GenericRecord data) {
LOGGER.info("data='{}'", data.toString());
}
But this throws an exception into the logs:
2017-12-05 12:10:15,206 DEBUG -L-2 o.s.w.c.RestTemplate:691 - GET request for "http://192.168.99.100:8990/datarecord/avro/v1" resulted in 200 (null)
org.springframework.messaging.converter.MessageConversionException: No schema can be inferred from type org.apache.avro.generic.GenericRecord and no schema has been explicitly configured.
Is there a way to accomplish what I am trying to do?

Related

spring-cloud-schema-registry-client does not convert to Avro and I get org.springframework.http.converter.HttpMessageNotWritableException

My goal is to have Spring automaically marshall responses and requests to Avro format from POJO and any Avro schema files automatically. I have tried to no avail for several days to get spring-cloud-schema-registry-client to automatically serialize/deserialze application/avro messages sent back by the controller as a response. The request comes in with the Accept header "application/avro" (I also tried application/emp.v1+avro and other combinations).
No matter what I try, the response message is not automatically deserialized by spring cloud AvroSchemaMessageConverter. As per their documentation, the Avro schema should be automatically inferred from the POJO, and the repsonse should be sent back in Avro out of the box.
The dependency I'm using is
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-schema-registry-client</artifactId>
<version>1.1.5</version>
</dependency>
My application.yml file has-
spring:
cloud:
stream:
bindings:
output:
contentType: application/*+avro
schema:
avro:
dynamicSchemaGenerationEnabled: true
Can someone please help me understand why this doesn't work and I keep getting the error message -
org.springframework.http.converter.HttpMessageNotWritableException: No converter for [class java.util.Arrays$ArrayList]
Do I need to register the schema explicitly?
For eg,, I'm returning a list of Employees POJO.
If needed I can try to generate the Avro schema from the Employee POJO using jackson-dataformats and then manually register it with the AvroConverter bean. But as per my understanding this should be done out of the box by spring-cloud-schema-registry.
Additional Information:
My controller returns a list of Employee POJO like
ResponseEntity<List<Employee>>
I used #EnableSchemRegistryClient in my main class along with #SpringBootApplication.
In my test method I'm using:
MvcResult result = mockMvc
.perform(MockMvcRequestBuilders.get(url)
//.contentType(MediaType.APPLICATION_JSON_VALUE)
.accept("application/avro"))
.andExpect(content().contentType("application/avro"))
I'm no longer a Spring member, and I believe the schema registry is no longer maintained. But it was never its intention to work with the controller to serialize/deserialize payloads. It was meant to work with Spring Cloud Stream own's serialization mechanism (version 1.x).
I'm afraid it would not work in this case without some tinkering on your end to build a serializer that looks up the registry for a given schema and serializes it back and forth. Note you would also need a header with the FQN of the schema in order to query the registry, and caching it would be a must to not impact your app.
Hopefully someone from the Spring team could provide some better insights on this.

How does Kafka Schema registration happen in Spring Cloud Stream?

I am trying to understand how to use Spring Cloud Streams with the Kafka Binder.
Currently, I am trying to register an AVRO schema with my Confluent Schema Registry and send messages to a topic.
I am unable to understand how the schema registration is being done by Spring Cloud Streams behind the scenes.
Lets take this example from the Spring Cloud Stream samples.
The AVRO schema is located in src/resources/avro
When the mvn:compile goal is run the POJO for the AVRO schema is generated and the producer can post data.
But what I am not able to understand is how Spring Cloud Stream is doing the schema registration to AVRO ?
#Autowired
StreamBridge streamBridge;
#Bean
public Supplier<Sensor> supplier() {
return () -> {
Sensor sensor = new Sensor();
sensor.setId(UUID.randomUUID().toString() + "-v1");
sensor.setAcceleration(random.nextFloat() * 10);
sensor.setVelocity(random.nextFloat() * 100);
sensor.setTemperature(random.nextFloat() * 50);
return sensor;
};
}
#Bean
public Consumer<Sensor> receiveAndForward() {
return s -> streamBridge.send("sensor-out-0", s);
}
#Bean
Consumer<Sensor> receive() {
return s -> System.out.println("Received Sensor: " + s);
}
Is it done when the beans are created ?
Or is it done when the first message is sent ? If so then how does Spring Stream know where to find the .avsc file from ?
Basically what is happening under the hood ?
There seems to be no mention about this is in the docs.
Thanks.
Your serialization strategy (in this case, AVRO) is always handled in the serializers (for producers) and deserializers (for consumers).
You can have Avro (de)serialized keys and/or Avro (de)serialized values. Which means one should pass in KafkaAvroSerializer.class/KafkaAvroDeserializer.class to the producer/consumer configs, respectively. On top of this, one must pass in the schema.registry.url to the clients config as well.
So behind the scenes, spring cloud stream makes your application avro compatible when it creates your producers/consumers (using the configs found in application.properties or else where). Your clients will connect to the schema registry (logs will tell you if failed to connect) on start up, but does not do any schema registration out of the box.
Schema registration is done on the first message that gets sent. If you haven't already, you'll see that the generated POJOs contain the schemas already, so spring cloud stream doesn't need the .avsc files at all. For example, my last generated Avro pojo contained (line 4) :
#org.apache.avro.specific.AvroGenerated
public class AvroBalanceMessage extends org.apache.avro.specific.SpecificRecordBase implements org.apache.avro.specific.SpecificRecord {
private static final long serialVersionUID = -539731109258473824L;
public static final org.apache.avro.Schema SCHEMA$ = new org.apache.avro.Schema.Parser().parse({\"type\":\"record\",\"name\":\"AvroBalanceMessage\",\"namespace\":\"tech.nermindedovic\",\"fields\"[{\"name\":\"accountNumber\",\"type\":\"long\",\"default\":0},{\"name\":\"routingNumber\",\"type\":\"long\",\"default\":0},{\"name\":\"balance\",\"type\":{\"type\":\"string\",\"avro.java.string\":\"String\"},\"default\":\"0.00\"},{\"name\":\"errors\",\"type\":\"boolean\",\"default\":false}]}");
public static org.apache.avro.Schema getClassSchema() { return SCHEMA$; }
.......
When producers send this pojo, it communicates to the registry about the current version of the schema. If the schema is not in the registry, then the registry will store it and identify it by ID. The producer sends the message with its schema ID to the Kafka broker. On the other hand, the consumer will get this message and check if its seen the ID (stored in cache so you don't always have to retrieve the schema from the Registry) and if it hasnt, it will communicate with the registry to get such information about the message.
A bit outside of the scope of spring cloud stream, but one can also use the REST API for SR to manually register schemas.

Spring Cloud #StreamListener condition deprecated what is the alternative

We have multiple applications consumer listening to the same kafka topic and a producer sets the message header when sending message to the topic so specific instance can evaluate the header and process the message. eg
#StreamListener(target=ITestSink.CHANNEL_NAME,condition="headers['franchiseName'] == 'sydney'")
public void fullfillOrder(#Payload TestObj message) {
log.info("sydney order request received message is {}",message.getName());
}
In Spring Cloud Stream 3.0.0 the #StreamListener is deprecated and I could not find the equivalent of the condition property in Function.
Any suggestion?
Though I was not able to find the equivalent for the functional approach either, I do have a suggestion.
The #StreamListener annotations condition does not stop the fact that the application must consume the message, read its header, and filter out specific records before passing it to the listener (fullfillOrder()). So it's safe to assume you're consuming every message that hits the topic regardless (by the event receiver that Spring Cloud has implemented for us under the hood), but the listener only gets executed when header == sydney.
If there was a way to configure the event receiver that Spring Cloud uses (to discard message before hitting listener), I would suggest looking into that. If not, would resort to filtering out any messages (non-sydney) before doing any processing. If you're familiar with Spring Cloud's functional approach, would look something like this:
#Bean
public Consumer<Message<TestObj>> fulfillOrder() {
return msg -> {
// to get header - msg.getHeaders().get(key, valueType);
// filter out bad messages
}
}
or
#Bean
public Consumer<ConsumerRecord<?, TestObj>> fulfillOrder() {
return msg -> {
// msg.headers().lastHeader("franchiseName").value() -> filter em out
}
}
Other:
^ my code assumes you're integrating the kafka-client API with Spring cloud stream via spring-cloud-stream-binder-kafka. based on tags listed, i will note Spring Cloud Stream has two versions of binders for Kafka - one for the kafka client library, and one for kafka streams library.
Without considering Spring Cloud / Frameworks, the high-lvl DSL in kafka streams doesn't give you access to headers, but the low-level Processor API does. From the example, it seems like you're leveraging the client binder and not spring-cloud-stream-binder-kafka-streams / kafka streams binder. I haven't seen an implementation of spring cloud stream + kafka streams binder using the low-level processor API, so i can't tell if that was the aim.

Spring Cloud - SQS

I'm trying to get a simple queue handler working with the Spring Cloud framework. I've successfully got the message handler polling the queue, However. The problem I'm seeing is that when I post a message to the queue, my handler is failing to unmarshall the payload in to the required java Object.
#MessageMapping("MyMessageQueue")
#SuppressWarnings("UnusedDeclaration")
public void handleCreateListingMessage(#Headers Map<String, String> headers, MyMessage message) {
//do something with the MyMessage object
}
The error I'm getting is
No converter found to convert to class MyMessage
As I understand it, the #MessageMapping should use Jackson to unmarshall my JSON payload into a MyMessage object. However its complaining that it cannot find a converter.
Has anyone come across this?
I'm using the 1.0.0.BUILD-SNAPSHOT version of Spring Cloud.
Jackson is only used if a contentType header is set with value application/json on the SQS message. Otherwise the converters do not know what type of content is contained in the message's payload and the right converter cannot be chosen.
If you use QueueMessagingTemplate#convertAndSend as in the reference application, the contentType header will automatically be set.

Camel Serialize MessageContentsList

I receive a SOAP message to a CXF endpoint, with a Long and String value.
eg. [5, 'test']
The camel route receiving messages is already using dataformat=POJO
I need to send these parameters on ActiveMQ to another application.
If I use:
<convertBodyTo type="java.lang.String"/>
The logs show the body contains 5 only. 'test' is not sent.
I tried converting to a POJO before converting to a String, but I can't find proper documentation on making TypeConverters, (seriously, who can read this and figure out actual code from it?)
eg.
<convertBodyTo type="com.company.InfoPojo"/>
<convertBodyTo type="java.lang.String"/>
If I try to just forward the CXF data to the queue without any converting, I get:
Failed to extract body due to: javax.jms.JMSException: Failed to
build body from content. Serializable class not available to broker.
Reason: java.lang.ClassNotFoundException: Forbidden class
org.apache.cxf.message.MessageContentsList! This class is not allowed
to be serialized. Add package with
'org.apache.activemq.SERIALIZABLE_PACKAGES' system property..
Anyone know what the best option here is?
Thanks
You should marshal the parameters to XML or JSON (or any other format that takes your fancy) before sending them to the queue. The consumer will then need to unmarshal them.
No need to mess around with type converters. Camel's data formats make this really easy: https://github.com/apache/camel/blob/master/components/readme.adoc#data-formats
JSON: https://github.com/apache/camel/blob/master/docs/user-manual/en/json.adoc
JAXB: https://github.com/apache/camel/blob/master/components/camel-jaxb/src/main/docs/jaxb-dataformat.adoc
Made a bean and sent to the bean:
public void process(Exchange exchange) throws Exception {
log.info("Converting CXF values for queue.");
Object[] args = exchange.getIn().getBody(Object[].class);
patientKey = String.valueOf((Long)args[0]);
destinationUrl = (String)args[1];
exchange.getOut().setBody(new String(patientKey + "|" + destinationUrl));
}

Categories