I'm working with Spring cloud streams and wanted to fiddle with KStreams/KTables a little.
I'm looking for the methodology of going from a standard Kafka topic to turn it into a stream.
I've done this in KSQL but I'm trying to figure out if there is a way to have SpringBoot handle this. The best I can find is examples where both #Input and #Output channels are already KStreams but I think that is not what I want.
Kafka Setup
Inside of SpringBoot I'm doing the following:
My data comes on: force-entities-topic
I then "clean" the data removing the [UTC] tag from the time message and re-publish on:
force-entities-topic-clean
From there I was hoping to take the output of that and build both a KStream and KTable keyed on the platformUID field.
Input data
So the data I'm working with is:
{
"platformUID": "UID",
"type": "TLPS",
"state": "PLATFORM_INITIALIZED",
"fuelremaining": 5.9722E+24,
"latitude": 39,
"longitude": -115,
"altitude": 0,
"time": "2018-07-18T00:00:00Z[UTC]"
}
KSQL
I can run these KSQL commands to create what I need. (Here I'm reading time in as a string as opposed to actual time which I'm doing in the java/kotlin implementation)
CREATE STREAM force_no_key (
platformUID string,
type string,
state string,
fuelremaining DOUBLE,
latitude DOUBLE,
longitude DOUBLE,
altitude DOUBLE
) with (
kafka_topic='force-entities-topic',
value_format='json');
From there I make another stream (because I couldn't get it to read the key correctly)
CREATE STREAM force_with_key
WITH (KAFKA_TOPIC='blue_force_with_key') AS
select PLATFORMUID as UID, LATITUDE as lat, LONGITUDE as LON, ALTITUDE as ALT, state, type
FROM force_no_key
PARTITION BY UID;
And from this point
CREATE TABLE FORCE_TABLE
( UID VARCHAR,
LAT DOUBLE,
LON DOUBLE,
ALT DOUBLE
) WITH (KAFKA_TOPIC = 'force_with_key',
VALUE_FORMAT='JSON',
KEY = 'UID');
Java Style!
Where I'm running into trouble I think is here. I define my binding interface here:
interface ForceStreams {
companion object {
// From the settings file we configure it with the value of-force-in
const val DIRTY_INPUT = "dirty-force-in"
const val CLEANED_OUTPUT = "clean-force-out"
const val CLEANED_INPUT = "clean-force-in"
const val STREAM_OUT = "stream-out"
}
#Input(DIRTY_INPUT)
fun initialInput(): MessageChannel
#Output(CLEANED_OUTPUT)
fun cleanOutput(): SubscribableChannel
#Input(CLEANED_INPUT)
fun cleanInput(): MessageChannel
#Output(STREAM_OUT)
fun cleanedBlueForceMessage(): KStream<String, ForceEntity>
#Output(TABLE_OUT)
fun tableOutput(): KTable<String, ForceEntity>
}
And then I do the cleaning with this block:
#StreamListener(ForceStreams.DIRTY_INPUT)
#SendTo(ForceStreams.CLEANED_OUTPUT)
fun forceTimeCleaner(#Payload message: String): ForceEntity {
var inputMap: Map<String, Any> = objectMapper.readValue(message)
var map = inputMap.toMutableMap()
map["type"] = map["type"].toString().replace("-", "_")
map["time"] = map["time"].toString().replace("[UTC]", "")
val json = objectMapper.writeValueAsString(map)
val fe : ForceEntity = objectMapper.readValue(json, ForceEntity::class.java)
return fe
}
But I'm going from MessageChannel to SubscribableChannel
What I'm unsure how to do is go from SubscribableChannel to either KStream<String,ForceEntity> or KTable<String,ForceEntity>
Any help would be appreciated - thanks
Edit - applicaiton.yml
server:
port: 8888
spring:
application:
name: Blue-Force-Table
kafka:
bootstrap-servers: # This seems to be for the KStreams the other config is for normal streams
- localhost:19092
cloud:
stream:
defaultBinder: kafka
kafka:
binder:
brokers: localhost:19092
bindings:
dirty-force-in:
destination: force-entities-topic
contentType: application/json
clean-force-in:
destination: force-entities-topic-clean
contentType: application/json
clean-force-out:
destination: force-entities-topic-clean
contentType: application/json
stream-out:
destination: force_stream
contentType: application/json
table-out:
destination: force_table
contentType: application/json
I guess the follow on question is - is this even possible? Can you mix and match binders within a single function?
In the first StreamListener, you are receiving data through DIRTY_INPUT binding and writing through the binding CLEANED_OUTPUT. Then you need to have another StreamListener, where you receive that data as KStream and do the processing and write the output.
First processor:
#StreamListener(ForceStreams.DIRTY_INPUT)
#SendTo(ForceStreams.CLEANED_OUTPUT)
fun forceTimeCleaner(#Payload message: String): ForceEntity {
....
Change the following to a KStream binding.
#Input(CLEANED_INPUT)
fun cleanInput(): MessageChannel
to
#Input(CLEANED_INPUT)
fun cleanInput(): KStream<String, ForceEntity>
Second processor:
#StreamListener(CLEANED_INPUT)
#SendTo(STREAM_OUT)
public KStream<String, ForceEntity> process(
KStream<String, ForceEntity> forceEntityStream) {
return forceEntityStream
........
.toStream();
}
Currently, the Kafka Streams binder in Spring Cloud Stream does not support writing the data out as a KTable. Only KStream objects are allowed on the output (KTable binding is allowed on the input). If that is a hard requirement, you need to look into Spring Kafka where you can go lower level and do that kind of outbound operations.
Hope that helps.
Related
I am creating a plugin to host an UDF, it is an http_get function, receiving the httpAddress path, the query parameters and the request headers. It is implemeted in scala:
#ScalarFunction(value = "http_get", deterministic = true)
#Description("Returns the result of an Http Get request")
#SqlType(value = StandardTypes.VARCHAR)
def httpGetFromArrayMap(
#SqlType(StandardTypes.VARCHAR) httpAddress: Slice,
#SqlType(constants.STRING_MAP) parameters: ImmutableMap[Slice, Slice],
#SqlNullable #SqlType(constants.STRING_MAP) headers: ImmutableMap[Slice, Slice],
): String = {
val stringHeaders = castSliceMap(headers)
val stringParams = castSliceMap(parameters)
val request = Http(httpAddress.toStringUtf8).headers(stringHeaders).params(stringParams)
val stringResponse = request.asString.body
stringResponse
}
When running on trino, it raises the following exception:
io.trino.spi.TrinoException: Exact implementation of http_get do not match expected java types.
The problem is: What is the corresponding java type to map(varchar,varchar)?
I've tried Many:
Scala Map[String, String]
Java Map<String, String>
Java Map<Slice, Slice>
ImmutableMap<String, String>
ImmutableMap<Slice, Slice>
Can't find any example of plugin implementing a function which receives a map.
Any help is appreciated. Thanks
Disclaimer: I'm not familiar at all with Trino UDF
According to the examples you can find in Trino repository, the type to use for a SqlType("map(X,y)") is io.trino.spi.block.Block.
The Block can then be manipulated to extract its content as if it were a regular Map.
See for instance: https://github.com/trinodb/trino/blob/master/core/trino-main/src/main/java/io/trino/operator/scalar/MathFunctions.java#L1340
I have three component: rest, Cassandra and Kafka and I am using Apache camel. When the request receives, I want to add a record in Cassandra and after that, adding that record to Kafka. Finally generating rest response. May be pipeline is not a good solution for me! Because Cassandra part is InOnly and hasn't any out exchange!
I wrote this route:
rest().path("/addData")
.consumes("text/plain")
.produces("application/json")
.post()
.to("direct:requestData");
from("direct:requestData")
.pipeline("direct:init",
"cql://localhost/myTable?cql=" + CQL,
"direct:addToKafka"
)
.process(exchange -> {
var currentBody = (List<?>) exchange.getIn().getBody();
var body = new Data((String) currentBody.get(0), (Long) currentBody.get(1), (String) currentBody.get(2));
exchange.getIn.setBody(body.toJsonString());
});
from("direct:init")
.process(exchange -> {
var currentBody = exchange.getIn().getBody();
var body = Arrays.asList(generateId(), generateTimeStamp, currentBody);
exchange.getIn().setBody(body);
});
from("direct:addToKafka")
.process(
// do sth to add kafka
);
I tried sth such setting patternExtention to InOut for Cassandra!!! finally understand that this is impossible! because patternExtention used for consumer and I used Cassandra in a route as producer.
I want to read from multiple topics, so i declared them in yaml file with comma separated but getting below error:
java.lang.IllegalStateException: Topic(s) [ topic-1 , topic-2, topic-3, topic-4, topic-5, topic-6, topic-7] is/are not present and missingTopicsFatal is true
Spring:
kafka:
topics:
tp: topic-1 , topic-2, topic-3, topic-4, topic-5, topic-6, topic-7
#KafkaListener(topics = "#{'${spring.kafka.topics.tp}'.split(',')}",
concurrency = "190",
clientIdPrefix = "client1",
groupId = "group1")
public void listenData(final ConsumerRecord<Object, Object> inputEvent) throws Exception {
handleMessage(inputEvent);
}
if i declare all topics inside KafkaListener annotation its working fine.
Remove the spaces
tp: topic-1,topic-2,topic-3,topic-4,topic-5,topic-6,topic-7
Or use
.split(' *, *')
I have a request that looks like the following:
package pricing
import scala.beans.BeanProperty
class Request(#BeanProperty var name: String, #BeanProperty var surname: String) {
def this() = this(name="defName", surname="defSurname")
}
The handler is as follows:
package pricing
import com.amazonaws.services.lambda.runtime.{Context, RequestHandler}
import scala.collection.JavaConverters
import spray.json._
class ApiGatewayHandler extends RequestHandler[Request, ApiGatewayResponse] {
import DefaultJsonProtocol._
def handleRequest(input: Request, context: Context): ApiGatewayResponse = {
val headers = Map("x-foo" -> "coucou")
val msg = "Hello " + input.name
val message = Map[String, String]("message" -> msg )
ApiGatewayResponse(
200,
message.toJson.toString(),
JavaConverters.mapAsJavaMap[String, Object](headers),
true
)
}
}
which has been documented as:
functions:
pricing:
handler: pricing.ApiGatewayHandler
events:
- http:
path: pricing/test
method: get
documentation:
summary: "submit your name and surname, the API says hi"
description: ".. well, the summary is pretty exhaustive"
requestBody:
description: "Send over name and surname"
queryParams:
- name: "name"
description: "your 1st name"
- name: "surname"
description: ".. guess .. "
methodResponses:
- statusCode: "200"
responseHeaders:
- name: "x-foo"
description: "you can foo in here"
responseBody:
description: "You'll see a funny message here"
responseModels:
"application/json": "HelloWorldResponse"
well, this is a copy and paste from one of the tutorials. And it is not working.
I guess that the BeanProperty refers to body object properties; and this is what I can guess from the example here.
if I would like to have query strings?
A try was:
package pricing
import scala.beans.BeanProperty
import spray.json._
abstract class ApiGatewayGetRequest(
#BeanProperty httpMethod: String,
#BeanProperty headers: Map[String, String],
#BeanProperty queryStringParameters: Map[String, String])
abstract class ApiGatewayPostRequest(
#BeanProperty httpMethod: String,
#BeanProperty headers: Map[String, String],
#BeanProperty queryStringParameters: Map[String, String])
class HelloWorldRequest(
#BeanProperty httpMethod: String,
#BeanProperty headers: Map[String, String],
#BeanProperty queryStringParameters: Map[String, String]
) extends ApiGatewayGetRequest(httpMethod, headers, queryStringParameters) {
private def getParam(param: String): String =
queryStringParameters get param match {
case Some(s) => s
case None => "default_" + param
}
def name: String = getParam("name")
def surname: String = getParam("surname")
def this() = this("GET", Map.empty, Map.empty)
}
Which results in:
{
"message":"Hello default_name"
}
suggesting that the class has been initialized with an empty map in place of the queryStringParameters which was however submitted correctly
Mon Sep 25 20:45:22 UTC 2017 : Endpoint request body after
transformations:
{"resource":"/pricing/test","path":"/pricing/test","httpMethod":"GET","headers":null,"queryStringParameters":{"name":"ciao", "surname":"bonjour"},"pathParameters":null,"stageVariables":null,
...
Note:
I am following this path because I feel it would be convenient and expressive to replace the Map in #BeanProperty queryStringParameters: Map[String, String] with a type T, for example
case class Person(#beanProperty val name: String, #beanProperty val surname: String)
However, the code above looks at {"name":"ciao", "surname":"bonjour"} as a String, without figuring out that it should deserialize that String.
EDIT
I have also tried to replace the scala map with a java.util.Map[String, String] without success
By default, Serverless enables proxy integration between the lambda and API Gateway. What this means for you is that API Gateway is going to pass an object containing all the metadata about the request into your handler, as you have noticed:
Mon Sep 25 20:45:22 UTC 2017 : Endpoint request body after transformations: {"resource":"/pricing/test","path":"/pricing/test","httpMethod":"GET","headers":null,"queryStringParameters":{"name":"ciao", "surname":"bonjour"},"pathParameters":null,"stageVariables":null, ...
This clearly doesn't map to your model which has just the fields name and surname in it. There are several ways you could go about solving this.
1. Adapt your model
Your attempt with the HelloWorldRequest class does actually work if you make your class a proper POJO by making the fields mutable (and thus creating the setters for them):
class HelloWorldRequest(
#BeanProperty var httpMethod: String,
#BeanProperty var headers: java.util.Map[String, String],
#BeanProperty var queryStringParameters: java.util.Map[String, String]
) extends ApiGatewayGetRequest(httpMethod, headers, queryStringParameters) {
AWS Lambda documentation states:
The get and set methods are required in order for the POJOs to work with AWS Lambda's built in JSON serializer.
Also keep in mind that Scala's Map is not supported.
2. Use a custom request template
If you don't need the metadata, then instead of changing your model you can make API Gateway pass only the data you need into the lambda using mapping templates.
In order to do this, you need to tell Serverless to use plain lambda integration (instead of proxy) and specify a custom request template.
Amazon API Gateway documentation has an example request template which is almost perfect for your problem. Tailoring it a little bit, we get
functions:
pricing:
handler: pricing.ApiGatewayHandler
events:
- http:
path: pricing/test
method: get
integration: lambda
request:
template:
application/json: |
#set($params = $input.params().querystring)
{
#foreach($paramName in $params.keySet())
"$paramName" : "$util.escapeJavaScript($params.get($paramName))"
#if($foreach.hasNext),#end
#end
}
This template will make a JSON out of the query string parameters, and it will now be the input of the lambda:
Endpoint request body after transformations: { "name" : "ciao" }
Which maps properly to your model.
Note that disabling proxy integration also changes the response format. You will notice that now your API returns your response model directly:
{"statusCode":200,"body":"{\"message\":\"Hello ciao\"}","headers":{"x-foo":"coucou"},"base64Encoded":true}
You can fix this by either modifying your code to return only the body, or by adding a custom response template:
response:
template: $input.path('$.body')
This will transform the output into what you expect, but will blatantly ignore the statusCode and headers. You would need to make a more complex response configuration to handle those.
3. Do the mapping yourself
Instead of extending RequestHandler and letting AWS Lambda map the JSON to a POJO, you can instead extend RequestStreamHandler, which will provide you an InputStream and an OutputStream, so you can do the (de)serialization with the JSON serializer of your choice.
I'm playing around with Kotlin and Spark, creating a RESTful web service. However I'm struggling to parse a JSON POST request. I have the following endpoint...
post("") { req, res ->
var objectMapper = ObjectMapper()
println(req.body())
val data = objectMapper.readValue(req.body(), User::class.java)
usersDao.save(data.name, data.email, data.age)
res.status(201)
"okies"
}
However I'm getting a 500 error, it's not actually printing an error, just returning a 500.
It seems to be this line val data = objectMapper.readValue(req.body(), User::class.java). I'm attempting to convert the json body into a user object. Here's my user object...
data class User(val name: String, val email: String, val age: Int, val id: Int)