How do I convert this spring-integration configuration from XML to Java? - java

This particular piece makes sense to implement in the application rather than XML because it is a constant across the entire cluster, not localized to a single job.
From dissecting the XSD, it looks to me like the xml for int-kafka:outbound-channel-adapter constructs a KafkaProducerMessageHandler.
There is no visible way to set the channel, the topic, or most of the other attributes.
Note to potential downvoters - (rant on) I have been RTFM'ing for a week and am more confused than when I started. My choice of language has graduated from adjectives through adverbs, and I'm starting to borrow words from other languages. The answer may be in there. But if it is, it is not locatable by mere mortals. (rant off)
XML configuration:
<int-kafka:outbound-channel-adapter id="kafkaOutboundChannelAdapter"
kafka-template="kafkaTemplate"
auto-startup="false"
channel="outbound-staging"
topic="foo"
sync="false"
message-key-expression="'bar'"
send-failure-channel="failures"
send-success-channel="successes"
partition-id-expression="2">
</int-kafka:outbound-channel-adapter>
If so, then I would expect the java config to look something like this:
#Bean
public KafkaProducerMessageHandler kafkaOutboundChannelAdapter () {
KafkaProducerMessageHandler result = new KafkaProducerMessageHandler(kafkaTemplate());
result.set????? (); // WTH?? No methods for most of the attributes?!!!
return result;
}
EDIT: Additional information about the high level problem being solved
As a part of a larger project, I am trying to implement the textbook example from https://docs.spring.io/spring-batch/4.0.x/reference/html/spring-batch-integration.html#remote-partitioning , with Kafka backing instead of JMS backing.
I believe the final integration flow should be something like this:
partitionHandler -> messagingTemplate -> outbound-requests (DirectChannel) -> outbound-staging (KafkaProducerMessageHandler) -> kafka
kafka -> executionContainer (KafkaMessageListenerContainer) -> inboundKafkaRequests (KafkaMessageDrivenChannelAdapter) -> inbound-requests (DirectChannel) -> serviceActivator (StepExecutionRequestHandler)
serviceActivator (StepExecutionRequestHandler) -> reply-staging (KafkaProducerMessageHandler) -> kafka
kafka -> replyContainer (KafkaMessageListenerContainer) -> inboundKafkaReplies (KafkaMessageDrivenChannelAdapter) -> inbound-replies (DirectChannel) -> partitionhandler

Not sure what you mean that they are missed, but this is what I see in the source code of that KafkaProducerMessageHandler:
public void setTopicExpression(Expression topicExpression) {
this.topicExpression = topicExpression;
}
public void setMessageKeyExpression(Expression messageKeyExpression) {
this.messageKeyExpression = messageKeyExpression;
}
public void setPartitionIdExpression(Expression partitionIdExpression) {
this.partitionIdExpression = partitionIdExpression;
}
/**
* Specify a SpEL expression to evaluate a timestamp that will be added in the Kafka record.
* The resulting value should be a {#link Long} type representing epoch time in milliseconds.
* #param timestampExpression the {#link Expression} for timestamp to wait for result
* fo send operation.
* #since 2.3
*/
public void setTimestampExpression(Expression timestampExpression) {
this.timestampExpression = timestampExpression;
}
and so on.
You also have access to the super class setters, for example a setSync() for your XML variant.
The input-channel is not a MessageHandler responsibility. It goes to the Endpoint and can be confgigured via #ServiceActivator alongside with that #Bean.
See more info in the Core Spring Integration Reference Manual: https://docs.spring.io/spring-integration/reference/html/#annotations_on_beans
Also there is very important chapter in the beginning: https://docs.spring.io/spring-integration/reference/html/#programming-tips
In addition it might be better to consider to use Java DSL instead of direct MessageHandler usage:
Kafka
.outboundChannelAdapter(producerFactory)
.sync(true)
.messageKey(m -> m
.getHeaders()
.get(IntegrationMessageHeaderAccessor.SEQUENCE_NUMBER))
.headerMapper(mapper())
.partitionId(m -> 0)
.topicExpression("headers[kafka_topic] ?: '" + topic + "'")
.configureKafkaTemplate(t -> t.id("kafkaTemplate:" + topic))
.get();
See more info about Java DSL in the mentioned Spring Integration Docs: https://docs.spring.io/spring-integration/reference/html/#java-dsl

Related

Add partitions for Kafka topic dynamically using Spring Boot?

I was able to inspect particular topic for its partitions:
public void addPartitionIfNotExists(int partitionId){
Map<String, TopicDescription> games = kafkaAdmin.describeTopics("games");
TopicDescription gamesTopicDescription = games.get("games");
List<TopicPartitionInfo> partitionsInfo = gamesTopicDescription.partitions();
boolean partitionIdExists = partitionsInfo.stream().anyMatch(partitionInfo -> partitionInfo.partition() == partitionId);
if (!partitionIdExists){
//missing part
}
}
But I haven't been able to add new partition to a already existing topic during runtime. Don't know if that is even possible.
See KafkaAdminOperations Javadocs for more info:
/**
* Create topics if they don't exist or increase the number of partitions if needed.
* #param topics the topics.
*/
void createOrModifyTopics(NewTopic... topics);
Not sure in your logic around partitionIdExists though, since the partition in the Kafka topic is just an index number. So, if there is partition 3, it doesn't mean that there is no partitions 1 or 2. Therefore a NewTopic API is just that simple as numPartitions. Nothing more.
Technically, what you are asking is just covered by that createOrModifyTopics() and that's it: you don't need to check for topics yourself.

Using KeycloakOIDCFilter with Spark UI - cannot configure

We are attempting to use KeycloakOIDCFilter as our Apache Spark UI filter. However, we are struggling to configure the KeycloakOIDCFilter itself.
We have, in spark-defaults.conf:
spark.ui.filters=org.keycloak.adapters.servlet.KeycloakOIDCFilter
This is picked up successfully, and the Spark master logs show this filter being applied to all URL routes.
We have generated a client config file in the Keycloak Admin Console, which has spit out a keycloak-oidc.json.
But how do we tell KeycloakOIDCFilter about this information?
From the Spark docs
Filter parameters can also be specified in the configuration,
by setting config entries of the form spark.<class name of filter>.param.<param name>=<value>
For example:
spark.ui.filters=com.test.filter1
spark.com.test.filter1.param.name1=foo
spark.com.test.filter1.param.name2=bar
In our case that would seem to be:
spark.org.keycloak.adapters.servlet.KeycloakOIDCFilter.param.<name>=<value>
However, the KeycloakOIDCFilter Java class has only two constructors. One takes no parameters at all and one takes a KeycloakConfigResolver.
The Keycloak Java servlet filter adapter docs only talk about web.xml which isn't applicable in the case of configuring Spark.
So how can we properly configure/point to parameters for the KeycloakOIDCFilter servlet filter?
Update: We've determined that spark.org.keycloak.adapters.servlet.KeycloakOIDCFilter.param.keycloak.config.file can be used to point to a config file, but it appears that Spark does not use SessionManager, leading to a separate error that may or may not be resolvable.
I haven't tested the solution but, according to the Keycloak and Spark documentation you cited, and the source code of KeycloakOIDCFilter, assuming you are using a file in your filesystem, the following configuration could work:
spark.ui.filters=org.keycloak.adapters.servlet.KeycloakOIDCFilter
spark.org.keycloak.adapters.servlet.KeycloakOIDCFilter.param.keycloak.config.file=/path/to/keycloak-oidc.json
Or this other one if your config is accesible as a web app resource, through getServletContext().getResourceAsStream(...), instead of a file:
spark.ui.filters=org.keycloak.adapters.servlet.KeycloakOIDCFilter
spark.org.keycloak.adapters.servlet.KeycloakOIDCFilter.param.keycloak.config.path=/WEB-INF/keycloak-oidc.json
Please, note that they indicate that filters parameters can also be specified in the configuration: afaik, it doesn't mean that the filter should have any special constructor or something similar.
This configuration is performed by the addFilters:
/**
* Add filters, if any, to the given ServletContextHandlers. Always adds a filter at the end
* of the chain to perform security-related functions.
*/
private def addFilters(handler: ServletContextHandler, securityMgr: SecurityManager): Unit = {
conf.get(UI_FILTERS).foreach { filter =>
logInfo(s"Adding filter to ${handler.getContextPath()}: $filter")
val oldParams = conf.getOption(s"spark.$filter.params").toSeq
.flatMap(Utils.stringToSeq)
.flatMap { param =>
val parts = param.split("=")
if (parts.length == 2) Some(parts(0) -> parts(1)) else None
}
.toMap
val newParams = conf.getAllWithPrefix(s"spark.$filter.param.").toMap
JettyUtils.addFilter(handler, filter, oldParams ++ newParams)
}
and addFilter:
def addFilter(
handler: ServletContextHandler,
filter: String,
params: Map[String, String]): Unit = {
val holder = new FilterHolder()
holder.setClassName(filter)
params.foreach { case (k, v) => holder.setInitParameter(k, v) }
handler.addFilter(holder, "/*", EnumSet.allOf(classOf[DispatcherType]))
}
methods in the JettyUtils class in the source code of Spark UI.

Thread safety for method that returns Mono based on mutable attribute in Java

In my Spring Boot application I have a component that is supposed to monitor the health status of another, external system. This component also offers a public method that reactive chains can subscribe to in order to wait for the external system to be up.
#Component
public class ExternalHealthChecker {
private static final Logger LOG = LoggerFactory.getLogger(ExternalHealthChecker.class);
private final WebClient externalSystemWebClient = WebClient.builder().build(); // config omitted
private volatile boolean isUp = true;
private volatile CompletableFuture<String> completeWhenUp = new CompletableFuture<>();
#Scheduled(cron = "0/10 * * ? * *")
private void checkExternalSystemHealth() {
webClient.get() //
.uri("/health") //
.retrieve() //
.bodyToMono(Void.class) //
.doOnError(this::handleHealthCheckError) //
.doOnSuccess(nothing -> this.handleHealthCheckSuccess()) //
.subscribe(); //
}
private void handleHealthCheckError(final Throwable error) {
if (this.isUp) {
LOG.error("External System is now DOWN. Health check failed: {}.", error.getMessage());
}
this.isUp = false;
}
private void handleHealthCheckSuccess() {
// the status changed from down -> up, which has to complete the future that might be currently waited on
if (!this.isUp) {
LOG.warn("External System is now UP again.");
this.isUp = true;
this.completeWhenUp.complete("UP");
this.completeWhenUp = new CompletableFuture<>();
}
}
public Mono<String> waitForExternalSystemUPStatus() {
if (this.isUp) {
LOG.info("External System is already UP!");
return Mono.empty();
} else {
LOG.warn("External System is DOWN. Requesting process can now wait for UP status!");
return Mono.fromFuture(completeWhenUp);
}
}
}
The method waitForExternalSystemUPStatus is public and may be called from many, different threads. The idea behind this is to provide some of the reactive flux chains in the application a method of pausing their processing until the external system is up. These chains cannot process their elements when the external system is down.
someFlux
.doOnNext(record -> LOG.info("Next element")
.delayUntil(record -> externalHealthChecker.waitForExternalSystemUPStatus())
... // starting processing
The issue here is that I can't really wrap my head around which part of this code needs to be synchronised. I think there should not be an issue with multiple threads calling waitForExternalSystemUPStatusat the same time, as this method is not writing anything. So I feel like this method does not need to be synchronised. However, the method annotated with #Scheduled will also run on it's own thread and will in-fact write the value of isUp and also potentially change the reference of completeWhenUpto a new, uncompleted future instance. I have marked these two mutable attributes with volatilebecause from reading about this keyword in Java it feels to me like it would help with guaranteeing that the threads reading these two values see the latest value. However, I am unsure if I also need to add synchronized keywords to part of the code. I am also unsure if the synchronized keyword plays well with reactor code, I have a hard time finding information on this. Maybe there is also a way of providing the functionality of the ExternalHealthCheckerin a more complete, reactive way, but I cannot think of any.
I'd strongly advise against this approach. The problem with threaded code like this is it becomes immensely difficult to follow & reason about. I think you'd at least need to synchronise the parts of handleHealthCheckSuccess() and waitForExternalSystemUPStatus() that reference your completeWhenUp field otherwise you could have a race hazard on your hands (only one writes to it, but it might be read out-of-order after that write) - but there could well be something else I'm missing, and if so it may show as one of these annoying "one in a million" type bugs that's almost impossible to pin down.
There should be a much more reliable & simple way of achieving this though. Instead of using the Spring scheduler, I'd create a flux when your ExternalHealthChecker component is created as follows:
healthCheckStream = Flux.interval(Duration.ofMinutes(10))
.flatMap(i ->
webClient.get().uri("/health")
.retrieve()
.bodyToMono(String.class)
.map(s -> true)
.onErrorResume(e -> Mono.just(false)))
.cache(1);
...where healthCheckStream is a field of type Flux<Boolean>. (Note it doesn't need to be volatile, as you'll never replace it so cross-thread worries don't apply - it's the same stream that will be updated with different results every 10 minutes based on the healthcheck status, whatever thread you'll access it from.)
This essentially creates a stream of healthcheck response values every 10 minutes, always caches the latest response, and turns it into a hot source. This means that the "nothing happens until you subscribe" doesn't apply in this case - the flux will start executing immediately, and any new subscribers that come in on any thread will always get the latest result, be that a pass or a fail. handleHealthCheckSuccess() and handleHealthCheckError(), isUp, and completeWhenUp are then all redundant, they can go - and then your waitForExternalSystemUPStatus() can just become a single line:
return healthCheckStream.filter(x -> x).next();
...then job done, you can call that from anywhere and you'll have a Mono that will only complete when the system is up.

What is the top first use case you think of, when you see the 'flatMap' method in someone else's code?

Sorry for some kind of theoretical question, but I'd like to find a way of quick reading someone else's functional code, building chain of methods use templates.
For example:
Case 1.
When I see use of .peek method or .wireTap from Spring Integration, I primarily expect logging, triggering monitoring or just transitional running external action, for instance:
.peek(params ->
log.info("creating cache configuration {} for key class \"{}\" and value class \"{}\"",
params.getName(), params.getKeyClass(), params.getValueClass()))
or
.peek(p ->
Try.run(() -> cacheService.cacheProfile(p))
.onFailure(ex ->
log.warn("Unable to cache profile: {}", ex.toString())))
or
.wireTap(sf -> sf.handle(msg -> {
monitoring.profileRequestsReceived();
log.trace("Client info request(s) received: {}", msg);
Case 2.
When I see use of .map method or .transform from Spring Integration, I understand that I'm up to get result of someFunction(input), for instance:
.map(e -> GenerateTokenRs.builder().token(e.getKey()).phoneNum(e.getValue()).build())
or
.transform(Message.class, msg -> {
ErrorResponse response = (ErrorResponse) msg.getPayload();
MessageBuilder builder = some tranforming;
return builder.build();
})
Current case.
But I don't have such a common view to .flatMap method.
Would you give me your opinion about this, please?
Add 1:
To Turamarth: I know the difference between .map and .flatMap methods. I actively use both .map, and .flatMap in my code.
But I ask community for theirs experience and coding templates.
It always helps to study the signature/javadoc of the streamish methods to understand them:
The flatMap() operation has the effect of applying a one-to-many transformation to the elements of the stream, and then flattening the resulting elements into a new stream.
So, typical code I expect, or wrote myself:
return someMap.values().stream().flatMap(Collection::stream)
The values of that map are sets, and I want to pull the entries of all these sets into a single stream for further processing here.
In other words: it is about "pulling out things", and getting them into a stream/collection for further processing.
I've found one more use template for .flatMap.
Let's have a look at the following code:
String s = valuesFromDb
.map(v -> v.get(k))
.getOrElse("0");
where Option<Map<String, String>> valuesFromDb = Option.of(.....).
If there's an entry k=null in the map, then we'll get null as a result of code above.
But we'd like to have "0" in this case as well.
So let's add .flatMap:
String s = valuesFromDb
.map(v -> v.get(k))
.flatMap(Option::of)
.getOrElse("0");
Regardless of having null as map's value we will get "0".

Flux endpoint from infinite java stream

I have an issue while processing a flux that is built from a Stream.generate construct.
The Java stream is fetching some data from a remote source, hence I implemented a custom supplier that has the data fetching logic embedded, and then used it to populate the Stream.
Stream.generate(new SearchSupplier(...))
My idea is to detect an empty list and use the Java9 feature of takeWhile ->
Stream.generate(new SearchSupplier(this, queryBody))
.takeWhile(either -> either.isRight() && either.get().nonEmpty())
(using Vavr's Either construct)
The repositoroy layer flux will then do:
return Flux.fromStream (
this.searchStream(...) //this is where the stream gets generated
)
.map(Either::get)
.flatMap(Flux::fromIterable);
The "service" layer is composed of some transformation steps on the flux, but the method signature is something like Flux<JsonObject> search(...).
Finally, the controller layer has a GetMapping:
#GetMapping(produces = "application/stream+json")
public Flux search(...) {
return searchService.search(...) //this is the Flux<JsonObject> parth
.subscriberContext(...) //stuff I need available during processing
.doOnComplete(() -> log.debug("DONE"));
}
My problem is that the Flux seems to never terminate.
Doing a call from Postman for example just shot the 'Loading...' part in the response section. When I terminate the process from my IDE the results are then flushed to postman and I see what I'm expecting. Also the doOnComplete lambda never gets called
What I noticed is that if I change the source of a Flux:
Flux.fromArray(...) //harcoded array of lists of jsons
the doOnComplete lambda is called and also the http connection closes, and results are displayed in postman.
Any idea of what might be the issue?
Thanks.
You could create the Flux directly using code that looks like this. Note that I'm adding some assumed methods which you would need to implement based on your how your SearchSupplier works:
Flux<SearchResultType> flux = Flux.generate(
() -> new SearchSupplier(this, queryBody),
(supplier, sink) -> {
SearchResultType current = supplier.next();
if (isNotLast(current)) {
sink.next(current);
} else {
sink.complete();
}
return supplier;
},
supplier -> anyCleanupOperations(supplier)
);

Categories