I'm using the MongoDB Reactive Streams Java API which I implemented following this example, but I'm encountering a serious problem: sometimes, when I try to query a collection, the await methods doesn't work, and it hangs until the timeout is reached.
The onSubscribe methods gets called correctly, but then neither onNext, nor onError nor onComplete get called.
There doesn't seem to be a specific circumstance causing this issue.
This is my code
MongoDatabase database = MongoDBConnector.getClient().getDatabase("myDb");
MongoCollection<Document> collection = database.getCollection("myCollection");
FindPublisher<Document> finder = collection.find(Filters.exists("myField"));
SettingSubscriber tagSub = new SettingSubscriber(finder);
//SettingsSubscriber is a subclass of ObservableSubscriber which calls publisher.subscribe(this)
tagSub.await(); //this is where it hangs
return tagSub.getWrappedData();
I wrote a simple implementation of what I assumed the SettingSubscriber looked like and tried to recreate the problem using a groovy script. I couldn't - my code runs without hanging, prints each output record and exits. Code for reference below:
#Grab(group = 'org.mongodb', module = 'mongodb-driver-reactivestreams', version = '4.3.3')
#Grab(group = 'org.slf4j', module = 'slf4j-api', version = '1.7.32')
#Grab(group = 'ch.qos.logback', module = 'logback-classic', version = '1.2.6')
import com.mongodb.MongoClientSettings;
import com.mongodb.MongoCredential;
import com.mongodb.ServerAddress;
import com.mongodb.reactivestreams.client.MongoClients;
import com.mongodb.reactivestreams.client.MongoClient;
import com.mongodb.reactivestreams.client.MongoDatabase;
import com.mongodb.reactivestreams.client.MongoCollection;
import com.mongodb.reactivestreams.client.FindPublisher;
import com.mongodb.client.model.Filters;
import org.bson.Document;
import org.reactivestreams.Subscriber;
import org.reactivestreams.Subscription;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.CountDownLatch;
MongoClientSettings.Builder clientSettingsBuilder = MongoClientSettings.builder()
.applyToClusterSettings { clusterSettingsBuilder ->
clusterSettingsBuilder.hosts( Arrays.asList(new ServerAddress("localhost", 27017)))
};
MongoClient mongoClient = MongoClients.create(clientSettingsBuilder.build());
MongoDatabase database = mongoClient.getDatabase("myDb");
MongoCollection<Document> collection = database.getCollection("myCollection");
FindPublisher<Document> finder = collection.find(Filters.exists("myField"));
SettingSubscriber tagSub = new SettingSubscriber(finder);
tagSub.await();
class SettingSubscriber implements Subscriber<Document> {
private final CountDownLatch latch = new CountDownLatch(1);
private Subscription subscription;
private List<Document> data = new ArrayList<>();
public SettingSubscriber(FindPublisher<Document> finder) {
finder.subscribe(this);
}
#Override
public void onSubscribe(final Subscription subscription) {
this.subscription = subscription;
subscription.request(1);
}
#Override
public void onNext(final Document document) {
System.out.println("Received: " + document);
data.add(document);
subscription.request(1);
}
#Override
public void onError(final Throwable throwable) {
throwable.printStackTrace();
latch.countDown();
}
#Override
public void onComplete() {
System.out.println("Completed");
latch.countDown();
}
public List<Document> getWrappedData() {
return data;
}
public void await() throws Throwable {
await(Long.MAX_VALUE, TimeUnit.MILLISECONDS);
}
public void await(final long timeout, final TimeUnit unit) throws Throwable {
if (!latch.await(timeout, unit)) {
System.out.println("Publish timed out");
}
}
}
Can you compare this implementation of the SettingSubscriber with yours to see if something is missed?
Related
I'm getting an error with StreamIdentifier when trying to use MultiStreamTracker in a kinesis consumer application.
java.lang.IllegalArgumentException: Unable to deserialize StreamIdentifier from first-stream-name
What is causing this error? I can't find a good example of using the tracker with kinesis.
The stream name works when using a consumer with a single stream so I'm not sure what is happening. It looks like the consumer is trying to parse the accountId and streamCreationEpoch. But when I create the identifiers I am using the singleStreamInstance method. Is the stream name required to have these values? They appear to be optional from the code.
This test is part of a complete example on github.
package kinesis.localstack.example;
import java.nio.ByteBuffer;
import java.nio.charset.StandardCharsets;
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
import java.util.UUID;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.Future;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;
import com.amazonaws.services.kinesis.producer.KinesisProducer;
import com.amazonaws.services.kinesis.producer.KinesisProducerConfiguration;
import org.junit.jupiter.api.AfterEach;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import org.testcontainers.containers.localstack.LocalStackContainer;
import org.testcontainers.junit.jupiter.Container;
import org.testcontainers.junit.jupiter.Testcontainers;
import org.testcontainers.utility.DockerImageName;
import software.amazon.awssdk.regions.Region;
import software.amazon.awssdk.services.cloudwatch.CloudWatchAsyncClient;
import software.amazon.awssdk.services.dynamodb.DynamoDbAsyncClient;
import software.amazon.awssdk.services.kinesis.KinesisAsyncClient;
import software.amazon.kinesis.common.ConfigsBuilder;
import software.amazon.kinesis.common.InitialPositionInStream;
import software.amazon.kinesis.common.InitialPositionInStreamExtended;
import software.amazon.kinesis.common.KinesisClientUtil;
import software.amazon.kinesis.common.StreamConfig;
import software.amazon.kinesis.common.StreamIdentifier;
import software.amazon.kinesis.coordinator.Scheduler;
import software.amazon.kinesis.exceptions.InvalidStateException;
import software.amazon.kinesis.exceptions.ShutdownException;
import software.amazon.kinesis.lifecycle.events.InitializationInput;
import software.amazon.kinesis.lifecycle.events.LeaseLostInput;
import software.amazon.kinesis.lifecycle.events.ProcessRecordsInput;
import software.amazon.kinesis.lifecycle.events.ShardEndedInput;
import software.amazon.kinesis.lifecycle.events.ShutdownRequestedInput;
import software.amazon.kinesis.processor.FormerStreamsLeasesDeletionStrategy;
import software.amazon.kinesis.processor.FormerStreamsLeasesDeletionStrategy.NoLeaseDeletionStrategy;
import software.amazon.kinesis.processor.MultiStreamTracker;
import software.amazon.kinesis.processor.ShardRecordProcessor;
import software.amazon.kinesis.processor.ShardRecordProcessorFactory;
import software.amazon.kinesis.retrieval.KinesisClientRecord;
import software.amazon.kinesis.retrieval.polling.PollingConfig;
import static java.util.stream.Collectors.toList;
import static org.assertj.core.api.Assertions.assertThat;
import static org.awaitility.Awaitility.await;
import static org.testcontainers.containers.localstack.LocalStackContainer.Service.CLOUDWATCH;
import static org.testcontainers.containers.localstack.LocalStackContainer.Service.DYNAMODB;
import static org.testcontainers.containers.localstack.LocalStackContainer.Service.KINESIS;
import static software.amazon.kinesis.common.InitialPositionInStream.TRIM_HORIZON;
import static software.amazon.kinesis.common.StreamIdentifier.singleStreamInstance;
#Testcontainers
public class KinesisMultiStreamTest {
static class TestProcessorFactory implements ShardRecordProcessorFactory {
private final TestKinesisRecordService service;
public TestProcessorFactory(TestKinesisRecordService service) {
this.service = service;
}
#Override
public ShardRecordProcessor shardRecordProcessor() {
throw new UnsupportedOperationException("must have streamIdentifier");
}
public ShardRecordProcessor shardRecordProcessor(StreamIdentifier streamIdentifier) {
return new TestRecordProcessor(service, streamIdentifier);
}
}
static class TestRecordProcessor implements ShardRecordProcessor {
public final TestKinesisRecordService service;
public final StreamIdentifier streamIdentifier;
public TestRecordProcessor(TestKinesisRecordService service, StreamIdentifier streamIdentifier) {
this.service = service;
this.streamIdentifier = streamIdentifier;
}
#Override
public void initialize(InitializationInput initializationInput) {
}
#Override
public void processRecords(ProcessRecordsInput processRecordsInput) {
service.addRecord(streamIdentifier, processRecordsInput);
}
#Override
public void leaseLost(LeaseLostInput leaseLostInput) {
}
#Override
public void shardEnded(ShardEndedInput shardEndedInput) {
try {
shardEndedInput.checkpointer().checkpoint();
} catch (Exception e) {
throw new IllegalStateException(e);
}
}
#Override
public void shutdownRequested(ShutdownRequestedInput shutdownRequestedInput) {
}
}
static class TestKinesisRecordService {
private List<ProcessRecordsInput> firstStreamRecords = Collections.synchronizedList(new ArrayList<>());
private List<ProcessRecordsInput> secondStreamRecords = Collections.synchronizedList(new ArrayList<>());
public void addRecord(StreamIdentifier streamIdentifier, ProcessRecordsInput processRecordsInput) {
if(streamIdentifier.streamName().contains(firstStreamName)) {
firstStreamRecords.add(processRecordsInput);
} else if(streamIdentifier.streamName().contains(secondStreamName)) {
secondStreamRecords.add(processRecordsInput);
} else {
throw new IllegalStateException("no list for stream " + streamIdentifier);
}
}
public List<ProcessRecordsInput> getFirstStreamRecords() {
return Collections.unmodifiableList(firstStreamRecords);
}
public List<ProcessRecordsInput> getSecondStreamRecords() {
return Collections.unmodifiableList(secondStreamRecords);
}
}
public static final String firstStreamName = "first-stream-name";
public static final String secondStreamName = "second-stream-name";
public static final String partitionKey = "partition-key";
DockerImageName localstackImage = DockerImageName.parse("localstack/localstack:latest");
#Container
public LocalStackContainer localstack = new LocalStackContainer(localstackImage)
.withServices(KINESIS, CLOUDWATCH)
.withEnv("KINESIS_INITIALIZE_STREAMS", firstStreamName + ":1," + secondStreamName + ":1");
public Scheduler scheduler;
public TestKinesisRecordService service = new TestKinesisRecordService();
public KinesisProducer producer;
#BeforeEach
void setup() {
KinesisAsyncClient kinesisClient = KinesisClientUtil.createKinesisAsyncClient(
KinesisAsyncClient.builder().endpointOverride(localstack.getEndpointOverride(KINESIS)).region(Region.of(localstack.getRegion()))
);
DynamoDbAsyncClient dynamoClient = DynamoDbAsyncClient.builder().region(Region.of(localstack.getRegion())).endpointOverride(localstack.getEndpointOverride(DYNAMODB)).build();
CloudWatchAsyncClient cloudWatchClient = CloudWatchAsyncClient.builder().region(Region.of(localstack.getRegion())).endpointOverride(localstack.getEndpointOverride(CLOUDWATCH)).build();
MultiStreamTracker tracker = new MultiStreamTracker() {
private List<StreamConfig> configs = List.of(
new StreamConfig(singleStreamInstance(firstStreamName), InitialPositionInStreamExtended.newInitialPosition(TRIM_HORIZON)),
new StreamConfig(singleStreamInstance(secondStreamName), InitialPositionInStreamExtended.newInitialPosition(TRIM_HORIZON)));
#Override
public List<StreamConfig> streamConfigList() {
return configs;
}
#Override
public FormerStreamsLeasesDeletionStrategy formerStreamsLeasesDeletionStrategy() {
return new NoLeaseDeletionStrategy();
}
};
ConfigsBuilder configsBuilder = new ConfigsBuilder(tracker, "KinesisPratTest", kinesisClient, dynamoClient, cloudWatchClient, UUID.randomUUID().toString(), new TestProcessorFactory(service));
scheduler = new Scheduler(
configsBuilder.checkpointConfig(),
configsBuilder.coordinatorConfig(),
configsBuilder.leaseManagementConfig(),
configsBuilder.lifecycleConfig(),
configsBuilder.metricsConfig(),
configsBuilder.processorConfig().callProcessRecordsEvenForEmptyRecordList(false),
configsBuilder.retrievalConfig()
);
new Thread(scheduler).start();
producer = producer();
}
#AfterEach
public void teardown() throws ExecutionException, InterruptedException, TimeoutException {
producer.destroy();
Future<Boolean> gracefulShutdownFuture = scheduler.startGracefulShutdown();
gracefulShutdownFuture.get(60, TimeUnit.SECONDS);
}
public KinesisProducer producer() {
var configuration = new KinesisProducerConfiguration()
.setVerifyCertificate(false)
.setCredentialsProvider(localstack.getDefaultCredentialsProvider())
.setMetricsCredentialsProvider(localstack.getDefaultCredentialsProvider())
.setRegion(localstack.getRegion())
.setCloudwatchEndpoint(localstack.getEndpointOverride(CLOUDWATCH).getHost())
.setCloudwatchPort(localstack.getEndpointOverride(CLOUDWATCH).getPort())
.setKinesisEndpoint(localstack.getEndpointOverride(KINESIS).getHost())
.setKinesisPort(localstack.getEndpointOverride(KINESIS).getPort());
return new KinesisProducer(configuration);
}
#Test
void testFirstStream() {
String expected = "Hello";
producer.addUserRecord(firstStreamName, partitionKey, ByteBuffer.wrap(expected.getBytes(StandardCharsets.UTF_8)));
var result = await().timeout(600, TimeUnit.SECONDS)
.until(() -> service.getFirstStreamRecords().stream()
.flatMap(r -> r.records().stream())
.map(KinesisClientRecord::data)
.map(r -> StandardCharsets.UTF_8.decode(r).toString())
.collect(toList()), records -> records.size() > 0);
assertThat(result).anyMatch(r -> r.equals(expected));
}
#Test
void testSecondStream() {
String expected = "Hello";
producer.addUserRecord(secondStreamName, partitionKey, ByteBuffer.wrap(expected.getBytes(StandardCharsets.UTF_8)));
var result = await().timeout(600, TimeUnit.SECONDS)
.until(() -> service.getSecondStreamRecords().stream()
.flatMap(r -> r.records().stream())
.map(KinesisClientRecord::data)
.map(r -> StandardCharsets.UTF_8.decode(r).toString())
.collect(toList()), records -> records.size() > 0);
assertThat(result).anyMatch(r -> r.equals(expected));
}
}
Here is the error I am getting.
[Thread-9] ERROR software.amazon.kinesis.coordinator.Scheduler - Worker.run caught exception, sleeping for 1000 milli seconds!
java.lang.IllegalArgumentException: Unable to deserialize StreamIdentifier from first-stream-name
at software.amazon.kinesis.common.StreamIdentifier.multiStreamInstance(StreamIdentifier.java:75)
at software.amazon.kinesis.coordinator.Scheduler.getStreamIdentifier(Scheduler.java:1001)
at software.amazon.kinesis.coordinator.Scheduler.buildConsumer(Scheduler.java:917)
at software.amazon.kinesis.coordinator.Scheduler.createOrGetShardConsumer(Scheduler.java:899)
at software.amazon.kinesis.coordinator.Scheduler.runProcessLoop(Scheduler.java:419)
at software.amazon.kinesis.coordinator.Scheduler.run(Scheduler.java:330)
at java.base/java.lang.Thread.run(Thread.java:829)
According to documentation:
The serialized stream identifier should be of the following format: account-id:StreamName:streamCreationTimestamp
So your code should be like this:
private List<StreamConfig> configs = List.of(
new StreamConfig(multiStreamInstance("111111111:multiStreamTest-1:12345"), InitialPositionInStreamExtended.newInitialPosition(TRIM_HORIZON)),
new StreamConfig(multiStreamInstance("111111111:multiStreamTest-2:12389"), InitialPositionInStreamExtended.newInitialPosition(TRIM_HORIZON)));
Note: this also will change leaseKey format to account-id:StreamName:streamCreationTimestamp:ShardId
There is code that is supposed to do the load testing form some function that performs the http call (we call it callInit here) and collects some data in the LoaTestMetricsData:
the collected responses
and the total duration of the execution.
import io.reactivex.Observable;
import io.reactivex.Scheduler;
import io.reactivex.Single;
import io.reactivex.observers.TestObserver;
import io.reactivex.schedulers.Schedulers;
import io.reactivex.subjects.PublishSubject;
import io.reactivex.subjects.Subject;
import io.restassured.internal.RestAssuredResponseImpl;
import io.restassured.response.Response;
import org.junit.jupiter.api.Test;
import java.time.Duration;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.function.Supplier;
import static java.lang.Thread.sleep;
import static org.hamcrest.CoreMatchers.equalTo;
import static org.hamcrest.CoreMatchers.is;
import static org.hamcrest.MatcherAssert.assertThat;
import static org.hamcrest.Matchers.allOf;
import static org.hamcrest.Matchers.greaterThanOrEqualTo;
import static org.hamcrest.Matchers.lessThan;
public class TestRx {
#Test
public void loadTest() {
int CALL_N_TIMES = 10;
final long CALL_NIT_EVERY_MILLISECONDS = 100;
final LoaTestMetricsData loaTestMetricsData = loadTestHttpCall(
this::callInit,
CALL_N_TIMES,
CALL_NIT_EVERY_MILLISECONDS
);
assertThat(loaTestMetricsData.responseList.size(), is(equalTo(Long.valueOf(CALL_N_TIMES).intValue())));
long errorCount = loaTestMetricsData.responseList.stream().filter(x -> x.getStatusCode() != 200).count();
long executionTime = loaTestMetricsData.duration.getSeconds();
//assertThat(errorCount, is(equalTo(0)));
assertThat(executionTime , allOf(greaterThanOrEqualTo(1L),lessThan(3L)));
}
// --
private Single<Response> callInit() {
try {
return Single.fromCallable(() -> {
System.out.println("...");
sleep(1000);
Response response = new RestAssuredResponseImpl();
return response;
});
} catch (Exception ex) {
throw new RuntimeException(ex.getMessage());
}
}
// --
private LoaTestMetricsData loadTestHttpCall(final Supplier<Single<Response>> restCallFunction, long callnTimes, long callEveryMilisseconds) {
long startTimeMillis = System.currentTimeMillis();
final LoaTestMetricsData loaDestMetricsData = new LoaTestMetricsData();
final AtomicInteger atomicInteger = new AtomicInteger(0);
final TestObserver<Response> testObserver = new TestObserver<Response>() {
public void onNext(Response response) {
loaDestMetricsData.responseList.add(response);
super.onNext(response);
}
public void onComplete() {
loaDestMetricsData.duration = Duration.ofMillis(System.currentTimeMillis() - startTimeMillis);
super.onComplete();
}
};
final Subject<Response> subjectInitCallResults = PublishSubject.create(); // Memo: Subjects are hot so if you don't observe them the right time, you may not get events. Thus: subscribe first then emit (onNext)
final Scheduler schedulerIo = Schedulers.io();
subjectInitCallResults
.subscribeOn(schedulerIo)
.subscribe(testObserver); // subscribe first
final Observable<Long> source = Observable.interval(callEveryMilisseconds, TimeUnit.MILLISECONDS).take(callnTimes);
source.subscribe(x -> {
final Single<Response> singleResult = restCallFunction.get();
singleResult
.subscribeOn(schedulerIo)
.subscribe( result -> {
int count = atomicInteger.incrementAndGet();
if(count == callnTimes) {
subjectInitCallResults.onNext(result); // then emit
subjectInitCallResults.onComplete();
} else {
subjectInitCallResults.onNext(result);
}
});
});
testObserver.awaitTerminalEvent();
testObserver.assertComplete();
testObserver.assertValueCount(Long.valueOf(callnTimes).intValue()); // !!!
return loaDestMetricsData;
}
}
The: LoaTestMetricsData is defined as:
public class LoaTestMetricsData {
public List<Response> responseList = new ArrayList<>();
public Duration duration;
}
Sometimes test fails with this error:
java.lang.AssertionError: Value counts differ; expected: 10 but was: 9 (latch = 0, values = 9, errors = 0, completions = 1)
Expected :10
Actual :9 (latch = 0, values = 9, errors = 0, completions = 1)
<Click to see difference>
If someone could tell me why?
As is some of the subjectInitCallResults.onNext() has not been executed, or consumed. But why.. I understand that PublishSubject is hot observable, thus I subscribe for the events before emitting/onNext anything to it.
UPDATE:
What would fix it, is this ugly code, that would wait for the subject to fill up:
while(subjectInitCallResults.count().blockingGet() != callnTimes) {
Thread.sleep(100);
}
..
testObserver.awaitTerminalEvent();
But is the proper / better way of doing it?
Thanks.
I have a problem with a simple workflow implemented with activiti engine and Java
I'm not able to execute a task asynchronously. The workflow is very simple
The upper part of the workflow start a process execute the "Start Service Task" emit a signal and execute the "Loop service Task" and repeat 10 times.
The bottom part is triggered by the signal emited in the upper part and must be execute asynchronously respect the loop part but in reality it block the "Loop service Task".
I try with set the async attribute in the bottom flow but in this case the bottom is not executed.
Follow the link to github project.
https://github.com/giane88/testActiviti
package configuration;
import org.activiti.engine.ProcessEngine;
import org.activiti.engine.ProcessEngineConfiguration;
import org.activiti.engine.RepositoryService;
import org.activiti.engine.impl.asyncexecutor.AsyncExecutor;
import org.activiti.engine.impl.asyncexecutor.ManagedAsyncJobExecutor;
import org.activiti.engine.impl.cfg.StandaloneInMemProcessEngineConfiguration;
public class WorkflowConfiguration {
final ProcessEngine processEngine;
public WorkflowConfiguration(final String workFlowName) {
processEngine = setUpProcessEngine(workFlowName);
}
public ProcessEngine getProcessEngine() {
return processEngine;
}
private ProcessEngine setUpProcessEngine(String workFlowName) {
ProcessEngineConfiguration cfg = null;
cfg = new StandaloneProcessEngineConfiguration()
.setJdbcUrl("jdbc:h2:mem:activiti;DB_CLOSE_DELAY=1000")
.setJdbcUsername("sa")
.setJdbcPassword("")
.setJdbcDriver("org.h2.Driver")
.setDatabaseSchemaUpdate(ProcessEngineConfiguration.DB_SCHEMA_UPDATE_TRUE);
final ProcessEngine processEngine = cfg.buildProcessEngine();
RepositoryService repositoryService = processEngine.getRepositoryService();
repositoryService.createDeployment().addClasspathResource("activiti/" + workFlowName)
.deploy();
return processEngine;
}
}
package configuration;
import org.activiti.engine.ProcessEngine;
import org.activiti.engine.runtime.ProcessInstance;
import java.util.HashMap;
import java.util.Map;
public class WorkflowManipulator {
private final Map<String, Object> nextDelegateVariables ;
private final String wfName;
private final ProcessEngine engine;
public WorkflowManipulator(String wfName, ProcessEngine engine) {
this.nextDelegateVariables = new HashMap<>();
this.wfName = wfName;
this.engine = engine;
}
public ProcessInstance startProcess() {
if (nextDelegateVariables.size() > 0) {
return engine.getRuntimeService().startProcessInstanceByKey(wfName, nextDelegateVariables);
} else {
return engine.getRuntimeService().startProcessInstanceByKey(wfName);
}
}
}
#Log4j2
public class TestWorkFlowMain {
public static void main(String[] args) throws IOException {
WorkflowConfiguration workflowConfiguration = new WorkflowConfiguration("test.bpmn");
WorkflowManipulator workflowManipulator = new WorkflowManipulator("testProcess", workflowConfiguration.getProcessEngine());
ProcessInstance processInstance = workflowManipulator.startProcess();
}
}
package delegates;
import lombok.extern.log4j.Log4j2;
import org.activiti.engine.delegate.DelegateExecution;
import org.activiti.engine.delegate.JavaDelegate;
#Log4j2
public class AsyncServiceTask implements JavaDelegate {
#Override
public void execute(DelegateExecution execution) throws Exception {
log.info("Sleeping for 3 second");
Thread.sleep(3000);
log.warn("AsyncCompleted");
}
}
As usualy i found the solution some minutes after post the question.
The problems was in the process engine configuration you need to set up an asyncExecutor and enable and active it like in the example below.
ProcessEngineConfiguration cfg;
AsyncExecutor asyncExecutor = new ManagedAsyncJobExecutor();
cfg = new StandaloneInMemProcessEngineConfiguration()
.setAsyncExecutor(asyncExecutor)
.setAsyncExecutorEnabled(true)
.setAsyncExecutorActivate(true);
final ProcessEngine processEngine = cfg.buildProcessEngine();
getting the below error stack trace while working with kafka streams
UPDATE: as per #matthias-j-sax, have implemented my own Serdes with default constructor for WrapperSerde but still getting the following exceptions
org.apache.kafka.streams.errors.StreamsException: stream-thread [streams-request-count-4c239508-6abe-4901-bd56-d53987494770-StreamThread-1] Failed to rebalance.
at org.apache.kafka.streams.processor.internals.StreamThread.pollRequests (StreamThread.java:836)
at org.apache.kafka.streams.processor.internals.StreamThread.runOnce (StreamThread.java:784)
at org.apache.kafka.streams.processor.internals.StreamThread.runLoop (StreamThread.java:750)
at org.apache.kafka.streams.processor.internals.StreamThread.run (StreamThread.java:720)
Caused by: org.apache.kafka.streams.errors.StreamsException: Failed to configure value serde class myapps.serializer.Serdes$WrapperSerde
at org.apache.kafka.streams.StreamsConfig.defaultValueSerde (StreamsConfig.java:972)
at org.apache.kafka.streams.processor.internals.AbstractProcessorContext.<init> (AbstractProcessorContext.java:59)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.<init> (ProcessorContextImpl.java:42)
at org.apache.kafka.streams.processor.internals.StreamTask.<init> (StreamTask.java:136)
at org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask (StreamThread.java:405)
at org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask (StreamThread.java:369)
at org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.createTasks (StreamThread.java:354)
at org.apache.kafka.streams.processor.internals.TaskManager.addStreamTasks (TaskManager.java:148)
at org.apache.kafka.streams.processor.internals.TaskManager.createTasks (TaskManager.java:107)
at org.apache.kafka.streams.processor.internals.StreamThread$RebalanceListener.onPartitionsAssigned (StreamThread.java:260)
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete (ConsumerCoordinator.java:259)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded (AbstractCoordinator.java:367)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup (AbstractCoordinator.java:316)
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll (ConsumerCoordinator.java:290)
at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce (KafkaConsumer.java:1149)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll (KafkaConsumer.java:1115)
at org.apache.kafka.streams.processor.internals.StreamThread.pollRequests (StreamThread.java:827)
at org.apache.kafka.streams.processor.internals.StreamThread.runOnce (StreamThread.java:784)
at org.apache.kafka.streams.processor.internals.StreamThread.runLoop (StreamThread.java:750)
at org.apache.kafka.streams.processor.internals.StreamThread.run (StreamThread.java:720)
Caused by: java.lang.NullPointerException
at myapps.serializer.Serdes$WrapperSerde.configure (Serdes.java:30)
at org.apache.kafka.streams.StreamsConfig.defaultValueSerde (StreamsConfig.java:968)
at org.apache.kafka.streams.processor.internals.AbstractProcessorContext.<init> (AbstractProcessorContext.java:59)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.<init> (ProcessorContextImpl.java:42)
at org.apache.kafka.streams.processor.internals.StreamTask.<init> (StreamTask.java:136)
at org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask (StreamThread.java:405)
at org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask (StreamThread.java:369)
at org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.createTasks (StreamThread.java:354)
at org.apache.kafka.streams.processor.internals.TaskManager.addStreamTasks (TaskManager.java:148)
at org.apache.kafka.streams.processor.internals.TaskManager.createTasks (TaskManager.java:107)
at org.apache.kafka.streams.processor.internals.StreamThread$RebalanceListener.onPartitionsAssigned (StreamThread.java:260)
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete (ConsumerCoordinator.java:259)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded (AbstractCoordinator.java:367)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup (AbstractCoordinator.java:316)
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll (ConsumerCoordinator.java:290)
at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce (KafkaConsumer.java:1149)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll (KafkaConsumer.java:1115)
at org.apache.kafka.streams.processor.internals.StreamThread.pollRequests (StreamThread.java:827)
at org.apache.kafka.streams.processor.internals.StreamThread.runOnce (StreamThread.java:784)
at org.apache.kafka.streams.processor.internals.StreamThread.runLoop (StreamThread.java:750)
at org.apache.kafka.streams.processor.internals.StreamThread.run (StreamThread.java:720)
Here's my usecase:
I will be getting json responses as input to the stream, I want to count requests whose status codes are not 200. Initially, I went through the documentation of kafka streams in official documentation as well as confluent, then implemented WordCountDemo which is working very fine, then I tried to wrote this code, but getting this exception, I am very new to kafka streams, I went through the stack trace, but couldn't understood the context, hence came here for help!!!
Here's my code
LogCount.java
package myapps;
import java.util.Properties;
import java.util.concurrent.CountDownLatch;
import org.apache.kafka.common.serialization.Serde;
import org.apache.kafka.common.serialization.Serdes;
import org.apache.kafka.streams.KafkaStreams;
import org.apache.kafka.streams.StreamsBuilder;
import org.apache.kafka.streams.StreamsConfig;
import org.apache.kafka.streams.Topology;
import org.apache.kafka.streams.kstream.KStream;
import org.apache.kafka.streams.kstream.Produced;
import myapps.serializer.JsonDeserializer;
import myapps.serializer.JsonSerializer;
import myapps.Request;
public class LogCount {
public static void main(String[] args) {
Properties props = new Properties();
props.put(StreamsConfig.APPLICATION_ID_CONFIG, "streams-request-count");
props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
JsonSerializer<Request> requestJsonSerializer = new JsonSerializer<>();
JsonDeserializer<Request> requestJsonDeserializer = new JsonDeserializer<>(Request.class);
Serde<Request> requestSerde = Serdes.serdeFrom(requestJsonSerializer, requestJsonDeserializer);
props.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass().getName());
props.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, requestSerde.getClass().getName());
final StreamsBuilder builder = new StreamsBuilder();
KStream<String, Request> source = builder.stream("streams-requests-input");
source.filter((k, v) -> v.getHttpStatusCode() != 200)
.groupByKey()
.count()
.toStream()
.to("streams-requests-output", Produced.with(Serdes.String(), Serdes.Long()));
final Topology topology = builder.build();
final KafkaStreams streams = new KafkaStreams(topology, props);
final CountDownLatch latch = new CountDownLatch(1);
System.out.println(topology.describe());
// attach shutdown handler to catch control-c
Runtime.getRuntime().addShutdownHook(new Thread("streams-shutdown-hook") {
#Override
public void run() {
streams.close();
latch.countDown();
}
});
try {
streams.cleanUp();
streams.start();
latch.await();
} catch (Throwable e) {
System.exit(1);
}
System.exit(0);
}
}
JsonDeserializer.java
package myapps.serializer;
import com.google.gson.Gson;
import org.apache.kafka.common.serialization.Deserializer;
import java.util.Map;
public class JsonDeserializer<T> implements Deserializer<T> {
private Gson gson = new Gson();
private Class<T> deserializedClass;
public JsonDeserializer(Class<T> deserializedClass) {
this.deserializedClass = deserializedClass;
}
public JsonDeserializer() {
}
#Override
#SuppressWarnings("unchecked")
public void configure(Map<String, ?> map, boolean b) {
if(deserializedClass == null) {
deserializedClass = (Class<T>) map.get("serializedClass");
}
}
#Override
public T deserialize(String s, byte[] bytes) {
if(bytes == null){
return null;
}
return gson.fromJson(new String(bytes),deserializedClass);
}
#Override
public void close() {
}
}
JsonSerializer.java
package myapps.serializer;
import com.google.gson.Gson;
import org.apache.kafka.common.serialization.Serializer;
import java.nio.charset.Charset;
import java.util.Map;
public class JsonSerializer<T> implements Serializer<T> {
private Gson gson = new Gson();
#Override
public void configure(Map<String, ?> map, boolean b) {
}
#Override
public byte[] serialize(String topic, T t) {
return gson.toJson(t).getBytes(Charset.forName("UTF-8"));
}
#Override
public void close() {
}
}
As I mentioned, I will be getting JSON as input, the structure is like this,
{
"RequestID":"1f6b2409",
"Protocol":"http",
"Host":"abc.com",
"Method":"GET",
"HTTPStatusCode":"200",
"User-Agent":"curl%2f7.54.0",
}
The corresponding Request.java file looks like this
package myapps;
public final class Request {
private String requestID;
private String protocol;
private String host;
private String method;
private int httpStatusCode;
private String userAgent;
public String getRequestID() {
return requestID;
}
public void setRequestID(String requestID) {
this.requestID = requestID;
}
public String getProtocol() {
return protocol;
}
public void setProtocol(String protocol) {
this.protocol = protocol;
}
public String getHost() {
return host;
}
public void setHost(String host) {
this.host = host;
}
public String getMethod() {
return method;
}
public void setMethod(String method) {
this.method = method;
}
public int getHttpStatusCode() {
return httpStatusCode;
}
public void setHttpStatusCode(int httpStatusCode) {
this.httpStatusCode = httpStatusCode;
}
public String getUserAgent() {
return userAgent;
}
public void setUserAgent(String userAgent) {
this.userAgent = userAgent;
}
}
EDIT: when I exit from kafka-console-consumer.sh, it's saying Processed a total of 0 messages.
As the error indicate, a class is missing a non-argument default constructor for Serdes$WrapperSerde:
Could not find a public no-argument constructor
The issue is this construct:
Serde<Request> requestSerde = Serdes.serdeFrom(requestJsonSerializer, requestJsonDeserializer);
props.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, requestSerde.getClass().getName());
Serdes.serdeFrom return WrapperSerde that does not have an empty default constructor. Thus, you cannot pass it into the StreamsConfig. You can use Serdes generate like this only if you pass objects into the corresponding API calls (ie, overwrite default Serde for certain operators).
To make it work (ie, to be able to set the Serde in the config), you would need to implement a proper class that implement Serde interface.
The requestSerde.getClass().getName() did not work for me. I needed to provide my own WrapperSerde implementation in an inner class. You probably need to do the same with something like:
public class MySerde extends WrapperSerde<Request> {
public MySerde () {
super(requestJsonSerializer, requestJsonDeserializer);
}
}
Instead of specifying in properties, add the custom serde in streams creation
KStream<String, Request> source = builder.stream("streams-requests-input",Consumed.with(Serdes.String(), requestSerde));
Is there a way to start elasticsearch within a gradle build before running integration tests and afterwards stop elasticsearch?
My approach so far is the following, but this blocks the further execution of the gradle build.
task runES(type: JavaExec) {
main = 'org.elasticsearch.bootstrap.Elasticsearch'
classpath = sourceSets.main.runtimeClasspath
systemProperties = ["es.path.home":"$buildDir/elastichome",
"es.path.data":"$buildDir/elastichome/data"]
}
For my purpose i have decided to start elasticsearch within my integration test in java code.
I've tried out ElasticsearchIntegrationTest but that didn't worked with spring, because it didn't harmony with SpringJUnit4ClassRunner.
I've found it easier to start elasticsearch in the before method:
My test class testing some 'dummy' productive code (indexing a document):
import static org.hamcrest.CoreMatchers.notNullValue;
import static org.junit.Assert.assertThat;
import org.elasticsearch.action.index.IndexResponse;
import org.elasticsearch.client.Client;
import org.elasticsearch.client.transport.TransportClient;
import org.elasticsearch.common.settings.ImmutableSettings;
import org.elasticsearch.common.settings.ImmutableSettings.Builder;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.common.transport.InetSocketTransportAddress;
import org.elasticsearch.indices.IndexAlreadyExistsException;
import org.elasticsearch.node.Node;
import org.elasticsearch.node.NodeBuilder;
import org.junit.After;
import org.junit.Before;
import org.junit.Test;
public class MyIntegrationTest {
private Node node;
private Client client;
#Before
public void before() {
createElasticsearchClient();
createIndex();
}
#After
public void after() {
this.client.close();
this.node.close();
}
#Test
public void testSomething() throws Exception {
// do something with elasticsearch
final String json = "{\"mytype\":\"bla\"}";
final String type = "mytype";
final String id = index(json, type);
assertThat(id, notNullValue());
}
/**
* some productive code
*/
private String index(final String json, final String type) {
// create Client
final Settings settings = ImmutableSettings.settingsBuilder().put("cluster.name", "mycluster").build();
final TransportClient tc = new TransportClient(settings).addTransportAddress(new InetSocketTransportAddress(
"localhost", 9300));
// index a document
final IndexResponse response = tc.prepareIndex("myindex", type).setSource(json).execute().actionGet();
return response.getId();
}
private void createElasticsearchClient() {
final NodeBuilder nodeBuilder = NodeBuilder.nodeBuilder();
final Builder settingsBuilder = nodeBuilder.settings();
settingsBuilder.put("network.publish_host", "localhost");
settingsBuilder.put("network.bind_host", "localhost");
final Settings settings = settingsBuilder.build();
this.node = nodeBuilder.clusterName("mycluster").local(false).data(true).settings(settings).node();
this.client = this.node.client();
}
private void createIndex() {
try {
this.client.admin().indices().prepareCreate("myindex").execute().actionGet();
} catch (final IndexAlreadyExistsException e) {
// index already exists => we ignore this exception
}
}
}
It is also very important to use elasticsearch version 1.3.3 or higher. See Issue 5401.