Java - composite / parallel HealthCheck using AbstractHealthIndicator - java

I have an application which establishes connections to multiple Active MQ queues and I need to introduce a health check endpoint (i.e. /health) which can be used to test queues connectivity. In order to do this I am using AbstractHealthIndicator provided by Spring Boot actuator.
The main problem with it is if multiple health checks are defined as a separate health indicators - they are running in sequential manner. In case of an issue / timeout with some of the connections (5 in total) - overall health check time to check every queue significantly increases.
To overcome this problem I created only one health indicator by extending AbstractHealthIndicator and running checks for each of the queue within it in parallel using ExecutorService and futures.
MQ health check class implementation:
public class HealthCheck extends AbstractHealthCheck implements Callable<Health> {
private String queueName;
public HealthCheck(JmsTemplate jmsTemplate, String queueName) {
super(jmsTemplate);
this.queueName = queueName;
}
public String getQueueName() {
return queueName;
}
#Override
public Health call() throws JMSException {
Health.Builder healthBuilder = new Health.Builder();
healthBuilder.withDetail("QueueName", getQueueName());
ConnectionFactory connectionFactory = super.getJmsTemplate().getConnectionFactory();
try (Connection connection = connectionFactory.createConnection()) {
connection.start();
healthBuilder.up();
}
catch (Exception e) {
healthBuilder.down();
}
return healthBuilder.build();
}
}
Health check indicator implementation which will be used by Spring Actuator:
public class HealthChecker extends AbstractHealthIndicator {
private final List<AbstractHealthCheck> healthCheckList;
private List<Future<Health>> futureListHealth;
private ExecutorService executorService = Executors.newFixedThreadPool(5);
public HealthChecker(final List<AbstractHealthCheck> healthCheckList) {
this.healthCheckList = healthCheckList;
}
#Override
protected synchronized void doHealthCheck(final Builder builder) throws Exception {
futureListHealth = new ArrayList<>();
Map<String, Object> compositeHealthCheckDetails = new HashMap<>();
builder.up();
futureListHealth = executorService.invokeAll(healthCheckList, 5, TimeUnit.SECONDS);
futureListHealth.forEach(future -> {
try {
Health futureHealthCheckResult = future.get();
compositeHealthCheckDetails.put(futureHealthCheckResult.getDetails().get("QueueName").toString(), futureHealthCheckResult);
} catch (CancellationException | ExecutionException | InterruptedException ex) {
compositeHealthCheckDetails.put(/* how to get queue details ? */, new Health.Builder().down().withDetail("error", ex.getMessage()).build());
builder.down();
System.out.println("Cancellation Exception Occurred");
}
});
builder.withDetails(compositeHealthCheckDetails);
builder.build();
}
}
The main challenge occurs when some of checks times out and future.get() for them throws a CancellationException.
Active MQ / JMS connectionFactory.createConnection() doesn't handle InterruptionException and as a result I cannot handle cancellation scenario within HealthCheck class itselft and lose Health details of timed out health check (queue name, for instance).
I am trying to understand what is the best way to handle this scenario.
Questions I have:
Within CancellationException catch section - what is the best way to obtain queue name details for which future has been called?;
Currently for each timed out health check I recreate Health.Builder with down state and exception details like mentioned below. Is there an alternative / better way doing it?
compositeHealthCheckDetails.put(/* how to get queue details ? */, new Health.Builder().down().withDetail("error", ex.getMessage()).build());

Related

What is the best way to detect that a Kafka cluster is down?

The Kafka Consumer API is so nice to hide any transient connection errors and just pick up reading from it's current offset if a Kafka broker dies and comes up again.
But in some applications it's important to alert and stop processing data (from other sources), if the entire Kafka cluster is down (i.e. all brokers).
I've browsed the misc. APIs and that doesn't seem to be a feature.
The closest I've come is to submit an Admin call and depending on a timeout, conclude that the Kafka cluster is down:
Properties properties = ... // Load properties from somewhere.
int timeout = 5_000; // 5 second timeout
AdminClient adminClient = AdminClient.create(properties);
try {
adminClient.listTopics(new ListTopicsOptions().timeoutMs(timeout)).listings().get();
// Here we know the cluster is up as call returned within timeout.
} catch (ExecutionException ex) {
// Here we know that the cluster is down as the call timed out.
}
Is this the best way to do it?
Another way is to query ZooKeeper, but the above approach will also work in situations where there's a network problem between the application and Kafka.
Although I would suggest you to use a proper monitoring tool, in case you still want to do this programmatically, one option is to use AdminClient and try fetching the topic names.
For example,
Properties properties = new Properties();
properties.put("bootstrap.servers", "localhost:9092");
properties.put("request.timeout.ms", 5000);
try {
AdminClient adminClient = AdminClient.create(properties)
ListTopicsResult topics = adminClient.listTopics();
Set<String> names = topics.names().get();
} catch(InterruptedException | ExecutionException e) {
System.err.println("Kafka is unavailable");
}
Note however that the above won't throw an exception if some of the brokers are down (but obviously, if a broker is down it doesn't mean that the Kafka Cluster itself is down, as data should still be accessible)
Your approach looks fine. A similar approach (using Spring's HealthIndicator's notion) is what MartinX3 did here.
His solution:
#Component
public class KafkaHealthIndicator implements HealthIndicator {
private final Logger log = LoggerFactory.getLogger(KafkaHealthIndicator.class);
private KafkaTemplate<String, String> kafka;
public KafkaHealthIndicator(KafkaTemplate<String, String> kafka) {
this.kafka = kafka;
}
/**
* Return an indication of health.
*
* #return the health for
*/
#Override
public Health health() {
try {
kafka.send("kafka-health-indicator", "❥").get(100, TimeUnit.MILLISECONDS);
} catch (InterruptedException | ExecutionException | TimeoutException e) {
return Health.down(e).build();
}
return Health.up().build();
}
}
You might also want to combine other metric-checks, before returning Health.up().build(), such as ActiveControllerCount = 0, depending on what you consider important for your use case to consider the entire cluster as down.

How to create pool of clients which can handle just one task at once

My application starts couple of clients which communicate with steam. There are two types of task which I can ask for clients. One when I don't care about blocking for example ask client about your friends. But second there are tasks which I can submit just one to client and I need to wait when he finished it asynchronously. So I am not sure if there is already some design pattern but you can see what I already tried. When I ask for second task I removed it from queue and return it here after this task is done. But I don't know if this is good sollution because I can 'lost' some clients when I do something wrong
#Component
public class SteamClientWrapper {
private Queue<DotaClientImpl> clients = new LinkedList<>();
private final Object clientLock = new Object();
public SteamClientWrapper() {
}
#PostConstruct
public void init() {
// starting clients here clients.add();
}
public DotaClientImpl getClient() {
return getClient(false);
}
public DotaClientImpl getClient(boolean freeLast) {
synchronized (clients) {
if (!clients.isEmpty()) {
return freeLast ? clients.poll() : clients.peek();
}
}
return null;
}
public void postClient(DotaClientImpl client) {
if (client == null) {
return;
}
synchronized (clientLock) {
clients.offer(client);
clientLock.notify();
}
}
public void doSomethingBlocking() {
DotaClientImpl client = getClient(true);
client.doSomething();
}
}
Sounds like you could use Spring's ThreadPoolTaskExecutor to do that.
An Executor is basically what you tried to do - store tasks in a queue and process the next as soon the previous has finished.
Often this is used to run tasks in parallel, but it can also reduce overhead for serial processing.
A sample doing it this way would be on
https://dzone.com/articles/spring-and-threads-taskexecutor
To ensure only one client task runs at a time, simply set the configuration to
executor.setCorePoolSize(1);
executor.setMaxPoolSize(1);

google spanner singleton reconnect on fault

I am creating a SpannerSingleton to stay connected for the duration of the app's life.
I'm interested in connection durability... if there is a session/connection issue how can I recreate a session?
One idea was to spawn a new connection increasing the setMaxSessions to a higher number if more than 90% of the pool is exhausted. Like the opposite of exponential backoff? But where / how can I do that? I could not find anything in the client library that would let me monitor the pool status or client count.
I went with bill-pugh-singleton cuz it seemed like a good choice...
Here is what I have:
public class SpannerSingleton {
private static Spanner spanner;
private static SpannerOptions options;
private static SessionPoolOptions sessionPoolOps = SessionPoolOptions
.newBuilder()
.setMaxSessions(1000) // 1000 concurrent queries
.setMinSessions(100) // keep 100 alive
.setMaxIdleSessions(100) // how many to keep from being idle and closed
.build();
private SpannerSingleton() {
try {
options = SpannerOptions
.newBuilder()
.setSessionPoolOption(sessionPoolOps)
.build();
spanner = options.getService();
} catch (Exception e) {
e.printStackTrace();
}
}
private static class SingletonHelper{
private static final Spanner CONNECTION = new SpannerSingleton().spanner;
}
public static synchronized Spanner getSpanner() {
return SingletonHelper.CONNECTION;
}
}
I use a Factory pattern to make the dbClient
public SpannerFactory {
private static Spanner spanner = SpannerSingleton.getSpanner();
private static DatabaseId dbId;
public static DatabaseClient getConnection(String instance) {
if (Util.isEmpty(instance)) return null;
if ("mickey".equalsIgnoreCase(instance)) {
dbId = DatabaseId.of(spanner.getOptions().getProjectId(), "instance1", "mickey");
}
if ("mouse".equalsIgnoreCase(instance)) {
dbId = DatabaseId.of(spanner.getOptions().getProjectId(), "instance1", "mouse");
}
return spanner.getDatabaseClient(dbId);
}
}
What I would like to add is something that would check for the connection pool to see how close to starved we are and then recreate itself... I might be over thinking it, but what might happen if the connection is disrupted?
Client library should take care of maintaining healthy session pool, user shouldn't have to worry about sessions/connections explicitly.
As documented in java client, if you set MaxSessions correctly - client will take care of maintaining those many sessions.
At a high level, the flow would be like:
If currentSessions < MaxSessions {
if !idleSessions.empty()
use an idle session.
else
CreateNewSession.
} else {
Block/Fail based on action chosen in : ActionOnExhaustion.
}
If you want to avoid small overhead in CreateSession as part of request processing, one recommended option is to keep minSessions and maxSessions same as your concurrent TPS requirement so that at the beginning we have those many sessions ready to use.
For additional details like session monitoring, keeping idle sessions alive : please refer to documentation at : https://cloud.google.com/spanner/docs/sessions

Execute each subtask in parallel in a multithreaded environment

I am working on a library which will take an object DataRequest as an input parameter and basis on that object, I will construct an URL and then make a call to our app servers using apache http client and then return the response back to the customer who is using our library. Some customer will call the executeSync method to get the same feature and some customer will call our executeAsync method to get the data.
executeSync() - waits until I have a result, returns the result.
executeAsync() - returns a Future immediately which can be processed after other things are done, if needed.
Below is my DataClient class which has above two methods:
public class DataClient implements Client {
private final ForkJoinPool forkJoinPool = new ForkJoinPool(16);
private CloseableHttpClient httpClientBuilder;
// initializing httpclient only once
public DataClient() {
try {
RequestConfig requestConfig =
RequestConfig.custom().setConnectionRequestTimeout(500).setConnectTimeout(500)
.setSocketTimeout(500).setStaleConnectionCheckEnabled(false).build();
SocketConfig socketConfig =
SocketConfig.custom().setSoKeepAlive(true).setTcpNoDelay(true).build();
PoolingHttpClientConnectionManager poolingHttpClientConnectionManager =
new PoolingHttpClientConnectionManager();
poolingHttpClientConnectionManager.setMaxTotal(300);
poolingHttpClientConnectionManager.setDefaultMaxPerRoute(200);
httpClientBuilder =
HttpClientBuilder.create().setConnectionManager(poolingHttpClientConnectionManager)
.setDefaultRequestConfig(requestConfig).setDefaultSocketConfig(socketConfig).build();
} catch (Exception ex) {
// log error
}
}
#Override
public List<DataResponse> executeSync(DataRequest key) {
List<DataResponse> responsList = null;
Future<List<DataResponse>> responseFuture = null;
try {
responseFuture = executeAsync(key);
responsList = responseFuture.get(key.getTimeout(), key.getTimeoutUnit());
} catch (TimeoutException | ExecutionException | InterruptedException ex) {
responsList =
Collections.singletonList(new DataResponse(DataErrorEnum.CLIENT_TIMEOUT,
DataStatusEnum.ERROR));
responseFuture.cancel(true);
// logging exception here
}
return responsList;
}
#Override
public Future<List<DataResponse>> executeAsync(DataRequest key) {
DataFetcherTask task = new DataFetcherTask(key, this.httpClientBuilder);
return this.forkJoinPool.submit(task);
}
}
Below is my DataFetcherTask class which also has a static class DataRequestTask which calls our app servers by making URL:
public class DataFetcherTask extends RecursiveTask<List<DataResponse>> {
private final DataRequest key;
private final CloseableHttpClient httpClientBuilder;
public DataFetcherTask(DataRequest key, CloseableHttpClient httpClientBuilder) {
this.key = key;
this.httpClientBuilder = httpClientBuilder;
}
#Override
protected List<DataResponse> compute() {
// Create subtasks for the key and invoke them
List<DataRequestTask> requestTasks = requestTasks(generateKeys());
invokeAll(requestTasks);
// All tasks are finished if invokeAll() returns.
List<DataResponse> responseList = new ArrayList<>(requestTasks.size());
for (DataRequestTask task : requestTasks) {
try {
responseList.add(task.get());
} catch (InterruptedException | ExecutionException e) {
Thread.currentThread().interrupt();
return Collections.emptyList();
}
}
return responseList;
}
private List<DataRequestTask> requestTasks(List<DataRequest> keys) {
List<DataRequestTask> tasks = new ArrayList<>(keys.size());
for (DataRequest key : keys) {
tasks.add(new DataRequestTask(key));
}
return tasks;
}
// In this method I am making a HTTP call to another service
// and then I will make List<DataRequest> accordingly.
private List<DataRequest> generateKeys() {
List<DataRequest> keys = new ArrayList<>();
// use key object which is passed in contructor to make HTTP call to another service
// and then make List of DataRequest object and return keys.
return keys;
}
/** Inner class for the subtasks. */
private static class DataRequestTask extends RecursiveTask<DataResponse> {
private final DataRequest request;
public DataRequestTask(DataRequest request) {
this.request = request;
}
#Override
protected DataResponse compute() {
return performDataRequest(this.request);
}
private DataResponse performDataRequest(DataRequest key) {
MappingHolder mappings = DataMapping.getMappings(key.getType());
List<String> hostnames = mappings.getAllHostnames(key);
for (String hostname : hostnames) {
String url = generateUrl(hostname);
HttpGet httpGet = new HttpGet(url);
httpGet.setConfig(generateRequestConfig());
httpGet.addHeader(key.getHeader());
try (CloseableHttpResponse response = httpClientBuilder.execute(httpGet)) {
HttpEntity entity = response.getEntity();
String responseBody =
TestUtils.isEmpty(entity) ? null : IOUtils.toString(entity.getContent(),
StandardCharsets.UTF_8);
return new DataResponse(responseBody, DataErrorEnum.OK, DataStatusEnum.OK);
} catch (IOException ex) {
// log error
}
}
return new DataResponse(DataErrorEnum.SERVERS_DOWN, DataStatusEnum.ERROR);
}
}
}
For each DataRequest object there is a DataResponse object. Now once someone calls our library by passing DataRequest object, internally we make List<DataRequest> object and then we invoke each DataRequest object in parallel and return List<DataResponse> back where each DataResponse object in the list will have response for corresponding DataRequest object.
Below is the flow:
Customer will call DataClient class by passing DataRequest object. They can call executeSync() or executeAsync() method depending on their requirements.
Now in the DataFetcherTask class (which is a RecursiveTask one of ForkJoinTask's subtypes), given a key object which is a single DataRequest, I will generate List<DataRequest> and then invokes each subtask in parallel for each DataRequest object in the list. These subtasks are executed in the same ForkJoinPool as the parent task.
Now in the DataRequestTask class, I am executing each DataRequest object by making an URL and return its DataResponse object back.
Problem Statement:
Since this library is being called in a very high throughput environment so it has to be very fast. For synchronous call, executing in a separate thread is ok here? It will incur extra costs and resources for a Thread along with the cost of context switch of threads in this case so I am little bit confuse. Also I am using ForkJoinPool here which will save me in using extra thread pool but is it the right choice here?
Is there any better and efficient way to do the same thing which can be performance efficient as well? I am using Java 7 and have access to Guava library as well so if it can simplify anything then I am open for that as well.
It looks like we are seeing some contention when it runs under very heavy load. Is there any way this code can go into thread contention when runs under very heavy load?
I think in your situation it's better to use async http call, see link: HttpAsyncClient. And you don't need to use thread pool.
In executeAsync method create empty CompletableFuture<DataResponse>() and pass it to client call, there in callback call set the result of completableFuture by calling complete on it (or completeExceptionally if exceptions raise).
ExecuteSync method implementation looks good.
edit:
For java 7 it's only need to replace a completableFuture to promise implementation in guava, like ListenableFuture or anything similar
The choice to use the ForkJoinPool is correct, its designed for efficiency with many small tasks:
A ForkJoinPool differs from other kinds of ExecutorService mainly by virtue of employing work-stealing: all threads in the pool attempt to find and execute tasks submitted to the pool and/or created by other active tasks (eventually blocking waiting for work if none exist). This enables efficient processing when most tasks spawn other subtasks (as do most ForkJoinTasks), as well as when many small tasks are submitted to the pool from external clients. Especially when setting asyncMode to true in constructors, ForkJoinPools may also be appropriate for use with event-style tasks that are never joined.
I suggest to try the asyncMode = true in the constructor since in your case the tasks are never joined:
public class DataClient implements Client {
private final ForkJoinPool forkJoinPool = new ForkJoinPool(16, ForkJoinPool.ForkJoinWorkerThreadFactory, null, true);
...
}
For the executeSync() you can use the forkJoinPool.invoke(task), this is the managed way to do a synchronous task execution in the pool for resources optimisation:
#Override
public List<DataResponse> executeSync(DataRequest key) {
DataFetcherTask task = new DataFetcherTask(key, this.httpClientBuilder);
return this.forkJoinPool.invoke(task);
}
If you can use Java 8 then there is a common pool already optimised: ForkJoinPool.commonPool()

When to use Akka and when not to?

I'm currently in the situation that I'm actually making things more complicated by using Actors then when I don't. I need to execute a lot of Http Requests without blocking the Main thread. Since this is concurrency and I wanted to try something different then locks, I decided to go with Akka. Now I'm in the situation that I'm doubting between two approaches.
Approach one (Create new Actors when it's in need):
public class Main {
public void start() {
ActorSystem system = ActorSystem.create();
// Create 5 Manager Actors (Currently the same Actor for all but this is different in actual practise)
ActorRef managers = system.actorOf(new BroadcastPool(5).props(Props.create(Actor.class)));
managers.tell(new Message(), ActorRef.noSender());
}
}
public class Actor extends UntypedActor {
#Override
public void onReceive(Object message) throws Exception {
if (message instanceof Message) {
ActorRef ref = getContext().actorOf(new SmallestMailboxPool(10).props(Props.create(Actor.class)));
// Repeat the below 10 times
ref.tell(new Message2(), getSelf());
} else if (message instanceof Message2) {
// Execute long running Http Request
}
}
}
public final class Message {
public Message() {
}
}
public final class Message2 {
public Message2() {
}
}
Approach two (Create a whole lot of actors before hand and hope it's enough):
public class Main {
public void start() {
ActorSystem system = ActorSystem.create();
ActorRef actors = system.actorOf(new SmallestMailboxPool(100).props(Props.create(Actor.class)));
ActorRef managers = system.actorOf(new BroadcastPool(5).props(Props.create(() -> new Manager(actors))));
managers.tell(new Message(), ActorRef.noSender());
}
}
public class Manager extends UntypedActor {
private ActorRef actors;
public Manager(ActorRef actors) {
this.actors = actors;
}
#Override
public void onReceive(Object message) throws Exception {
if (message instanceof Message) {
// Repeat 10 times
actors.tell(new Message2(), getSelf());
}
}
}
public class Actor extends UntypedActor {
#Override
public void onReceive(Object message) throws Exception {
if (message instanceof Message2) {
// Http request
}
}
}
public final class Message {
public Message() {
}
}
public final class Message2 {
public Message2() {
}
}
So both approaches have up and down sides. One makes sure it can always handle new requests coming in, those never have to wait. But it leaves behind a lot of Actors that are never gonna be used. Two on the hand reuses Actors but with the downside that it might not have enough of them and can't cope some time in the future and has to queue the messages.
What is the best approach of solving this and what is most common way people deal with this?
If you think I could be doing this sort of stuff a lot better (with or without Akka) please tell me! I'm pretty new to Akka and would love to learn more about it.
Based on the given information, it looks like a typical example for task-based concurrency -- not for actor-based concurrency. Imagine you have a method for doing the HTTP request. The method fetches the given URL and returns an object without causing any data races on shared memory:
private static Page loadPage(String url) {
// ...
}
You can easily fetch the pages concurrently with an Executor. There are different kinds of Executors, e.g. you can use one with a fixed number of threads.
public static void main(String... args) {
ExecutorService executor = Executors.newFixedThreadPool(5);
List<Future<Page>> futures = new ArrayList<>();
// submit tasks
for (String url : args) {
futures.add(executor.submit(() -> loadPage(url)));
}
// access result of tasks (or wait until it is available)
for (Future<Page> future : futures) {
Page page = future.get();
// ...
}
executor.shutdown();
}
There is no further synchronization required. The Executor framework takes care of that.
I'd use mixed approach: create relatively small pool of actors beforehand, increase it when needed, but keep pool's size limited (deny request when there are too many connections, to avoid crash due to out of memory).

Categories