Does Jedis support async operations - java

I am using Jedis (java client) to commmunicate with Redis server. I have 3 Redis instances running on three different nodes. I want to "get" (read) some records from 3 Redis instances. I want to issue these "gets" (reads) in parallel, and then do some processing on the received data and form a final output.
What is the best way to do this in java?
One of the way is to create 3 threads and isssue "get" (read) in each of them (synchronously). Wait for all 3 commands to complete and then combine the result.
Does Jedis have a mechanism for issuing 3 "gets" (any command for that matter) asynchronously, with a callback feature?
I have 3 different Redis instances. So do you suggest to use "ShardedJedisPipeline" (jedis/tests/ShardedJedisPipelineTest.java) for interacting with these Redis instances in parallel?
Normal Jedis Pipeline (jedis/tests/PipeliningTest.java), just sends multiple commands to single Redis instance so they are executed one after other on Redis server (and all responses available at the end).
So I assume I have to use "ShardedJedisPipeline". But there are two limitations to this:
1. I want to execute Lua script i.e. "eval" on 3 Redis instances in parallel.
2. I dont want sharding (some hash algorithm used by Jedis) to distribute data or on its own (using its algorithm) read data from instances. We have a different strategy for distributing data. So I want to be able to specify, a record should be stored in which redis instance and accordingly from where it should be read. keyTags seems to provide this mechanism but not sure how to use this with "eval".

You can use pipeline as mentioned.
AsyncJedis is a work is progress and will be released with the next version of Jedis. It will be based on netty and will be compatible with vert.x

Until then you can roll it yourself with an ExecutorService with three Jedis instances, then await the futures that are returned.

As far as Feb 2015, Jedis apparently does not support Async operation over a single redis instance as you need: https://github.com/xetorthio/jedis/issues/241
What I would do in your case is go ahead with 3 threads and proceed with Futures and Executor Service as #Xorlev suggested above.

Related

Passing one message to just one messagelistener with Spring data redis

I have service in place which monitors key expiry topic __keyevent#*__:expired in redis. I am running 3 instance of the service. Which means 3 message listener.
The RedisKeyExpirationListener setup based on the suggestion in this solution https://developpaper.com/implementation-code-of-expired-key-monitoring-in-redis-cluster/
The above solution suggests using Distributed redis lock to make sure there is parallel processing i.e. same event not being processed again by a different node. Is there a different solution to make sure that Redis passes a event to just 1 node, to have true parallel processing across the 3 nodes rather than same event being processed with different nodes.
I know how to implement distributed lock with redis but want to understand if there are precise settings to enable or make sure the event is sent to only 1 active messagelistener and not all the keyexpirationlisteners ??

Redis Redisson - Strategies for workers

I am new to redis, and redisson but kind of up to date with what is available now.
Mostly from here: https://github.com/redisson/redisson/wiki/9.-distributed-services#91-remote-service
The case here involves a worker, on only one server out of say many. The worker gets to images which can be downloaded later on. They can be pushed to an executor to download later, however, that is not persistable and so we will loose out.
Redis offers executorservice. But I was wondering, does all redis nodes share or ship in to peform the work by default? Is there a way to control that only one gets to do the work? The stuff in the runnable / callable that is being accessed, I am guessing there has to be restrictions on what could be used since it is a closure with access to environment? No access?
Redis also offers something called distributed remote services. How are they different from an executorservice in this regard?
Another option is to push these to reddis list / queue / dequeu, and work of the "messages" albeit the executor service I think would allow me to have all that logic in the same place.
What is the best approach?
What are the rules for objects inside the closure supplied in a runnable / callable ? Fully serialiazeble everything ?
How do I manage if a worker is working, and suddenly dies (nuclear). Can I ensure that someone else gets to work?

Spark - How to create a variable that is different for each executor context?

My Spark application launches several executors.
I have several partitions that get spread over my executors.
When using map() on these partitions, I want to use a MongoDB connection (MongoDB Java Driver) and query more data from there, process this data and return it as the output of the map() function.
I want to create one connection per executor.
Each partition should then access this executor-local variable and use it to query the data.
Establishing a connection for each partition is probably not a good idea. Broadcasting the connection won't work either because it is not serializable (I think?).
To sum it up:
How to create a variable that is different for each executor context?
You should use the MongoConnector.
It will handle creating a collection and is backed by a cache that efficiently handles the shutdown of any MongoClients. It is serialisable so it can be a broadcast and it can take options, a readConfig or the Spark context to configure where to connect to.
MongoConnector uses the loan pattern to handle reference management of the underlying connection to MongoDB and allows access at the MongoClient, MongoDatabase or the MongoCollection level.

Using Java thread pool, how to process some messages serially and others in parallel depending on message characteristic?

This is more of a Java concurrency design question. I’m working on an application that need to process many messages for many different clients. If two messages have different client names, then they can be processed in parallel. However, if they have the same client name, then they need to be processed in order serially.
What’s the best way to implement this?
My current implementation is pretty simple: I wrote a wrapper class called OrderedExecutorPool. It has a list of single-threaded executors. In its submit method, it does the following to figure out which executor to submit the task to:
int executorNum = Math.abs(clientName.hashCode()) % numExecutors;
executorList.get(executorNum).submit(task);
This ensures that all messages with same clients go to the same executor while still supporting processing messages for different clients in parallel.
There are a couple of problems with this design:
1.) If most client names have same hash code, then only a few executors are doing work
2.) If one client has MANY messages, only one executor may not keep up
Is there an elegant solution to this problem that can fix the shortcomings above?
Edit
clientName is just a String. I'm just invoking the String.hashCode() method on it.
There is no jdk builtin solution that i know of. i've implemented a custom executor solution to this at my current job using this basic logic.
keep an internal map of clientname to work queue (each client has their own queue)
when work comes in for a client, add it to their queue
if this is the first job on the queue, create a Runnable for this clientname/queue and push it into the "real" executor (standard jdk thread pool)
Runnable impl just consumes tasks from a single client queue until empty and then exits
this simple implementation is the "greedy" approach (a client will keep working until its queue is empty). if you have more clients than underlying threads, you may want a more "fair" approach, where a client executes some number of tasks and they re-queues itself in the underlying executor (thus allowing other clients to get some work done).

How to parallelize Multiple Requests to mongoDb?

I am using a single standalone Mongo DB server with no special topology like replication or sharding etc. Currently I have an issue that mongo DB does not support more than 500 parallel requests. Note that I am using only one instance of MongoClient and the remaining threads are used for inserts. I am using a java executor framework to create the threads and these threads are used to insert data to a collection [all insert in the same collection]
You should queue the requests before you issue them towards the database. There is no use requesting 500 things from your database in parallel. Remember a single request comes with some costs memory wise, locking wise and so on. Actually you are wasting resources by asking your database too much at once - remember I mean this request wise not data wise.
So use a queue (or more) and pool up the requests. From that pool you feed your worker threads (lets say 5 or 10 are enough) and that's it.
Take a look at the Future interface in the concurrent package of java. Using asynchrone processing here looks like the thing with the highest throughput and the lowest resource impact.
But check the MongoDB driver first. I would not be surprised if they have implemented it already this way. If this is the case you just have to limit yourself by using a queue to have only lets say 10 or 100 requests at once being handled by the database driver. Do some performance check tweaking the number of actual requests send to the database.

Categories