I have a cluster with 3 nodes (in different machines) and I have a "business logic" that use a distributed lock at startup.
Sometimes when there is more latency every node acquires the exclusive lock with success because the cluster isn't already "startup" so each node does not yet see the other.
Subsequently the nodes see each other and the cluster is correctly configured with 3 nodes. I know there is a "MemberShipListener" to capture the event "Member added" so I could execute again the "business logic", but I would to know if there is a method to ensure when the cluster startup is properly finished in order to wait to execute the "business logic" until the cluster is on.
I tried to use hazelcast.initial.wait.seconds but configure the right seconds isn't deterministic and I don't know if this also delay the member join operations.
Afaik, there is no such thing in Hazelcast. As the cluster is dynamic, a node can go and leave at any time, so the cluster is never "complete" or not.
You can, however :
Configure an initial wait, like you described, in order to help with initial latencies
use hazelcast.initial.min.cluster.size to define the minimum number of members hazelcast is waiting for at start
Define a minimal quorum : the minimal number of nodes for the cluster to be considered as useable/healthy (see cluster quorum)
Use the PartitionService to check is the cluster is safe, or if there are pending migrations
Related
I have a jvm process that wakes a thread every X minutes.
If a condition is true -> it starts a job (JobA).
Another jvm process does almost the same but if the condition is true -
it throws a message to a message broker which triggers the job in another server (JobB).
Now, to avoid SPOF problem I want to add another instance of this machine in my cloud.
But than I want ensure I run a single instance of a JobA each time.
What are my options?
There are a number of patterns to solve this common problem. You need to choose based on your exact situation and depending on which factor has more weight in your case (performance, correctness, fail-tolerance, misfires allowed or not, etc). The two solution-groups are:
The "Quartz" way: you can use a JDBCStore from the Quartz library which (partially) was designed for this very reason. It allows multiple nodes to communicate, and share state and workload between each other. This solution gives you a probably perfect solution at the cost of some extra coding and setting up a shared DB (9 tables I think) between the nodes.
Alternatively your nodes can take care of the distribution itself: locking on a resource (single record in a DB for example) can be enough to decide who is in charge for that iteration of the execution. Sharing previous states however will require a bit more work.
At our company we have a server which is distributed into few instances. Server handles users requests. Requests from different users can be processed in parallel. Requests from same users should be executed strongly sequentionally. But they can arrive to different instances due to balancing. Currently we use Redis-based distributed locks but this is error-prone and requires more work around concurrency than business logic.
What I want is something like this (more like a concept):
Distinct queue for each user
Queue is named after user id
Each requests identified by request id
Imagine two requests from the same user arriving at two different instances concurrently:
Each instance put their request id into this user queue.
Additionaly, they both store their request ids locally.
Then some broker takes request id from the top of "some_user_queue" and moves it into "some_user_queue_processing"
Both instances listen for "some_user_queue_processing". They peek into it and see if this is request id they stored locally. If yes, then do processing. If not, then ignore and wait.
When work is done server deletes this id from "some_user_queue_processing".
Then step 3 again.
And all of this happens concurrently for a lot (thousands of them) of different users (and their queues).
Now, I know this sounds a lot like actors, but:
We need solution requiring as small changes as possible to make fast transition from locks. Akka will force us to rewrite almost everything from scratch.
We need production ready solution. Quasar sounds good, but is not production ready yet (more correctly, their Galaxy cluster).
Tops at my work are very conservative, they simply don't want another dependency which we'll need to support. But we already use Redis (for distributed locks), so I thought maybe it could help with this too.
Thanks
The best solution that matches the description of your problem is Redis Cluster.
Basically, the cluster solves your concurrency problem, in the following way:
Two (or more) requests from the same user, will always go to the same instance, assuming that you use the user-id as a key and the request as a value. The value must be actually a list of requests. When you receive one, you will append it to that list. In other words, that is your queue of requests (a single one for every user).
That matching is being possible by the design of the cluster implementation. It is based on a range of hash-slots spread over all the instances.
When a set command is executed, the cluster performs a hashing operation, which results in a value (the hash-slot that we are going to write on), which is located on a specific instance. The cluster finds the instance that contains the right range, and then performs the writing procedure.
Also, when a get is performed, the cluster does the same procedure: it finds the instance that contains the key, and then it gets the value.
The transition from locks is very easy to perform because you only need to have the instances ready (with the cluster-enabled directive set on "yes") and then to run the cluster-create command from redis-trib.rb script.
I've worked last summer with the cluster in a production environment and it behaved very well.
Does Hadoop process replicas also? For example worker node i, in mapper phase, processes the data stored on that machine only. After data (not replica, but original) is finished to be processed in mapper phase or maybe not finished, can there be a case that, machine i processes replica data stored on that machine? Or replica is used only when some node does off?
Yes, processing replicas also would happen on a specific scenario called Speculative execution.
If the machine i takes too much time to process the data block stored in that machine , then the job's Application master would start a duplicate parallel mapper against the another replica of the data block stored in a different machine. This new speculative mapper will run in the machine j where the replica is stored.
Whichever mapper completes the execution first, its outputs will be considered.The other slow running mapper and its resources will be removed.
by default, the Speculative execution is enabled. You could toggle this by modifying the below properties.
mapreduce.map.speculative
mapreduce.reduce.speculative
By any case, not more than one replica of the data block will be stored in the same machine. Every replica of the data block will be kept in different machines.
The master node(jobtracker) may or may not pick the original data, in fact it doesn't maintain any info about out of the 3 replica which is original. Because when it saves the data it does a checksum verification on the file and saves it cleanly. Now when jobtracker wants to pick up a slot for the mapper, it takes so many things to account like number of free map slots, overhead of a tasktracker and other things. And last but not least data locality, so the closest node which satisfies almost all criteria will only be picked, it doesn't bother whether it is original or a replica and as mentioned even it doesn't maintain that identity.
Let's assume we have 3 geographically distributed data centers A,B,C. In each of these, a Cassandra cluster is up and running. Now assume DC A can no longer gossip with B and C.
Writes to A with LOCAL_QUORUM, would still be satisfied - but they would no longer be propagated to B and C; and vice-versa.
This situation could have some very disastrous consequences...
What I'm looking for are some tips on how to rapidly ascertain that DC A has become 'isolated' from the other data centers (using the Native Java driver).
I remember reading about push notifications, but I seem to recall they referred only to the status of the local cluster. Does anybody have any ideas? Thanks.
First thing to note is that in the event that A can no longer connect to B and C, Hints will be stored and delivered upon the restoration of the network connection. So for outages that do not last for a long period of time there is already a safety mechanism and you don't need to do anything.
For longer outages it has been best practice to use the repair command following such an outage to synchronize the replicas.
That said, if you are looking for way to determine when inter DC communication has been disrupted you have several options.
1) Use a tool like Datastax Opscenter to monitor your cluster state, this tool will automatically discover when these sorts of events happen and log them. I also believe you are able to set up triggered events but i'm not an expert in how Opscenter works.
2) Use the Java driver's public Cluster register(Host.StateListener listener) to register a function to be called on node down events, you can then determine when entire DC's go down.
3) Track via JMX on each of the DCs the current state of gossip, this will allow you to see what each Datacenter thinks about the current availability of all the machines. You could do this directly or via nodetool status.
#RussS .. I dont think point (2) works when all three host are not reachable ..
For example ..I Implemented state listener and i am poining to my cluster from my local machine .. I can see that listener gets invoked when nodes go up/down .. But i dont see this listener being invoked when I unplug my ether
I'm having an issue; maybe you can help me.
Basically, I would like to know if :
quartz clustering can have its trigger changed dynamically (i.e. same config on all servers, but at a given point in time, I want to change the cron expression ON A SINGLE SERVER, and see this change propagated on ALL servers).
generally, if changes on a single server are propagated to all other servers (for example if I stop a particular scheduler on a single node, if all nodes stop the scheduler).
Unless, you're in for TerracottaJobStore, you probably utilize the clustering through database. The way it works is that scheduling data, such as Triggers and JobDetail, is saved to the database. All Scheduler nodes synchronize on that persisted data. Therefore, a change to that data from one node is reflected to all nodes.
OTOH, stopping / starting / standby etc. are all management data (as opposed to Triggers and JobDetails). Management data is considered node-specific and does not propagate to other nodes. According to this post it might in the future...