I am trying to implement distributed cache using Hazelcast in my application. I am using Hazelcast’s IMap. The problem I have is every time I get a value from a map and update the value, I need to do a put(key, value) again. If my value object has 10 properties and I have to update all 10, then I have to call put(key, value) 10 times. Something like -
IMap<Integer, Employee> mapEmployees = hz.getMap("employees");
Employee emp1 = mapEmployees.get(100);
emp1.setAge(30);
mapEmployees.put(100, emp1);
emp1.setSex(“F”);
mapEmployees.put(100, emp1);
emp1.setSalary(5000);
mapEmployees.put(100, emp1);
If I don’t do this way, some other node which operates on the same Employee object will update it and the final result is that the employee object is not synchronized. Is there any solution to avoid calling put explicitly multiple times? In a ConcurrentHashMap, I don’t need to do this because if I change the object, the map also gets updated.
As of version 3.3 you'll want to use an EntryProcessor:
What you really want to do here is build an EntryProcessor<Integer, Employee> and call it using
mapEmployees.executeOnKey( 100, new EmployeeUpdateEntryProcessor(
new ObjectContainingUpdatedFields( 30, "F", 5000 )
);
This way, Hazelcast handles locking the map on the key for that Employee object and allows you to run whatever code is in the EntryProcessor's process() method atomically including updating values in the map.
So you'd implement EntryProcessor with a custom constructor that takes an object that contains all of the properties you want to update, then in process() you construct the final Employee object that will end up in the map and do an entry.setValue(). Don't forget to create a new StreamSerializer for the EmployeeUpdateEntryProcessor that can serialize Employee objects so that you don't get stuck with java.io serialization.
Source:
http://docs.hazelcast.org/docs/3.5/manual/html/entryprocessor.html
Probably a transaction is what you need. Or you may want to take a look at distributed lock.
Note that in your solution if this code is ran by two threads changes made by one of them will be overwriten.
This may interest you.
You could do something like this for your Employee class (simplified code with one instance variable only):
public final class Employee
implements Frozen<Builder>
{
private final int salary;
private Employee(Builder builder)
{
salary = builder.salary;
}
public static Builder newBuilder()
{
return new Builder();
}
#Override
public Builder thaw()
{
return new Builder(this);
}
public static final class Builder
implements Thawed<Employee>
{
private int salary;
private Builder()
{
}
private Builder(Employee employee)
{
salary = employee.salary;
}
public Builder withSalary(int salary)
{
this.salary = salary;
return this;
}
#Override
public Employee freeze()
{
return new Employee(this);
}
}
}
This way, to modify your cache, you would:
Employee victim = map.get(100);
map.put(100, victim.thaw().withSalary(whatever).freeze());
This is a completely atomic operation.
If there is possibility that another node can update data that your node is working with then using put() will overwrite changes made by another node. Usually it is unwanted behavior, cause it leads to data loss and inconsistent data state.
Take a look at IMap.replace() method and other ConcurrentMap related methods. If replace() is failed then you've faced changes collision. In this case you should give it another attempt:
re-read entry from hazelcast
update it's fields
save to hazelcast with replace
After some failed attempts you can throw StorageException to the upper level.
You should use tryLock on your map entry :
long timeout = 60; // Define your own timeout
if (mapEmployees.tryLock(100, timeout, TimeUnits.SECONDS)){
try {
Employee emp1 = mapEmployees.get(100);
emp1.setAge(30);
emp1.setSex(“F”);
emp1.setSalary(5000);
mapEmployees.put(100, emp1);
} finally {
mapEmployees.unlock(100);
}
}else{
// do something else like log.warn(...)
}
See : https://docs.hazelcast.com/imdg/4.2/data-structures/fencedlock#releasing-locks-with-trylock-timeout
Related
I am developing a metrics store (Map) , which basically collects metrics about some operations such as
mix
max
counter
timeElapsed[] etc
Here Key is the name of the method and value are metrics about it.
Spring can help me create singleton object of MetricStore, i am using ConcurrentHashMap to avoid race condition when multiple REST request comes in parallel.
My query
1- Do i need to make MetricStore variable store volatile? to improve the visibility among multiple requests. 2- I am using Map as the base class and ConcurrentHashMap as Implemetnation, does it affect as Map is not ThreadSafe.
-
#Component
class MetricStore{
public Map<String, Metric> store = new ConcurrentHashMap<>();
//OR public volatile Map<String, Metric> store = new ConcurrentHashMap<>();
}
#RestController
class MetricController{
#Autowired
private MetricStore metricStore;
#PostMapping(name="put")
public void putData(String key, Metric metricData) {
if(metricStore.store.containsKey(key)) {
// udpate data
}
else {
metricStore.store.put(key, metricData);
}
}
#PostMapping(name="remove")
public void removeData(String key) {
if(metricStore.store.containsKey(key)) {
metricStore.store.remove(key);
}
}
}
Do i need to make MetricStore variable store volatile?
No, because you are not changing the value of store (i.e. store can be marked as final and the code should still compile).
I am using Map as the base class and ConcurrentHashMap as Implemetnation, does it affect as Map is not ThreadSafe
Because you're using a ConcurrentHashMap as the implementation of Map, it is thread-safe. If you want the declared type to be more specific, Map can be changed to ConcurrentMap.
The bigger issue here is that you're using containsKey before calling put and remove when you should be using compute and computeIfPresent, which are atomic operations:
#PostMapping(name="put")
public void putData(String key, Metric metricData) {
metricStore.store.compute(key, (k, v) -> {
if (v == null) {
return metricData;
}
// update data
});
}
#PostMapping(name="remove")
public void removeData(String key) {
metricStore.store.computeIfPresent(key, (k, v) -> null);
}
I have a field in a class that should only be accessed directly from a getter. As an example...
public class CustomerHelper {
private final Integer customerId;
private String customerName_ = null;
public CustomerHelper(Integer customerId) {
this.customerId = customerId;
}
public String getCustomerName() {
if(customerName_ == null){
// Get data from database.
customerName_ = customerDatabase.readCustomerNameFromId(customerId);
// Maybe do some additional post-processing, like casting to all uppercase.
customerName_ = customerName_.toUpperCase();
}
return customerName_;
}
public String getFormattedCustomerInfo() {
return String.format("%s: %s", customerId, getCustomerName());
}
}
So even within the class itself a function like getFormattedCustomerInfo should not be able to access it via customerName_. Is there a way to enforce a class not access a field directly aside from the provided getter function?
There is no such mechanism in Java (or at least I think there should not be). If you are sure that getFormattedCustomerInfo should be prohibited from direct access to customerName_, create another class and compose them.
I would recommend CustomerInfoFormatter.
Also, I would change customerName_ to customerName as the language supports privacy by explicit declaration and it is not needed to add more indicators.
It looks like you are trying to cache the database value, and want to protect against accessing a value which has yet to be cached.
If this is true, then the variable customerName_ should not exist in the CustomerHelper class; the cached value should exist closer to the database.
The method customerDatabase.readCustomerNameFromId(customerId) should first look at a cache, and if the cache is empty, call the database and cache the result.
Effectively, customerName_ becomes a value in the cache: Map<Integer, String> cache where the key is customerId.
I am working on measuing my application metrics using below class in which I increment and decrement metrics.
public class AppMetrics {
private final AtomicLongMap<String> metricCounter = AtomicLongMap.create();
private static class Holder {
private static final AppMetrics INSTANCE = new AppMetrics();
}
public static AppMetrics getInstance() {
return Holder.INSTANCE;
}
private AppMetrics() {}
public void increment(String name) {
metricCounter.getAndIncrement(name);
}
public AtomicLongMap<String> getMetricCounter() {
return metricCounter;
}
}
I am calling increment method of AppMetrics class from multithreaded code to increment the metrics by passing the metric name.
Problem Statement:
Now I want to have metricCounter for each clientId which is a String. That means we can also get same clientId multiple times and sometimes it will be a new clientId, so somehow then I need to extract the metricCounter map for that clientId and increment metrics on that particular map (which is what I am not sure how to do that).
What is the right way to do that keeping in mind it has to be thread safe and have to perform atomic operations. I was thinking to make a map like that instead:
private final Map<String, AtomicLongMap<String>> clientIdMetricCounterHolder = Maps.newConcurrentMap();
Is this the right way? If yes then how can I populate this map by passing clientId as it's key and it's value will be the counter map for each metric.
I am on Java 7.
If you use a map then you'll need to synchronize on the creation of new AtomicLongMap instances. I would recommend using a LoadingCache instead. You might not end up using any of the actual "caching" features but the "loading" feature is extremely helpful as it will synchronizing creation of AtomicLongMap instances for you. e.g.:
LoadingCache<String, AtomicLongMap<String>> clientIdMetricCounterCache =
CacheBuilder.newBuilder().build(new CacheLoader<String, AtomicLongMap<String>>() {
#Override
public AtomicLongMap<String> load(String key) throws Exception {
return AtomicLongMap.create();
}
});
Now you can safely start update metric counts for any client without worrying about whether the client is new or not. e.g.
clientIdMetricCounterCache.get(clientId).incrementAndGet(metricName);
A Map<String, Map<String, T>> is just a Map<Pair<String, String>, T> in disguise. Create a MultiKey class:
class MultiKey {
public String clientId;
public String name;
// be sure to add hashCode and equals
}
Then just use an AtomicLongMap<MultiKey>.
Edited:
Provided the set of metrics is well defined, it wouldn't be too hard to use this data structure to view metrics for one client:
Set<String> possibleMetrics = // all the possible values for "name"
Map<String, Long> getMetricsForClient(String client) {
return Maps.asMap(possibleMetrics, m -> metrics.get(new MultiKey(client, m));
}
The returned map will be a live unmodifiable view. It might be a bit more verbose if you're using an older Java version, but it's still possible.
I have two components:
The manager, on which add(Data) can be called. This will add some data to the manager.
The clients, which can call retrieve(predicate) on the manager. A list of Data objects which match the given predicate are returned. If there is no such data, retrieve keeps waiting.
A typical blocking priority queue cannot be used here, since the client is not interested in every new object. Only those who are allowed by his requirements as defined in the predicate are useful for him.
How can this be implemented in Java? I could get it working with a x.notifyAll() call after each call to add(Data) in the manager, and a x.wait() in the retrieve(predicates) method. I was wondering if the java.concurrent package has more higher-level functionalities which can be used for this problem.
Here is an outline of something that may give you an idea. For simplicity I am going to assume that predicates and data are strings.
As you stated you do not know your predicates ahead of time so I would try to dynamically update and cache based on new incoming predicates.
Manager
public class Manager(){
private Map<String, Set<String>> jobs = new HashMap<>():
private Set<String> knownPredicates = new HasSet();
private final static String GENERAL = "GENERAL_DATA";
public void addJob(String data){
Set<String> matchingPredicates = getMatchingPredicates(data);
if(matchingPredicates.isEmpty()){
updateJobs(GENERAL, data);
} else {
for(String predicate: matchingPredicates){
updateJobs(GENERAL, data);
}
}
synchronized(this){
notifyAll();
}
}
private Set<String> getMatchingPredicates(String data){
Set<String> matchingPredicates = new HashSet<>();
for(String knownPredicate: knownPredicates){
// Check if the data matched the predicate. If so add it to the list
}
return matchingPredicates;
}
private void updateJobs(String predicate, String data){
Set<String> dataList;
if(jobs.containsKey(predicate)){
dataList = jobs.get(predicate);
} else {
dataList = new HashSet<>();
}
dataList.add(data);
jobs.put(predicate, dataList);
}
public synchronized List<String> retrieve(String predicate){
Set<String> jobsToReturn;
knownPredicates.add(predicate);
if(jobs.containsKey(predicate)){
jobsToReturn = jobs.remove(predicate);
}
for(String unknownData: jobs.get()){
//Check if unknownData matches the new predicate if it does add it to jobsToReturn
}
cleanupData(jobsToReturn);
return jobsToReturn;
}
//Removes data that may match more than one predicate
private static void cleanupData(Set<String> dataSet){
for(String data: dataSet){
for(Set <String> predicateSet: jobs.values()){
predicateSet.remove(data);
}
}
}
}
Client
public class Client() implements Runnable{
private Manager managerRef;
public Client(Manager m){
managerRef = m;
}
public void run() {
while(true){
String predicate = //Get the predicate somehow
Set<String> workToDo = managerRef.retrieve(predicate)
if(workToDo.isEmpty()){
synchornized(managerRef){
managerRef.wait();
}
} else {
//Do something
}
}
}
}
The above is only a skeleton though. You would have to resolve some issue regarding clearing your known predicates etc. . .
You might need to consider implementing predicate-based caching with the following behavior:
If 'retrieve(predicate)' method has never been called and 'add(Data)' method is executed, a new Data object is simply added to the manager and cache remains empty.
If 'retrieve(predicate)' method is called, the client checks the cache for the requested predicate in order to retrieve references to the corresponding Data objects. If cache is empty or no match has been found, the system runs a search on the specified predicate against all Data objects in the manager and updates the cache. To improve the performance, if no match found, flag this up in the cache so that the subsequent queries for the same predicate are returned faster.
If 'add(Data)' method is called and cache isn't empty, the Data object being added is scanned for all predicates already in the cache and the matching objects are associated by a reference with the corresponding predicates in the cache.
Note as any caching mechanism, it will be slower at the start but will improve as more objects fill up the cache.
I know how to make a collection unmodifiable in java but I dont understand the need for such a method to exist. Can someone please explain this with an example scenario where I would want to make my collection unmodifiable?
Thanks in advance
The most efficient way to share private data outside of a class is to simply return it. But then something outside of the class can change the data that the class depends on. Another option is to copy the data before you share. This takes time and memory to do. Unmodifiable collections will often wrap the data and simply present it without allowing an outside class to edit it. This is faster than making a copy. An outside class can optionally make a modifiable copy if it needs.
An unmodifiable collection is basically read-only which is exactly what you want in case you need to publish such collection to client code and you don't want the client code to modify the collection.
It also promotes immutability which is generally a good thing since you won't have to care about the state of the collection for the rest of the execution of your program. See item 15 of Effective Java (2nd Edition) : Minimize mutability, and to quote Joshua Bloch :
Immutable objects are simple. An immutable object can be in exactly
one state, the state in which it was created.
Note that an unmodifiable collection will not make the contained objects immutable. This is a property each of the contained objects needs to make sure of, if it is required of course.
Take a look at this scenario. There is an application that creates 2 users, and then wants to notify them about something. But only users with name different from Peter should get the notification.
So we have to User.class:
public class User {
private String name;
private Integer id;
public User(final Integer id, final String name) {
this.id = id;
this.name = name;
}
public String getName() {
return name;
}
public Integer getId() {
return id;
}
}
The users are stored in special holder class (containing map):
public class UsersHolder {
private static Map<Integer, User> usersMap = new HashMap<Integer, User>();
public static void addUser(final User user) {
usersMap.put(user.getId(), user);
}
public static Map<Integer, User> getUsersMap() {
return usersMap;
//return Collections.unmodifiableMap(usersMap);
}
}
Then we have the UsersCreator that creates those users and stores them in a map:
public class UsersCreator {
public static void createUsers() {
UsersHolder.addUser(new User(1, "Peter"));
System.out.println("Created user " + UsersHolder.getUsersMap().get(1).getName());
UsersHolder.addUser(new User(2, "Paul"));
System.out.println("Created user " + UsersHolder.getUsersMap().get(2).getName());
}
public static void main(String[] args) {
UsersCreator.createUsers();
System.out.println("Number of users before notification: " + UsersHolder.getUsersMap().size());
new UsersNotificator().notifyAllUsersButPeters(UsersHolder.getUsersMap());
System.out.println("Number of users after notification: " + UsersHolder.getUsersMap().size());
}
}
And the notificator that notifies all but Peters:
public class UsersNotificator {
public void notifyAllUsersButPeters(final Map<Integer, User> map) {
//we don't need peters, so we'll remove them from the list;
Iterator<Entry<Integer, User>> iterator = map.entrySet().iterator();
while (iterator.hasNext()) {
if (iterator.next().getValue().getName().equals("Peter")) {
iterator.remove();
}
}
//now we can notify all from the list;
notifyUsers(UsersHolder.getUsersMap());
}
private void notifyUsers(Map<Integer, User> map) {
for (final User user : map.values())
System.out.println("notifyingUsers: " + user.getName());
}
}
Now - the notificator was presented with a map and it may modify it, which it does. It doesn't know that it shouldn't modify it as it's global usersMap. In effect it removes all users with name Peter. It does it for it's own purposes, but the results will be visible for every other class using UsersHolder.
The result is as follows:
Created user Peter
Created user Paul
Number of users before notification: 2
notifyingUsers: Paul
Number of users after notification: 1
When returning unmodifiableMap in UsersHolder the removal will not be possible. The only way would be to create new map with users to notify, so our usersHolder is safe.
This example is a bit big, sorry for that, i failed to think of/create somehting shorter.
Unmodifiable map helps to keep your classes Immutable which is safe(as presented in the example) especially in multithreaded enviroment.
There are many situations in which you do not want your collection to be modifiable. Whenever you know that the collection is initialized with exactly the content it should contain at all times, it can provide security to make it unmodifiable.
The (rather long) example provided by another user is a good example of where it often causes problems. Whenever you traverse a collection, there is a risk you change the collection if you forget to do it on a copy. Making the collection unmodifiable catches and prevents this easy to make mistake.