Is it possible to create a client only hazelcast node? We have hazelcast embedded in our Java Applications and they use a common hazelcast.xml. This works fine, however when one of our JVM's is distressed, it causes the other clustered JVM's to slow down and have issues. I want to run a hazelcast cluster outside of our application stack and update the common hazelcast.xml to point to the external cluster. I have tried various config options but the application JVM's always want to start a listener and become members. I realize I maybe asking for something that defeats the purpose of hazelcast, however I thought it may be possible to configure an instance to be a client only.
Thanks.
You can change your application to use Hazelcast client instances, but it requires a code change.
Instead of
HazelcastInstance hz = Hazelcast.newHazelcastInstance();
you'll need to initialize your instance by requesting a client one:
HazelcastInstance hz = HazelcastClient.newHazelcastClient();
Another option is to keep the code unchanged and configure your embedded members to be "lite" ones. So they don't own any partition (they don't store cluster data).
<hazelcast xmlns="http://www.hazelcast.com/schema/config"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.hazelcast.com/schema/config
http://www.hazelcast.com/schema/config/hazelcast-config-4.0.xsd">
<!--
===== HAZELCAST LITE MEMBER CONFIGURATION =====
Configuration element's name is <lite-member>. When you want to use a Hazelcast member as a lite member,
set this element's "enabled" attribute to true in that member's XML configuration. Lite members do not store
data and are used mainly to execute tasks and register listeners. They do not have partitions.
-->
<lite-member enabled="true"/>
</hazelcast>
Related
Let's say I have a Spring Boot application running in AWS ECS. Let's further suppose that Spring Cloud Config Server is overkill, and we set all application properties via environment variables loaded via the current task definition.
E.g. in application.yml:
db:
url: ${DB_URL}
Let's also assume that the task definition pulls the necessary config values from AWS Parameter Store.
If I update the corresponding DB_URL value in AWS Parameter store, is there any reasonable way for the Spring application to see this value short of starting up a new container?
My hunch would be that, with the container already built, the values specified by the task definition were baked in to the container once it was created.
(I realize even if the updated value was visible that there's still the matter of properly updating the affected resource(s).)
Another thought might be to use AWS Secrets Manager as it seems to have the client-side caching library (https://github.com/aws/aws-secretsmanager-caching-java), but then all configuration values would have to be stored there instead of AWS Parameter Store.
I'm pretty sure I know the answer, but I wanted to make sure I'm not missing anything: Is there any other way to accomplish what's being asked besides the above? Or is the creation of a new container the only way (unless I want to switch to using, say, Spring Cloud Config Server)?
Thank you in advance!
Recreating the container is the only way to update the environment variables. This generally isn't an issue as ECS will spin up the new container, and start sending traffic to the new container, draining connections from the old container, so your application won't be down during this process.
Below is my scenario from Application perspective.
We have 2 applications (.war) files will be running in a same instance of Application server (mostly Tomcat 8), In production we may deploy App1 on 100 servers and App2 only on 50 server out of those 100 (The App2 does not need to be distributed so much)
Now this 2 applications (.war) depends on a common custom jar (some utility classes)
I am planning to use Jcache API and hazelcast implementation in our apps. I have added following dependency in my pom.xml
<!-- JSR 107 JCache -->
<dependency>
<groupId>javax.cache</groupId>
<artifactId>cache-api</artifactId>
<version>1.0.0</version>
</dependency>
<!-- Hazelcast dependency -->
<dependency>
<groupId>com.hazelcast</groupId>
<artifactId>hazelcast</artifactId>
<version>3.4</version>
</dependency>
Plan is to write a utility CacheManager in this common custom jar which will be shared by App1 and App2.
I am planning to use only the hazelcast server provider as I am doing in-memory cluster i.e. the caching will be in application memory.
Below is the snippet of my code.
public class PPCacheManager {
// Loads the default CacheProvider (HazelCast) from hazelcast.xml which is
// in classpath
private static CachingProvider defaultCachingProvider = Caching.getCachingProvider(); //
// Loads the default CacheManager from hazelcast.xml which is in classpath
private static CacheManager defaultCacheManager = defaultCachingProvider.getCacheManager();
// Some more code goes here...
My hazelast.xml
<hazelcast xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.hazelcast.com/schema/config
http://www.hazelcast.com/schema/config/hazelcast-config-3.4.xsd"
xmlns="http://www.hazelcast.com/schema/config">
<cache name="commonClientCache">
<key-type class-name="java.lang.String"></key-type>
<value-type class-name="java.lang.Object"></value-type>
<statistics-enabled>true</statistics-enabled>
<management-enabled>true</management-enabled>
<read-through>true</read-through>
</cache>
</hazelcast>
Now I have several question around this approach.
Is this a good way to implement the in memory caching (currently we are not looking for cluster caching), should this code be in the common custom jar or somewhere else?
There is some master data from DB which I am planning to load (both applications need this data) so not sure how and where I should load this data into memory. Note: I do not want to do lazy loading; I want to load this master data very first.
Where should I add the cache shutdown code to avoid memory leak issues, as this cache is shared by both the applications.
Update
Also by implementing this approach will I have 2 copies of cache each for application or a single copy will be shared across both?
I have already implemented this approach in my application and from Hazelcast management console I can see that there is only 1 cache is created but it says GET is executed on this cache twice.
Hazelcast is the perfect solution for what you are trying to do. Definitely no lazy loading. You don't need anything like that if you have shared memory.
As far, as how many instances you'd have inside one (Tomcat) JVM, you'd have two if you instantiate Hazelcast twice. It'd autoincrement the port. However both will belong to the same cluster (you call "cache"), as long as the cluster name is the same. So other than looking a little silly (sharding on a single JVM), you are fine. To avoid it, you can configure one of the wars to instantiate a HazelcastClient. The utility jar can be the same. It should all be in some e.g. Spring config - which every war would have its own copy of. Or you can put that config into two external directories and add to the catalina classpath.
The shutdown code belongs to the same place you instantiated Hazelcast i.e. your two wars will have two shutdown calls. You can do it in Spring's destroy() of any of your high-level config (or autowired) beans or put it in the Web App session listener.
I have a java web application I'm trying to re-factor to work with the elastic beanstalk way of doing things. The application will be load balanced and have (for the moment) 2 hosts without taking any advantage of auto-scaling. The issue is that there are slight configuration differences between the nodes, in particular authenticating to certain web-services is done with different credentials to effectively double throughput as there are per account throttling restrictions.
Currently my application treats configuration separately from the archive so its relatively simple on fixed hosts where the configuration remains in a relatively static file path and deployment of the war files is all that is required.
Going down the elastic beanstalk path I think I'll have to include all the configuration options inside the deployable artifact and some how get the application to load up the relevant host specific configuration. The problem I have is deciding which configuration to load inside the application. I could use a physical aspect about the host, i.e. an IP address or Instance ID that would effectively load the relevant config;
/config-<InstanceID-1>.properties
/config-<InstanceID-2>.properties
This approach is totally flawed given that if I create an entirely new environment in beanstalk, it would require me to update all the configuration files in the project to reflect the new Instance-id's created.
Has anyone come up with a good way of doing this in beanstalk?
If you have to have two different types of nodes, then you should consider SOA architecture for your application.
Create two environments, environment-a and environment-b. Either set all properties for the environments through AWS web console, or can reuse your existing configuration files and just set the specific configuration file name for each environment.
#environment-a
PARAM1 = config-environment-a.properties
#environment-b
PARAM1 = config-environment-b.properties
You share the same code base and push to either environment with -e modifier.
#push to environment-a
$ git aws.push -e environment-a
#push to environment-b
$ git aws.push -e environment-b
You can also create git alias to push to both environments at the same time :-)
Now, the major benefit of SOA approach is that you can scale and manage those environments separately. It is simple and elegant.
If you want more complex and less elegant, use simple token distribution service. On every environment initialization, send two messages to Amazon SQS. Each message should contain configuration name. Then pull those messages from SQS, each instance will get exactly one from the queue. Whichever configuration name the message contains, configure your node with that configuration. :-)
Hope it helps.
Update after #vcetinick comment:
All still seems rather complex for what should be pretty simple.
That's why I suggested separate environments. You can make your own registration service, when the node comes up, it registers with the service and in return gets configuration params. You keep available configurations in persistent DB. If the node dies and the service gets another registration request, the registration service can quickly check registered all nodes (because they all left their info during the registration), and if any of the nodes is not responding, its configuration data is reassigned to the new node. And now you have single point of failure on your hands :-)
Again, there might be other ways to approach that problem.
I have a web-application running on Google AppEngine.
I have a single PRODUCTION environment, a STAGING env and multiple development & QA envs. There are many configuration parameters that should differ between PRODUCTION and other environments - such as API keys for services we integrate with (GoogleAnalytics for example). Some of those parameters are defined in code, other are defined in web.xml (inside init-param tag for Filters, for example), and others cases as well.
I know that there are a couple of approaches to do so:
Saving all parameters in the datastore (and possible caching them in each running instance / Memcached)
Deploying the applications with different system-properties / environment-variables in the web.xml
Other options...?
Anyway, I'm interested to hear your best-practices for resloving this issue.
My favorite approach is to store them all in datastore and having only one master record in it with all the different properties and making a good use of the memcache. By doing that you don't need to have different configuration files or polluting your code with different configuration settings. Instead you can deploy and change this values from an administrative form that you will have to create in order to update this master record.
Also if you are storing tokens and secret keys then you are aware of the fact that is definitely not a good idea to have them in the web.xml or anywhere else in the code, but rather having it per application on something more secure, like datastore.
Once you have that, then you can have one global function that will retrieve properties by name and if you want to get the Google Analytics ID from anywhere in your app you should use it by having something like this:
getProperty('googleAnalyticsID')
where this global getProperty() function will try to find this value with these steps:
Check if it exist in memcache and return
If not in memcache, update memcache from master entity from datastore and return
If not in datastore create an entity with a default values, update memcache and return
Of course there are different approaches on how to retrieve data from that Model but the idea is the same: Store in one record and use the memcache.
You must have separate app ids for your production/staging/qa envs. This must be hardcorded into your web.xml (or you have a script of some sort that updates your web.xml)
After that you can code in your settings based on appid. I assume there's a java equivalent to this:
https://developers.google.com/appengine/docs/python/appidentity/functions#get_application_id
You could put it in the datastore if they're settings that change dynamically, but if they are static to the environment, it doesn't make sense to keep fetching from datastore.
I am developing application which is embedded within the cluster environment in Websphere AS. I am using several nodes and sometimes I would like to change configuration settings on the fly and propagate it to all nodes within the cluster. I don't want to hold the config in the db or at least I would like to cache it on the node level and trigger config refresh action which forces each node to refresh the config from some common ground (i.e. db or net drive)
to avoid constant round-trips to the config storage.
More over some configuration can't be stored in db i.e. log level needs to be applied on the logger object in each node separately.
I was thinking about using JMS Topics and publish/subscribe approach to achive that goal.
The idea is that each node could subscribe to each Topic and no matter which nodes initate the config change modification would be propagated to all nodes within the cluster.
Has anyone ever tried to do that in WAS and whether there are any obstacles with this approach. If there are or if you have any other suggestion on how to solve that problem I would be very greatfull for your help.
Tx in advance,
Marcin
Here are a few options to consider as alternatives to JMS -
Use Java EE environment entries. These are scoped to the application, and WAS will automatically propagate any changes to all servers against which the application is deployed. This is a good approach since it is the standard Java EE approach to application configuration, if it is robust enough to meet your use case.
Use a WebSphere Shared Library. This allows you to link your applications to static files external to your application (i.e. on the filesystem), such that they are available on your classpath. Although these files are located on the node file systems, there is a way that you can place these files in WebSphere's centralized configuration repository such that they are automatically propagated to all WAS nodes. For more details on this, see this answer.
Both of these options are optimized for static configuration; in other words, configuration settings that are intended to be set at assembly-time, deployment-time, or to be changed by system administrators, but they are not typically used for values that change frequently, nor are they generally changed programmatically at runtime. WAS does allow your applications to pick these configuration settings in a rolling fashion, such that no application downtime is required though.
Currently we solved the problem with maybe not the most pretty approach but with the most simple one. Since we are using only 2 nodes we have possibility to enter web interface of specific node where we modify settings per each node. Maybe it is not very pretty but for now it is the easiest way. The config is stored in DB and we are planning to trigger config reload in each node and change the log level per node as well.