JMX Monitoring for Kubernetes scaling Statefulset - java

I'm trying to get JMX monitoring from jconsole for an application that is running inside a Kubernetes pod.
Currently, I'm following this method:
I expose a port, let's say 5000 in the YAML
I create a nodePort service that binds that pod port to the worker nodes port
I add the following 4 java properties:
JMX_REMOTE_AUTHENTICATE
JMX_REMOTE_PORT
JMX_REMOTE_RMI_PORT
JMX_REMOTE_SSL
Then I can go into a monitoring tool like jvisualvm and create a connection to the public IP of the worker node hosting that pod at port 5000 and I can monitor that pod, which works great.
Issue:
Now let's say my application scales up, and a new pod comes up on another worker node, I can manually add all the above steps again to monitor that pod.
But that isn't ideal. Ideally, I'd like every pod to be automatically get monitored as it comes online. I can add the properties for JMX in my statefulset YAML, but do I need a nodePort service for every single pod that comes online that binds to a different port? If so, there must be a way to do this through a script or a built-in function?
Anyone with any experience with this, any pointers would be very helpful?

Related

Akka Clustering not working with Docker container in host node

We are trying to use Application level clustering using the Akka Clustering for our distributed application which runs within docker containers across multiple nodes. We plan to run the docker container in the "host" mode networking.
When the dockerized application comes up for the first time, the Akka Clustering does seem to work and we do not see any Gossip messages being exchanged between the cluster nodes. This gets resolved only when we remove the file "/var/lib/docker/network/files/local-kv.db” and restart the docker service. This is not an acceptable solution for the production deployment and so we are trying to do an RCA and provide a proper solution.
Any help here would be really appreciated.
Tried removing file "/var/lib/docker/network/files/local-kv.db” and restarting docker service worked. But this workaround is unacceptable in the production deployment
Tried using the bridge network mode for the dockerized container. That helps, but our current requirement requires us to run the container in "host" mode.
application.conf has the following settings for the host and port currently.
hostname = "" port = 2551 bind-hostname = "0.0.0.0" bind-port = 2551
No gossip messages are exchanged between the akka cluster nodes. Whereas we see those messages after applying the mentioned workaround

Pivotal gemfire cluster configuration

I am trying to set up a Pivotal Gemfire cluster with two nodes/hosts. Precisely two different unix servers. The idea behind is creating 1 locator and 1 cache server in each host where the locators should take care of load balancing among the cache servers. A replicated region will be created in both the cache servers. When a client creates/update a region in cache server using gfsh or java API, it should be replicated to other one
Using gfsh, I am able to start a locator (locator 1) and a cache server (server 1) in host_A and likewise in host_B. I have created a region (RegionA) in both the servers.
Is that all i have to do ?. Pivotal tutorials talk about having a locator and multiple cache servers in same machine. I could not find any appropriate resource which talks about multi-server/host configuration.
After starting the servers in both the hosts. I am starting servers in each of the host like this.
start server --name=server1 --locators=host_A[10334],host_B[10334] --group=group1 --server-port=40406
start server --name=server2 --locators=host_A[10334],host_B[10334] --group=group1 --server-port=40406
When i do "list members" in gfsh, host B shows (locator 2, server 1 [from host A], server 2), but host A shows locator 1 only. Ideally i am expecting 2 locator s and 2 servers as members in both the machines. Is that not right?
The steps look just fine, are you having any issues or something is not working while using the started cluster?. You can go through Pivotal GemFire in 15 Minutes or Less to get to know how to start locators and servers, and how to interact with them as well. The only extra item I can think of (not mentioned withint he previous link as all members are started locally within the same gfsh session) is that you need to correctly configure the --locators parameter when starting your members, more information about how this works can be found in How Member Discovery Works and Configuring Peer-to-Peer Discovery.
Just for your reference, you can have as many members as you want per host, there's no implicit limit about this other than the actual physical resources on the host itself (memory, disk, ports, network throughput, etc.). Keep in mind, however, that it is always better to have only one member per host to achieve the highest reliability and availability for both your data and locator services.
Hope this helps, cheers.

Kubernetes: Exposed service to deployment unreachable

I deployed a container on Google Container Engine and it runs fine. Now, I want to expose it.
This application is a service that listens on 2 ports. Using kubectl expose deployment, I created 2 load balancers, one for each port.
I made 2 load balancers because the kubectl expose command doesn't seem to allow more than one port. While I defined it as type=LoadBalancer on kubectl, once these got created on GKE, they were defined as Forwarding rules associated to 2 Target pools that were also created by kubectl. kubectl also automatically made firewall rules for each balancer.
The first one I made exposes the application as it should. I am able to communicate with the application and get a response.
The 2nd one does not connect at all. I keep getting either connection refused or connection timeout. In order to troubleshoot this issue, I further stripped down my firewall rules, to be as permissive as possible, to troubleshoot this issue. Since ICMP is allowed, by default, pinging the ip for this balancer results in replies.
Does kubernetes only allow one load balancer to work, even if more than one can be configured? If it matters any, the working balancer's external ip is in the pattern 35.xxx.xxx.xxx and the ip of the balancer that's not working is 107.xxx.xxx.xxx.
As a side question, is there a way to expose more than one port using kubectl expose --port, without defining a range i.e. I just need 2 ports?
Lastly, I tried using the Google console, but I couldn't get the load balancer, or forwarding rules to work with what's on kubernetes, the way doing it on kubectl does.
Here is the command I used, modifying the port and service name on the 2nd use:
kubectl expose deployment myapp --name=my-app-balancer --type=LoadBalancer --port 62697 --selector="app=my-app"
My firewall rule is basically set to allow all incoming TCP connections over 0.0.0.0/0.
Edit:
External IP had nothing to do with it. I kept deleting & recreating the balancers until I was given an IP of xxx.xxx.xxx.xxx for the working balancer, and the balancer still worked fine.
I've also tried deleting the working balancer and re-creating the one that wasn't working, to see if it's a conflict between balancers. The 2nd balancer still didn't work, even if it was the only one running.
I'm currently investigating the code for the 2nd service of my app, though it's practically the same as the 1st service, a simple ServerSocket implementation that listens on a defined port.
After more thorough investigation (opening a console in the running pod, installing tcpdump, iptables etc), I found that the service (i.e. load balancer) was, in fact, reachable. What happened in this situation was, although traffic reached the container's virtual network interfrace (eth0), the data wasn't routed to the listening services, even when these were ip aliases for the interface (eth0:1, eth0:2).
The last step to getting this to work was to create the required routes through
iptables -t nat -A PREROUTING -p tcp -i eth0 --dport <listener-ip> -j DNAT --to-destination <listener-ip>
Note, there are other ways to accomplish this, but this was the one I chose. I wish the Docker/Kubernetes documentation mentioned this.

Kubernetes service in java does not resolve restarted service/replicationcontroller

I have a kubernetes cluster where one service (java application) connects to another service to write data (elasticsearch).
When elasticsearch (service & replicationcontroller) is restarted/redeployed, the java-application looses it's connection, which can only be recovered by restarting the java-application (rc). This is not the desired behaviour and should be solved.
Using curl from the kubernetes pod of the application to query elasticsearch does work fine after restart, so it must be probably something java is doing.
It does work when only the replicationcontroller for elasticsearch is touched, leaving the service as it is. But why does curl work in that case, however this should not be the solution.
Using the same konfiguration in a local docker setup without kubernetes does also not lead to problems.
Promising solutions that did not worked:
Setting networkaddress.cache.ttlor networkaddress.cache.negative.ttl to zero (or other small positive values)
Hacking /etc/nsswitch.conf as described in https://stackoverflow.com/a/32550032/363281
I'm using kubernetes 1.1.3, OpenJDK 8u66, service Dockerfile is derived from java:8
Try java.security.Security.setProperty("networkaddress.cache.ttl" , "60");
This means sixty seconds and you should adapt to your needs.
One solution is not to restart your Service: a Service resolves the Pods by IPs and watches the Pods by selectors, so you don't need to restart the Service when you restart your Pods.
Now likely what is happening is that your application is resolving the Service at start up, and it then caches the IP. When you restart the Service it likely gets a new IP which messes up your application's behavior. You need to check how you can reset this cache or initiate some sort of restart of that App when the pods/services are changes.
If you don't restart the Service, the IP won't change, but it will still proxy to the Pods that are restarted.

writing application master for Yarn Application

I am new to YARN, and I am developing a framework to launch java applications via YARN container. To register my ApplicationMaster to resource manager, the code is executing registerApplicationMaster("",0,"") which works fine on single node cluster. But the same call hangs forever in case of multi-node cluster. I am wondering if not passing these parameters properly is causing this.
Even if it is not, I want to know what are these for.
public abstract RegisterApplicationMasterResponse registerApplicationMaster(String appHostName, int appHostPort, String appTrackingUrl)
appHostName - Name of the host on which master is running
appHostPort - Port master is listening on
appTrackingUrl - URL at which the master info can be seen

Categories