I have some docker swarm containers running on an Ubuntu 16.04.4 LTS instance on Azure. The containers are running Java Spring Boot and Netflix OSS Applications like Eureka, Ribbon, Gateway etc. applications. I obseverd my container is taking huge size of memory although services are just REST Endpoint.
I tried to limit the memory consumption by Passing Java VM args like below, but this didn't help the size didn't get changed even after.
Please note below configuration that I am using here,
Java Version : Java 8 Alpine
Kernel Version: 4.15.0-1023-azure
Operating System: Ubuntu 16.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 32
Total Memory: 125.9GiB
Memory footprint after docker stats
Java VM Arguments,
docker service create --name xxxxxx-service --replicas 1 --network overnet 127.0.0.1:5000/xxxxxx-service --env JAVA_OPTS="-Xms16m -Xmx32m -XX:MaxMetaspaceSize=48m -XX:CompressedClassSpaceSize=8m -Xss256k -Xmn8m -XX:InitialCodeCacheSize=4m -XX:ReservedCodeCacheSize=8m -XX:MaxDirectMemorySize=16m -XX:+UseCGroupMemoryLimitForHeap -XX:-ShrinkHeapInSteps -XX:MinHeapFreeRatio=10 -XX:MaxHeapFreeRatio=70"
I've tried to look at the application log files within each of the containers also but can't find any memory related errors. I also tried to limit container resources. But that also didn't work for me.
Limit a container's resources
Any clue how I can troubleshoot this heavy memory Issue?
You can troubleshoot this by using a profiler such as visualvm or jprofiler, they will show you where the memory is allocated (which types of objects etc.).
You shouldn't use it on a production system though, if possible, because profiling can be very CPU heavy.
Another way to find out more I have used in the past is to use AspectJ's load time weaving to add special code that adds memory information to your log files.
This will also slow down your system, but when your aspects have been weel written not so much as using a profile.
If possible profiling would be prefered - if not AspectJ load time weaving might become helpful.
You can try enabling actuator and compare the memory consumption values with the values generated by docker stats.
to enable actuator you could add the following dependency in your pom.xml file.
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
I generally use HAL-browser for monitoring the application and consuming the actuator endpoints.
you can add that using the following maven dependency.
<dependency>
<groupId>org.springframework.data</groupId>
<artifactId>spring-data-rest-hal-browser</artifactId>
</dependency>
In the HAL browser you could try consuming the /metrics endpoint for your application.
A sample output would look like this.
{
"mem" : 193024,
"mem.free" : 87693,
"processors" : 4,
"instance.uptime" : 305027,
"uptime" : 307077,
"systemload.average" : 0.11,
"heap.committed" : 193024,
"heap.init" : 124928,
"heap.used" : 105330,
"heap" : 1764352,
"threads.peak" : 22,
"threads.daemon" : 19,
"threads" : 22,
"classes" : 5819,
"classes.loaded" : 5819,
"classes.unloaded" : 0,
"gc.ps_scavenge.count" : 7,
"gc.ps_scavenge.time" : 54,
"gc.ps_marksweep.count" : 1,
"gc.ps_marksweep.time" : 44,
"httpsessions.max" : -1,
"httpsessions.active" : 0,
"counter.status.200.root" : 1,
"gauge.response.root" : 37.0
}
this way you can monitor the memory performance of your application and find out how much memory your application is actually consuming. If this is analogous with the report generated by docker than this is a issue with your code.
However I must state that use of actuator is not production friendly as it has a significant resource overhead in itself.
Related
I have spring boot application with embedded tomcat and Java melody configured to it using dependency in pom.xml.
I am doing performance testing for my application but problem is no matter how big load is i.e 30hits per second or 500hits per second. Number of threads in java melody in not going beyond 25-30.
Is their any default configuration in spring boot which is limiting the threads opening or any resources depend like memory or cpu ? I am not using any thread configuration in my application which means by default it will have 200 threads if i am not wrong.
Please suggest.
Note : I am running my application with 5gb memory and 2 vCPU.
I am trying to bring one of the environments EAP 7.0 to EAP7.4.1 and I have managed to migrate one of the environments successfully. However, on one of the environments, as soon as I start EAP after upgrade in the domain mode, the server runs out of memory with the error below:
> "WFLYCTL0030: No resource definition is registered for address [
> (\"host\" => \"somehost-server-1\"),
> (\"server\" => \"server-1\"),
> (\"core-service\" => \"platform-mbean\"),
> (\"type\" => \"operating-system\") ]"
I have tried to copy the exact configuration as the other environment where EAP is running smoothly and find no difference. I couldn't find any help if I try to find this error, all I can see is that it has something to do with the Monitoring service of Jboss EAP. Can someone help?
I found the issue, apparently jberet was internally trying to fetch details about the job executions from the table(JDBC repository). And since there was a lot of rows in that table, the committed heap size would run out.
After I deleted rows from that table, the server looks stable and everything runs smoothly. I wonder how the server handles a large load since it is constantly trying to fetch that data. Is there an alternate to the solution?
I am doing spark submit from oozie, --driver-cores option is not working. For examples if i provided --driver-cores 4, yarn still creates 1 vCore container for driver.
Spark Opts in oozie:
<master>yarn-cluster</master>
<spark-opts>--queue testQueue --num-executors 4 --driver-cores 4
...
</spark-opts>
I have tried other config keys also like --conf spark.driver.cores=4 and --conf spark.yarn.am.cores=4, even those are not working.
Any pointer will be helpful. Thanks
If you have specified this, your program is using 4 cores. There is no doubt in that.
You are seeing it wrong.
So in resource manager page, if you are in default settings DefaultResourceCalculator, it only calculates memory usage.
And for vCore usage it will always show 1, because it doesn’t calculate it.
If you can change resource manager class to DominantResourceCalculator, then it will show actual core usage.
Just add this properties to yarn-site.xml and restart yarn
yarn.scheduler.capacity.resource-calculator: org.apache.hadoop.yarn.util.resource.DominantResourceCalculator
You can also verify this theory by going to Spark History server UI.
Before changing anything submit a spark job, find that job in spark UI.
Go to Executors section of that job and you will see all the executor used by spark and its configs.
I'm using Tomee, Primefaces 5.0 and Apache SHiro.
When I start the server, it consumes 600 Mb of memory.
If I open and close a certain page, that contains a lot of information, but is related to a ViewScoped bean, the memory usage goes to 1,6 GiB. The same thing if I open other things, even RequestScopped beans.
I have checked and the PreDestroy method is being called, so my problem isn't it.
Using Eclipse Memory Analyser:
One instance of "org.apache.openejb.core.WebContext" loaded by
"org.apache.catalina.loader.StandardClassLoader # 0xa34f0cf0" occupies
1,189,717,200 (97.83%) bytes. The memory is accumulated in one
instance of "java.util.concurrent.ConcurrentHashMap$Segment[]" loaded
by "system class loader".
Keywords
java.util.concurrent.ConcurrentHashMap$Segment[]
org.apache.openejb.core.WebContext
org.apache.catalina.loader.StandardClassLoader # 0xa34f0cf0
And when I run shutdown.sh, I have the following in catalina.out
org.apache.catalina.loader.WebappClassLoader
checkThreadLocalMapForLeaks SEVERE: The web application [/projeto-bim]
created a ThreadLocal with key of type
[org.apache.shiro.util.ThreadContext.InheritableThreadLocalMap] (value
[org.apache.shiro.util.ThreadContext$InheritableThreadLocalMap#5720d785])
and a value of type [java.util.HashMap] (value
[{org.apache.shiro.util.ThreadContext_SECURITY_MANAGER_KEY=org.apache.shiro.web.mgt.DefaultWebSecurityManager#2d258973,
org.apache.shiro.util.ThreadContext_SUBJECT_KEY=org.apache.shiro.web.subject.support.WebDelegatingSubject#7b62f42c}]) but failed to remove it when the web application was stopped. Threads
are going to be renewed over time to try and avoid a probable memory
leak.
I tried several things, like setting some configuration in web.xml to maintain only one session or set Tomee to save session information on disk, but nothing worked.
What should I do?
// New information:
The memory goes to 1.6 GiB and stops because that is my maximum heap space. The web server begins to throw OutOfMemoryError. I'll try to increase this to see how much more it uses.
Ok. Now I increased the java heap space to 3 GB. And my application uses it all. It is clearly a memory leak, because each time I open a certain page, which contains a lot of information, the memory goes up 300 Mb, and it never decreases!
What could I do?
I don't see much thing linked to any leaks.
WebContext is bound to your webapp in TomEE (normal you see it while your app is running)
then all warnings are just that apache shiro is in your webapp and uses threadlocal to keep security context by thread. Since it was loaded by the webapp it can creates leaks but excepting patching shiro you can't help it.
I'm trying to deploy the jBPM 6.1.0.Final version code using Spring on Tomcat 6.0 server. It is taking more than 3 hours to start the RuntimeManager when server starts. I have used below :
1) Spring integration
2) Added process and task lifecycle listeners
3) Used singleton session strategy
I am not sure why it is taking so much time to deploy. With JBPM 5.4 it worked just fine.
I have taken the thread dump and memory dump, but there is nothing out of the ordinary. Are there any other ways I can view exactly which threads are hogging the time?
EDIT - Java version 6, Tomcat version 6
So the issue has been identified. The bottle neck was with http://www.omg.org/spec/BPMN/20100524 namespace. There were several such namespaces included in the BPM XML file for XSD. But they weren't getting loaded. The root cause is an Eclipse plugin bug for BPMN2 plugin that generates incorrect XSD definitions in the XML file. By removing all XSD definitions except BPMN2.0.xsd it started correctly.