Can multiple JVM processes share memory for common classes? - java

I'd like to run multiple Java processes on my web server, one for each web app. I'm using a web framework (Play) that has a lot of supporting classes and jar files, and the Java processes use a lot of memory. One Play process shows about 225MB of "resident private" memory. (I'm testing this on Mac OS X, with Java 1.7.0_05.) The app-specific code might only be a few MB. I know that typical Java web apps are jars added to one server process (Tomcat, etc), but it appears the standard way to run Play is as a standalone app/process. If these were C programs, most of that 200MB would be shared library and not duplicated in each app. Is there a way to make this happen in Java? I see some pages about class data sharing, but that appears to apply only to the core runtime classes.

At this time and with the Oracle VM, this isn't possible.
But I agree, it would be a nice feature, especially since Java has all the information it needs to do that automatically.
Of the top of my hat, I think that the JIT is the only reason why this can't work: The JIT takes runtime behavior into account. So if app A uses some code in a different pattern than app B, that would result in different assembler code generated at runtime.
But then, the usual "pattern" is "how often is this code used." So if app A called some method very often and B didn't, they could still share the code because A has already paid the price for optimizing/compiling it.
What you can try is deploy several applications as WAR files into a single VM. But from my experience, that often causes problems with code that doesn't correctly clean up thread locals or shutdown hooks.

IBM JDK has a jvm parameter to achieve this. Check out # http://www.ibm.com/developerworks/library/j-sharedclasses/
And this takes it to the next step : http://www.ibm.com/developerworks/library/j-multitenant-java/index.html

If you're using a servlet container with virtual hosts support (I believe Tomcat does it) you would be able to use the play2-war-plugin. From Play 2.1 the requirement of always being the root app is going to be lifted so you will probably be able to use any servlet container.
One thing to keep in mind if that you will probably have to tweak the war file to move stuff from WEB-INF/lib to your servlet container's lib directory to avoid loading all the classes again and this could affect your app if it uses singleton or other forms of class shared data.

The problem of sharing memory between JVM instances is more pressing on mobile platforms, and as far as I know Android has a pretty clever solution for that in Zygote: the VM is initialized and then when running the app it is fork()ed. Linux uses copy-on-write on the RAM pages, so most of the data won't be duplicated.
Porting this solution might be possible, if you're running on Linux and want to try using Dalvik as your VM (I saw claims that there is a working port of tomcat on Dalvik). I would expect this to be a huge amount of work, eventually saving you few $s on memory upgrades.

Related

Easy deployment of a jvm based web server on a remote machine

I wanted to know what is the easiest way to deploy a web server made using java or kotlin. With nodejs, I just keep all the server code on remote machine and edit it using the sshfs plugin for vscode. For jvm based servers, this doesn't appear as easy since intellij doesn't provide remote editing support. Is there a method for jvm based servers which allows quick iterative development cycle?
Do you have to keep your server code on remote machine? How about developing and testing it locally, and only when you want to test it on the actual deployment site, then deploy it?
I once tried to use SSH-FS with IntelliJ, and because of the way IntelliJ builds its cache, the performance was terrible. The caching was in progress, but after 15 minutes I gave up. And IntelliJ without its caching and smart hints would be close to a regular editor.
In my professional environment, I also use Unison from time to time: https://www.cis.upenn.edu/~bcpierce/unison/. I have it configured in a way to copy only code, not the generated sources. Most of the times it works pretty well, but it tends to have its quirks which can make you waste half a day on debugging it.
To sum up, I see such options:
Developing and testing locally, and avoiding frequent deployments to the remote machine.
VSCode with sshfs plugin, because why not, if it's enough for you for nodejs?
A synchronization tool like Unison.
Related answers regarding SSHFS from IntelliJ Support (several years old, but, I believe, still hold true):
https://intellij-support.jetbrains.com/hc/en-us/community/posts/206592225-Indexing-on-a-project-hosted-via-SSHFS-makes-pycharm-unusable-disable-indexing-
https://intellij-support.jetbrains.com/hc/en-us/community/posts/206599275-Working-directly-on-remote-project-via-ssh-
A professional deployment won't keep source code on the remote server, for several reasons:
It's less secure. If you can change your running application by editing source code and recompiling (or even if edits are deployed automatically), it's that much easier for an attacker to do the same.
It's less stable. What happens to users who try to access your application while you are editing source files or recompiling? At best, they get an error page; at worst, they could get a garbage response, or even a leak of customer data.
It's less testable. If you edit your source code and deploy immediately, how do you test to ensure that your application works? Throwing untested buggy code directly at your users is highly unprofessional.
It's less scalable. If you can keep your source code on the server, then by definition you only have one server. (Or, slightly better, a small number of servers that share a common filesystem.) But that's not very scalable: you're clearly hosted in only one geographic location and thus vulnerable to all kinds of single points of failure. A professional web-scale deployment will need to be geographically distributed and redundant at every level of the application.
If you want a "quick iterative development cycle" then the best way to do that is with a local development environment, which may involve a local VM (managed with something like Vagrant) or a local container (managed with something like Docker). VMs and containers both provide mechanisms to map a local directory containing your source code into the running application server.

Docker container vs Java Virtual Machine

Is it true to say that to a high degree what is now done in docker containers can also be done in java with jvm if someone wanted to ?
Besides being able to write an application in your own language and having much customisation flexibility does docker basically do what Java has been doing with its virtual machine for ages ? i.e. it provides executable environment separate from the underlying OS.
Generally Docker containers cannot be done "within Java" because Docker serves to encapsulate the application, and "within Java" is the code being loaded after the JVM launches.
The JVM is already running when it parses the class it will search for the main method. So encapsulation at the process level can't be done because the process (the JVM) is already running.
Java has encapsulation techniques which do provide protection between various Java elements (see class loader hierarchies in Tomcat, for an example); but those only isolate "application plugins" from each other, the main process that is running them all is Tomcat, which is really a program loaded into an already running JVM.
This doesn't mean you can't combine the two to achieve some object, it just means that the types of isolation provided between the two products isn't interchangeable.
what is now done in docker containers can also be done in java with jvm someone wanted to
Short Answer: No. You could wrap a docker container around your JVM, but you cannot wrap a JVM around a docker container, non-trivially.
docker basically do what Java has been doing with its virtual machine for ages ? i.e. it provides executable environment separate from the underlying OS.
Docker containers provide isolation from other containers without introducing a virtualisation layer. Thus, they are different and more performant than VMs.
Docker can perform several things that the Java JVM cannot do however, programming in Java and running on the JVM will provide several of the advantages of running in a Docker container.
I work on a large Java project and this project is 20 years old. We have been evolving and fixing our application without any tools or compatibility issues for all these years. Also, as a bonus, the application is platform independent. Several components of it can run in both Windows and Linux. Because no effort was made initially to build a multiplatform application, there is one component that cannot run on Linux. But it would be relatively easy to make it work on that platform.
It would have been much more difficult to do the same thing using C or C++ and associated toolchain.

Java JVM on Docker/CoreOS

I'm learning CoreOS/Docker and am trying to wrap my mind around a few things.
With Java infrastructure, is it possible to use the JVM in it's own container and have other Java apps/services use this JVM container? If not, I'm assuming the JVM would have to be bundled in each container, so essentially you have to pull the Java dockerfile and merge my Java services; essentially creating a Linux Machine + Java + Service container running on top of the CoreOS machine.
The only other thought I had was it might be possible to run the JVM on CoreOS itself, but it seems like this isn't possible.
It is actually possible to just untar Orcale Java in /opt, but that's just a kind of last resort. The Oracle binaries of JRE and JDK don't require any system libraries, so it's pretty easy anywhere.
I have written some pretty small JRE and JDK images, with which I was able to run Elasticsearch and other major open-source applications. I also wrote some containers that allow me to compile jars on CoreOS (errordeveloper/mvn, errordeveloper/sbt & errordeveloper/lein).
As #ISanych pointed out, running multiple Java containers will not impact disk usage, it's pretty much equivalent to running multiple JVMs on the host. If you find that running multiple JVMs is not quite your cuppa tea, then the answer is really that JVM wouldn't have to be as complex as it is if containers existed before it. However, Java in container is still pretty good, as you can have one classpath that would be fixed forever and you won't get into dependency hell. Perhaps instead of building uberjars (which is what I mostly do, despite that they are known to be not exactly perfect, but I am lazy) one could instead bundle jars in tarball and then use ADD jars.tar /app/lib/ in their Dockerfile.
Applications that run on JVM will have to have JVM installed in the container. So if you want to split application components into separate containers, each of these containers need to have JVM.
On a side note, containers can talk to each other via a process called container linking
Best practice will be to create image with jvm and then other images based on jvm image (from jvm in Dockerfile). Even if you will create many different images they would not waste much space as docker is using layered file system and all containers will use the same single copy of jvm image. Yes each jvm will be separate process eating own memory - but isolated environments that is what docker is used for.

JVM memory profiling when multiple applications are running on the same JVM

I am running a Web Based Java application on JBOSS and OFBIZ. I am suspecting some memory leak is happening so, did some memory profiling of the JVM on which the application along with JBOSS and OFBIZ are running. I am suspecting garbage collection is not working as expected for the application.
I used VisulaVM, JConsole, YourKit, etc to do the memory profiling. I could see how much heap memory is getting used, how many classes are getting loaded, how many threads are getting created, etc. But I need to know how much of memory is used only by the application, how much by JBOSS and how much by OFBIZ, respectively. I want to find out who is using how much memory and what is the usage pattern. That will help me identify where the memory leak is happening, and where tuning is needed.
But the memory profilers I ran so far, I was unable to differentiate the usage of each application separately. Can you please tell me which tool can help me with that?
There is no way to do this with Java since the Java runtime has no clear way to say "this is application A and this is B".
When you run several applications in one Java VM, you're just running one: JBoss. JBoss then has a very complex classloader but the app you're profiling is actually JBoss.
To do what you want to do, you have to apply filters but this only works when you have a memory leak in a class which isn't shared between applications (so when com.pany.app.a.Foo leaks, you can do this).
If you can't use filters, you have to look harder at the numbers to figure out what's going on. That means you'll probably have to let the app server run out of memory, create a heap dump and then look for what took most of the memory and work from there.
The only other alternative is to install a second server, deploy just one app there and watch it.
You can install and create Docker containers, allowing you to run processes in isolation. This will allow you to use multiple containers with the same base and without having to install the JDK multiple times. The advantage of this is separation of concerns- Every application can be deployed in a separate container. With this, you can then profile any specific application running on the JVM because each namespace is provided with a completely isolated application's view of the operating environment, including process trees, network, user ids and mounted file system.
Here are a couple of resources for Docker:
Deploying Java applications with Docker
JVM plus Docker: Better together
Docker
Please let me know if you have any questions!
Another good tool to use to find java memory leaks is Plumbr. you can try it out for free, it will find the cause for the java.lang.OutOfMemoryError and even shows you the exact location of the problem along with solution guidelines.
I explored various Java memory profilers, and found that YourKit can give me the closest result. In YourKit dashboard you can get links to individual classes running. So, if you are familiar with the codebase, you will know which class belongs to which app. You click on any class, you will see the CPU, Memory usage related to that. Also, if you notice any issues, YourKit can help you trace back to the particular line of the code in your source java files!
If you add YourKit to Eclipse, clicking on the object name in the 'issue area', will highlight the code line in the particular source file, which is the source of the problem.
Pretty cool!!

Java - Hot deployment

Recently, I was reading book about Erlang which has hot deployment feature. The deployment can be done without bringing the system down. All the existing requests will be handled by old version of code and all the new request after deployment will be served by the new code. In these case, both the versions of code available in runtime for sometime till all the old requests are served. Is there any approach in Java where we can keep 2 versions of jar files? Is there any app/web servers support this?
If your intention is to speed up development then JRebel is a tool for just this purpose. I wouldn't however recommend to use it to patch a production system.
JRebel detects whenever a class file has changed and reloads it into the running appserver without throwing any old state away. This is much faster compared to what most appservers do when redeploying a whole war/ear where the whole initialization process must rerun.
There are many ways to achieve hot deployment in the Java world, so you'll probably need to be a bit more specific bout your context and what you are trying to achieve.
Here are some good leads / options to consider:
OSGi is a general purpose module system that supports hot deployment
Clojure is a dynamic JVM language that enables a lot of runtime interactivity. Clojure is often used for "live coding" - pretty much anything can be hot-swapped and redefined at runtime. Clojure is a functional language with a strong emphasis on immutability and concurrency, so in some ways has some interesting similarities with Erlang. Clojure has some very nice web frameworks like Noir that are suitable for hot swapping web srever code.
The Play Framework is designed to enable hot swapping code for productivity reasons (avoiding web server restarts). Might be relevant if you are looking primarily at hot-swapping web applications.
Most Java applicatio servers such as JBoss support some form of hot-swapping for web applications.
The only reason for hot updates on production application is the aim to provide zero downtime to the users.
LiveRebel (based on JRebel) is the tool that could be used in conjunction with Jenkins, for instance. It can do safe hotpatching as well as rolling restarts while draining the sessions on the production cluster.
Technically, you CAN do this yourself. Although, I wouldn't recommend it since it can get complicated quickly. But the idea is that you can create a ClassLoader and load your new version of your class. Then make sure that your executing code knows about the new ClassLoader.
I would recommend just using JBoss and redeploying your jars and wars. Nice and simple for the most part.
In either case, you have make sure you don't have any memory leaks because You'll run out of PermGen space after a few redeployments.
Add RelProxy to your toolbox, RelProxy is a hot class reload for Java (and Groovy) to improve development even for production, with just one significative limitation, reloading only is only possible in a subset of your code.

Categories