I am about to create a java server/client construct, that uses bidirectional communication via tcp sockets. For every requesting client a new thread is created. At this moment it is running on a virtual machine from a hosting service. Now i thought about using docker. But does it really make sense to switch to docker in this case? Is docker really ment to run permanent applications like a java server?
I'm on thin ice here but, I'm going say it anyway... If you're running a process on Linux, to most intents and purposes, the process is running in a container.
Containers are "sugar" atop intrinsic Linux kernel features (namespaces, cgroups etc.). Solutions including Docker Engine mostly made these -- somewhat arcane -- capabilities of the kernel available more broadly available | easier to use.
Containers and VMs are very distinct technologies. Extending the above, you can run VMs in Containers and -- you're almost always -- running Containers in VMs.
It's containers all the way down :-)
To answer your question directly: you are already running your Java server in a container and it's running on a VM. You may decide to do two things but please read up more on each before deciding:
Add Docker (Engine) into your existing VM (if it's not already there) as a way to more easily manage your Java server as a Docker container. Benefits: unclear but see below.
Extract the Java server from the VM (!) and run the server instead as a Docker container. Benefits: unclear; Consequence: may not be possible with your hosting company; potential security concerns; no well-defined benefits; etc.
One benefit for you in using containers and deploying containers to your existing hosting provider (and continuing to use their VMs) is that you would be able to build and test in locations other than your hosting provider and be (mostly) guaranteed that a container image that worked during build and test will also work on your hosting service provider in production.
HTH!
Related
Is it true to say that to a high degree what is now done in docker containers can also be done in java with jvm if someone wanted to ?
Besides being able to write an application in your own language and having much customisation flexibility does docker basically do what Java has been doing with its virtual machine for ages ? i.e. it provides executable environment separate from the underlying OS.
Generally Docker containers cannot be done "within Java" because Docker serves to encapsulate the application, and "within Java" is the code being loaded after the JVM launches.
The JVM is already running when it parses the class it will search for the main method. So encapsulation at the process level can't be done because the process (the JVM) is already running.
Java has encapsulation techniques which do provide protection between various Java elements (see class loader hierarchies in Tomcat, for an example); but those only isolate "application plugins" from each other, the main process that is running them all is Tomcat, which is really a program loaded into an already running JVM.
This doesn't mean you can't combine the two to achieve some object, it just means that the types of isolation provided between the two products isn't interchangeable.
what is now done in docker containers can also be done in java with jvm someone wanted to
Short Answer: No. You could wrap a docker container around your JVM, but you cannot wrap a JVM around a docker container, non-trivially.
docker basically do what Java has been doing with its virtual machine for ages ? i.e. it provides executable environment separate from the underlying OS.
Docker containers provide isolation from other containers without introducing a virtualisation layer. Thus, they are different and more performant than VMs.
Docker can perform several things that the Java JVM cannot do however, programming in Java and running on the JVM will provide several of the advantages of running in a Docker container.
I work on a large Java project and this project is 20 years old. We have been evolving and fixing our application without any tools or compatibility issues for all these years. Also, as a bonus, the application is platform independent. Several components of it can run in both Windows and Linux. Because no effort was made initially to build a multiplatform application, there is one component that cannot run on Linux. But it would be relatively easy to make it work on that platform.
It would have been much more difficult to do the same thing using C or C++ and associated toolchain.
I am trying to wrap my head around Apache Mesos and need clarification on a few items.
My understanding of Mesos is that it is an executable that gets installed on every physical/VM server ("node") in a cluster, and then provides a Java API (somehow) that treats each individual node as a collective pool of computing resources (CPU/RAM/etc.). Hence, to programs coding against the Java API, they only see 1 single set of resources, and don't have to worry about how/where the code is deployed.
So for one, I could be fundamentally wrong in my understanding here (in which case, please correct me!). But if I'm on target, then how does the Java API (provided by Mesos) allow Java clients to tap into these resources?!? Can someone give a concrete example of Mesos in action?
Update
Take a look at my awful drawing below. If I understand the Mesos architecture correctly, we have a cluster of 3 physical servers (phys01, phys02 and phys03). Each of these physicals is running an Ubuntu host (or whatever). Through a hypervisor, say, Xen, we can run 1+ VMs.
I am interested in Docker & CoreOS, so I'll use those in this example, but I'm guessing the same could apply to other non-container setups.
So on each VM we have CoreOS. Running on each CoreOS instance is a Mesos executable/server. All Mesos nodes in a cluster see everything underneath them as a single pool of resources, and artifacts can be arbitrarily deployed to the Mesos cluster and Mesos will figure out which CoreOS instance to actually deploy them to.
Running on top of Mesos is a "Mesos framework" such as Marathon or Kubernetes. Running inside Kubernetes are various Docker containers (C1 - C4).
Is this understanding of Mesos more or less correct?
Your summary is almost right but it does not reflect the essence of what mesos represents. The vision of mesosphere, the Company behind the project, is to create a "Datacenter Operating System" and the mesos is the kernel of it in analogy to the kernel of a normal OS.
The API is not limited to Java, you can use C, C++, Java/Scala, or Python.
If you have set-up your mesos cluster, as you describe in your question and want to use your resources, you usually do this through a framework instead of running your workload directly on it. This doesn't mean that this is complicated here is a very small example in Scala which demonstrates this. Frameworks exist for multiple popular distributed data processing systems like Apache Spark, Apache Cassandra. There are other frameworks such as Chronos a cron on data center level or Marathon which allows you to run Docker based applications.
Update:
Yes, mesos will take care about the placement in the cluster as that's what a kernel does -- scheduling and management of limited resources. The setup you have sketched raises several obvious questions, however.
Layers below mesos:
Installing mesos on CoreOS is possible but cumbersome as I think. This is not a typical scenario for running mesos -- usually it is moved to the lowest possible layer (above Ubuntu in your case). So I hope you have good reasons for running CoreOS and a hypervisor.
Layers above mesos:
Kubernetes ist available as framework and mesosphere seems to put much effort in it. It is, however, without question that there are partly overlapping in terms of functionality -- especially with regard to scheduling. If you want to schedule basic workloads based on Containers, you might be better off with Marathon or in the future maybe Aurora. So also here I hope you have good reasons for this very arrangement.
Sidenote: Kubernetes is similar to Marathon with a broader approach and pretty opinionated.
I'm learning CoreOS/Docker and am trying to wrap my mind around a few things.
With Java infrastructure, is it possible to use the JVM in it's own container and have other Java apps/services use this JVM container? If not, I'm assuming the JVM would have to be bundled in each container, so essentially you have to pull the Java dockerfile and merge my Java services; essentially creating a Linux Machine + Java + Service container running on top of the CoreOS machine.
The only other thought I had was it might be possible to run the JVM on CoreOS itself, but it seems like this isn't possible.
It is actually possible to just untar Orcale Java in /opt, but that's just a kind of last resort. The Oracle binaries of JRE and JDK don't require any system libraries, so it's pretty easy anywhere.
I have written some pretty small JRE and JDK images, with which I was able to run Elasticsearch and other major open-source applications. I also wrote some containers that allow me to compile jars on CoreOS (errordeveloper/mvn, errordeveloper/sbt & errordeveloper/lein).
As #ISanych pointed out, running multiple Java containers will not impact disk usage, it's pretty much equivalent to running multiple JVMs on the host. If you find that running multiple JVMs is not quite your cuppa tea, then the answer is really that JVM wouldn't have to be as complex as it is if containers existed before it. However, Java in container is still pretty good, as you can have one classpath that would be fixed forever and you won't get into dependency hell. Perhaps instead of building uberjars (which is what I mostly do, despite that they are known to be not exactly perfect, but I am lazy) one could instead bundle jars in tarball and then use ADD jars.tar /app/lib/ in their Dockerfile.
Applications that run on JVM will have to have JVM installed in the container. So if you want to split application components into separate containers, each of these containers need to have JVM.
On a side note, containers can talk to each other via a process called container linking
Best practice will be to create image with jvm and then other images based on jvm image (from jvm in Dockerfile). Even if you will create many different images they would not waste much space as docker is using layered file system and all containers will use the same single copy of jvm image. Yes each jvm will be separate process eating own memory - but isolated environments that is what docker is used for.
In our application remote Procedure call is solved with an own netty based command dispatcher system. We have a lot of modules (about 20) and I want to run all modules in separate jvm-s. My problem is, that RMI spawns about 17 threads for each JVM. I do not need RMI at all (as far as I know).
Can I completely disable RMI for a jvm? Or at least configure it in a way that it does not use this many threads?
You are most likely looking at your JVMs with a monitoring application, right? Well: these monitoring applications use RMI. So you will see always RMI threads within your monitoring application, profiler, etc. And you will always see them using some amount of CPU time. Just as gathering profiling information does not work without transporting these information (via RMI) to your tool. You can implement your own transportation protocol and direct the management beans to use it but I doubt that you can save enough resources to return your development costs.
If you don’t use any RMI, RMI won’t start any threads. But if you are using it, even if you aren’t aware of this, disabling RMI implies that your software will not work any more.
StuartMarks is correct. RMI doesn't start any threads until you use it.
Possibly you are using it in some way of which you are unaware, e.g. JMX?
Ok I just need to understand this thing. I have a JVM installed on my machine. If I develop 2 programs (2 jars with their own main classes) and run them, will they be both running on the same JVM?
If they both run on same JVM instance, how can I make them communicate?
The system I am currently working on has many components installed in one machine but communicate with each other using RMI. Isn't it impractical for these components to use RMI when they are all running on one machine?
If I develop 2 programs (2 jars with
their own main classes) and run them,
will they be both running on the same
JVM?
Typically no, each will be run in their own JVM process (java), unless you start one from the other in a separate thread or something.
The system I am currently working on
has many components installed in one
machine but communicate with each
other using RMI. Isn't it impractical
for these components to use RMI when
they are all running on one machine?
It is sort of impractical, at least inefficient with all the (de)serialization that takes place. (RMI uses object serialization to marshal and unmarshal parameters)
OSGi (Dynamic Module System for Java)
If you are running both locally, and just need components to find each other, I suggest you make them into OSGi bundles. It has been engineering for this sort of use.
(As an example, this is how Eclipse IDE components and plugins interact with each other, while being loosely coupled, and without the unnecessary (de)serialization)
The two applications will run in different virutal machines if they are started separately (even if they are running on the same physical machine). OSGi has already been mentioned as a way of bring them together but if you want to maintain them as separate applications it might be worth considering web services as a communication method. The benefit of this over RMI is interoperability with other applications and flexability for future development.
you may also use observable pattern if both of the application runs on the same machine - a thought.