Experience with Java clustering?

Experience with Java clustering? - java

Would like to hear from people about their experience with java clustering (ie. implementing HA solutions). aka . terracotta, JGroups etc. It doesn't have to be web apps. Experience writing custom stand alone servers would be great also.
UPDATE : I will be a bit more specific -> not that interested in Web App clustering (unless it can be pulled out and run standalone). I know it works. But we need a bit more than just session clustering. Examining solutions in terms of ease of programming, supported topologies (ie. single data center versus over the WAN ), number of supported nodes. Issues faced, workarounds. At the moment I am doing some POC (Proof of concept) work on Terracotta and JGroups to see if its worth the effort for our app (which is stand alone, outside of a web container).

Jboss clustering was very easy to get up and running.
It seems to work well for us.

You might want to take a look at Hazelcast. It is super lite, easy and free clustering platform with cluster API. If you are clustering your application state/data, Hazelcast can be great help with its distributed/partitioned, queue, map, set, list and lock implementations.
Regards,
-talip
http://www.hazelcast.com

You may look at Oracle Coherence (formerly Tangosole Coherence).
http://www.oracle.com/technology/products/coherence/coherencedatagrid/coherence_solutions.html

I saw a demonstration of GridGain at our local JUG and I was very impressed. The documentation is very complete and it's very easy to get it going. I haven't started using it yet, so I can't quite say that it's working for us.
http://www.gridgain.com/

JBossCache is a standalone open source project that JbossClustering makes use of in the Application Server.
Our company made use of it in our own custom network server, its working well so far in development, though yet to be deployed.
Its a pretty simple API, and it comes in two flavors, a flat cache, or a "POJO Cache" that uses insturmentation to keep State across servers. Basically, updates to fields are propgated across the network using JGroups.

Related

Usability: How do I provide & easily deploy a (preferably node.js + MongoDB based) server backend for my users?

I'm currently planing an application (brainstorming, more or less), designed to be used in small organizations. The app will require syncronization w/ a backend-server, e.g. for user management and some advanced, centralized functionality. This server has to be hosted locally and should be able to run on Linux, Mac and Windows. I haven't decided how I'm going to realize this, mainly I simply don't know which would be the smartest approach.
Technically speaking, a very interessting approach seemed to be node.js + mongoose, connecting to a local MongoDB. But this is where I'm struggeling: How do I ensure that it's easy and convienient for a organization's IT to set this up?
Installing node.js + MongoDB is tedious work and far from standartized and easy. I don't have the ressources to provide a detailled walthrough for every major OS and configuration or do take over the setup myself. Ideally, the local administrator should run some sort of setup on the machine used as server (a "regular" PC running 24/7 should suffice) and have the system up and running, similar to the way some games provide executables for hosting small game-servers for a couple friends (Minecraft, for instance).
I also thought about Java EE, though I haven't dug into an details here. I'm unsure about whether this is really an option.
Many people suggest to outsource the backend (BaaS), e.g. to parse.com or similar services. This is not an option, since it's mandatory that the backend will be hosted locally.
I'm sorry if this question is too unspecific, but unfortunately, I really don't know where to start.

I can give you advice both from the sysadmin's side and the developers side.
Sysadmin
Setting up node.js is not a big task. Setting up a MongoDB correctly is. But that is not your business as an application vendor, especially not when you are a one man show FOSS project, as I assume. It is an administrators task to set up a database, so let him do it. Just tell them what you need, maybe point out security concerns and any capable sysadmin will do his job and set up the environment.
There are some things you underestimate, however.
Applications, especially useful ones, tend to get used. MongoDB has many benefits, but being polite about resources isn't exactly one of them. So running on a surplus PC may work in a software development company or visual effects company, where every workstation has big mem, but in an accountant company your application will lack resources quite fast. Do not make promises like "will run on your surplus desktop" until you are absolutely, positively sure about it because you did extensive load tests to make sure you are right. Any sensible sysadmin will monitor the application anyway and scale resources up when necessary. But when you make such promises and you break them, you loose the single most important factor for software: the users trust. Once you loose it, it is very hard to get it back.
Developer
You really have to decide whether MongoDB is the right tool for the job. As soon as you have relations between your documents, in which the change of of document has to be reflected in others, you have to be really careful. Ask yourself if your decision is based on a rational, educated basis. I have seen some projects been implemented with NoSQL databases which would have been way better of with a relational database, just because NoSQL is some sort of everybody's darling.
It is a FAR way from node.js to Java EE. The concepts of Java EE are not necessarily easy to grasp, especially if you have little experience in application development in general and Java.
The Problem
Without knowing anything about the application, it is very hard to make a suggestion or give you advice. Why exactly has the mongodb to be local? Can't it be done with a VPC? Is it a webapp, desktop app or server app? Can the source ode be disclosed or not? How many concurrent users per installation can be expected? Do you want a modular or monolithic app? What are your communication needs? What is your experience in programming languages? It is all about what you want to accomplish and which services you want to provide with the app.

Simple and to the point: Chef (chef solo for vagrant) + Vagrant.
Vagrant provides a uniform environment that can be as closed to production as you want and Chef provides provisioning for those environments.
This repository is very close to what you want: https://github.com/TryGhost/Ghost-Vagrant
There are hundreds of thousands of chef recipes to install and configure pretty much anything in the market.

Java Web Application for 5000~ Users

For the first time (hopefully not the last) in my life I will be developing an application that will have to handle a high number of users (around 5000) and manage lots of data. I have developed an application that manages lots of data (around 100~ GB of data, not so much by many of your standards), but the user count was pretty low (around 50).
Here is the list of tools / frameworks I think I will be using:
Vaadin UI framework
Hibernate
PostgreSQL
Apache Tomcat
Memcached (for session handling)
The application will mainly be run inside a company network. It might be run on a cluster of servers or not, depends on how much money the company wants to spend to make its life easier.
So what do you think of my choices and what should I take caution of?
Cheers

The answer, as with all performance/scaling related issues is: it depends.
There is nothing in your frameworks of choice that would lead me to think it wouldn't be able to handle a large amount of users. But without knowing what exactly you want to do or what your budget is, it's impossible to pick a technology.
To ensure that your application will scale/perform, I would consider the following:
Keep the memory footprint of each session low. For example, caching stuff in the HttpSession may work when you have 50, but not a good idea when you have 5000 sessions.
Do as much work as you can in the database to reduce the amount of data that is being moved around (e.g. when looking at tables with lots of rows, ensure that you've got paging that is done in the database (rather than getting 10,000 rows back to Tomcat and then picking the first 10...)
Try to minimise the state that has to be kept inside the HttpSession, this makes it easier to cluster.
Probably the most important recommendations:
Use load testing tools to simulate your peak load and beyond and test. JMeter is the tool I use for performance/load-testing.
When load testing, ensure:
That you actually use 5000 users (so that 5000 HttpSessions are created) and use a wide range of data (to avoid always hitting the cache).
EDIT:
I don't think 5000 users is not that much and you may find that (performance-wise) you only need a single server (depends on the size of the server and the results of the load testing, of course, and you may consider a clustered solution for failovers anyway...) for handling the load (i.e. not every one of your 5000 users will be hitting a button concurrently, you'll find the load going up in the morning (i.e. everyone logs in).

You might want to consider an Apache HTTP server in front of your Tomcat servers. Apache will provide: compression, static caching, load-balancing and SSL.

Any reason for not using Spring? It has really became an de-facto standard in the enterprise java applications.
Spring provides an incredibly powerful and flexible collection of technologies to improve your enterprise Java application development that is used by millions of developers.
Spring is lightweight and can stay as a middle layer, connecting the vaadin and hibernate, there by creating a clean separation of layers. The spring transaction management is also superior to the one on hibernate. I will suggest you go for it until you have a strong reason stopping you.

Since you asked people to weigh in, I won't hold back my opinion. ORMs in general, and Hibernate in particular, are an anti-pattern. I know, I've worked in shops that use Hibernate over the past 9 years. Knowing what I know now, I will never use it again.
I highly recommend this blog post, as it puts it more succinctly than I can:
ORM is an anti-pattern
But forgive me if I quote the bit from that blog about ORMs and anti-patterns:
The reason I call ORM an anti-pattern is because it matches the two
criteria the author of AntiPatterns used to distinguish anti-patterns
from mere bad habits, specifically:
It initially appears to be beneficial, but in the long term has more
bad consequences than good ones
An alternative solution exists that is proven and repeatable
Your other technology choices seem fine. Personally, I lean more toward Jetty than Tomcat. There's a reason that Google embeds it in a lot of their projects (think GWT and PlayN); it's a younger codebase and I think more actively developed now that Eclipse has taken it over. Just my humble opinion.
[UPDATE] One more link, very long read but when making architectural decisions, reading is good.
Object/Relational Mapping: The Vietnam of Computer Science

I recommend Glassfish for application server because Apache Tomcat can serve simple content. And Glassfish has full implementation of the Java EE specification.

Depending on you specification and future goals I would perhaps leave the normal version of tomcat and go for Apache TomEE or my personal preference of jBoss. As I understand it EJB's are not very well supported in the normal tomcat version and that is probalby something sweet to have when you want to create a couple of services, some clustered singleton service and other stuff. But this is just my personal pref of course and if your specification will not allow a more advanced EE server then you should stick with the slick tomcat.

RPG (iSeries) Modernization using JTOpen - What is possible?

We would be in near future implementing a solution to modernize our iSeries applications
written as RPG programs with some stored procedures, and our preferred way is leveraging the latest and greatest of what Java has to offer in this space.
From googling and checking other questions here on STOVFlow, JTOpen seems to be the defacto
library/toolset which has worked for most and I was encouraged to see that Tomcat runs on an I-series box with out any issues.
With this as the background, I am thinking of the following as the high level sol arch
Install IBM JRE and use JTOpen's capabilities to invoke RPG Programs and in some cases directly call the stored procedures running on DB2
Have Tomcat host a modern web application built using Grails and other frameworks (Camel, Smooks) to provide an application logic layer which would fill any mediations, transformations required for the old functionality to be offered to the user from a browser
Questions-
If any one of you has been involved in such an exercise, please share the pitfalls with this approach
Is there a significant performance drop with respect to response times for the end user?
Would it be better to some how expose the JT400 code as web services and run the web app on a different machine altogether consuming these web services?

Be very careful with calling RPG from Java because RPG is not threadsafe without some changes.

When I was at COMMON, the best product I felt on the market was Profound UI. There are several others from a variety of vendors. Most of these products do not use Java. Java on the i tends to be slow. (There are things that can be done to make it faster, but native is always faster.) You'll pay the price for these products, but just imagine how much time it would take you to do this yourself. For the above, I was quoted in the $20+ thousand range. But like all i products prices vary greatly based on system.
To directly answer your questions:
I have been doing research on modernization as time allows, the products weren't quite there yet (at the time I looked) to use it for what we wanted to use it for (before COMMON 2011). Now it looks like it might work.
This really depends on your system. A newer system will have less problems than an older system. Web will always be slower than the green-screen. Hands-down entry people won't like it. Executives and younger people will love it.
Your slow point is running the business logic. It wouldn't matter which server the HTML is coming from.

I've found that for all practical purposes an AS/400 behaves like an AIX box seen from Java code, and you must use jt400 (jtOpen) to communicate with the AS/400 specific features like data queues, files etc. This works pretty well, but the slowness of invoking the JVM pressures Java based solutions to be long running.
Note also that QTEMP is generally unavailable as a mechanism to keep state due to the nature of prestarted jobs.
Under V6R1 Java 6 is available and runs pretty well in the "new technology" edition. You can then run almost all Java based solutions, including web servers like Jetty in it. Note that Java defaults to code page 819 when accessing IFS files directly. Windows clients using AS/400 as a network drive uses a compatible code page.

Monitoring a Java web application - is JMX the right choice?

We have a Java web application and we'd like to set up some basic monitoring with a view to expanding this monitoring in future. Our plan is as follows:
(1) Collect generic information (e.g. memory and threads) about the virtual machine of the web container that application is running in.
(2) Monitor the "state" of the application. This is rather vague but at the least we'd like to see if the web application is still alive and can respond to requests.
(3) In the future we'd like to collect more information that is specific to our application. Again this is rather vague but you can assume that we might want to make certain statistics collected internally by the application available to the support staff.
Usually the web application will be deployed in a Tomcat 5.5 or 6 environment. A quick bit of searching on the web shows that JMX can be enabled for Tomcat and that JConsole can then be used to connect to the server. This gives us lots of basic information that solves point (1). Also, some information is available in the MBeans section for "Catalina" and drilling down on this I can at least, for example, see how many requests a particular servlet has received. This is not quite what we want for point (2) but at least gives us some information. There seems to be quite a lot of information there but it's rather difficult to interpret using JConsole. Perhaps there is a better tool for interpreting the MBeans exposed by Tomcat.
For point (3), it seems, at first glance that we could write our own MBeans and then make these available to something like JConsole. Personally, this would involve me learning about JMX which I'm quite happy to do but I have a concern. Having looked around I notice that most of the textbooks on the subject haven't been updated for several years and the open source tools seem to be languishing without recent updates. So my main question is a simple one. What are your opinions on JMX? Does it have a future or is it/has it been superseded by something else? Given we already have our web application but we're starting from scratch for the management console, should we choose JMX or is there something more appropriate with a better future ahead of it?
I ask this question with no personal axe to grind, I'm simply interested to hear your opinions and experiences. I'm sure there's no one correct answer but I think an informed discussion would be useful.
Thanks in advance,
Adam.

JMX is certainly a good solution here. I wouldn't worry about it languishing. Most enterprises I've worked for recently use (or have plans to use) JMX, and I'd have to hear a pretty convincing argument before choosing something else in the Java world. It's easy to write clients (monitoring solutions) for it and you can return complex data very easily indeed. Most 3rd party components support monitoring via JMX as well.
Note that you may want to consider integration with any existing management solutions (e.g. Nagios, BNC Patrol, HP Openview etc.) as well. They may not be so Java-aware, but rather prefer tests like simple HTTP connectivity for testing if a web-site is up (easy using Nagios), or integration using SNMP (which Openview talks natively).

If applicable to your situation (Java 6 update 10 JDK or later, plus on the same machine) then consider using jvisualvm instead as it can dig even deeper than JConsole.
You may find that the easiest way to do what you need is a plugin to jvisualvm knowing your application

Any experience using Terracotta open source?

Does anybody have experience using the open source offering from Terracotta as opposed to their enterprise offering? Specifically, I'm interested if it is worth the effort to use terracotta without the enterprise tools to manage your cluster?
Over-simplified usage summary: we're a small startup with limited budget that needs to process millions of records and scale for hundreds-of-thousands of page views per day.

I am in a process of integrating Terracotta with my project (a sensor node network simulator). About three weeks ago I found out about Terracotta from one of my colleagues. And now my application takes advantage of grid computing using Terracotta. Below I summarized some essential points of my experience with Terracotta.
The Terracotta site contains pretty detailed documentation. This article probably a good starting point for a developer Concept and Architecture Guide
When you are stuck with a problem and found no answer in the documentation, the Terracotta community forum is a good place to ask questions. It seems that Terracotta developers check it on a regular basis and pretty responsive.
Even though Terracotta is running under JVM and it is advertised that it is only a matter of configuration to make you application running in a cluster, you should be ready that it may require to introduce some serious changes in you application to make it perform reasonably well. E.g. I had to completely rewrite synchronization logic of my application.
Good integration with Eclipse.
Admin Console is a great tool and it helped me a lot in tweaking my application to perform decently under Terracotta. It collects all performance metrics from servers and clients you can only think of. It certainly has some GUI related issues, but who does not :-)
Prefer standard Java Synchronization primitives (synchronized/wait/notify) over java.util.concurrent.* citizens. I found that standard primitives provide higher flexibility (can be configured to be a read or write cluster lock or even not-a-lock at all), easier to track in the Admin Console (you see the class name of the object being locked rather then e.g. some ReentrantLock).
Hope that helps.

At the moment, the Terracotta enterprise tools provide only a few features beyond the open source version around things like visualization and management (like the ability to kick a client out of the cluster). That will continue to diverge and the enterprise tools are likely to boast more operator-level functionality around things like managing and monitoring, but you can certainly manage and tune an app even with the open source tools.
The enterprise license also gives you things like support, indemnification, etc which may or may not be as important to you as the tooling.
I would urge you to try it for yourself. If you'd like to see an example of a real app using Terracotta, you should check out this reference web app that was just released:
The Examinator

You may want to take a look at JBossCache/PojoCache which is an in-memory distributed caching solution. The difference is it uses a simple API to propagate objects across your 'cluster' of caches, where as Terracotta works at the classloading/jvm level.
(They don't actually have their own JVM, but they modify classes as they are loaded to allow them to be 'clusterable')
Our company had a lot of luck with JBossCache, I'd recommend checking it out.

Update
What I see in the OP message is "well, I don't really know what we need (thus the lack of detailed requirements), but may be some enterprizey tool will magically solve all our problems, known and unforeseen? That would be awesome!"
With an architectural approach like this it's not gonna fly. No success stories from Teracotta would change that.
OSS is beneficial when the community around it can replace the commercial support. Suppose the guy have a problem in production. Community cannot help -- it's too small for the obscure product like this. Servers are down, business is in danger. You see? You need a commercial license up-front. No money? Well, then you're not a business, and probably not gonna become one (if nobody's willing to invest into it).
Sorry for interrupting your day-dreaming.
IMHO:
Terracotta is a clustering solution. Clustering is required for large, enterprise-grade applications. Large applications mean big budgets. Big budgets mean you can afford commercial license from Terracotta.
To put it in another way: if you don't have budget to buy it, it's probably not beneficial for your project.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.