Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
It seems the hype about cloud computing cannot be avoided, but the actual transition to that new platform is subject to many discussions...
From a Theoretical viewpoint, the following can be said:
Cloud:
architectural change (you might not install anything you want)
learning curve (because of the above)
no failover (since failure is taken care of)
granular cost (pay per Ghz or Gbyte)
instantaneous scalability (not so instantaneous, but at least transparent?) ? lower latency
Managed:
failover (depends on provider)
manual scalability (requires maintenance)
static cost (you pay the package , whether you use it fully or not)
lower cost (for entry- packages only)
data ownership ( you do )
liberty ( you do ) ? lower latency ( depends on provider)
Assuming the above is correct or not; Nevertheless, a logical position is "it depends.." .. on the application itself.
Now comes the hidden question: how would you profile your j2ee app to determine if it is a candidate to cloud or not; knowing that it is
a quite big app in number of services/ functions (i.e.; servlets)
relies on a complex database (ie. num. tables)
doesn't need much media resources, mostly text based
"Now comes the hidden question: how would you profile your j2ee app to determine if it is a candidate to cloud or not; knowing that it is"
As an aside, make that the Explicit question. Make it the TITLE of this question. Put it in the beginning of the question. If possible, delete all of your assumptions, and focus on the question.
Here's what we do.
Call some vendors for your "cloud" or "managed service" arrangement. Not too many. One or two of each.
Ask them what they support. More importantly, what they don't support.
Then, given a short list of features that aren't supported, look at your code for those features. If they don't support things you need, you have some architecture work to do. Or cross them off your preferred vendor list.
For good vendors, write a pilot contract that gives you free (or cheap) access for a few months to install and test. If it doesn't work, you haven't paid much.
"But why go through the expense of trying to host it when it may not work?"
What expense? You can spend months "studying" your code. Or you can try to host it. Usually, the "try to host it" will turn up an answer within a few days. It's less effort to just do it.
What sort of Cloud Service are you talking about? IaaS, PaaS, DaaS ?
architectural change (you might not install anything you want)
Depends: moving from a "managed server" to a Platform (e.g. GAE) might be.
learning curve (because of the above)
Amazon EC2 might not be a big learning curve if you are used to running your own server
no failover (since failure is taken care of)
Depends: EC2 -> you have to roll your own
instantaneous scalability (not so instantaneous, but at least transparent?) ? lower latency
Depends: EC2 -> you have to plan for this / use an adjunct service
There are numerous cloud providers and as far as I've seen there are two main types of them:
Ones that support a cloud computing platform (e.g. Amazon E2C, MS Azure)
Virtual instance providers providing you ability to create numerous running instances (e.g. RightScale)
As far as the platforms, be aware that relational database support is still quite poor and the learning curve is quite long.
As for virtual instance providers the learning curve is really little (you just have to fire up your instances), but instances need some way of synchronizing... for a complex application this might not work.
As for your original question: I don't think there's any standard way you could profile wether an application should / could be moved to the cloud. You probably need to familiarize yourself with the options, narrow down to a few providers and see if the benefits you would get to them would be of any significant win over managed hosting (which you're probably currently doing).
In some ways comparing google app engine (gae) and amazon ec2 is like comparing apples and oranges.
With ec2 you get an operating system, with or without installed server software (tomcat, database, etc.; your choice, depending on which ami you choose). With ec2 you need a (or be a) system administrator to keep things running smoothly. Load balancing on ec2 is something you'll have to figure out and implement; I've never done that part. The big advantage with ec2 is that you can spin up and down new instances programatically, and compared to a regular web server provider, only pay for when your instance is up and running. You use this "auto spin up/down" to implement your load balancing and failover. But you have to do the implementation (again, I have no experience with this part).
With google app engine (gae), all of the system administration is taken care of for you. It also automatically scales out as needed, both on the app side and the database side. You also only pay for what you use; an idle app that gets no hits incurs no costs. The downsides to gae are that you're restricted in the languages you can use; python and java (or things that run on the jvm, like jruby). An even bigger downside is that the database is not sql (they don't call it a database; they call it the datastore) and will likely require reworking your ddl if you have an existing database; it's a definite cost in programmer time to understand how it works and how to use it effectively.
So if you're starting from scratch (or willing to rewrite) and you have the resources and time to learn its ways, gae may be the way to go. If you have a sysadmin or sysadmin skills and have the time or know how to set up load balancing and failover then ec2 could be the way to go. If your app is going to be idle a lot then ec2 is expensive; last time I checked it was something like $70 a month to leave a small instance up.
I think the points you have to focus are:
granular cost (pay per Ghz or Gbyte)
instantaneous scalability (not so instantaneous, but at least transparent?) ? lower latency
Changing your application to run on a cloud will take a good amount of time, but it will not really matter if the cloud don't lower your costs and/or you don't really need instantaneous/fast scalability (the classic example are eCommerce app)
After considering these 2 points. The one IMO you should think about is relies on a complex database (ie. num. tables), since depending on its "conplexity", changing to a cloud environment can be really troublesome.
Related
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
I have a web application - a simple web application archive file - that having several storage adapters for different storage types ie. MongoDB and CouchDB. By using this application I can store/query data to those databases with the web services I have written. Currently I can only have one single database instance per application, cannot have more than one which prevents me having parallel processing.
What I want is to run my application on several machines. And above of those, I want to write a UI enabling the Client to store/query the data without knowing the database types/addresses.
I have two different scenarios and wanted to ask you which one of them is a better way to do it and why.
1) Let's say I have three servers running three single databases - couchdb. I can upload my application to those servers and then with the help of my UI or a layer above my application I can define a map of servers so that I can store and query the data.
As you see above, database and application lies in the same server, so they are remote.
2) Let's say three servers are still running remotely but in this case my application is local. And I enabled it to accept several database instances.
I actually prefer the first one since in that case I won't need to extend my application but I wanted to hear what you think about it. I will be glad if you can provide some sources for that kind of distributed scenarios - I had no experience at all on that kind of stuff.
Please take a look on article, which is describing of Instagram architecture. It's quite interesting to know how 3 engineers handled 15-25 millions of user with 150 millions of photos per day.
Also I would recommend interesting blog, which describes different scalability solutions for popular web-resources:
Facebook: An Example Canonical Architecture for Scaling Billions of Messages
Tumblr Architecture - 15 Billion Page Views a Month and Harder to Scale than Twitter
There are lots of information.
But the most common things are:
keep everything as easy as it's possible
use well-known, mature technologies, which have good support of the community
scale each part of application separately
And even though you may find explanations of each these, I'd like to focus on the last one according to your requirements.
When you want to make your application horizontally scalable, you need to consider each of clusters as separate logical module, regardless on actual number of servers, involved into cluster. F.e. for your web-application you can setup several instances of that application and set a load-balancer before them. Thus users can access single entry point (e.g. http://mysite.com), meanwhile actual instance may be arbitrary.
If you need to collaborate instances between each other, then you need to avoid in-memory storage, but use "key-value" storages, such as Redis, along with Messages Brokers, such as ActiveMQ, RabbitMQ or cloud version Iron.IO etc.
Datastorage you also need to consider as single entry point, e.g. sharded cluster (f.e. MongoDB supports out-of-box auto-sharding, and most of NoSQL solutions also have it - CouchDB, HBase).
So basically you call some shard-controller, which according specific shard-key redirects to corresponding instance. But please note, that usually sharding might be quite non-trivial thing, therefore in most cases when you deal with RDBMS you need to use vertical scalability.
Considering everything above I would suggest you such structure:
For sure ideally all the servers must be near each other physically (f.e. in the same data-centre). But if you are going to use your application as World-wide, then you need to shard your instances according less latency. Here is quite interesting lecture about server's configuration (even though it's about MongoDb, I believe some approaches might be helpful in your case as well): https://www.youtube.com/watch?v=TZOH92mZIN8
But if do not need to use all your servers for distributed "map/reduce" computing, and for getting result you need only one particular server's instance, in that case I believe scenario #1 is fairly suitable and better for your needs (in case if you setup load-balancer before your instances).
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
Does anyone have any real world experience with Hazelcast distributed data grid and execution product? How has it worked for you? It has an astonishingly simple API and functionality that seems almost to good to be true for such a simple to use tool. I have done some very simple apps and it seems to work as advertised so far. So here I am looking for the real world 'reality check'. Thank you.
We've been using it in production since version 1.8+, using mainly the distributed locking feature. It works great, we've found a couple of workarounds/bugs, but those were fixed relatively fast.
With 1.8M locks per day we found no problems so far.
I recommend start using version 1.9.4.4.
There are still some issues still with its development,
http://code.google.com/p/hazelcast/issues/list
Generally, you can choose to either let it use its own multicast algorithm or specify your own ip's. We've tried it in a LAN environment and it works pretty well. Performance wise it's not bad but the monitoring tool didn't work very well as it failed to update most of the time. If you can live with the current issues then by all mean go for it. I would use it with caution but it's a great working tool IMHO.
Update:
We've been using Hazelcast for a few months now and it's working very well. The settings are relatively easy to set up and with the new updates, are comprehensive enough to customize even small things like the number of threads allowed in read/write operations.
We are using Hazelcast (1.9.4.6 now) in production integrated with a complicated transactional service. It was added to alleviate immediate database throughput issues. We have discovered that we frequently have to stop it bringing down all transaction services for at least an hour. We are running clients in superclient mode because it is the only option that even remotely meets our performance requirements (about 4 times faster than native clients.) Unfortunately stopping a superclient node causes split brain issues and causes the grid to lose records, forcing a complete shutdown of services. We have been trying to make this product work for us for almost a full year now, and even paid to have 2 hazelcast reps flown in to help. They were unable to produce a solution, but were able to let us know that we were probably doing it wrong. In their opinion it should work better but it was pretty much a wasted trip.
At this point we are on the hook for over 6 figures per year in licensing fees and we are currently using about 5 times the resources to keep the grid alive and meet our performance needs than we would be using with a clustered and optimized database stack. This was absolutely the wrong decision for us.
This product is killing us off. Use with caution, sparingly, and only for simple services.
If my own company and projects count as real world, here's my experience. I wanted to get as close to eliminating external (disk) storage in favor of limitless and persistent "RAM". For starters that eliminates CRUD plumbing which sometimes makes up to 90% of the so-called "middle tier". There are other benefits. Since RAM is your "database" you don't need any complex caches or HTTP session replication (which in turn eliminates ugly sticky session technique).
I believe RAM is the future and Hazelcast has everything to be an in-memory database: queries, transactions, etc. So I wrote a mini-framework abstracting it: to load data from the persistent storage (I can plugin anything that can store BLOBs - the fastest turned out to be MySQL). It is too long to explain why I didn't like Hazelcast's built-in persistence support. It's rather generic and rudimentary. They should remove it. It is not rocket science to implement your own distributed and optimized write-behind and write-through. Took me a week.
Everything was fine until I started performance-testing. Queries are slow - after all of the optimizations I did: indexes, Portable serialization, explicit comparators, etc. A simple "greater than" query on an indexed field takes 30 seconds on the set of 60K of 1K records (map entries). I believe Hazelcast team did everything they could. As much as I hate to say it, Java collections are still slow compared to super-optimized C++ code normal databases use. There are some open-source Java projects that address that. However at this time query persistence is unacceptable. It should be instant on a single local instance. It is an in-memory technology after all.
I switched to Mongo for the database, however left Hazelcast for shared runtime data - namely sessions. Once they improve query performance I'll switch back.
If you have alternatives to hazelcast maybe look at these first. We have it in running production mode and it is still quite buggy, just check out the open issues.
However, the integration with Spring, Hibernate etc. is quite nice and the setup is really easy :)
We use Hazelcast in our e-commerce application to make sure that our inventory is consistent.
We use extensive use of distributed locking to make sure SKU Items of inventory are modified in atomic way because there are hundred of nodes in our web application cluster that operates concurrently on these items.
Also, we use distributed map for caching purpose which are shared across all the nodes. Since scaling node in Hazelcast is so simple and it utilises all its CPU core, it gives added advantage over redis or any other caching framework.
We are using Hazelcast from last 3 years in our e-commerce application to make sure availability (supply & demand) is consistent, atomic, available & scalable.
We are using IMap (distributed map) to cache the data and Entry Processor for read & write operations to do fast in-memory operations on IMap without you having to worry about locks.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I am currently developing a proof of concept for an alternative data store. The reason why is I need to enhance a read-mostly clustered webapp, but also because I want to free myself from the pain of the sometimes overly-complex ORM+RDBMS solution.
Overall the idea is quite similar to a distributed cache with persistence (letting the cluster be the SoR), however:
want to be able to retrieve any object along with its children, by
id (providing class & id) [only that to start off, as the main
querying part is already resolved with lucene in my app].
need to have map of maps of types ( ~ tables in the relational
world), and therein distributed maps of 'dehydrated' stored objects (flattening the object graph via reflection deep cloning)
a bin log (like Prevayler, for example) for
eventual recovery if whole cluster goes down
development (and ability to refactor code / change structure)
perhaps asynchronously processed for other purposes (reporting, whatever)
eventually later on try to integrate a statically-typed query mechanism, like LINQ, Jaque or H2's JaQu / see ODBs / Lucene (?)
it has to be transaction-aware (not sure "JTA type" though)
I'm planning to implement this idea with Hazelcast (I love its super-simple API) or Terracotta (which I never used - but I'm aware of their 'sweet spot', medium-term data). If you will, my aim is to do more or less what Jonas once blogged about. Using one of these, stored data would roughly have to fit in the sum of the JVM heaps of the cluster.
This should be pretty simple to scale, would avoid the relational impedance mismatch (ie save as with an ODB) and JDBC + I/O overhead.
Do you know of other tools/frameworks or combination thereof already providing similar functionality, that I'm ignoring?
Can you suggest other ways of tackling this 'getting rid of the DB'? What flaws do you already see in this idea?
Concurrency-wise would it make sense to consider Scala instead of Java?
How about non-relational data stores such as Couch DB, Neo4j, HyperTable, HBase?
A similar question was asked one month ago - but there was no concrete solution.
BTW I just stumbled upon the concept of Enterprise Data Fabric, which, to my surprise, describes a lot of these ideas.
Definitely give Terracotta a try. It's free (unless you go Enterprise which has an SLA and support). It is a JVM-level cluster, so to speak, so you don't have the issues associated with sessions on multiple boxes behind disparate JK workers (assuming you're using this for a J2EE app).
I'm just rambling, so have a look here: http://en.wikipedia.org/wiki/Terracotta_Cluster
UPDATE numerous bits of info on Terracotta on the web too, e.g. http://blog.terracottatech.com/2007/12/fud_of_the_week_terracotta_doe.html
UPDATE2 Bit of background on my views: I work for a company with a fairly big audience. We have a enterprise MySQL running with a master and about 5 slaves (times 2 considering we have 2 channels, with 4 app servers per channel), using MySQL's JDBC Replication driver (for which we've already submitted various patches). We use Spring2.5/Hibernate3 using Spring's declarative JTA transaction management, so read-onlies go to the slaves. With the advent of numerous Ajax enhancements on a future version of our site, our DB servers' load has gone up - we create pricing summaries for thousands of products for all countries, taking into account duties/tax rules for all these countries (plus promotions and real-time auctions running all the time), then the Ajax services have the latest prices in a blink. Terracotta takes the load off the DB and app servers by making these prices available to all app servers on a JVM-layer, with all the JVMs across the boxes linked. So, server A can update the prices every few minutes, and if Ajax hits server B, the prices are available immediately. I know there are people/companies out there with similar businesses, who probably have better ideas and implementations, so I'm always open for discussion, but this is my two cents.
I get inspiration from the guys at Facebook too, for instance this very informative article:
http://www.facebook.com/note.php?note_id=23844338919
They talk about memcached which you should also definitely check out.
As Neo4j is mentioned in the question, I'm chiming in with a few thoughts on using a graph database in this case. (I'm part of the Neo4j team)
retrieving children is trivial in a
graph db
there is a map implementation
for neo4j
as graphs are native to a graph db
you could consider not to flatten the
object graph, but to persist data in nodes
and edges/relationships (this gives you
more flexibility in handling the data)
neo4j is fully transactional
With the new DB technologies emerging today, there's really no need to stay with a RDBMS if your data isn't a good fit for the relational paradigm.
Seems to me Terracotta is a perfect fit for your requirements:
cluster a map to retrieve children
via keys (e.g. clustered Map)
map of maps - no problem
no explicit bin log - but Terracotta already persists everything to disk so full cluster restart is already supported
integrated already to Compass, Hibernate Search, and Lucene for search
Transactions? Too slow. Use the cache as a datastore. With persistence you won't lose data writing to (clustered) memory and trickle back to the DB.
In addition, Terracotta does the "reflection" thing you ask for - although it doesn't use reflection as that is far too slow. It uses BCM. Only changes are propagated on the network.
Hazelcast btw requires serialization so it will be slow and will not do well at all with a map of maps data structure (every put will result in a full deep clone copy across the network) and it doesn't have any kind of persistence built in.
Interesting.
I have a view that we all develop a zoo which comprises all the abstraction layers we habitually use in our projects. And each abstraction layer is a completely different animal.
My goal is to minimize the amount of time spent on just care and feeding of the animals whenever it diverts me from solving the problem at hand - it's overhead - wasted resources. So the fewer, simpler abstraction layers we can get away with, the more productive we are.
I can usually do just fine with two beasties - OOP and RDBMS, coupled through nice, simple, minimal, hand-crafted DAL. For me, ORM is mostly overhead - one abstraction too many, and a pretty hungry one.
Don't discount the option of treating stored procedures as an abstraction tool, either. If you're real comfortable with SQL, it can be a useful resource for implementing a light-weight BL facade that means not needing to think about the ORM problem.
And this post suggests the emergence of alternatives to RDBMS for some requirements, anyway.
Thanks for your answers.
Actually, you talk about DBs which is something I want to completely take out of the picture.
The use case I'm targetting is a startup's small/medium-sized clustered webapp (boxes in a LAN, or in the cloud). It needs to retrieve objects at ~RAM-speed levels and scale fairly easily. As a side-effect, one wouldn't have to think about DB server installations, impedance mismatch, JDBC, caches, polluting domain models with annotations, etc.
Again, what I want to accomplish is something like described here, and I would love to have some more feedback on ideas concerning the actual implementation (why use Terracotta instead of Hazelcast, use serialization or deep cloning via reflection or whatever else, and also the major drawbacks of an approach like this - eg. why wouldn't you change it for your current ORM/DB setup).
It has to be super simple to integrate so it'll feature a really neat Java API, improving code readability. No other software (DB, memcached will be required).
Try GigaSpaces. I think they have exactly what you require, and if I'm not mistaken there's a free version for startups.
Some concepts:
"Space" is some place where you can store and retrieve objects
Space can be backed by any JDBC-compliant DB, automatically (no code, only configuration)
Space can be started in your java process, so all accesses are at RAM speed
Space can be clustered/partitioned in any way you want (full mirror, partial, grid).
Space supports distributed or local transactions
Check their wiki, (but only "programmer's guide" - all the rest is marketing BS).
What are the recommended steps for creating scalable web/enterprise applications in Java and .NET? I'm more interested in what it takes to go from a low volume app to a high volume one (assuming no major faults in the original architecture). For example, what are the options in each platform for clustering, load-balancing, session management, caching, and so on.
Unfortunately there are a lot of aspects of your question that are going to be context dependant. Scalability can mean different things depending on your requirements and the application. The questions that need answering as you come up with your architecture (regardless of the technology you are using) include:
For example do you want to support users working with very large amounts of data or do you simply need to support a large number users each working with relative modest amounts of data?
Does your application do more reading than writing, since reads are cheap and writes are expensive a read heavy application can be simpler to scale than a write heavy one.
Does the data in your application have be always consistent or will eventual consistency suffice? Think posting a message to a social networking site versus withdrawing money from a bank account).
How available does you application need to be? Strict high availability requirements will probably require you to have smooth fail-over to other servers when one server crashes.
There are many other questions like this that need to be answered about your specific application before a proper discussion on your application architecture can take place.
However so I don't just leave you questions and not a single answer here is what I think is the single most important thing to take into account when designing a web application for scalability: Reduce (down to zero if possible) the amount of shared session state in your application (global application counters, cached primary key blocks and the like). Replicating shared state across a cluster has a very high cost, when you are trying to scale up
Perhaps there is so many options, I am not sure anyone to considered all options in full. Usually, the choice of language comes first and then you decide how to achieve the results you want.
My guess is you have good options in just about any major general purpose language.
This would have been an issue 10 years ago, but not today IMHO.
A good comparison of available open source technologies in Java is http://java-sources.org/ As you drill down into each category, you can see this is fairly extensive and it does even consider commercial products or C#
For any scalable system, you need to consider hardware solutions as well, such as network components.
My suggestion for turning a low volume app into a high volume one, is start with something fairly high volume in the first place and expect to re-engineer it as you go.
What sort of volumes are you talking about? Everyone has different expectations of high and low volume. e.g. When google have lower than average volumes they switch of whole data centres to save power. ;)
Just came across this explanation of how MySpace scaled up, really interesting and exactly the kind of information I'm looking for. However, there is no discussion about the technologies used to create the partitioning or caching and so on...
Short answer - it depends. What (if any) business goals are involved? What is the project budget and timeframe? Which industry (is it regulated)?
I have an application that's a mix of Java and C++ on Solaris. The Java aspects of the code run the web UI and establish state on the devices that we're talking to, and the C++ code does the real-time crunching of data coming back from the devices. Shared memory is used to pass device state and context information from the Java code through to the C++ code. The Java code uses a PostgreSQL database to persist its state.
We're running into some pretty severe performance bottlenecks, and right now the only way we can scale is to increase memory and CPU counts. We're stuck on the one physical box due to the shared memory design.
The really big hit here is being taken by the C++ code. The web interface is fairly lightly used to configure the devices; where we're really struggling is to handle the data volumes that the devices deliver once configured.
Every piece of data we get back from the device has an identifier in it which points back to the device context, and we need to look that up. Right now there's a series of shared memory objects that are maintained by the Java/UI code and referred to by the C++ code, and that's the bottleneck. Because of that architecture we cannot move the C++ data handling off to another machine. We need to be able to scale out so that various subsets of devices can be handled by different machines, but then we lose the ability to do that context lookup, and that's the problem I'm trying to resolve: how to offload the real-time data processing to other boxes while still being able to refer to the device context.
I should note we have no control over the protocol used by the devices themselves, and there is no possible chance that situation will change.
We know we need to move away from this to be able to scale out by adding more machines to the cluster, and I'm in the early stages of working out exactly how we'll do this.
Right now I'm looking at Terracotta as a way of scaling out the Java code, but I haven't got as far as working out how to scale out the C++ to match.
As well as scaling for performance we need to consider high availability as well. The application needs to be available pretty much the whole time -- not absolutely 100%, which isn't cost effective, but we need to do a reasonable job of surviving a machine outage.
If you had to undertake the task I've been given, what would you do?
EDIT: Based on the data provided by #john channing, i'm looking at both GigaSpaces and Gemstone. Oracle Coherence and IBM ObjectGrid appear to be java-only.
The first thing I would do is construct a model of the system to map the data flow and try to understand precisely where the bottleneck lies. If you can model your system as a pipeline, then you should be able to use the theory of constraints (most of the literature is about optimising business processes but it applies equally to software) to continuously improve performance and eliminate the bottleneck.
Next I would collect some hard empirical data that accurately characterises the performance of your system. It is something of a cliché that you cannot manage what you cannot measure, but I have seen many people attempt to optimise a software system based on hunches and fail miserably.
Then I would use the Pareto Principle (80/20 rule) to choose the small number of things that will produce the biggest gains and focus only on those.
To scale a Java application horizontally, I have used Oracle Coherence extensively. Although some dismiss it as a very expensive distributed hashtable, the functionality is much richer than that and you can, for example, directly access data in the cache from C++ code .
Other alternatives for horizontally scaling your Java code would be Giga Spaces, IBM Object Grid or Gemstone Gemfire.
If your C++ code is stateless and is used purely for number crunching, you could look at distributing the process using ICE Grid which has bindings for all of the languages you are using.
You need to scale sideways and out. Maybe something like a message queue could be the backend between the frontend and the crunching.
Andrew, (in addition to modeling as a pipeline etc), measuring things is important. Have you ran a profiler over the code and got metrics of where most of the time is spent?
For the database code, how often does it change ? Are you looking at caching at the moment ? I assume you have looked at indexes etc over the data to speed up the Db ?
What levels of traffic do you have on the front end ? Are you caching web pages ? (It isn't too hard to say use a JMS type api to communicate between components. You can then put Web Page component on one machine (or more), and then put the integration code (c++) on another, and for many JMS products there are usually native C++ api's ie. ActiveMQ comes to mind), but it really helps to know how much of the time is in Web (JSP ?) , C++, Database ops.
Is the database storing business data, or is it being also used to pass data between Java and C++ ? You say you are using shared mem not JNI ? What level of multi-threading currently exists in the APP? Would you describe the code as being synchronous in nature or async?
Is there a physical relationship between the Solaris code and the devices that must be maintained (ie. do all the devices register with the c++ code, or can that be specified). ie. if you were to put a web load balancer on the frontend, and just put 2 machines up today is the relationhip of which devices are managed by a box initialized up front or in advance?
What are the HA requirements ? ie. just state info ? Can the HA be done just in the web tier by clustering Session data ?
Is the DB running on another machine ?
How big is the DB ? Have you optimized your queries ie. tried using explicit inner/outer joins sometimes helps versus nested sub queries (sometmes). (again look at the sql stats).