ActiveMQ Durable Topic Connection Design Opinions Sought - java

I have a very simple app that is using ActiveMQ. The use case that it solves will involve sending small atomic Topic messages.
My first pass at this functionality built one connection to the broker and reused it as needed. However, in reading some of the docs, it seems like hanging onto a connection for reuse potentially hogs resources in the JVM.
So my dilema is, do I go incur the overhead of building up and tearing down a connection for every message, or do I incur the cost of hanging onto resources that for the most part sit idle.
I know there is no one definitive answer and the real answer is "it depends", but would really like some insight and opinions from others.

I believe you should be aware of both mentioned criteria. The solution is to use a pool of connection. In this case you share the connections and most of the time do not create a new one, as well as pool usually is limited to a specific number of connection (this is my assumption as of how I would implement it) - so that it doesn't take all resources in JVM.
Take a look at PooledConnectionFactory related section.
Also decision to keep connections or to recreate them totally depends on your usage scenario. If you plan to send messages regularly - sharing the connection is the right thing to do - since connection and session (I would recommend sharing sessions if possible in a high traffic) creation are quite expensive operations. However if your messages will be sent not that often ( few times per hour? :) ) - it will make sense not to keep alive the idle connection.

Related

Java server - multiple clients handling - is using threads optimal?

I am developing server-client communication based system and I am trying to determine the most optimal way to handle multiple clients. What is important I really don't want to use any third-party libraries.
In many places in the Internet I saw this resolved by creating a separate thread for each connections, but I don't think it is the best way when I assume there will be a huge number of connections (maybe I'm wrong). So, solution that I'm thinking of is
Creating queue of events and handling them by workers - the defined pool of threads (where there is a constant number n of workers). This solution seems to be pretty slow, but I can not imagine how big difference will be in case of handling huge amount of clients.
I've been thinking also about load-balancing via multiinstantiatig the server (on different physical machines) but it is only a nice add-on to any solution, not the solution itself.
I am aware that Java is not really async-friendly, but maybe I lack some knowledge and there is nice solution. I'll be grateful for any sugestions.
Additional info:
I assume really big number of connections
Every connection will last for a long time (days, maybe weeks)
Program will need to send some data to specified client quite frequently
Each client will send data to server about once a 3 seconds
To avoid discussion (as SO is not a place for them):
One client - one thread
Many clients - constant number of threads and events pool
Any async-like solution, that I'm not aware of
Anything else?
I'd suggest starting off with the simple architecture of one thread per connection. Modern JVMs on sufficiently sized systems can support thousands of threads. You might be pleasantly surprised at how well even this simple scheme works. If you need 300k connections, though, I doubt that one thread per connection will work. (But I've been wrong before.) You might have to fiddle with the thread stack size and OS resource limits.
A queueing system will help decouple the connections from the threads handling the work, but it will add to the amount of work done per message received from each client. This will also add to latency (but I'm not sure how important that is). If you have 300k connections, you'll probably want to have a pool of threads reading from the connections, and you'll also want to have more than one queue through which the work flows. If you have 300k clients sending data once every 3 seconds, that's 100k ops/sec, which is a lot of data to shove through a single queue. That might turn into a bottleneck.
Another approach probably worth investigating is to have a pool of worker threads, but instead of each worker reading data from a queue written by connection reader threads, have each worker handle a bunch of sockets directly. Use a NIO Selector to have each thread wait on multiple sockets. Say, have 100 threads each handling 3,000 sockets. Or perhaps have 1,000 threads each handling 300 sockets. This depends on the amount of data and the amount of work necessary to process each incoming message. You'll have to experiment. This will probably be considerably simpler than using asynchronous I/O.
Java 7 has true asynchronous IO under the NIO package I've heard. I don't know much about it other than its difficult to work with.
Basic IO in java is blocking. This means using a fixed number of threads to support many clients is likely not possible with basic IO as you could have all threads tied up in blocking calls reading from clients who aren't sending data.
I suggest you look in asynchronous IO with Grizzly/Netty, if you change your mind on 3rd party libraries.
If you haven't changed your mind, look into NIO yourself.

Implementing push-like technology

I want to be able to exchange data between my app and the server where each side has to be able to initiate sending of data. I want it to happen quickly and polling from the client side for new messages is not fast enough in my case. How do push technologies work?
I was thinking to keep an opened socket connection from the device to the server and send receive raw bytes in some custom format.
Is it a good approach and what problems might I run into? What would you suggest as an alternative?
When it comes to message passing, the time needed to initialize a new connection between the server and the client usually exceeds by far the time needed to sent the data itself - at least for simple status-like messages. This adds significantly to the communication latency, which seems to be your main concern.
There are two main ways to deal with this issue:
Keep a connection open between both ends at all times: This is the standard way of dealing with this issue - it has the advantage of programming simplicity but you may need to use stay-alive packets regularly to keep the connection open. This may reduce the battery life of a mobile device and increase the networking cost slightly. It may also interact unfavorably with the power-management features of a mobile device.
In addition, no matter what you do, you cannot completely eliminate the possibility of a new connection needing to be established at an inconvenient time - connections that are mostly idle do not fare very well in today's networking infrastructure, I'm afraid...
Use a connection-less protocol, such as UDP: This solution has the potential to minimize the communication and power cost, but it requires that your server and client are designed to handle the inherent unreliability of such protocols.
That said, I would not consider the actual format of the data a major concern, until some profiling demonstrates that a custom format will indeed result in significant savings. I would consider the ability to use off-the-shelf network monitoring and analysis software far more important during the development phase...
Push technology is loosely called Comet. The basic logic is to open an persistent HTTP connection with the server (often called HTTP Streaming). As this connection will not last forever (due to the limitations on the server by default), you should be able to reopen the connection. I am not sure how to do it in android specifically but this should be possible.
The basic concept behind this is explained in this blogpost
As this is a concept, it can be implemented in any server side programming language of your choice. This tutorial gives a fair introduction about how to implement COMET in php. socket.io is another such library if you are comfortable with javascript. This SOF thread provides some useful links.
Coming to advantages and disadvantages,
If you want almost instant updates, COMET is the best.
If you have a limit on the number of connections to the server at a time, then COMET probably has to be thought upon based on the tradeoff.

What is the most "expensive" for a server: connections opened, send/receive messages or connections/deconnections?

What is the most expensive for a server
(using Java NIO Selector and SocketChannel, but I guess the language and library don't matter anyway)
keeping many client connections opened
many client connections/deconnections
receiving many messages from clients and answering many messages to clients
When solving performance issues, never trust any guesses or even "common sense". Always measure. Always profile. In case of doubt write sample applications that do only one of the points in question (its usually quite easy to do). Again and again you will be surprised.
I think the second one will be the most expensive, if you use a good object pooling mechanism for the first one. If not, then keeping hundreds of connections open, is going to become a serious issue.
Sending and receiving messages shouldn't be that of a hassle.

Commercial Website architecture question

I have to write an architecture case study but there are some things that i don't know, so i'd like some pointers on the following :
The website must handle 5k simultaneous users.
The backend is composed by a commercial software, some webservices, some message queues, and a database.
I want to recommend to use Spring for the backend, to deal with the different elements, and to expose some Rest services.
I also want to recommend wicket for the front (not the point here).
What i don't know is : must i install the front and the back on the same tomcat server or two different ? and i am tempted to put two servers for the front, with a load balancer (no need for session replication in this case). But if i have two front servers, must i have two back servers ? i don't want to create some kind of bottleneck.
Based on what i read on this blog a really huge charge is handle by one tomcat only for the first website mentionned. But i cannot find any info on this, so i can't tell if it seems plausible.
If you can enlight me, so i can go on in my case study, that would be really helpful.
Thanks :)
There are probably two main reasons for having multiple servers for each tier; high-availability and performance. If you're not doing this for HA reasons, then the unfortunate answer is 'it depends'.
Having two front end servers doesn't force you to have two backend servers. Is the backend going to be under a sufficiently high load that it will require two servers? It will depend a lot on what it is doing, and would be best revealed by load testing and/or profiling. For a site handling 5000 simultaneous users, though, my guess would be yes...
It totally depends on your application. How heavy are your sessions? (Wicket is known for putting a lot in the session). How heavy are your backend processes.
It might be a better idea to come up with something that can scale. A load-balancer with the possibility to keep adding new servers for scaling.
Measurement is the best thing you can do. Create JMeter scripts and find out where your app breaks. Built a plan from there.
To expand on my comment: think through the typical process by which a client makes a request to your server:
it initiates a connection, which has an overhead for both client and server;
it makes one or more requests via that connection, holding on to resources on the server for the duration of the connection;
it closes the connection, generally releasing application resources, but generally still hogging a port number on your server for some number of seconds after the conncetion is closed.
So in designing your architecture, you need to think about things such as:
how many connections can you actually hold open simultaneously on your server? if you're using Tomcat or other standard server with one thread per connection, you may have issues with having 5,000 simultaneous threads; (a NIO-based architecture, on the other hand, can handle thousands of connections without needing one thread per connection); if you're in a shared environment, you may simply not be able to have that many open connections;
if clients don't hold their connections open for the duration of a "session", what is the right balance between number of requests and/or time per connection, bearing in mind the overhead of making and closing a connection (initialisation of encrypted session if relevant, network overhead in creating the connection, port "hogged" for a while after the connection is closed)
Then more generally, I'd say consider:
in whatever architecture you go for, how easily can you re-architecture/replace specific components if they prove to be bottlenecks?
for each "black box" component/framework that you use, what actual problem does it solve for you, and what are its limitations? (Don't just use Tomcat because your boss's mate's best man told them about it down the pub...)
I would also agree with what other people have said-- at some point you need to not be too theoretical. Design something sensible, then run a test bed to see how it actually copes with your expected volumes of data. (You might not have the whole app built, but you can start making predictions about "we're going to have X clients sending Y requests every Z minutes, and p% of those requests will take n milliseconds and write r rows to the database"...)

JMS alternative? something for decoupling sending emails from http reqs

we have a web application that does various things and sometimes emails users depending on a given action. I want to decouple the http request threads from actually sending the email in case there is some trouble with the SMTP server or a backlog. In the past I've used JMS for this and had no problem with it. However at the moment for the web app we're doing JMS just feels a bit of an over kill right now (in terms of setup etc) and I was wondering what other alternative there are out there.
Ideally I just like something that I can run in-process (JVM/Tomcat), but when the servlet context is unloaded any pending items in the queue would be swapped to disk/db. I could of course just code something together involving an in memory Q, but I'm looking to gain the benfit of opensource projects, so wondering whats out there if anything.
If JMS really is the answer anyone know of somethign that could fit our simple requirements.
thanks
I'm using JMS for something similar. Our reasons for using JMS:
We already had a JMS server for something else (so it was just adding a new queue)
We wanted our application be decoupled from the processing process, so errors on either side would stay on their side
The app could drop the message in a queue, commit, and go on. No need to worry about how to persist the messages, how to start over after a crash, etc. JMS does all that for you.
I would think spring integration would work in this case as well.
http://www.springsource.org/spring-integration
Wow, this issue comes up a lot. CommonJ WorkManagager is what you are looking for. A Tomcat implementation can be found here. It allows you to safely create threads in a Java EE environment but is much lighter weight than using JMS (which will obviously work as well).
Beyond JMS, for short messages you could also use Amazon Simple Queue Service (SQS).
While you might think it an overkill too, consider the fact there's minimal maintenance required, scales nicely, has ultra-high availability, and doesn't cost all that much.
No cost for creating new queues etc; or having account. As far as I recall, it's purely based on number of operations you do (sending messages, polling/retrieving).
Main limitation really is the message size (there are others, like not guaranteeing ordering due to distributed nature etc); but that might work as is. Or for larger messages, using related AWS service, s3, for storing actual body, and just passing headers through SQS.
You could use a scheduler. Have a look at Quartz.
The idea is that you schedule a job to start at regular intervals. All requests need to be persisted somewhere. The scheduled job will read them and process them. You need to define the interval between two subsequent jobs to fit your needs.
This is the recommended way of doing things. Full-fledged application servers offer Java EE Timers for this, but these aren't available in Tomcat. Quartz is fine though and you could avoid starting your own threads, which will cause mess in some situations (e.g. in application updates).
I agree that JMS is overkill for this.
You can just send the e-mail in a separate thread (i.e. separate from the request handling thread). The only thing to be careful about is that if your app gets any kind of traffic at all, you may want to use a thread pool to avoid resource depletion issues. The java.util.concurrent package has some nice stuff for thread pools.
Since you say the app "sometimes" emails users it doesn't sound like you're talking about a high volume of mail. A quick and dirty solution would be to just Runtime.getRuntime().exec():
sendmail recipient#domain.com
and dump the message into the resulting Process's getOutputStream(). After that it's sendmail's problem.
Figure a minute to see if you have sendmail available on the server, about fifteen minutes to throw together a test if you do, and nothing to install assuming you found sendmail. A few more minutes to construct the email headers properly (easy - here are some examples) and you're done.
Hope this helps...

Categories