I am developing a chat website using jsp/servlet.I will be hosting my website on gooogle appengine .Now i have some doubts regarding whether to use server push or client pull technology
1)If i use server push and if i dont close the response of servlet will it cause the server to go slow?How many simultanious connection can a tyicall tomcat server can handle if i keep the socket open for the entire chat session between 2 clinets??
2)Will server push or clinet push be better??
If you are using a servlet (prior to 3.0), then I guess you'll have to go with pull because of the programming model of servlet. However, there ARE advantages in using a push model. Primarily, wasted load on server and the limitation in latency. That's why there are technologies such as comet. Servlet 3.0 also supports push model. These are commonly used in ajax based apps.
In fact I believe a push model is more suited for a chatting app. because of the fast response time (=better user experience) it can provide.
If you use a nio based implementation for push-model, you can support thousands or even more than 10k concurrent connections (obviously, your millage varies).
If you use a conventional IO based implementation, it will be likely in the range of hundreds of concurrent connections (don't take this estimation too seriously though. I'm just giving these numbers to give a very, very rough feeling).
As for tomcat, last time I checked, people were saying that it won't have a good push-model support until version 7.0. But I'm not following the current status so I'm not sure (Sorry, perhaps somebody else can help you on this). If that is the case, you might want to check out comet support of jetty.
grizzly and netty are also good NIO based network frameworks, but if you want to use JSP, and find that tomcat is not sufficient, I guess jetty would be the best bet.
edit: (some additional info)
In this "push models", it's not like the server opens a connection to the client. The connection will be kept alive, and the server will push messages as it sees fit.
Also, it's not like there are only "push" and "pull" models. You can have a hybrid, like long polling.
I don't know how are you thinking of achieving server push here. As far as I can see, server needs a request to respond over HTTP. So, when there is a request, server will respond to that.
If i use server push and if i dont close the response of servlet will it cause the server to go slow?
App Engine will not let you do that. You have to finish your response within thirty seconds, or it will be killed. The thirty seconds is also an edge case, most calculations they do (for quota and such) are based on a 75 millisecond response time.
How many simultanious connection can a tyicall tomcat server
Tomcat? I thought you are planning to use App Engine?
Pull. Always pull.
I know it's a manufacturing-oriented book but the advice from Lean Thinking (Womack & Jones) is invaluable in any context (roughly, from memory):
Start by defining value,
line up the activities that create value in the value-stream,
create flow across the value-stream,
let customers pull value from the value-stream,
compete against perfection rather than other organizations
If I misquoted them, I apologize. Anyway, all of those principles can easily be applied in the development of any software product just as they could in the production of any physical product but the one that matters for you is pull.
Letting consumers of a service pull rather than pushing to them not only makes your programming model easier, it aligns activity with demand. You can still use queuing to load-level over time, if you have to, just the way you could with push but, this way, you have complete visibility into what, exactly happens in any given transaction.
I don't quite get your first question but the answer is still pull.
The answer to your query depends on what underlying protocol you wish to use.
Since you have mentioned JSP/servlets, your app will be implemented over the HTTP protocol.
HTTP is a protocol over TCP. TCP is connection oriented and remains alive, until the connection is ended. However, HTTP connections are persistent, only for the duration of a single request-response cycle. The TCP connection is broken after every request-response cycle. So that should answer your doubt with regards to how many socket connections a typical TOMCAT server will be able to handle. The connections will not be persistent, at all. They will only last the duration of a HTTP request-response cycle.
Given this basic idea, I would suggest , you use a client pull strategy, to implement your app.
Even with server push, over HTTP, even though the name says "server push", it is always the web client that polls the server at regular intervals, which just gives an illusion of "server-push". HTTP specification mandates that the client makes a request to which the server responds.
I have considerable experience in developing chat applications (both mobile and web).
Let me know , if you need any assistance. I will be more that willing to help.
Related
I've been through different questions about this topic, however, none of them have cleared my doubts on the best approach notifying the client side of a server-client IM app.
The Problem:
The whole problem is how to notify the client application of updates. I've alread seen the following approaches:
Clients keeps checking for updates: From time to time, client app performs a check in the server to see if there are updates for that specific user;
Problem: it is not performatic at all. Suppose you have one million users and each one of them checks for new updates every second. Serve would have to deal with one million requests per second. Wont work.
Client app opens a socket: The client app opens a socket and sends its address to the server. Server, by its turn, persists this information and connects to the socket whenever it needs to notify the client of some update.
Problem: Often the client will be connected to a NAT, so, the IP it has access to is in a non-visible range. In order to send messages to this client, a port forwarding in the NAT would have to be configured, which can't be done.
Despite of the technology, I think this approach will always be used, however, I have no idea how the problem described above can be solved.
Google Cloud Message (GCM): use the GCM service to notify the client of any update. Problem: It does't seems right to use a third server to handle the IM and it raises concerns about the scalability of the system. When the number of messages and users increases exponentially, it seems that the service will go down. Despite that, it seems that passing the information for two servers before delivering to the targets just adds bottlenecks in the process.
A combination of 2 and 3: uses GCM to reach the client when the last persist addres is no longer available.
Problem: same as described in 2
XMPP: I've seen many answers indicating the use of XMPP for IM applications, however, XMPP is a protocol - as per what I've foun in the web. I don't see how it can solve the problem described in 2 for instance.
Given the options above, can someone indicate me what line should I try to go for? Which one of these approaches has the best chances of success?
Thank y'all in advanced.
Use Google Cloud Messaging. Opposing to what you stated this service is built to scale to billions of users it will generally not introduce performance bottlenecks.
What you basically want to do is to use the messaging service to wake up devices. If you insist you can then still use your client server approach and thus your own protocol to have the client lookup new messages from the backend.
I am trying to create a Jetty servlet that allows clients (web browsers, Java clients, ...) to get broadcast notifications from the web server.
The notifications should be sent in a JSON format.
My first idea was to make the client send a long-polling request, and the server respond when a notification is available using Jetty's Continuation API, then repeat.
The problem with this approach is that I am missing all the notifications that happen between 2 requests.
The only solution I found for this, is to buffer the Events on the server and use a timestamp mechanism to retransmit missed notifications, which works, but seems pretty heavy for what it does...
Any idea on how I could solve this problem more elegantly?
Thanks!
HTTP Streaming is most definitely a better solution than HTTP long-polling. WebSockets are an even better solution.
WebSockets offer the first standardised bi-directional full-duplex solution for realtime communication on the Web between any client (it doesn't have to be a web browser) and server. IMHO WebSockets are the way to go since they are a technology that will continue to be developed, supported and in demand and will only grow in usage and popularity. They're also super-cool :)
There appear to be a few WebSocket clients for Java and Jetty also supports WebSockets.
Sorry for bumping this up, yet I believe numerous people will come across this thread and the accepted answer, IMHO, is at least outdated, not to say misleading.
In order of priority I'd put it as following:
1) WebSockets is the solution nowadays. I've personally had the experience of introducing WebSockets in enterprise oriented applications. All of the major browsers (Chrome, Firefox, IE - in alphbetical order :)) support WebSockets natively. All major servers/servlets (IIS, Tomcat, Jetty) are the same and there are quite a number of frameworks in Java implementing JSR 356 APIs. There is a problem with proxies, especially in cloud deployment. Yet there is a high awareness of the WebSockets requirements, so NginX supported them already 1.5 year ago. Anyway, secured 'wss' protocol solves proxy problem in 99.9% (not 100% just to be on the safe side, never experienced that myself).
2) Long Polling is probably the second best solution, and the 'probably' part is due to 'short polling' alternative. When speaking of long polling I mean the repeated request from client to server, which responses as soon as any data available. Thus, one poll can finish in a few millis, another one - till the max wait time.
Be sure to limit the poll time to something lesser than 2mins since usually otherwise you'll need to manage timeout error in you client side. I'd propose to limit the poll time to something like tens of seconds.
To be sure, once poll finished (timely or before that) it is immediately repeated (yet better to establish some simple protocol and give to your server a chance to say to the client - 'suspend').
Cons of the long polling, which IMHO justifies the continuation of the list, is that it holds one of just a few (4, 8? still not that many) allowed connections, that browser allows each page to establish to a server. So that is can eat up ~12% to ~25% of your website's client traffic resource.
3) Short polling not that much loved by many, but sometimes i prefer it. The main Cons of this one, is, of course, the high load on the browser and the server while establishing new connections. Yet, i believe that if connections pools are used properly, that overhead much lesser than it look like on the first glance.
4) HTTP streaming, either it be page streaming via IFrame or XHR streaming, is, IMHO, highly BAD solution since it's like an accumulation of Cons of all the rest and more:
you'll hold the connections opened (resources of browser and server);
you'll still be eating up from total available client traffic limit;
most evil: you'll need to design/implement (or reuse design/implementation) the actual content delivery in order to be able to differentiate the new content from the old one (be it in pushing scripts, oh my! or tracking the length of the accumulated content). Please, don't do this.
Update (20/02/2019)
If WebSockets are not an option - Server Sent Events is the second best option IMHO - effectively browsers implemented HTTP streaming for you here at the lower level.
I have done this before using Http Streaming via Atmosphere framework and it worked fine.
Visit Comet, Streaming
if you see the atmosphere tutorial they have given multiple examples
You may want to check how they implemented this in CometD: http://cometd.org .
Or you may even consider to use that tool, without having to reinvent the wheel.
Scenario: User logs in on the client software which forms a persistent bidirectional connection with the serverside entity (server) which would process user specified tasks. When the serverside entity, while processing user's task, encounters an error or requires further user input, it will notify the client software, and wait until the client decides what to do. The client software will take the new user specifiefd inputs and send this to the serverside. The serverside continue where it last stopped with the new user specified inputs. This feedback cycle will continue until it's finished processing. The progressively updated user inputs will all be stored on the serverside and accessible and modifiable from the client software. So if a client deletes a specific input, that change will be immediately reflected on the serverside. On the serverside, an extra interface is probably required to route different user's clients to available hardware nodes (cloud) to support concurrent multi-user tasks running on the serverside.
On the client side, I suspect using sockets to connect to the server...
Now for the server, I am a little lost because there seems to be many different Java servers like Jetty & Netty. I am also practicing caution in order to not try and reinvent any wheels here.
Is building a server the right approach? or Build a webservice that will complete a specific task on demand?
I am also not just looking for a one size fits all solution (wishful thinking probably) but open to any insights on my current situation.
Netty will provide a lot of what it sounds like you need for this, without making you reinvent a socket server. That said, I would make certain that you actually need bidirectional, real-time communication between the client and server. If you can rework the problem such that the client-server communications do not need to be real-time, then things like RESTful webservices become a possibility, and (in my experience) are much less complicated and error prone.
We have a Java (Spring) web application with Tomcat servlet container.
We have a something like blog.
But the blog must load its posts dynamically with Ajax.
The client's ajax script checks for new posts every second.
I.e. Ajax must ask the server for new posts every second and it will be very heavy for database.
But what if we have hundreds of thousands connects simultaneously?
I think that we must retrieve all posts with cron every second and after that save it somewhere. But where? The main idea is to unload the database.
Any ideas about architecture?
Thanks in advance!
There is other architecture for polling that could be more optimal, depending on the case:
Long polling
Long polling is a variation of the
traditional polling technique and
allows emulation of an information
push from a server to a client. With
long polling, the client requests
information from the server in a
similar way to a normal poll. However,
if the server does not have any
information available for the client,
instead of sending an empty response,
the server holds the request and waits
for some information to be available.
Once the information becomes available
(or after a suitable timeout), a
complete response is sent to the
client. The client will normally then
immediately re-request information
from the server, so that the server
will almost always have an available
waiting request that it can use to
deliver data in response to an event.
In a web/AJAX context, long polling is
also known as Comet programming.
Long Polling
Example of Implementations of this technology:
Push Server
You could also use the observer pattern to register the requests, and notify them when an update is done.
Hundreds of thousands of concurrent users all polling our site every second makes for a huge amount of traffic. If you truly expect this load you are going to have to design your platform accordingly, probably by clustering multiple web, application and DB servers.
Remember that with a database connection pool you don't need a DB connection for every user.
I'm not as familiar with Tomcat, but in WebSphere we can set up connection pools to prepare a certain number of connections.
Also, are you mainly worried about reads or the same number of writes?
Plus, you may also want to have the database "split" depending on region etc. This way there is no single heavy load across the entire database, but it can then be split and even load balanced.
There is also the "NoSQL" databases to look into as well. Maybe something to consider. Just ideas to help out.
I have a situation where I want a Java client to have a two-way data channel with a servlet (I have control over both), so that either can begin data transferring without having to wait for the other to do something first, but to get through the firewalls this needs to be tunnelled in http or https.
I have looked around, but I do not believe I know the right terms for asking Google.
I was originally looking at http-tunneling modules, but realizing that I have a web container in the other end, I believe that the appropriate way is to think of a fat client needing to communicate home. I was thinking that the persistant connection in http 1.1 might be very useful here. I can easily do heartbeat transfers to keep the connection from ideling.
At this point in time I just need to do a proof of concept so I primarily need something that works now, which can then be optimized or even replaced later.
So, I'd appreciate pointers to projects that allow me to have a connection where either side can at will push information (like a serialized object or a descriptive stream of bytes) to the other side. I'd prefer pure Java, if at all possible.
EDIT: Thanks for the pointers. It appears that what I need, will be available in the servlet 3.0 specification, which I might end up using in the long term depending on when it will be supported in the various web containers.
For now I am investigating the Cometd package, which appears to be able to do exactly what I need for my prototype.
Search terms: comet, long-polling
These are mostly used in an AJAX context, but I see no reason why you could not use them in a Java project.
Please take a look at Eclipse Net4J,
http://wiki.eclipse.org/Net4j
It supports all the features you mentioned. A special nice feature is that it supports HTTP connection pooling so you can have lots of channels between client and server but use only a few HTTP connections.
The only problem is that it doesn't have documentation at all. You just have to read the source code. Once you figure it out, it's very easy to use.
There are a few more diagrams on old Net4J site,
http://net4j.berlios.de/
How fast does it need to be? You could always just do polling on the client. Just check for new messages every so often.
You can use the Hessian protocol over HTTP. It's a fast binary protocol for serializing data. Typically used for a web-services style RPC communication, but there's no reason it couldn't be 2-way - see Hessian mux. It's pure Java, too :-)
Generally this is done by having the server not respond to an http request immediately. It waits around for some update (or a timeout) before sending a response. Obviously some care needs to be made ensuring that the server will handle this under load.
See, for instance, Comet.