In our new project we need to implement a server application. This server gets connection requests of 50,000(+) clients. Problem is these connections have to remain open and have to be managed somewhere. The application should work like a telephone exchange. So it can get requests of connected clients and connect them to other (maybe several) clients only if they are also connected. A proprietary protocol is used. My questions are:
How (and where) to manage the open sockets? Should I put them in a HashMap or something? This sounds curious to me. But I don't have experiences with so many open connections.
Are there any frameworks available which support this connection requirements?
Thank you for your help!
How (and where) to manage the open sockets? Should I put them in a HashMap or something?
Typically each socket will be managed by a thread that will be responsible for reading and writing to the socket. You would also have a master thread that is responsible for receiving all connection requests at a predefined network interface & port (using the ServerSocket API class), which may then hand off the actual processing work to the worker/slave threads. In this case, you ought to be looking at a thread pool for the worker threads, because creating 50k threads will most likely overwhelm your OS and the hardware.
Also, if you are indeed managing 50k concurrent sockets, using NIO API (java.nio.*) over the plain IO API of Java is highly recommended, although I haven't seen too many projects requiring more than 2-5k concurrent connections. There are atleast two known NIO based frameworks in the Java world - Apache MINA and JBoss Netty. I would however recommend reading the well written NIO tutorial, before heading onto use the NIO API or the NIO frameworks.
Related
I've been looking into making a simple Sockets-based game in Java, and read in multiple places that client sockets are destroyed after a single exchange. Is this good practice for continued connections? The server needs to maintain a connection with a client (i.e. not using socket.accept() every time it wants to tell a client about something), but can't wait every time for the client's response. I already have the server/client running in separate threads, but won't destroying the socket after every exchange mean re-acquiring (or failing to re-acquire) a connection to that client? I've seen so many conflicting websites about sockets in Java and how they should be implemented.
There's no hard and fast rules, but it does depend slightly on what data rates you want to achieve.
For example, YouTube is a streaming video service, but the video data is delivered by means of the client using https to fetch batches of video data. Inefficient, yes, but very easy to program for. There's lots of reasons to use https for an application like YouTube (firewalls, etc), but ultimate power saving and network performance were not one of them. The "proper" way would be to use a protocol like RTP which uses UDP to deliver small packets of data which can then be rearranged into order, you also have to deal with missing frames at the CODEC level, etc. Much less network traffic, friendly to bandwidth constrained network links, but significantly more difficult to deal with traversing across firewalls, in client software, etc.
So if your game is sending modest amounts of data, the only thing wrong with setting up and tearing down a whole socket connection for every message is the nagging feeling you yourself will have that it is somehow not the most efficient solution.
Though it sounds like you have a conflict between the need to communicate between client / server and a need to process something else whilst waiting for the communication to complete. Here you're getting into asynchronous I/O territory. To make that easy i strongly suggest you take a look at ZeroMQ - that will make everything a whole lot simpler.
and read in multiple places that client sockets are destroyed after a single exchange.
Only in the places where that actually happens. There are numerous contexts where it doesn't, the outstanding example being HTTP, where every effort is made to reuse connections.
Is this good practice for continued connections?
The question is a contradiction in terms. A continued connection is a connection that isn't closed. A closed connection can't be continued.
The server needs to maintain a connection with a client (i.e. not using socket.accept() every time it wants to tell a client about something), but can't wait every time for the client's response.
The word you are groping for here is 'session'.
I already have the server/client running in separate threads, but won't destroying the socket after every exchange mean re-acquiring (or failing to re-acquire) a connection to that client?
Yes.
I've seen so many conflicting websites about sockets in Java and how they should be implemented.
You should use a connection pool at the client; a request loop at the server that looks for multiple requests per connection; a client-side facility that closes idle connections after some idle timeout; and a read timeout at the server that closes connections on which no request has been read within the timeout.
Writing any kind of web server in Java (be it a webserver, RESTful webapp or a microservice) you get to use Sockets for dual channel communication between client and server.
Using the common Socket and ServerSocket class is trivial, but since Sockets are blocking, you end up creating a thread for each request. Using this threaded system, your server will work perfectly but won't scale very well.
The alternative is using Streams by means of SocketChannel, ServerSocketChannel and Selector, and is clearly not as trivial as common Sockets.
My question is: which of these two systems are used in production ready code? I'm talking about medium to big projects like Tomcat, Jetty, Sparkjava and the like?
I suppose they all use the Stream approach, right?
To make a web server really scalable, you'll have to implement it with non-blocking I/O - which means that you should make it in such a way that threads will never get blocked waiting for I/O operations to complete.
Threads are relatively expensive objects. For example, for each thread memory needs to be allocated for its call stack. By default this is in the order of one or a few MB. Which means that if you create 1000 threads, just the call stacks for all those threads will already cost you ~ 1 GB memory.
In a naïve server application, you might create a thread for each accepted connection (each client). This won't scale very well if you have many concurrent users.
I don't know the implementation details of servers like Tomcat and Jetty, but they are most likely implemented using non-blocking I/O.
Some info about non-blocking I/O in Tomcat: Understanding the Tomcat NIO Connector
One of the most well-known non-blocking I/O libraries in Java is Netty.
I know my IP address, and that of my friend.
How can I transfer objects/files between the two machines?
I am an advanced Java programmer, but have never worked with networks before.
EDIT:
I am now using an API called jnmp2p ( http://code.google.com/p/jnmp2p/ ).
It works fine when I use internal IPs, but fails when I give the external ones.
How do I connect to a computer that isn't on my private network?
If you looking for communication between two java applications and do not want to meddle with the low level networking details, then you can use following two approaches, depending on the type of applications you are dealing with.
If both the application (on two machines) are java standalone applications, then RMI is the best bet. Check out the basics from these links (1,2)
If your application (receiving files/objects) is a web application then its you can write the Servlet on the serve side and then write a client application to send files/objects(binary) to server. Commons FileUpload is very popular library for this purpose.
Author of jnmp2p here. I don't maintain the library any more because I've moved onto other things. However I had some comments.
Peer to peer communication with IPs outside your private network is a hard problem. This is because stateful NATs and firewalls on both ends have become common-place, which prevent you from establishing connections between machines directly.
For example skype uses a rendezvous service where both machines start outbound connections to a third machine and communicate via that. Aside from setting up additional infrastructure that any peer to peer solution is going to be limited to subnets within your NAT, so solutions like JNMP2P or RMI (with gross modifications) are going to be your best bet.
I'm writing a multiplayer/multiroom game (Hearts) in java, using RMI and a centralized server.
But there's a problem: RMI Callbacks will not work beacause clients are Natted and Firewalled. I basically need the server to push data updates to clients, possibly without using polling and without using sockets (I would code at an higher level)
In your opinion, what's the best solution for realizing this kind of architecture? Is an ajax application the only solution?
You say that you don't want polling, but AJAX is exactly that. You can look at Comet but it's hard to escape polling anyway (e.g. Comet itself uses polling underneath).
You could use a peer to peer framework such as JXTA.
I can suggest two main techniques.
The server has a method getUpdates, callable by clients. The method returns the control to the client when there is an update to show.
When Clients perform the registration, they give the server a callback remote object
Since this object is not registered in any rmi registry, there should no be any issue with natted clients.
I'm not sure how(if) ajax works for a non-browser-based app. You could just maintain your own pool of SocketConnections open for the duration of the application with a Thread per connection.
If you need to scale to a lot of concurrent connections, look to a non-blocking I/O framework like Apache Mina or Netty (related SO post: Netty vs Apache MINA).
The point of my question is to ask if it is accepted to use both TCP and UDP to communicate between client and server.
I am making a real-time client server game with parts of the communication that need to be guaranteed (logging in, etc..), but other parts will be ok to lose packets (state updates, etc). So, I would like to use UDP for most of the data communication but I do not want to have to implement my own framework to insure that my control communication (logging in) is guaranteed.
So, would it be reasonable to initially use TCP to manage a connection, and then on a separate port send data communication pack and forth?
You should absolutely do it that way (use TCP and UDP to accomplish different communication tasks.) And you don't even have to use two different ports. One will suffice. You can listen to the two different protocols on the same port.
It is quite reasonable and already used in mainstream. Even when browsing the Web, DNS operations are UDP-based and HTTP connections are TCP-based.
Keep in mind that you should either consider the two connection types to be completely independent or employ additional measures to properly handle any inter-dependencies. TCP connections can have timing issues at the OS and network levels and UDP connections have packet loss issues. You should take specific measures to avoid deadlocks and performance problems when the TCP part of your application stalls or a UDP packet is lost.
It is not only accepted but is widely used. As a good example, BATS Exchange is using this approach in their market data distribution system, to implement a recovery mechanisms.