At what point is it better to switch from java.net to java.nio? .net (not the Microsoft entity) is easier to understand and more familiar, while nio is scalable, and comes with some extra nifty features.
Specifically, I need to make a choice for this situation: We have one control center managing hardware at several remote sites (each site with one computer managing multiple hardware units (a transceiver, TNC, and rotator)). My idea was to have write a sever app on each machine that acts as a gateway from the control center to the radio hardware, with one socket for each unit. From my understanding, NIO is meant for one server, many clients, but what I'm thinking of is one client, many servers.
I suppose a third option is to use MINA, but I'm not sure if that's throwing too much at a simple problem.
Each remote server will have up to 8 connections, all from the same client (to control all the hardware, and separate TX/RX sockets). The single client will want to connect to several servers at once, though. Instead of putting each server on different ports, is it possible to use channel selectors on the client side, or is it better to go multi-threaded io on the client side of things and configure the servers differently?
Actually, since the remote machines serve only to interact with other hardware, would RMI or IDL/CORBA be a better solution? Really, I just want to be able to send commands and receive telemetry from the hardware, and not have to make up some application layer protocol to do it.
Avoid NIO unless you have a good reason to use it. It's not much fun and may not be as beneficial as you would think. You may get better scalability once you are dealing with tens of thousands of connections, but at lower numbers you'll probably get better throughput with blocking IO. As always though, make your own measurements before committing to something you might regret.
Something else to consider is that if you want to use SSL, NIO makes it extremely painful.
Scalability will probably drive your choice of package. java.net will require one thread per socket. Coding it will be significantly easier. java.nio is much more efficient, but can be hairy to code around.
I would ask yourself how many connections you expect to be handling. If it's relatively few (say, < 100), I'd go with java.net.
There is almost no reason to write this kind of networking code from scratch now. Packages like netty.io will almost always get you more reliable and flexible code with fewer lines of code than a hand-crafted solution will. Also, with Netty, you can get SSL support w/o complicating your implementation at all. Libraries like netty also obviate the "async vs threads" question almost entirely, gives you good performance, and still lets you tweak the threading model as needed.
The number of connections you're talking about tells me you should use java.net. Really, there's no reason to complexify your task with non-blocking I/O. (Unless your remote systems are underpowered, but then why are you using Java on them?)
Take a look at Apache's XML-RPC package. It's easy to use, completely hides the network stuff from you, and works over good ol' HTTP. No need to worry about protocol issues ... it'll all look like method calls to you, on both ends.
Given the small number of connections involved, java.net sounds like the right solution.
Other posters talked about using XML-RPC. This is a good choice if the volumes of data being shipped are small, however I have had bad experiences with XML-based protocols when writing inter-process communications that ship large amounts of data (e.g. large request/responses, or frequent small amounts of data). The cost of XML parsing is typically orders of magnitude higher than more optimised wire formats (e.g. ASN.1).
For low volume control applications the simplicity of XML-RPC should outweigh the performance costs. For high volume data communications it may be better to use a more efficient wire protocol.
Related
I have the following situation:
I have 2 JVM processes (really 2 java processes running separately, not 2 threads) running on a local machine. Let's call them ProcessA an ProcessB.
I want them to communicate (exchange data) with one another (e.g. ProcessA sends a message to ProcessB to do something).
Now, I work around this issue by writing a temporary file and these process periodically scan this file to get message. I think this solution is not so good.
What would be a better alternative to achieve what I want?
Multiple options for IPC:
Socket-Based (Bare-Bones) Networking
not necessarily hard, but:
might be verbose for not much,
might offer more surface for bugs, as you write more code.
you could rely on existing frameworks, like Netty
RMI
Technically, that's also network communication, but that's transparent for you.
Fully-fledged Message Passing Architectures
usually built on either RMI or network communications as well, but with support for complicated conversations and workflows
might be too heavy-weight for something simple
frameworks like ActiveMQ or JBoss Messaging
Java Management Extensions (JMX)
more meant for JVM management and monitoring, but could help to implement what you want if you mostly want to have one process query another for data, or send it some request for an action, if they aren't too complex
also works over RMI (amongst other possible protocols)
not so simple to wrap your head around at first, but actually rather simple to use
File-sharing / File-locking
that's what you're doing right now
it's doable, but comes with a lot of problems to handle
Signals
You can simply send signals to your other project
However, it's fairly limited and requires you to implement a translation layer (it is doable, though, but a rather crazy idea to toy with than anything serious.
Without more details, a bare-bone network-based IPC approach seems the best, as it's the:
most extensible (in terms of adding new features and workflows to your
most lightweight (in terms of memory footprint for your app)
most simple (in terms of design)
most educative (in terms of learning how to implement IPC). (as you mentioned "socket is hard" in a comment, and it really is not and should be something you work on)
That being said, based on your example (simply requesting the other process to do an action), JMX could also be good enough for you.
I've added a library on github called Mappedbus (http://github.com/caplogic/mappedbus) which enable two (or many more) Java processes/JVMs to communicate by exchanging messages. The library uses a memory mapped file and makes use of fetch-and-add and volatile read/writes to synchronize the different readers and writers. I've measured the throughput between two processes using this library to 40 million messages/s with an average latency of 25 ns for reading/writing a single message.
What you are looking for is inter-process communication. Java provides a simple IPC framework in the form of Java RMI API. There are several other mechanisms for inter-process communication such as pipes, sockets, message queues (these are all concepts, obviously, so there are frameworks that implement these).
I think in your case Java RMI or a simple custom socket implementation should suffice.
Sockets with DataInput(Output)Stream, to send java objects back and forth. This is easier than using disk file, and much easier than Netty.
I tend to use jGroup to form local clusters between processes. It works for nodes (aka processes) on the same machine, within the same JVM or even across different servers.
Once you understand the basics it is easy working with it and having the options to actually run two or more processes in the same JVM makes it easy to test those processes easily.
The overhead and latency is minimal if both are on the same machine (usually only a TCP rountrip of about >100ns per action).
socket may be a better choice, I think.
Back in 2004 I implement code which do the job with sockets. Until then, many times I search for a better solution, because socket approach triggers firewall and my clients worry. There is no better solution until now. Client must serialize your data, send and server must receive and unserialize.
It is easy.
Basically I want a Java, Python, or C++ script running on a server, listening for player instances to: join, call, bet, fold, draw cards, etc and also have a timeout for when players leave or get disconnected.
Basically I want each of these actions to be a small request, so that players could either be processes on same machine talking to a game server, or machines across network.
Security of messaging is not an issue, this is for learning/research/fun.
My priorities:
Have a good scheme for detecting when players disconnect, but also be able to account for network latencies, etc before booting/causing to lose hand.
Speed. I'm going to be playing millions of these hands as fast as I can.
Run on a shared server instance (I may have limited access to ports or things that need root)
My questions:
Listen on ports or use sockets or HTTP port 80 apache listening script? (I'm a bit hazy on the differences between these).
Any good frameworks to work off of?
Message types? I'm thinking JSON or Protocol Buffers.
How to make it FAST?
Thanks guys - just looking for some pointers and suggestions. I think it is a cool problem with a lot of neat things to learn doing it.
As far as frameworks goes, Ginkgo looks promising for building a network service (which is what you're doing). The Python is very straightforward, and the asynchronicity enabled by gevent lets you do asynchronous things without generally having to worry about callbacks. The gevent core also gives you access to a lot of building blocks.
Rather than having lots of services communicating over ports, you might look into either 1) a good message queue, like RabbitMQ or 0mq, or 2) a distributed coordination server, like Zookeeper.
That being said, what you aim to do is difficult, especially if you're not familiar with the basics. It's a worthwhile endeavor to learn about those basics.
Don't worry about speed at first. Get it working, then make it scale. Of course, there are directions you can go that will make it easier to scale in the future. Zookeeper in particular gives you easy-to-implement primitives for scaling horizontally (i.e. multiple workers sharing the load). In particular, see the Zookeeper recipe book and their corresponding python implementations (courtesy of the kazoo, a gevent-based client library).
Don't forget that "fast" also means optimizing your own development time, for quicker iterations and less time cursing your development environment. So use Python, which will let you get up and running quickly now, and optimize later if you really truly start to bind on CPU time or memory use. (With this particular application, you're far more likely to bind on network IO.)
Anything else? Maybe a cup of coffee to go with your question :-)
Answering your question from the ground up would require several books worth of text with topics ranging from basic TCP/IP networking to scalable architectures, but I'll try to give you some direction nevertheless.
Questions:
Listen on ports or use sockets or HTTP port 80 apache listening script? (I'm a bit hazy on the differences between these).
I would venture that if you're not clear on the definition of each of these maybe designing an implementing a service that will be "be playing millions of these hands as fast as I can" is a bit hmm, over-reaching? But don't let that stop you as they say "ignorance is bliss."
Any good frameworks to work off of?
I think your project is a good candidate for Node.js. There main reason being that Node.js is relatively scaleable and it is good at hiding the complexity required for that scalability. There are downsides to Node.js, just Google search for 'Node.js scalability critisism'.
The main point against Node.js as opposed to using a more general purpose framework is that scalability is difficult, there is no way around it, and Node.js being so high level and specific provides less options for solving though problems.
The other drawback is Node.js is Javascript not Java or Phyton as you prefer.
Message types? I'm thinking JSON or Protocol Buffers.
I don't think there's going to be a lot of traffic between client and server so it doesn't really matter I'd go with JSON just because it is more prevalent.
How to make it FAST?
The real question is how to make it scalable. Running human vs human card games is not computationally intensive, so you're probably going to run out of I/O capacity before you reach any computational limit.
Overcoming these limitations is done by spreading the load across machines. The common way to do in multi-player games is to have a list server that provides links to identical game servers with each server having a predefined number of slots available for players.
This is a variation of a broker-workers architecture were the broker machine assigns a worker machine to clients based on how busy they are. In gaming users want to be able to select their server so they can play with their friends.
Related:
Have a good scheme for detecting when players disconnect, but also be able to account for network latencies, etc before booting/causing to lose hand.
Since this is in human time scales (seconds as opposed to miliseconds) the client should send keepalives say every 10 seconds with say 30 second session timeout.
The keepalives would be JSON messages in your application protocol not HTTP which is lower level and handled by the framework.
The framework itself should provide you with HTTP 1.1 connection management/pooling which allows several http sessions (request/response) to go through the same connection, but do not require the client to be always connected. This is a good compromise between reliability and speed and should be good enough for turn based card games.
Honestly, I'd start with classic LAMP. Take a stock Apache server, and a mysql database, and put your Python scripts in the cgi-bin directory. The fact that they're sending and receiving JSON instead of HTTP doesn't make much difference.
This is obviously not going to be the most flexible or scalable solution, of course, but it forces you to confront the actual problems as early as possible.
The first problem you're going to run into is game state. You claim there is no shared state, but that's not right—the cards in the deck, the bets on the table, whose turn it is—that's all state, shared between multiple players, managed on the server. How else could any of those commands work? So, you need some way to share state between separate instances of the CGI script. The classic solution is to store the state in the database.
Of course you also need to deal with user sessions in the first place. The details depend on which session-management scheme you pick, but the big problem is how to propagate a disconnect/timeout from the lower level up to the application level. What happens if someone puts $20 on the table and then disconnects? You have to think through all of the possible use cases.
Next, you need to think about scalability. You want millions of games? Well, if there's a single database with all the game state, you can have as many web servers in front of it as you want—John Doe may be on server1 while Joe Schmoe is on server2, but they can be in the same game. On the other hand, you can a separate database for each server, as long as you have some way to force people in the same game to meet on the same server. Which one makes more sense? Either way, how do you load-balance between the servers. (You not only want to keep them all busy, you want to avoid the situation where 4 players are all ready to go, but they're on 3 different servers, so they can't play each other…).
The end result of this process is going to be a huge mess of a server that runs at 1% of the capacity you hoped for, that you have no idea how to maintain. But you'll have thought through your problem space in more detail, and you'll also have learned the basics of server development, both of which are probably more important in the long run.
If you've got the time, I'd next throw the whole thing out and rewrite everything from scratch by designing a custom TCP protocol, implementing a server for it in something like Twisted, keeping game state in memory, and writing a simple custom broker instead of a standard load balancer.
I'm planning on building a Java server that will handle real time game communications between clients. What is the best type of Java implementation out there that could efficiently and, hopefully, accurately communicate between a client and server at high speeds (say 5-15 packets per second)? I know there are many types of Java networking APIs (ie. ObjectInputStream and ObjectOutputStream, DatagramPacket, KyroNet, etc.), but I'm not sure what is the most effective and/or commonly used implementation for such a scenario. I would assume that most real time games use UDP communication methods, but I understand the reliability issues that come with it. Are there UDP implementations that have some form of flow control? Anyway, thanks in advance!
A few things to consider:
Java NIO is really good, and can handle the kind of throughput/latency you are looking for. Don't use any of the older networking / serialization frameworks and APIs
Latency is really important. You basically want a minimal layer over NIO that allows you to send very fast, small, inidividual messages with minimal overhead.
Depending on the game, you may want TCP or UDP or both. Use TCP for important messages, UDP for messages that aren't strictly necessary for the game to proceed or will be subsumed by a future update (e.g. position updates in a FPS)
Some people implement their own TCP-like messaging protocol over UDP for real time games. This is probably more hassle than it's worth, but be aware of it as an option if you really need to optimise for a specific type of communication
For real time games, you are nearly always doing custom serialisation (e.g. only sending deltas rather than full updates of object positions) - so make sure your framework allows this
Given this, I'd recommend one of the following
Kryonet - lightwieght, customisable, designed for this kind of purpose
Netty - slightly more enterprise-oriented, but very capable, robust and scalable
Roll-your-own based on NIO - tricky but possible if you want really fine grained control. I've done this before, but in retrospect I probably should have picked Kryonet or Netty
Good luck!
Immidiately forget ObjectOutputStream and ObjectInputStream. These are the standard output-input mechanisms of the old standard java serialization, which is slow and produces bloat objects. Some resources to start with:
http://code.google.com/p/kryonet/
http://code.google.com/p/pyronet/
I have done some searching but haven't come up with anything on this topic. I was wondering if anyone has ever compared (to some degree) the performance difference between an RPC over a socket and a REST web service. If both do the same thing, which would have a tendency to be the better performer? I've already started building some socket code and would like to know if REST would give better performance before I progress much further. Any input would be really appreciated. Thanks indeed
RMI
Feels like a local API, much like
XMLRPC
Can provide some fairly nice remote
exception data
Java specific means this causes lock
in and limits your options
Has horrible versioning problems
between different versions of clients
Skeleton files must be compiled in
like CORBA, which is not very flexible
REST:
easy to route around firewalls
useful for uploading files as it can
be rather lightweight
very simple if you just want to shove
simple things at something and get
back an integer (like for uploaders)
easy to proxy security behind Apache
and let it take the heat
does not define any standard format
for the way the data is being
exchanged (could be JSON, YAML 1.0,
YAML 2.0, arbitrary XML format, etc)
does not define any convention about
having remote faults sent back to the
caller, integer codes are frequently
used, but method of sending back data
is not defined. Ideally this would be
standardized.
may require a lot of work on the
client side caller of the library to
make use of data (custom serialization
and so forth)
In short from here
web services do allow a loosely
coupled architecture. With RMI, you
have to make sure that the objects
stay in sync in all applications
RMI works best for smaller
applications, that are not
internet-related and thus not scalable
Its hard to imagine that REST is faster than a simple socket connection given it also goes over a Socket.
However REST may be performant enough, standard and easier to use. I would test whether REST is fast enough and meets your requirements first (or one of the many other existing solutions) before attempting your own Socket solution.
I am wondering how fast client side Javascript is compared to server side Java in terms of raw computational power.
For instance, sorting. Should it all be done server side if possible? And how about iterating through a collection?
The answer is very complex and depends on each specific situation.
A server is generally going to be orders of magnitude more powerful than a client machine; and managed code is generally much faster than scripting.
However - the client machine also usually has a lot of spare computational power that isn't being used, while the server could be running requests for thousands of users. So in that case much of the work that can be offloaded to the client is preferable.
You must understand the needs and expectations of your users for each individual piece of functionality in your application and look at the relative load versus development cost for your organization to split development between two environments and figure out what works best. For example, your users probably expect that your site does not freeze their browser or cause unfortunate "this web page is eating your computer" dialogs, so your client scripts should be written intelligently. That's not to say you can't do a ton of work on the client (you can), you just have to be smart about how you do it and remember it blocks the UI thread.
Server side Java will certainly run much faster, you'll need to benchmark for your particular case but you're probably looking at a 10-20x speed advantage.
However that probably doesn't matter much: regardless of raw computational power I would still recommend trying to do as much calculation as possible client side in Javascript for the following reasons:
Even 20x slower is still likely to be unnoticeable to the user
When you factor in the latency of client to server communications, doing it locally on the client will almost certainly be more responsive to the user
Client machines are probably not CPU-bound, so executing some additional code on them is effectively free
If you can offload work from the server to the client, you will need less server side infrastructure, which can get expensive when you need to start scaling up
Having lots of client to server communications is likely to complicate your architecture and make it harder to develop new functionality in the future.
Doing calculations on the client can often reduce bandwidth requirements
There are of course good reasons to keep things on the server e.g.:
Security implications (if client can't be trusted)
Very large data set needed (would take too long to download to client)
Need to exploit massively parallel calculations (e.g. for Google search)
Avoid need to allow for differences in clients (e.g. Javascript versions)
But if these don't apply then I would try to push things to the client as much as possible.
The big difference here is not the speed of the VMs. The difference is that a single server has to serve dozens or hundreds of clients. Another factor: round trips to the server add a lot of overhead, so you want to minimize them.
Basically, anything that's not security-critical and can be done on the client easily, should be done on the client.
These two things cannot be compared side-by-side.
There are far too many factors, and the languages are far too different, and serve far too different purposes to effectively compare their speed.
You really need to decide where you do your calculations on a case-by-case basis.
If the client machine is required to do too much work, it will degrade the performance of the app, but if the server is asked to do too much, it can slow down the response time for everybody.
Javascript is way fast enough to do sorting of data on the client. I have used it with datasets of 5,000 rows, 11 fields per row and used that to sort tables on the client (with pagination). These sorts used compare functions so that it would sort the rows by field and datatype. The actual Javascript part of the process took something on the order of the high tens of milliseconds (~80 if I recall).
I would rather push that kind of mundane task down to the client any day rather than clog up a very busy server with it. YMMV.
Don't mixup Java with Javascript - the name is similar but they are completely different languages.
Javascript is a client side, interpreted language, Java is a byte-code language running inside a virtual machine, with much more optimization for handling large data.
As of the fact, that servers running Java services are normally have much more power (faster CPUs and disk-I/O, more RAM) computing on Java is always faster on my experience.
Javascript can be used on client-side if you want to compute small datas (like sorting just a few hundred elements).
All in all you will have to decide which way is faster: compute and prepare the data on a server and transmit them to the client (where the transmit via internet is the by far biggest slowdown reason), or to compute the data already on the client-side via javascript.
My suggestion is: if there are none of the data you want on client-side are already on client-side it is meaningful to compute them on the server and transmit the already prepared data to the client. But if the data is already on the client-side and they are not more than a few hundred the better user-experience is to compute them in the user's browser.
It really depends on the boxes you are running the code, how big the data is and the availability to work with the process and other factors, plus you have to think sending data through the wire that it's expensive. You have to balance what you gonna do with that and if it's better to spend more time processing things before and let the resources free for the heavy stuff, and playing sending back and forth data.
There is not an specific answer. It depends on the power of your client and the size of the computation. Is it a smart watch, a smart phone? If you can't guarantee the power of your client, I would leave the computation to the server.