WebSocket data consistency vs. latency

WebSocket data consistency vs. latency - java

Let's assume that I have a JavaScript front-end (Angular.js, for example), a Java-based back-end (Spring, running on Tomcat, for example) and a database management system (SAP HANA In-Memory, in my case). For example, I have graphs that can change relatively quickly.
I am wondering what an efficient and fast architecture could look like. Do you usually send a whole collection of objects to the UI or do you just send deltas?
In my case, data consistency on the UI is very important in order for the application to work properly, but low-latency as well, especially when it comes to data merges.
When it comes to consistency, I often tend to do a SELECT from the database on an insert and read the whole object collection again, but my concerns are that this does not scale.
Is there a generic approach to that problem or even existing frameworks?
Edit:
Currently, it is around 300 objects with a couple of integer attributes and cross-references that can change and rearrange in a millisecond time, but could go up to 10000 in the future. My challenge here is the communication between front-end and back-end, so the front-end always has a consistent data set in real-time.

How close is the client to the server? Is it a mile/km away or hundreds/thousands of miles away? Is the client on the internet or is it on a high-performance VPN? Are you close to the backbone or dozens of hops away? You're not normally going to consistently get 1 millisecond latency on the web if you're trusting the general internet.
If you are on an internal company network and the client is physically close to the server, e.g., same machine, same local network, you can get single digit ms latency with WebSocket (I personally have gotten 3-4 ms across internal data centers at a big investment bank).
Don't optimize too early. That's usually a bad thing.
Although with any high-performance UI, its always good to just send the deltas.
You may want to consider some sort of event mechanism to reduce your polling the data source. Then you would only update the data when it actually changed.

Related

Multiplayer card game on server using RPC

Basically I want a Java, Python, or C++ script running on a server, listening for player instances to: join, call, bet, fold, draw cards, etc and also have a timeout for when players leave or get disconnected.
Basically I want each of these actions to be a small request, so that players could either be processes on same machine talking to a game server, or machines across network.
Security of messaging is not an issue, this is for learning/research/fun.
My priorities:
Have a good scheme for detecting when players disconnect, but also be able to account for network latencies, etc before booting/causing to lose hand.
Speed. I'm going to be playing millions of these hands as fast as I can.
Run on a shared server instance (I may have limited access to ports or things that need root)
My questions:
Listen on ports or use sockets or HTTP port 80 apache listening script? (I'm a bit hazy on the differences between these).
Any good frameworks to work off of?
Message types? I'm thinking JSON or Protocol Buffers.
How to make it FAST?
Thanks guys - just looking for some pointers and suggestions. I think it is a cool problem with a lot of neat things to learn doing it.

As far as frameworks goes, Ginkgo looks promising for building a network service (which is what you're doing). The Python is very straightforward, and the asynchronicity enabled by gevent lets you do asynchronous things without generally having to worry about callbacks. The gevent core also gives you access to a lot of building blocks.
Rather than having lots of services communicating over ports, you might look into either 1) a good message queue, like RabbitMQ or 0mq, or 2) a distributed coordination server, like Zookeeper.
That being said, what you aim to do is difficult, especially if you're not familiar with the basics. It's a worthwhile endeavor to learn about those basics.
Don't worry about speed at first. Get it working, then make it scale. Of course, there are directions you can go that will make it easier to scale in the future. Zookeeper in particular gives you easy-to-implement primitives for scaling horizontally (i.e. multiple workers sharing the load). In particular, see the Zookeeper recipe book and their corresponding python implementations (courtesy of the kazoo, a gevent-based client library).
Don't forget that "fast" also means optimizing your own development time, for quicker iterations and less time cursing your development environment. So use Python, which will let you get up and running quickly now, and optimize later if you really truly start to bind on CPU time or memory use. (With this particular application, you're far more likely to bind on network IO.)

Anything else? Maybe a cup of coffee to go with your question :-)
Answering your question from the ground up would require several books worth of text with topics ranging from basic TCP/IP networking to scalable architectures, but I'll try to give you some direction nevertheless.
Questions:
Listen on ports or use sockets or HTTP port 80 apache listening script? (I'm a bit hazy on the differences between these).
I would venture that if you're not clear on the definition of each of these maybe designing an implementing a service that will be "be playing millions of these hands as fast as I can" is a bit hmm, over-reaching? But don't let that stop you as they say "ignorance is bliss."
Any good frameworks to work off of?
I think your project is a good candidate for Node.js. There main reason being that Node.js is relatively scaleable and it is good at hiding the complexity required for that scalability. There are downsides to Node.js, just Google search for 'Node.js scalability critisism'.
The main point against Node.js as opposed to using a more general purpose framework is that scalability is difficult, there is no way around it, and Node.js being so high level and specific provides less options for solving though problems.
The other drawback is Node.js is Javascript not Java or Phyton as you prefer.
Message types? I'm thinking JSON or Protocol Buffers.
I don't think there's going to be a lot of traffic between client and server so it doesn't really matter I'd go with JSON just because it is more prevalent.
How to make it FAST?
The real question is how to make it scalable. Running human vs human card games is not computationally intensive, so you're probably going to run out of I/O capacity before you reach any computational limit.
Overcoming these limitations is done by spreading the load across machines. The common way to do in multi-player games is to have a list server that provides links to identical game servers with each server having a predefined number of slots available for players.
This is a variation of a broker-workers architecture were the broker machine assigns a worker machine to clients based on how busy they are. In gaming users want to be able to select their server so they can play with their friends.
Related:
Have a good scheme for detecting when players disconnect, but also be able to account for network latencies, etc before booting/causing to lose hand.
Since this is in human time scales (seconds as opposed to miliseconds) the client should send keepalives say every 10 seconds with say 30 second session timeout.
The keepalives would be JSON messages in your application protocol not HTTP which is lower level and handled by the framework.
The framework itself should provide you with HTTP 1.1 connection management/pooling which allows several http sessions (request/response) to go through the same connection, but do not require the client to be always connected. This is a good compromise between reliability and speed and should be good enough for turn based card games.

Honestly, I'd start with classic LAMP. Take a stock Apache server, and a mysql database, and put your Python scripts in the cgi-bin directory. The fact that they're sending and receiving JSON instead of HTTP doesn't make much difference.
This is obviously not going to be the most flexible or scalable solution, of course, but it forces you to confront the actual problems as early as possible.
The first problem you're going to run into is game state. You claim there is no shared state, but that's not right—the cards in the deck, the bets on the table, whose turn it is—that's all state, shared between multiple players, managed on the server. How else could any of those commands work? So, you need some way to share state between separate instances of the CGI script. The classic solution is to store the state in the database.
Of course you also need to deal with user sessions in the first place. The details depend on which session-management scheme you pick, but the big problem is how to propagate a disconnect/timeout from the lower level up to the application level. What happens if someone puts $20 on the table and then disconnects? You have to think through all of the possible use cases.
Next, you need to think about scalability. You want millions of games? Well, if there's a single database with all the game state, you can have as many web servers in front of it as you want—John Doe may be on server1 while Joe Schmoe is on server2, but they can be in the same game. On the other hand, you can a separate database for each server, as long as you have some way to force people in the same game to meet on the same server. Which one makes more sense? Either way, how do you load-balance between the servers. (You not only want to keep them all busy, you want to avoid the situation where 4 players are all ready to go, but they're on 3 different servers, so they can't play each other…).
The end result of this process is going to be a huge mess of a server that runs at 1% of the capacity you hoped for, that you have no idea how to maintain. But you'll have thought through your problem space in more detail, and you'll also have learned the basics of server development, both of which are probably more important in the long run.
If you've got the time, I'd next throw the whole thing out and rewrite everything from scratch by designing a custom TCP protocol, implementing a server for it in something like Twisted, keeping game state in memory, and writing a simple custom broker instead of a standard load balancer.

How to Implement caching for a web application

What are the different ways to cache a web application data, developed using Java and NoSQL database? Databases also provide caching, are they, the only & always the best option to go with, for caching?
How else can I cache my data of users on the application. Application contains very user specific data like in a social network. Are there some simple thumb rules of what type of things should be cached?
Can I also cache my data on the application server using Java ?

If you want a rule of thumb, here's what Michael Jackson (not that Michael Jackson) said:
The First Rule of Program Optimization: Don't do it.
The Second Rule of Program Optimization (for experts only!): Don't do it yet.
The ancient tradition is that you don't optimise until you've profiled - that is, until you have hard evidence as to what actually needs to be optimised. Cacheing is a kind of optimisation; it is very likely to be important for your app, but until you are able to put your app under load and look at what objects are taking a long time to obtain (loading from the database or whatever), you won't know what needs cacheing. It really doesn't matter how smart you are, or what advice you get here - until you do that, you will not know what needs to be cached.
As for things you can cache, it's anything, but i suppose you can classify it into three groups:
Things that have come fresh from the database. These are easy to cache, because at the point at which you go to the database, you have the identifying information you'd need for a cache key (primary key, query parameters, etc). By cacheing them, you save the time taken to get them from the database - this involves IO, so it is likely to be quite large.
Things that have been produced by computation in the domain model (news feeds in a social app, perhaps). These may be trickier to cache, because more contextual information goes into producing them; you might have to refactor your code to create a single point where the required information is all to hand, so you can apply cacheing to it. Or you might find that this exists already. Cacheing these will save all the database access needed to obtain the information that goes into making them, as well as all the computation; the time taken for computation may or may not be a significant addition to the time taken for IO. Invalidating cached things of this kind is likely to be much harder than pure database objects.
Things that are being sent to the browser - pages, or fragments of pages. These can be quite easy to cache, because in a properly-designed application, they're uniquely identified by either the URL, or the combination of URL and user. Cacheing these will save all the computation in your app; it can even avoid servicing requests, because it can be done by a reverse proxy sitting in front of your app server. Two problems. Firstly, it uses a huge amount of memory: the page rendered from a few kilobytes of objects could be tens or hundreds of kilobytes in size (my Facebook homepage is 50 kB). That means you have to save a vast amount of computation to make it a better deal than cacheing at the database or domain model layers, and there just isn't that much computation between the domain model and the HTML in a sensibly-designed application. Secondly, invalidation is even harder than in the domain model, and is likely to happen prohibitively often - anything which changes the page or the fragment needs to invalidate the cache.
Finally, the actual mechanism: start with something simple and in-process, like a map with limited size and a least-recently-used eviction policy. That's simple but effective. Something out-of-process like EHCache is more complicated, but has two advantages: you can share caches between multiple processes (helpful if you have a cluster, which you probably will at some point), and you can store data where the garbage collector won't see it, which might save some CPU time (might - this is too big a subject to get into here).
But i reiterate my first point: don't cache until you know what needs to be cached, and once you do, be mindful of the limitations on the benefits of cacheing, and try to keep your cacheing strategy as simple as possible (but no simpler, of course).

I'll assume you're building a relatively typical web application that:
has a single server used for persistence
multiple web servers
ties authenticated users to a single server via sticky sessions through a load balancer
Now, with that stated to answer so of your questions. Most persistence, database or NoSQL, likely have some sort of caching built in such that if you execute the same simple query repeatedly (e.g. retrieval by primary key) it's able to cache the result. However, the more complex the query, the less likely persistence can perform caching on it. In addition, if there's only one server for persistence (i.e. no sharding, or write master/read slaves) it quickly becomes the bottleneck. So the application level caching you want to do usually should occur on the web servers to reduce load on the database.
As far as what should be cached, the heuristic is items frequently accessed and/or expensive to generate (in terms of database/web server processing/memory). Typical candidates are the home page and any other landing page of a site - often the best approach for these is generating a static file and serving that. The next pieces depend on your application, but typically the most effective strategy is caching as close to the final result as possible - often the HTML being served. For your social network this might be a list of featured updates or some such.
As far as user sessions are concerned, these are definitely a good candidate for caching. In this case you can probably get a lot of mileage out of judicious use of the web server's session scope (assuming a JSP server). This data lives in memory and is a good place to keep of user specific information shown once a user authenticates on every page (e.g. first and last name).
Now the final thing to consider is dealing with cache invalidation and really is the hard part of all this (naming stuff is the other hard thing in computer science). In this case using something like memcached or ehcache as others have mentioned is the right approach. ehcache can easily run in process with your java application and does a good job of expiring things, with policies for least recently used and least frequently used, and allowing you to use both memory and disk for caching. What you'll need to think about is the situations where you need to expire something form the cache ahead of this schedule because data's changed. In this case you need to work through those dependencies in your application's architecture so that it read/writes to the cache as appropriate.

How fast is client side javascript versus server side Java?

I am wondering how fast client side Javascript is compared to server side Java in terms of raw computational power.
For instance, sorting. Should it all be done server side if possible? And how about iterating through a collection?

The answer is very complex and depends on each specific situation.
A server is generally going to be orders of magnitude more powerful than a client machine; and managed code is generally much faster than scripting.
However - the client machine also usually has a lot of spare computational power that isn't being used, while the server could be running requests for thousands of users. So in that case much of the work that can be offloaded to the client is preferable.
You must understand the needs and expectations of your users for each individual piece of functionality in your application and look at the relative load versus development cost for your organization to split development between two environments and figure out what works best. For example, your users probably expect that your site does not freeze their browser or cause unfortunate "this web page is eating your computer" dialogs, so your client scripts should be written intelligently. That's not to say you can't do a ton of work on the client (you can), you just have to be smart about how you do it and remember it blocks the UI thread.

Server side Java will certainly run much faster, you'll need to benchmark for your particular case but you're probably looking at a 10-20x speed advantage.
However that probably doesn't matter much: regardless of raw computational power I would still recommend trying to do as much calculation as possible client side in Javascript for the following reasons:
Even 20x slower is still likely to be unnoticeable to the user
When you factor in the latency of client to server communications, doing it locally on the client will almost certainly be more responsive to the user
Client machines are probably not CPU-bound, so executing some additional code on them is effectively free
If you can offload work from the server to the client, you will need less server side infrastructure, which can get expensive when you need to start scaling up
Having lots of client to server communications is likely to complicate your architecture and make it harder to develop new functionality in the future.
Doing calculations on the client can often reduce bandwidth requirements
There are of course good reasons to keep things on the server e.g.:
Security implications (if client can't be trusted)
Very large data set needed (would take too long to download to client)
Need to exploit massively parallel calculations (e.g. for Google search)
Avoid need to allow for differences in clients (e.g. Javascript versions)
But if these don't apply then I would try to push things to the client as much as possible.

The big difference here is not the speed of the VMs. The difference is that a single server has to serve dozens or hundreds of clients. Another factor: round trips to the server add a lot of overhead, so you want to minimize them.
Basically, anything that's not security-critical and can be done on the client easily, should be done on the client.

These two things cannot be compared side-by-side.
There are far too many factors, and the languages are far too different, and serve far too different purposes to effectively compare their speed.
You really need to decide where you do your calculations on a case-by-case basis.
If the client machine is required to do too much work, it will degrade the performance of the app, but if the server is asked to do too much, it can slow down the response time for everybody.

Javascript is way fast enough to do sorting of data on the client. I have used it with datasets of 5,000 rows, 11 fields per row and used that to sort tables on the client (with pagination). These sorts used compare functions so that it would sort the rows by field and datatype. The actual Javascript part of the process took something on the order of the high tens of milliseconds (~80 if I recall).
I would rather push that kind of mundane task down to the client any day rather than clog up a very busy server with it. YMMV.

Don't mixup Java with Javascript - the name is similar but they are completely different languages.
Javascript is a client side, interpreted language, Java is a byte-code language running inside a virtual machine, with much more optimization for handling large data.
As of the fact, that servers running Java services are normally have much more power (faster CPUs and disk-I/O, more RAM) computing on Java is always faster on my experience.
Javascript can be used on client-side if you want to compute small datas (like sorting just a few hundred elements).
All in all you will have to decide which way is faster: compute and prepare the data on a server and transmit them to the client (where the transmit via internet is the by far biggest slowdown reason), or to compute the data already on the client-side via javascript.
My suggestion is: if there are none of the data you want on client-side are already on client-side it is meaningful to compute them on the server and transmit the already prepared data to the client. But if the data is already on the client-side and they are not more than a few hundred the better user-experience is to compute them in the user's browser.

It really depends on the boxes you are running the code, how big the data is and the availability to work with the process and other factors, plus you have to think sending data through the wire that it's expensive. You have to balance what you gonna do with that and if it's better to spend more time processing things before and let the resources free for the heavy stuff, and playing sending back and forth data.

There is not an specific answer. It depends on the power of your client and the size of the computation. Is it a smart watch, a smart phone? If you can't guarantee the power of your client, I would leave the computation to the server.

Commercial Website architecture question

I have to write an architecture case study but there are some things that i don't know, so i'd like some pointers on the following :
The website must handle 5k simultaneous users.
The backend is composed by a commercial software, some webservices, some message queues, and a database.
I want to recommend to use Spring for the backend, to deal with the different elements, and to expose some Rest services.
I also want to recommend wicket for the front (not the point here).
What i don't know is : must i install the front and the back on the same tomcat server or two different ? and i am tempted to put two servers for the front, with a load balancer (no need for session replication in this case). But if i have two front servers, must i have two back servers ? i don't want to create some kind of bottleneck.
Based on what i read on this blog a really huge charge is handle by one tomcat only for the first website mentionned. But i cannot find any info on this, so i can't tell if it seems plausible.
If you can enlight me, so i can go on in my case study, that would be really helpful.
Thanks :)

There are probably two main reasons for having multiple servers for each tier; high-availability and performance. If you're not doing this for HA reasons, then the unfortunate answer is 'it depends'.
Having two front end servers doesn't force you to have two backend servers. Is the backend going to be under a sufficiently high load that it will require two servers? It will depend a lot on what it is doing, and would be best revealed by load testing and/or profiling. For a site handling 5000 simultaneous users, though, my guess would be yes...

It totally depends on your application. How heavy are your sessions? (Wicket is known for putting a lot in the session). How heavy are your backend processes.
It might be a better idea to come up with something that can scale. A load-balancer with the possibility to keep adding new servers for scaling.
Measurement is the best thing you can do. Create JMeter scripts and find out where your app breaks. Built a plan from there.

To expand on my comment: think through the typical process by which a client makes a request to your server:
it initiates a connection, which has an overhead for both client and server;
it makes one or more requests via that connection, holding on to resources on the server for the duration of the connection;
it closes the connection, generally releasing application resources, but generally still hogging a port number on your server for some number of seconds after the conncetion is closed.
So in designing your architecture, you need to think about things such as:
how many connections can you actually hold open simultaneously on your server? if you're using Tomcat or other standard server with one thread per connection, you may have issues with having 5,000 simultaneous threads; (a NIO-based architecture, on the other hand, can handle thousands of connections without needing one thread per connection); if you're in a shared environment, you may simply not be able to have that many open connections;
if clients don't hold their connections open for the duration of a "session", what is the right balance between number of requests and/or time per connection, bearing in mind the overhead of making and closing a connection (initialisation of encrypted session if relevant, network overhead in creating the connection, port "hogged" for a while after the connection is closed)
Then more generally, I'd say consider:
in whatever architecture you go for, how easily can you re-architecture/replace specific components if they prove to be bottlenecks?
for each "black box" component/framework that you use, what actual problem does it solve for you, and what are its limitations? (Don't just use Tomcat because your boss's mate's best man told them about it down the pub...)
I would also agree with what other people have said-- at some point you need to not be too theoretical. Design something sensible, then run a test bed to see how it actually copes with your expected volumes of data. (You might not have the whole app built, but you can start making predictions about "we're going to have X clients sending Y requests every Z minutes, and p% of those requests will take n milliseconds and write r rows to the database"...)

High availability and scalable platform for Java/C++ on Solaris

I have an application that's a mix of Java and C++ on Solaris. The Java aspects of the code run the web UI and establish state on the devices that we're talking to, and the C++ code does the real-time crunching of data coming back from the devices. Shared memory is used to pass device state and context information from the Java code through to the C++ code. The Java code uses a PostgreSQL database to persist its state.
We're running into some pretty severe performance bottlenecks, and right now the only way we can scale is to increase memory and CPU counts. We're stuck on the one physical box due to the shared memory design.
The really big hit here is being taken by the C++ code. The web interface is fairly lightly used to configure the devices; where we're really struggling is to handle the data volumes that the devices deliver once configured.
Every piece of data we get back from the device has an identifier in it which points back to the device context, and we need to look that up. Right now there's a series of shared memory objects that are maintained by the Java/UI code and referred to by the C++ code, and that's the bottleneck. Because of that architecture we cannot move the C++ data handling off to another machine. We need to be able to scale out so that various subsets of devices can be handled by different machines, but then we lose the ability to do that context lookup, and that's the problem I'm trying to resolve: how to offload the real-time data processing to other boxes while still being able to refer to the device context.
I should note we have no control over the protocol used by the devices themselves, and there is no possible chance that situation will change.
We know we need to move away from this to be able to scale out by adding more machines to the cluster, and I'm in the early stages of working out exactly how we'll do this.
Right now I'm looking at Terracotta as a way of scaling out the Java code, but I haven't got as far as working out how to scale out the C++ to match.
As well as scaling for performance we need to consider high availability as well. The application needs to be available pretty much the whole time -- not absolutely 100%, which isn't cost effective, but we need to do a reasonable job of surviving a machine outage.
If you had to undertake the task I've been given, what would you do?
EDIT: Based on the data provided by #john channing, i'm looking at both GigaSpaces and Gemstone. Oracle Coherence and IBM ObjectGrid appear to be java-only.

The first thing I would do is construct a model of the system to map the data flow and try to understand precisely where the bottleneck lies. If you can model your system as a pipeline, then you should be able to use the theory of constraints (most of the literature is about optimising business processes but it applies equally to software) to continuously improve performance and eliminate the bottleneck.
Next I would collect some hard empirical data that accurately characterises the performance of your system. It is something of a cliché that you cannot manage what you cannot measure, but I have seen many people attempt to optimise a software system based on hunches and fail miserably.
Then I would use the Pareto Principle (80/20 rule) to choose the small number of things that will produce the biggest gains and focus only on those.
To scale a Java application horizontally, I have used Oracle Coherence extensively. Although some dismiss it as a very expensive distributed hashtable, the functionality is much richer than that and you can, for example, directly access data in the cache from C++ code .
Other alternatives for horizontally scaling your Java code would be Giga Spaces, IBM Object Grid or Gemstone Gemfire.
If your C++ code is stateless and is used purely for number crunching, you could look at distributing the process using ICE Grid which has bindings for all of the languages you are using.

You need to scale sideways and out. Maybe something like a message queue could be the backend between the frontend and the crunching.

Andrew, (in addition to modeling as a pipeline etc), measuring things is important. Have you ran a profiler over the code and got metrics of where most of the time is spent?
For the database code, how often does it change ? Are you looking at caching at the moment ? I assume you have looked at indexes etc over the data to speed up the Db ?
What levels of traffic do you have on the front end ? Are you caching web pages ? (It isn't too hard to say use a JMS type api to communicate between components. You can then put Web Page component on one machine (or more), and then put the integration code (c++) on another, and for many JMS products there are usually native C++ api's ie. ActiveMQ comes to mind), but it really helps to know how much of the time is in Web (JSP ?) , C++, Database ops.
Is the database storing business data, or is it being also used to pass data between Java and C++ ? You say you are using shared mem not JNI ? What level of multi-threading currently exists in the APP? Would you describe the code as being synchronous in nature or async?
Is there a physical relationship between the Solaris code and the devices that must be maintained (ie. do all the devices register with the c++ code, or can that be specified). ie. if you were to put a web load balancer on the frontend, and just put 2 machines up today is the relationhip of which devices are managed by a box initialized up front or in advance?
What are the HA requirements ? ie. just state info ? Can the HA be done just in the web tier by clustering Session data ?
Is the DB running on another machine ?
How big is the DB ? Have you optimized your queries ie. tried using explicit inner/outer joins sometimes helps versus nested sub queries (sometmes). (again look at the sql stats).

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.