I'm trying to build a task-manager application. Two or more client applications should be able to change (such as mark off or change the title etc) certain tasks stored in the database over the web.
While creating the requirements, the following question came up:
How is it possible to inform client applications (Android Apps, Java-Applications on a Mac) about changes in the database, without constantly checking the database? I planned on storing the data objects in a SQL-Database on a Webserver.
Should I use another database? What is the standard way to go right now in SE world? Any keywords for me or explanations would help!
Firstly, make sure you are not accessing the database directly from the client applications, that's a very dangerous route.
Secondly, as for your requirement it looks like you want a server side push notification.
As far as I know there are 3 ways to do this.
Check updates from the server every X seconds (if you don't have too many clients that needs to be notified this way is okay to go)
Use HTTP long polling.
Use WebSocket to keep a long lasting connection between server/client applications.
For mobile devices though, if you want to be notified even when the app is closed, take a look at all the push notification frameworks available out there (e.g. FCM/GCM).
I'm creating on online real-time multiplayer mobile game using Kryonet (a Java TCP/UDP networking library) that I'm planning to host on AWS.
The architecture is as follows: clients connect to a central login/account server that allows them to login and view their stats etc. This bit is easy, as it'll basically just be a REST API, and can be scaled in a pretty standard way (like you would any webapp).
However, the more interesting bit is when players actually play a match. For this, I plan to have a separate pool of "match" servers (EC2s). The login/account server will pair two players, then send the client the address of a particular match server. The players will then join that match server, which will host their match (perhaps lasting 5-10 minutes). The match server needs to be sticky as it will be running a real-time instance of the game, and will be sending/receiving UDP packets in real time. Each match server will probably be able to host a few hundred matches.
My question is about how I should go about scaling these match servers. I suppose I will have them auto-register with the central server at start-up, and send some type of keep alive. I could build this all myself; however, I'm wondering if AWS has tools/services that can do this all for me.
Okay, I've done a little more reading of the AWS documentation. It seems that I can probably achieve this as follows:
Each time two players are paired up, they are added to a queue. They are taken off the queue when a spot is free on one of the match-playing servers. When the size of this queue exceeds some threshold, then the number of EC2s is scaled. This can be done with basically all in AWS config: http://docs.aws.amazon.com/autoscaling/latest/userguide/as-using-sqs-queue.html
The tricky bit is then scaling down instances. Unlike a normal REST API, you can't just turn a server off. The server needs to finish all its current games. It seems that AWS has this covered too with lifecycle hooks.
I have a concept of a game system that includes (preferably) Java server and multi-platform clients (Web, Android, iOS).
This is a 1vs1 player-vs-player realtime game. Servers performs a matchup of 2 players. So basically server needs to handle many matches containing 2 players. Both players alter same data, and each player should be updated in realtime with actions of other player.
Can you suggest me:
1) Server-side framework/library that would ease the implementation, as I would rather not start learning node.js from the scratch. :) Vert.x comes to mind.
2) Should clients hold the replica of the data, and alter it locally (meaning only data that is transfered are only commands, here I see JMS as good solution), or should only server alter the data and then send the complete data set every time change occurs?
3) How should the data be transfered? Considering the multi-platform requirement only thing I see viable are WebSockets.
4) An example/tutorial of server handling pairing of WebSocket connections? All I ever found are 1-to-1 connections.
5) Considering scalability, can you explain how could all this work in a distributed environment?
1) I don't think node.js is such big deal to learn. I would personally prefer a well known - broadly used framework.
2) If you are considering mobile, probably the first option seems more sound. You should consider send/push deltas during the game, and still provide functionality to retrieve the full state of the game in case the client disconnect and connect with same ID.
3) WebSocket would be the best option. Push approach, TLS option and well supported. Another option is the WebRTC data connection, that is peer-2-peer most of the times. I say most of the times because if one of the users is behind a dynamic NAT router or restrictive firewall, it won't be possible, and you will need a TURN (relay) server. Anyway, it is less supported than WS.
4) You should not "pair websockets". The WS connections just input commands to your logic, and your logic broadcast events to whoever it wants. Despite of being a 1vs1 game, probably you want to inspect the flow of events for further debugging or analysis. So consider WS as a transport, not as an entity.
5) Very, very, very broad question. But assuming that you are going to use WS, and that your application will be so successful that you will need multiple servers... assume that it is impossible to predict that two users will connect to the same server, so you should consider a message bus that allow to play users from one server with the users in other server. An EDA (Event Driven Architecture) would make sense.
I have an app which will generate 5 - 10 new database records in one host each second.
The records don't need any checks. They just have to be recorded in a remote database.
I'm using Java for the client app.
The database is behind a server.
The sending data can't make the app wait. So probably sending each single record to the remote server, at least synchronously, it's not good.
Sending data must not fail. My app doesn't need an answer from the server, but it has to be 100% secure that it arrives at the server correctly (which should be guaranteed using for example http url connection (TCP) ...?).
I thought about few approaches for this:
Run the send data code in separate thread.
Store the data only in memory and send to database after certain count.
Store the data in a local database and send / pulled by the server by request.
All of this makes sense, but I'm a noob on this, and maybe there's some standard approach which I'm missing and makes things easier. Not sure about way to go.
Your requirements aren't very clear. My best answer is to go through your question, and try to point you in the right direction on a point-by-point basis.
"The records don't need any checks," and "My app doesn't need an answer, but it has to be 100% secure that it arrives at the server correctly."
How exactly are you planning on the client knowing that the data was received without sending a response? You should always plan to write exception handling into your app, and deal with a situation where the client's connection, or the data it sends, is dropped for some reason. These two statements you've made seem to be in conflict with one another; you don't need a response, but you need to know that the data arrives? Is your app going to use a crystal ball to devine confirmation of the data being received (if so, please send me such a crystal ball - I'd like to use it to short the stock market).
"Run the send data code in a separate thread," and "store the data in memory and send later," and "store the data locally and have it pulled by the server", and "sending data can't make my app wait".
Ok, so it sounds like you want non-blocking I/O. But the reality is, even with non-blocking I/O it still takes some amount of time to actually send the data. My question is, why are you asking for non-blocking and/or fast I/O? If data transfers were simply extremely fast, would it really matter if it wasn't also non-blocking? This is a design decision on your part, but it's not clear from your question why you need this, so I'm just throwing it out there.
As far as putting the data in memory and sending it later, that's not really non-blocking, or multi-tasking; that's just putting off the work until some future time. I consider that software procrastination. This method doesn't reduce the amount of time or work your app needs to do in order to process that data, it just puts it off to some future date. This doesn't gain you anything unless there's some benefit to "batching" data sending into large chunks.
The in-memory idea also sounds like a temporary buffer. Many of the I/O stream implementations are going to have a buffer built in, as well as the buffer on your network card, as well as the buffer on your router, etc., etc. Adding another buffer in your code doesn't seem to make any sense on the surface, unless you can justify why you think this will help. That is, what actual, experienced problem are you trying to solve by introducing a buffer? Also, depending on how you're sending this data (i.e. which network I/O classes you choose) you might get non-blocking I/O included as part of the class implementation.
Next, as for sending the data on a separate thread, that's fine if you need non-blocking I/O, but (1) you need to justify why that's a good idea in terms of the design of your software before you go down that route, because it adds complication to your app, so unless it solves a specific, real problem (i.e. you have a UI in your app that shouldn't get frozen/unresponsive due to pending I/O operations), then it's just added complication and you won't get any added performance out of it. (2) There's a common temptation to use threads to, again, basically procrastinate work. Putting the work off onto another thread doesn't reduce the total amount of work needing to be done, or the total amount of I/O your app will consume in order to accomplish its function - it just puts it off on another thread. There are times when this is highly beneficial, and maybe it's the right decision for your app, but from your description I see a lot of requested features, but not the justification (or explanation of the problem you're trying to solve) that backup these feature/design choices, which is what should ultimately drive the direction you choose to go.
Finally, as far as having the server "pull" it instead of it being pushed to the server, well, all you're doing here is flipping the roles, and making the server act as a client, and the client the server. Realize that "client" and "server" are relative terms, and the server is the thing that's providing the service. Simply flipping the roles around doesn't really change anything - it just flips the client/server roles from one part of the software to the other. The labels themselves are just that - labels - a convenient way to know which piece is providing the service, and which piece is consuming the service (the client).
"I have an app which will generate 5 - 10 new database records in one host each second."
This shouldn't be a problem. Any decent DB server will treat this sort of work as extremely low load. The bigger concern in terms of speed/responsiveness from the server will be things like network latency (assuming you're transferring this data over a network) and other factors regarding your I/O choices that will affect whether or not you can write 5-10 records per second - that is, your overall throughput.
The canonical, if unfortunately enterprisey, answer to this is to use a durable message queue. Your app would send messages to the queue, and a backend app would receiver and store them in a database. Once the queue has accepted a message, it guarantees that it will be made available to the receiver, even if the sender, receiver, or the queue broker itself crash.
On my machine, using HornetQ, it takes ~1 ms to construct and send a short text message to a durable queue. That's quick enough that you can do it as part of handling a web request without adding any noticeable additional delay. Any good message queue will support your 10 messages per second throughput. HornetQ has been benchmarked as handling 8.2 million messages per second.
I should add that message queues are not that hard to set up and use. I downloaded HornetQ, and had it up and running in a few minutes. The code needed to create a queue (using the native HornetQ API) and send and receive messages (using the JMS API) is less than a hundred lines.
If you queue the data and send it in a thread, it should be fine if your rate is 5-10 per second and there's only one client. If you have multiple clients, to the point where your database inserts begin to get slow, you could have a problem; given your requirement of "sending data must not fail." Which is a much more difficult requirement, especially in the face of machine or network failure.
Consider the following scenario. You have more clients than your database can handle efficiently, and one of your users is a fast typist. Inserts begin to back up in-memory in their app. They finish their work and shut it down before the last ones are actually uploaded to the database. Or, the machine crashes before the data is sent - or while its sending; or worse yet, the database crashes while its sending, and due to network issues the client can't really tell that its transaction has not completed.
The easy way avoid these problems (most of them anyway), is to make the user wait until the data is committed somewhere before allowing them to continue. If you can make the database inserts fast enough then you can stick with a simpler scheme. If not, then you have to be more creative.
For example, you could locally write the data to disk when the user hits submit, and then upload it from another thread. This scenario needs to be smart enough to mark something that is persisted as sent (deleting it would work); and have the ability to re-scan at startup and look for unsent work to send. It also needs the ability to keep trying in the case of network or centralized server failure.
There also needs to be a way for the server side to detect duplicates. Because the client machine could send the data and crash before it can mark it as sent; and then upon restart it would send it again. The same situation could occur if there is a bad network connection. The client could send it and never receive confirmation from the server; time out and then end up retrying it.
If you don't want the client app to block, then yes, you need to send the data from a different thread.
Once you've done that, then the only thing that matters is whether you're able to send records to the database at least as fast as you're generating them. I'd start off by getting it working sending them one-by-one, then if that isn't sufficient, put them into an in-memory queue and update in batches. It's hard to say more, since you don't give us any idea what is determining the rate at which records are generated.
You don't say how you're writing to the database... JDBC? ORM like Hibernate? But the principles are the same.
Scenario: User logs in on the client software which forms a persistent bidirectional connection with the serverside entity (server) which would process user specified tasks. When the serverside entity, while processing user's task, encounters an error or requires further user input, it will notify the client software, and wait until the client decides what to do. The client software will take the new user specifiefd inputs and send this to the serverside. The serverside continue where it last stopped with the new user specified inputs. This feedback cycle will continue until it's finished processing. The progressively updated user inputs will all be stored on the serverside and accessible and modifiable from the client software. So if a client deletes a specific input, that change will be immediately reflected on the serverside. On the serverside, an extra interface is probably required to route different user's clients to available hardware nodes (cloud) to support concurrent multi-user tasks running on the serverside.
On the client side, I suspect using sockets to connect to the server...
Now for the server, I am a little lost because there seems to be many different Java servers like Jetty & Netty. I am also practicing caution in order to not try and reinvent any wheels here.
Is building a server the right approach? or Build a webservice that will complete a specific task on demand?
I am also not just looking for a one size fits all solution (wishful thinking probably) but open to any insights on my current situation.
Netty will provide a lot of what it sounds like you need for this, without making you reinvent a socket server. That said, I would make certain that you actually need bidirectional, real-time communication between the client and server. If you can rework the problem such that the client-server communications do not need to be real-time, then things like RESTful webservices become a possibility, and (in my experience) are much less complicated and error prone.