I have this concept of rewriting a game engine as a scalable collection of microservices.
It's currently a proof of concept but the main principle lies in each player having their session/connection held and managed by a single container, so containers will scale up and down based on the amount of connected users.
Each player container will speak to multiple other microservices to gather data and perform actions, these services will be static replica's of 2 or 3.
There is one microservice I have in mind which I feel is a bit of bottleneck which I'm currently looking for ways to make more 'scalable' and 'robust'.
This microservice in question is the GameMap service. There will be multiple GameMap services (atleast one service for each uniqe or instanced gamemap). Each map will contain N number of cells and each cell can contain objects with different types / states for example (i.e other playerObjects, ItemObjects)
I would like to be able to have a replica of atleast 2 for each GameMap to instantly flip if one was to for some reason fail and shutdown.. it is important for the users to have a seamless transition between the failing and failover GameMap. To achieve that I need to have consistent / up to date state shared between them.
The need to be able to load balance traffic between the two replica's is a nice to have but not essential.
So far the one potential solution I have come is hazelcast. This will allow me to keep the state of each map cell in a scalable memory data grid (again for robustness and scalability).
I expect that there may be up to 100s of state changes within a gamemap every second and my concern is that it may be too slow and cause huge latency between users.
Has anyone got any hints, suggestions or feedback based on the both scenario or more importantly the usecase of hazelcast here?
P.S. i can upload my very crude connectivity/architect diagram of my game engine as micro services at some point if it helps or if anyone is interested.
It really depends on your requirements, environment etc.
Especially if you want to be HA, you probably want to replicate to different availability zones or potentially different regions and you will be bound by the speed of light (or need to accept there is a chance for data loss). So in other words; the performance is mostly determined by the infrastructure.
But just to give you some ballpark numbers; for a simple read on c5.9xlarge instances on EC between machines in the same low latency group you are looking at 100/200 us. And running a hundreds of thousands of gets second per instance is normally not an issue.
In other words; it is very difficult to say if this is the right approach. Depending on your situation and how important this is, I would take a single slice of your whole system and make some benchmarks to get an impression how well it performs and how well it scales.
But my alarm-bells are going of when I see the combination of micro-service with 'real time' and 'game engine'.
Related
I'm currently struggling to wrap my head around some concurrency concepts in general. Suppose we have a REST api with several endpoints for updating and creating entities in our database. Lets assume that we receive 100 concurrent requests at the same time for a certain update. How do we guarantee that our data consistency is retained? If working with Java, I guess some options would be:
Using lock mechanisms
Using synchronization on methods in our service layer
However surely this would make a huge impact on the scalability of our application? But I can't see any other way currently of ensuring that we don't encounter any race conditions when interacting with our database. (Also I dont think there is much point to adding synchronization to every method we write in our service?)
So, I guess the question is: how can we reliably secure our application from race conditions with concurrent requests, while at the same time retaining scalability?
I realize this is a very open-ended and conceptual question, but please if you can point me in the right direction of what area / topic I should dive into for learning, I would be grateful.
Thanks!
You got a good understanding of the problem.
You have to decide between eventual consistency and strong consistency. Strong consistency will limit scaling to a certain extent but you also really need to sit down and be realistic/honest about your scaling needs(or your consistency needs).
It's also possible to limit consistency for example rows in a database could be consistent or you can be consistent geographically within a region or a continent. Different queries can also have different requirements.
Creating efficient and strongly consistent databases is a whole field of research and all the big tech giants have people working on that, there are too many solutions/technologies to list. Just googling something like "strong consistency scaling" will get you a ton of results you can read.
We are running a setup locally where we start two instances of an Axon application. The following properties are set in application.yml:
axon:
eventhandling:
processors:
SomeProcessorName:
initialSegmentCount: 2
threadCount: 1
mode: TRACKING
So both nodes have a single thread and they should each process a segment. They both connect to AxonServer. How do the two instances coordinate segment claims?
If I start both of these applications using an in-memory database, I can see in AxonServer that they both attempt to claim segment 0 and that segment 1 is claimed by neither. (We get a duplicated claim/unclaimed segment warning). If they connect to the same database, this does not happen, instance 1 claims segment 0, instance 2 claims segment 1.
Am I then correct in assuming that identical processors have to share a database in order for this to work properly? I can't find this information immediatly in the reference docs.
Does this then also mean that if I would hypothetically want to replicate a projection model for performance reasons (e.g: database server in the US and another one in the EU), this would not work properly?
To clarify: I would want both databases to build an identical query model that could both be queried separately. As it is right now (assuming that we could run two nodes on two databases), node 1 would only process events for segment 0, node 2 would only process events for segment 1. If I understand this correctly, this means that both databases only contain half of the information of the query model.
So in order to pull this off, I would have to create another near-identical codebase, with the only difference being the processor name?
I think I can give some guidance in this area.
Axon Server does not provide coordination between Tracking Tokens of TrackingEventProcessor at this point in time.
Thus, coordination of this part is purely in your application environment, or differently put, with the Axon Server client.
The most pragmatic approach would be to share the underlying storage solution for your TokenStore between both application; so your assumption on this part is correct.
Current implementations of the TokenStore are indeed database-based - nothing stops you to come up with a distributed solution of this though, as this is all open source and freely adjustable.
I do not completely follow your hypothetical suggestion that:
Does this then also mean that if I would hypothetically want to replicate a projection model for performance reasons (e.g: database server in the US and another one in the EU), this would not work properly?
Well, this can work properly, but I think the segmentation of a given TrackingEventProcessor it's TrackingToken is not the way to go in this part.
This solution is intended to share the work load of updating a single Query Model.
The 'work load' in this scenario is the Event Stream by the way.
If you're looking to replicate a given Query Model by means of reading the Event Stream, I'd indeed suggest to have a second TrackingEventProcessor, which has an identical Event Handling Component underneath.
Note that this should not require you to 'replicate the code base'.
You should merely need to register two Event Handling Components to two distinct TrackingEventProcessors.
If you are using Spring Boot as configuration, all this is typically abstracted away from you. But if you take a look at the EventProcessingConfigurer, you should be able to find a fair API describing how to achieve this. If things aren't clear in that area, I'd suggest a different issue should be introduced, as the topic somewhat diverges from the original question.
Hoping this is sufficient for you to proceed #MatthiasVanEeghem!
Let's say I have an API that gives me the values of stock for the last month. The data is sampled every hour.
Now I want to make a web app that would visualize this data on a line chart. I don't need all the hourly samples, so my question is how should I make this work?
My idea is that there would be a backend app (i.e. in Java Spring) that would GET the data from the API and calculate the average for each day (using a stream, maybe parallel stream?) and then put that in a new collection and pass it on to the front end to put in a chart.
Start thinking from the UI, what do you need there, how often do you need it and how fast ?
Then get the data from the backend, if there is too much data at once and the API cannot do otherwise, either:
get data and reduce to what the UI needs (backend), use once and throw away
OR get data and reduce to what the UI needs (backend), keep in cache for a while
OR pre-process the data so that when the UI needs it, it will be ready
For the return format, consider something lightweight, like some simple named json array {"dayAverages": [0.34, 1253.432, ...]}, "month" : 2, "year": 2018}, then in the UI adapt to the needs of your lib (that is debatable).
Also observe how users use the UI, then you may get some ideas on how to optimize the experience (preload next month ...)
If you do this for learning purposes, consider doing it async + lambdas = bonus :)
As to your question "...how should I make this work?" --
This is extremely broad. There are many, many ways to do this. Some of these ways depend heavily on your architecture, how much traffic is expected to your app, what request-load the API can handle, etc. Here are a few general things to consider:
Any sort of MVC architecture (or similar) would be a good fit for your Web app.
You mention needing a "backend app" of some type. Not sure what you mean here, but the averaging features can be built directly into your Web app framework without needing a separate back-end app.
If you're going to calculate averages for display in the Web app, you will need to maintain state somewhere. Assuming the API doesn't give this to you, you'll need a database of some type, or at least some type of memory caching storage engine to facilitate this. How you do this will depend on your architecture and the traffic/load on your app (e.g. will you have multiple, load-balanced servers).
Hope that helps. We could give more if you ask some specific questions.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I'm building a simulation in Java. So, I'll break my simulation into two parts:
1) The simulation engine
2) The simulation model
Basically I want a little help (tips/advice) about how to split it up i.e. what goes where.
So I'm thinking that the engine will keep track of time. It will listen for events and when events arrive it will update the state of the simulation ( I'm building a discrete event simulation). The simulation model will have the GUI and it will get the logic and data from the actual engine. I'm thinking that the model will provide the actual events as input to the engine. I've been thinking about a car analogy where the engine is the body of the car and the model the driver. So I want it to behave like the driver(model) telling the car(engine) what to do i.e. when to turn when to break and what speed to go at etc
Do you think I'm tackling this in the right way? I can sense that I sound a little confusing and not very clear. So I'll just clarify that what I'm looking for is just some input to how I should split this up and what the responsibility of engine and model should actually be.
Also, I was wondering, if I were to implement the MVC design pattern, how would that fit in with the way I'm trying to break it up?
EDIT:
By model I mean that I want the simulation to have a set of specific rules which the engine then follows. As I'm building a road traffic simulator, the rules could be like, the distribution of cars, driver profiles, what cars may and may not do ( e.g. stop for red light) etc. So the model is like the "brain" of the simulation if you get what I mean, and then the engine being the actual simulation of the set of "rules" specified by the model. I hope this makes more sense.
May be not very applicable, but for MVC approach (Model-View-Controller), which is rather wide-spread and accepted, controller seems to correspond to what you call engine. And model is just that -- bunch of simple dump Java objects with as little logic as possible, containing only attributes of real-world objects they represent.
So, employing this analogy with MVC you'll get your model as set of roads, cars, containing just coordinates of objects and the engine will move cars, detect collisions etc.
After round of moves is finished, you'll get an updated version of model (some cars are in new positions with new velocity, some buildings are burning (heh), etc). And you'll handle this updated model to your view (whatever it may be) for rendering.
The only thing I'm unsure here is what part of the system is going to provide input events. In usual MVC this is some external entity (usually human operator). If by events you mean human input, it will be the same for your application. If you mean events like collisions because of, say, car's movements -- then it's engine itself who will produce such events as the result of calculations on each step of simulation.
Although, this not very classic OO design. In classic OO design, you would get model classes, such as cars, having their internal logic, which would define that, say, car is suddenly changes it's velocity out of the blue. I wouldn't go this route, because it makes logic of your code distributed between model classes and controller classes. You have set of model objects at the start of the world and the only way forward is to either influence them with engine decisions or to have real external input (like GUI input from human). If you need model object to change it's behavior, it should be responsibility of engine code, not model code.
Sorry for this rather incohesive speculation, this is rather wide topic and there are lots of books about such things.
You haven't given us enough information to REALLY help sketch out your simulation, but here's a good tip: Anything that you can identify as a thing should be an object. So make a class Car. And a class TrafficLight. Then make a class Driver, each Car has a field Driver. And a Road would have a List<Car>
Before you start thinking about how to implement an MVC framework, make sure you understand what it is.. The most important thing about MVC is that it's about how the user interacts with a universe. So you'd want MVC if, for example, you were writing a game called SimTraffic, because not only do you need a traffic simulation, but the user needs to control it somehow too. If you were just watching a simulation occur (with no interaction), don't worry about MVC.
Forget about the GUI. Please start from the physics - there are scores of traffic simulations; I assume you have read at least one book on the subject, if not it is high time to do so: a starting point could be a Springer-published collection of essays on various modern models called Fundamentals of Traffic Simulation (ISBN 1441961410), Jaume Barcelo (ed.) (2010).
EDIT: Would advise first deciding on the scope of your sim; what are the constant assumptions? For what time periods will it be tuned? Will road network change? Do you allow for car crashes, DUI idiots, onlookers taking movies from the crash site for Youtube?
What accuracy do you need from the sim - do you want it to be used for city planning, environmental control or traffic management? What are the variables and parameters that you set? Have you got statistical data to validate your simulation and test predictions against? Do you have ready data on physical characteristics of cars/drivers in your modelled universe/city (acceleration, linear size, propensity to break traffic rules)? There are a bunch of questions that should be answered before you sit down to code...
EDIT #2: from your comment to #Victor Sorokin 's answer, I gather you have a nice idea of adding driver's expectations into the model - would make the driver's AI the first thing to code: yes, shortest path, but the solution to the shortest path problem comes from stale data (with possibly variable delay). If you give drivers perfect foresight, there won't be any crashes; if you make them imperfect, you will have to model sensory input, perhaps boiled down to direction-specific probabilities of detecting an incoming car. It makes for some huge expenditure of CPU cycles, for sure.
I have a lot of existing data in my database already, and want to develop a points mechanism that computes a score for each user based on what actions they do.
I am implementing this functionality in a pluggable way, so that it is independent of the main logic, and relies on Spring events being sent around, once an entity gets modified.
The problem is what to do with the existing data. I do not want to start collecting points from now, but rather include all the data until now.
What is the most practical way to do this? Should I design my plugins in such a way as to provide for an index() method, which will force my system to fetch every single entity from the database, send an EntityDirtyEvent, to fire the points plugins, for each one, and then update it, to let points get saved next to each entity. That could result in a lot of overhead, right?
The simplest thing would be to create a complex stored procedure, and then make the index() call that stored procedure. That however, seems to me like a bad thing either. Since I will have to write the logic for computing the points in java anyway, why have it once again in SQL? Also, in general I am not a fan of splitting business logic into the different layers.
Has anyone done this before? Please help.
First let's distinguish between the implementation strategy and business rules.
Since you already have the data, consider obtaining results directly from the data. This forms the data domain model. Design the data model to store all your data. Then, create a set of queries, views and stored procedures to access and update the data.
Once you have those views, use a data access library such as Spring JDBC Template to fetch this data and represent them into java objects (lists, maps, persons, point-tables etc).
What you have completed thus far does not change much, irrespective of what happens in the upper layers of the system. This is called Model.
Then, develop a rule base or logic implementation which determines, under what inputs, user actions, data conditions or for all other conditions, what data is needed. In mathetical sense, this is like a matrix. In programming sense, this would be a set of logic statements. If this and this and this is true, then get this data, else get that data, etc. This encompasses the logic in your system. Hence it is called "Controller".
Do not move this logic into the queries/stored procedure/views.
Then finally develop a front-end or "console" for this. In the simplest case, develop a console input system, which takes a .. and displays a set of results. This is your "view" of the system.
You can eventually develop the view into a web application. The above command-line view can still be viable in the form of a Restful API server.
I think there is one problem here to be considered: as I understand there's huge data in the Database so the idea to create only one mechanism to calculate the point system could not be the best approach.
In fact if you don't want to start collecting points but include all the data, you must process and calculate the information you have now. Yes, the first time you will run this can result an overhead, but as you said, you need this data calculated.
By other hand you may include another mechanism that attends changes in an entity and launches a different process capable of calculate the new pointing diffence that applies to this particular modification.
So, you can use one Service responsible of calculate the pointing system, one for a single entity and another, may be longer to finish, capable of calculate the global points. Even, if you don't need to be calculated in real-time you can create a scheduled job responsible of launch it.
Finally, I know it's not a good approach to split the business logic in two layers (Db + Java) but sometimes is a requirement do it, for example, if you need to reply quickly to a request that finally works with a lot of registries. I've found some cases that there's no other option than add business logic to the database (as a stored procedures, etc) to manage a lot of data and return the final result to the browser client (ex: calculation process in one specific time).
You seem to be heading in the right direction. You know you want your "points" thing decoupled from the main application. Since it is implied you are already using hibernate (by the tag!), you can tap into the hibernate event system (see here section 14.2). Depending upon the size/complexity of your system, you can plugin your points calculations here (if it is not a large/complex system), or you can publish your own event to be picked up by whatever software is listening.
The point in either design approach is that neither knows or cares about your point calculations. If you are, as I am guessing, trying to create a fairly general purpose plugin mechanism, then you publish your own events to that system from this tie-in point. Then if you have no plug-ins on a given install/setup, then no one gets/processes the events. If you have multiple plug-ins on another install/setup, then they each can decide what processing they need to do based upon the event received. In the case of the "points plugin" it would calculate it's point value and store it. No stored proc required....
You're trying to accomplish "bootstrapping." The approach you choose should depend on how complicated the point calculations are. If stored procedures or plain update statements are the simplest solution, do that.
If the calculations are complicated, write a batch job that loads your existing data, probably orders it oldest first, and fires the events corresponding to that data as if they've just happened. The code which deals with an event should be exactly the same code that will deal with a future event, so you won't have to write any additional code other than the batch jobs themselves.
Since you're only going to run this thing once, go with the simplest solution, even if it is quick and dirty.
There are two different ways.
One is you already know that - poll the database for for changed data. In that case you are hitting the database when there may not be change and it may slow down your process.
Second approach - Whenever change happens in database, the database will fire the event. That you can to using CDC (Change Data Capture). It will minimize the overhead.
You can look for more options in Spring Integration