I have to write an architecture case study but there are some things that i don't know, so i'd like some pointers on the following :
The website must handle 5k simultaneous users.
The backend is composed by a commercial software, some webservices, some message queues, and a database.
I want to recommend to use Spring for the backend, to deal with the different elements, and to expose some Rest services.
I also want to recommend wicket for the front (not the point here).
What i don't know is : must i install the front and the back on the same tomcat server or two different ? and i am tempted to put two servers for the front, with a load balancer (no need for session replication in this case). But if i have two front servers, must i have two back servers ? i don't want to create some kind of bottleneck.
Based on what i read on this blog a really huge charge is handle by one tomcat only for the first website mentionned. But i cannot find any info on this, so i can't tell if it seems plausible.
If you can enlight me, so i can go on in my case study, that would be really helpful.
Thanks :)
There are probably two main reasons for having multiple servers for each tier; high-availability and performance. If you're not doing this for HA reasons, then the unfortunate answer is 'it depends'.
Having two front end servers doesn't force you to have two backend servers. Is the backend going to be under a sufficiently high load that it will require two servers? It will depend a lot on what it is doing, and would be best revealed by load testing and/or profiling. For a site handling 5000 simultaneous users, though, my guess would be yes...
It totally depends on your application. How heavy are your sessions? (Wicket is known for putting a lot in the session). How heavy are your backend processes.
It might be a better idea to come up with something that can scale. A load-balancer with the possibility to keep adding new servers for scaling.
Measurement is the best thing you can do. Create JMeter scripts and find out where your app breaks. Built a plan from there.
To expand on my comment: think through the typical process by which a client makes a request to your server:
it initiates a connection, which has an overhead for both client and server;
it makes one or more requests via that connection, holding on to resources on the server for the duration of the connection;
it closes the connection, generally releasing application resources, but generally still hogging a port number on your server for some number of seconds after the conncetion is closed.
So in designing your architecture, you need to think about things such as:
how many connections can you actually hold open simultaneously on your server? if you're using Tomcat or other standard server with one thread per connection, you may have issues with having 5,000 simultaneous threads; (a NIO-based architecture, on the other hand, can handle thousands of connections without needing one thread per connection); if you're in a shared environment, you may simply not be able to have that many open connections;
if clients don't hold their connections open for the duration of a "session", what is the right balance between number of requests and/or time per connection, bearing in mind the overhead of making and closing a connection (initialisation of encrypted session if relevant, network overhead in creating the connection, port "hogged" for a while after the connection is closed)
Then more generally, I'd say consider:
in whatever architecture you go for, how easily can you re-architecture/replace specific components if they prove to be bottlenecks?
for each "black box" component/framework that you use, what actual problem does it solve for you, and what are its limitations? (Don't just use Tomcat because your boss's mate's best man told them about it down the pub...)
I would also agree with what other people have said-- at some point you need to not be too theoretical. Design something sensible, then run a test bed to see how it actually copes with your expected volumes of data. (You might not have the whole app built, but you can start making predictions about "we're going to have X clients sending Y requests every Z minutes, and p% of those requests will take n milliseconds and write r rows to the database"...)
Related
We have a costumer that have 3 stores with different databases. In every store has a wildfly running some webservices which communicate between them. Each json request with 10/30 rows spends 1 seconds in average. Every wildfly uses 1,5 gb of RAM. I know that memory is always a problem in Java, but can I be more economic using some microframework like Javalin or microservices rather than a java ee app server? And node.js would be a option for better performance?
Before you start looking into a different architecture, which would probably make for a major rewrite, find out where all that time is going. Set up profiling on the WildFly servers. Start by doing that on one, then have some calls coming in. Check how much time is spent in various parts of the stack. Is one call to the web service handled rather slowly? Then see where that time goes. It might be the database access. Is one such call handled pretty quickly on the server itself once it comes in? Then your best bet is your losing time on the network layer.
Check the network traffic. You can use Wireshark or a similar tracing tool for this. See how much time actually passes between a request coming in and the response going out. Is that slow but the processing on Wildfly itself seems fast enough? Maybe there's some overhead going on (like security). Is the time between request and response very fast? You're definitely looking at the network as the culprit.
Eventually you may need to have profiling and network tracing active on all three servers simultaneously to see what's going on, or for each combination of two servers. It may turn out only one of them is the bottleneck. And if you have servers A, B and C, from the sound of it your setup might cause a call from A to B to also require a call from B to C before some result can be returned to A. If that is the case, it's little wonder you may see some serious latency.
But measure and find the root of the problem before you start deciding to change the entire framework and a different programming language. Otherwise you may put a lot of time into something for no improvement at all. If the architecture is fundamentally flawed you need to think of a different approach. If this is still in the prototyping phase that would be substantially easier.
Well, first you may prune your WildFly installation or try Quarkus :)
Let's assume that I have a JavaScript front-end (Angular.js, for example), a Java-based back-end (Spring, running on Tomcat, for example) and a database management system (SAP HANA In-Memory, in my case). For example, I have graphs that can change relatively quickly.
I am wondering what an efficient and fast architecture could look like. Do you usually send a whole collection of objects to the UI or do you just send deltas?
In my case, data consistency on the UI is very important in order for the application to work properly, but low-latency as well, especially when it comes to data merges.
When it comes to consistency, I often tend to do a SELECT from the database on an insert and read the whole object collection again, but my concerns are that this does not scale.
Is there a generic approach to that problem or even existing frameworks?
Edit:
Currently, it is around 300 objects with a couple of integer attributes and cross-references that can change and rearrange in a millisecond time, but could go up to 10000 in the future. My challenge here is the communication between front-end and back-end, so the front-end always has a consistent data set in real-time.
How close is the client to the server? Is it a mile/km away or hundreds/thousands of miles away? Is the client on the internet or is it on a high-performance VPN? Are you close to the backbone or dozens of hops away? You're not normally going to consistently get 1 millisecond latency on the web if you're trusting the general internet.
If you are on an internal company network and the client is physically close to the server, e.g., same machine, same local network, you can get single digit ms latency with WebSocket (I personally have gotten 3-4 ms across internal data centers at a big investment bank).
Don't optimize too early. That's usually a bad thing.
Although with any high-performance UI, its always good to just send the deltas.
You may want to consider some sort of event mechanism to reduce your polling the data source. Then you would only update the data when it actually changed.
I suppose this is not possible. But I am looking at best way to separate different layers of my service yet be able to access layers quickly or without overhead of IPC/RMI.
The main programming language I am using is java, but can use C++ if required.
What we have right now is a server that host database and access control. And we use RMI for consumers to request data. This slow and doesn't scale very well.
We need performance and scalability which we dont have at the moment.
What we are thinking of is using a layered architecture with database at base, access control ontop of it along with a notification bus to notify clients of changes in database.
The main problem is the overhead of communication that we want to avoid/or minimize.
Is there any magic thread that can run in two context (switch context) and share information that way. I know the short answer would be no, but what are the options?
Update
We are currently using Java RMI.
Our base layer will provide an API that can be used to create plugins that will run on top. So its not a fixed collectors/consumer we have. We can have 5-6 collectors running and same amount of consumers.
We can have upto 1000 consumers.
My first suggestion is that you should buy a book (or find an online tutorial) on building scalable applications, because you seem to be pretty lost.
Sharing a thread between processes doesn't make sense at any level - it is meaningless, but you can share the data that the thread accesses, which is probably what you want.
The fastest method will be C based IPC (e.g., shared memory, semasphores, etc: Shmget). You say you want to avoid the overhead of IPC, but really, it isn't going to get any faster than that.
But why do you want multiple processes? If you are worried about the overhead of communicating between processes, just have your threads in one process? There is no reason your different layers have to be in different processes.
But anyway, I am not convinced that your original statement that RMI is slow and doesn't scale is completely correct. If it is not scaling, you are probably not using the right framework. Maybe you have an issue that you only have one RMI end point on the server. Have you considered an J2EE system with stateless session beans?
Without knowing about your requirements, it is hard to say.
It is not possible in general to share thread between two processes due to OS design. The problem of sharing data between two or more processes is usually solved by sharing files, sharing database or sharing messages (which in turn can be synchronous or asynchronous), having processes communicate via pipes, say in Linux, or even sharing memory. You scenario description is not very precise, you need to describe all processes and how information is supposed to flow, what triggers information flow, etc.
Most likely you need high performance messaging library, https://github.com/real-logic/Aeron/ is one. But to get precise answer you would need to describe better what overhead exactly you want to minimize.
If your goal is to notify users, you should consider publish/subscribe messaging (pub/sub). There are many middleware vendors out there that provide this architecture though most are expensive in production scenarios. For open source, check out http://redis.io/topics/pubsub. (No affiliation.)
Is it better to kick off one thread that will handle one client and another for each other one that connects but tells them the server is busy, or should I just stick with a singlethreaded approach where the same thread accepts and processes the client so others can't connect to it? (if that's the case of course)
Edit: I should note that there won't be 239482340 people connecting to it. Generally only one person will be connecting to the server, but I want my app to deal with another person trying to connect without falling over.
Sticking to a single thread is better for the server since the resource consumption is very low. However, this is probably annoying for the client since it doesn't know if there is something wrong with the server or if it is just busy.
Having threads that tell additional users that the server is busy takes up more resources but is better for the client.
In your particular case, either approach should be fine. I guess it really depends on the clients and what they want... :-/
Two major strategies are commonly used to build this kind of systems:
A solution based on a multithread strategy, assigning a different thread or process to the each incoming request. This model is used in many commercial servers
The event-driven model, is based on the use of non-blocking I/O operations to respond simultaneously to a number of requests coming from different clients. This is a growing approach.
Basically I want a Java, Python, or C++ script running on a server, listening for player instances to: join, call, bet, fold, draw cards, etc and also have a timeout for when players leave or get disconnected.
Basically I want each of these actions to be a small request, so that players could either be processes on same machine talking to a game server, or machines across network.
Security of messaging is not an issue, this is for learning/research/fun.
My priorities:
Have a good scheme for detecting when players disconnect, but also be able to account for network latencies, etc before booting/causing to lose hand.
Speed. I'm going to be playing millions of these hands as fast as I can.
Run on a shared server instance (I may have limited access to ports or things that need root)
My questions:
Listen on ports or use sockets or HTTP port 80 apache listening script? (I'm a bit hazy on the differences between these).
Any good frameworks to work off of?
Message types? I'm thinking JSON or Protocol Buffers.
How to make it FAST?
Thanks guys - just looking for some pointers and suggestions. I think it is a cool problem with a lot of neat things to learn doing it.
As far as frameworks goes, Ginkgo looks promising for building a network service (which is what you're doing). The Python is very straightforward, and the asynchronicity enabled by gevent lets you do asynchronous things without generally having to worry about callbacks. The gevent core also gives you access to a lot of building blocks.
Rather than having lots of services communicating over ports, you might look into either 1) a good message queue, like RabbitMQ or 0mq, or 2) a distributed coordination server, like Zookeeper.
That being said, what you aim to do is difficult, especially if you're not familiar with the basics. It's a worthwhile endeavor to learn about those basics.
Don't worry about speed at first. Get it working, then make it scale. Of course, there are directions you can go that will make it easier to scale in the future. Zookeeper in particular gives you easy-to-implement primitives for scaling horizontally (i.e. multiple workers sharing the load). In particular, see the Zookeeper recipe book and their corresponding python implementations (courtesy of the kazoo, a gevent-based client library).
Don't forget that "fast" also means optimizing your own development time, for quicker iterations and less time cursing your development environment. So use Python, which will let you get up and running quickly now, and optimize later if you really truly start to bind on CPU time or memory use. (With this particular application, you're far more likely to bind on network IO.)
Anything else? Maybe a cup of coffee to go with your question :-)
Answering your question from the ground up would require several books worth of text with topics ranging from basic TCP/IP networking to scalable architectures, but I'll try to give you some direction nevertheless.
Questions:
Listen on ports or use sockets or HTTP port 80 apache listening script? (I'm a bit hazy on the differences between these).
I would venture that if you're not clear on the definition of each of these maybe designing an implementing a service that will be "be playing millions of these hands as fast as I can" is a bit hmm, over-reaching? But don't let that stop you as they say "ignorance is bliss."
Any good frameworks to work off of?
I think your project is a good candidate for Node.js. There main reason being that Node.js is relatively scaleable and it is good at hiding the complexity required for that scalability. There are downsides to Node.js, just Google search for 'Node.js scalability critisism'.
The main point against Node.js as opposed to using a more general purpose framework is that scalability is difficult, there is no way around it, and Node.js being so high level and specific provides less options for solving though problems.
The other drawback is Node.js is Javascript not Java or Phyton as you prefer.
Message types? I'm thinking JSON or Protocol Buffers.
I don't think there's going to be a lot of traffic between client and server so it doesn't really matter I'd go with JSON just because it is more prevalent.
How to make it FAST?
The real question is how to make it scalable. Running human vs human card games is not computationally intensive, so you're probably going to run out of I/O capacity before you reach any computational limit.
Overcoming these limitations is done by spreading the load across machines. The common way to do in multi-player games is to have a list server that provides links to identical game servers with each server having a predefined number of slots available for players.
This is a variation of a broker-workers architecture were the broker machine assigns a worker machine to clients based on how busy they are. In gaming users want to be able to select their server so they can play with their friends.
Related:
Have a good scheme for detecting when players disconnect, but also be able to account for network latencies, etc before booting/causing to lose hand.
Since this is in human time scales (seconds as opposed to miliseconds) the client should send keepalives say every 10 seconds with say 30 second session timeout.
The keepalives would be JSON messages in your application protocol not HTTP which is lower level and handled by the framework.
The framework itself should provide you with HTTP 1.1 connection management/pooling which allows several http sessions (request/response) to go through the same connection, but do not require the client to be always connected. This is a good compromise between reliability and speed and should be good enough for turn based card games.
Honestly, I'd start with classic LAMP. Take a stock Apache server, and a mysql database, and put your Python scripts in the cgi-bin directory. The fact that they're sending and receiving JSON instead of HTTP doesn't make much difference.
This is obviously not going to be the most flexible or scalable solution, of course, but it forces you to confront the actual problems as early as possible.
The first problem you're going to run into is game state. You claim there is no shared state, but that's not right—the cards in the deck, the bets on the table, whose turn it is—that's all state, shared between multiple players, managed on the server. How else could any of those commands work? So, you need some way to share state between separate instances of the CGI script. The classic solution is to store the state in the database.
Of course you also need to deal with user sessions in the first place. The details depend on which session-management scheme you pick, but the big problem is how to propagate a disconnect/timeout from the lower level up to the application level. What happens if someone puts $20 on the table and then disconnects? You have to think through all of the possible use cases.
Next, you need to think about scalability. You want millions of games? Well, if there's a single database with all the game state, you can have as many web servers in front of it as you want—John Doe may be on server1 while Joe Schmoe is on server2, but they can be in the same game. On the other hand, you can a separate database for each server, as long as you have some way to force people in the same game to meet on the same server. Which one makes more sense? Either way, how do you load-balance between the servers. (You not only want to keep them all busy, you want to avoid the situation where 4 players are all ready to go, but they're on 3 different servers, so they can't play each other…).
The end result of this process is going to be a huge mess of a server that runs at 1% of the capacity you hoped for, that you have no idea how to maintain. But you'll have thought through your problem space in more detail, and you'll also have learned the basics of server development, both of which are probably more important in the long run.
If you've got the time, I'd next throw the whole thing out and rewrite everything from scratch by designing a custom TCP protocol, implementing a server for it in something like Twisted, keeping game state in memory, and writing a simple custom broker instead of a standard load balancer.