I am designing a structure like below to share master location .
I might need to develop an API to receive masters location and store into database. Then deliver master location to relevant users. The user could use the web, Android, IOS to get the master location.
So I got some questions as below:
Which one is better Socket or HTTP request?
Which one is better SQL or NoSQL?
Any other thing I need to pay attention to?
Here are Answers to your questions in order :
Which one is better Socket or http request?
For a single request/reply, they are about the same - WebSockets also needs to send HTTP headers when establishing the connection. For back-and-forth communication, especially small messages, WebSockets will be much faster because it doesn't need to transmit headers for every single message - it's a normal TCP connection, and it can use the same connection instead of long polling and then establishing a new one.
Which one is better SQL or NoSQL?
When trying to decide what sort of database to use for your application, it is important to identify what you actually want that database to do.
If you are dealing with large quantities of structured data, then SQL will work well for you. If you are dealing with unstructured data, then noSQL will work better for you.
If you can divide your data into structured and unstructured then you can use both solutions.
Hope this information helps!
Do tell me if anything is unclear in that post, I'll try to clear it up.
Related
I built a FIX Initiator Application using the QuickFIX/J library to send orders to my broker. If you don't know what is a FIX Application, consider that my program is an application that sends message to a server through TCP connection.
To get and send the orders created by multiple algorithms I have a Directory Watcher (WatchService) that watches for modifications on a local directory that is synchronized with a S3 bucket using AWS Cli.
This approach works well, except for the fact that I have to wait about 6-8 seconds before the file is on my local directory, so I can parse it to fix orders and send to broker's FIX app. I really would like to decrease this delay between the order creation and the moment when it is send to the broker.
What are the possible solutions that I'd tought:
1) Reading directly from S3 bucket without using AWS CLI
2) Opening different FIX sessions for each different algorithm
3) Instead of reading from a bucket, peaking a database (MySQL) for new orders. The algos would generate table rows instead of files
4) Having an API between my FIX application and the algorithms, so the algos can connect directly with my application.
Solution (1) didn't improved the order receiving time because it takes about the same time to list S3 objects, get summary and filter the desired file.
Solution (2) I didn't tried, but I think it is not the best one. If I have, for example, 100 different strategies I would have to open 100 different connections and I am not sure if my broker app can handle. But I may be wrong.
Solution (3) I also didn't tried.
Solution (4) is what I believe that is ideal, but I don't know how to implement. I tried to create an REST API, but I don't know if it is conceptually correct. Supposing that my FIX application is currently connected to the broker's server, my idea was to (i) create a new webapp to create a REST API (ii) receive order info through a API, (iii) find the current alive session and (iv) send order to broker server using the current session. Unfortunately, I was not able to find the current session by ID using the following on a different of the class that is running the FIX application:
SessionID sessionID = new SessionID("FIX.4.4", "CLIENT1", "FixServer");
Session session = Session.lookupSession(sessionID);
What I would like to hear from you:
What do you think that is the best solution to send FIX orders created by multiple sources?
If I want to create an API to connect 2 different applications, what are the steps that I can follow?
I am sorry if I was a bit confuse. Let me know if you need further clarification.
Thank you
Q : What do you think that is the best solution to send FIX orders created by multiple sources?
Definitely the 4) -i.e.- consolidate your multiple sources of decisions and interface the Broker-side FIX Protocol Gateway from a single point.
Reasons:
- isolation of concerns in design/implementation/operations
- single point of authentication/latency-motivated colocation for FIX Protocol channel
- minimised costs of FIX Protocol Gateway acceptance-testing (without this Tier-1 market participants will not let you run business with, so expenses on FIX Protocol E2E-mutual-cooperation compliance-testing do matter - both costs-wise and time-wise )
Q : what are the steps that I can follow?
Follow you own use-case, that defines all the MVP-features that need to be ready for going into testing.
Do not try to generalise your needs into any "new-Next-Gen-API", your trading is all about latency+trading, so rather specialise on the MVP-definition and do not design/implement anything beyond an MVP with minimum latency (overhead) on a point to point basis. Using stable professional frameworks, like nanomsg or ZeroMQ, may avoid spending a bit of time on reinventing any already invented wheels for low-latency trading messaging/signaling tools. Using REST is rather an anti-pattern in the 3rd millenium low-latency motivated high performance distributed computing eco-system for trading.
I am currently trying to transfer a file from a Android device to a Java TCP Server, but I am unable to find a good example which explains the structure I would need to implement this. There are many Java Client&Server examples there which explain file transfer but I want to make sure if this will still work once one throws an Android Device in there.
My question is how do I implement this sort of structure? And if it doesn't work, would I be better sending the file over an HTTP connection to a PHP server? I see a lot of examples and documentation online for the later method so I presume it is more reliable. I would however prefer to use a Java server.
The file consists of a large set of coordinates recorded by the Android device which will then be sent to the server. I have not yet established how I will store this data yet but I was originally going to store them in a primitive text file.
Design
The first thing you need is something to allow you to run Java code on your server.
There are a number of options. Two of the most popular technologies are Glassfish and Apache Tomcat.
Crudely speaking Apache Tomcat is sufficient for simple client-server communication and Glassfish is used if you need to do more complex stuff. Both allow Servlets (which are essentially self contained server classes written in Java) to run on the server-side.
They handle communication with the client by launching a JVM (Java Virtual Machine) each time they receive a request. The Java servlet can run inside the JVM and respond do some processing if required before sending a response back to the client.Each new request is run in a new instance of a servlet. This makes dealing with multiple concurrent requests simpler (no need for more complex threading).
Networking (sending data to and from the server)
In networking situations the client can be a PC, an Android phone, or any other device capable of connecting to the internet. As far as the server is concerned, if the client can communicate using HTTP (a standard protocol which it understands) the it doesn't care what sort of device it is. This means that solutions for PC desktop client-server applications are similar to one for a phone.
You can use library such as Apache HTTP Components to make it easier to handle HTTP requests and responses between the device and the server. Of course you could write your own classes to do this using Sockets but this would be very time consuming, particularly if you have never done it before.
Storage of Data
If you have time I would recommend implementing some sort of database to store the information.
They have a number of benefits to such as data recovery mechanisms, indexing for fast searching of data, ensure data integrity, better structuring of data and so on.
If you decide to use a database I recommend MySQL. It is a free and more importantly - well documented.
Aside: JDBC can be used to communicate with the database with Java.
Sorry about the in-line hyperlinks - apparently my repuation isn't high enough to post more than two!
Source: Personal experience from implementing a similar design.
I have an app which will generate 5 - 10 new database records in one host each second.
The records don't need any checks. They just have to be recorded in a remote database.
I'm using Java for the client app.
The database is behind a server.
The sending data can't make the app wait. So probably sending each single record to the remote server, at least synchronously, it's not good.
Sending data must not fail. My app doesn't need an answer from the server, but it has to be 100% secure that it arrives at the server correctly (which should be guaranteed using for example http url connection (TCP) ...?).
I thought about few approaches for this:
Run the send data code in separate thread.
Store the data only in memory and send to database after certain count.
Store the data in a local database and send / pulled by the server by request.
All of this makes sense, but I'm a noob on this, and maybe there's some standard approach which I'm missing and makes things easier. Not sure about way to go.
Your requirements aren't very clear. My best answer is to go through your question, and try to point you in the right direction on a point-by-point basis.
"The records don't need any checks," and "My app doesn't need an answer, but it has to be 100% secure that it arrives at the server correctly."
How exactly are you planning on the client knowing that the data was received without sending a response? You should always plan to write exception handling into your app, and deal with a situation where the client's connection, or the data it sends, is dropped for some reason. These two statements you've made seem to be in conflict with one another; you don't need a response, but you need to know that the data arrives? Is your app going to use a crystal ball to devine confirmation of the data being received (if so, please send me such a crystal ball - I'd like to use it to short the stock market).
"Run the send data code in a separate thread," and "store the data in memory and send later," and "store the data locally and have it pulled by the server", and "sending data can't make my app wait".
Ok, so it sounds like you want non-blocking I/O. But the reality is, even with non-blocking I/O it still takes some amount of time to actually send the data. My question is, why are you asking for non-blocking and/or fast I/O? If data transfers were simply extremely fast, would it really matter if it wasn't also non-blocking? This is a design decision on your part, but it's not clear from your question why you need this, so I'm just throwing it out there.
As far as putting the data in memory and sending it later, that's not really non-blocking, or multi-tasking; that's just putting off the work until some future time. I consider that software procrastination. This method doesn't reduce the amount of time or work your app needs to do in order to process that data, it just puts it off to some future date. This doesn't gain you anything unless there's some benefit to "batching" data sending into large chunks.
The in-memory idea also sounds like a temporary buffer. Many of the I/O stream implementations are going to have a buffer built in, as well as the buffer on your network card, as well as the buffer on your router, etc., etc. Adding another buffer in your code doesn't seem to make any sense on the surface, unless you can justify why you think this will help. That is, what actual, experienced problem are you trying to solve by introducing a buffer? Also, depending on how you're sending this data (i.e. which network I/O classes you choose) you might get non-blocking I/O included as part of the class implementation.
Next, as for sending the data on a separate thread, that's fine if you need non-blocking I/O, but (1) you need to justify why that's a good idea in terms of the design of your software before you go down that route, because it adds complication to your app, so unless it solves a specific, real problem (i.e. you have a UI in your app that shouldn't get frozen/unresponsive due to pending I/O operations), then it's just added complication and you won't get any added performance out of it. (2) There's a common temptation to use threads to, again, basically procrastinate work. Putting the work off onto another thread doesn't reduce the total amount of work needing to be done, or the total amount of I/O your app will consume in order to accomplish its function - it just puts it off on another thread. There are times when this is highly beneficial, and maybe it's the right decision for your app, but from your description I see a lot of requested features, but not the justification (or explanation of the problem you're trying to solve) that backup these feature/design choices, which is what should ultimately drive the direction you choose to go.
Finally, as far as having the server "pull" it instead of it being pushed to the server, well, all you're doing here is flipping the roles, and making the server act as a client, and the client the server. Realize that "client" and "server" are relative terms, and the server is the thing that's providing the service. Simply flipping the roles around doesn't really change anything - it just flips the client/server roles from one part of the software to the other. The labels themselves are just that - labels - a convenient way to know which piece is providing the service, and which piece is consuming the service (the client).
"I have an app which will generate 5 - 10 new database records in one host each second."
This shouldn't be a problem. Any decent DB server will treat this sort of work as extremely low load. The bigger concern in terms of speed/responsiveness from the server will be things like network latency (assuming you're transferring this data over a network) and other factors regarding your I/O choices that will affect whether or not you can write 5-10 records per second - that is, your overall throughput.
The canonical, if unfortunately enterprisey, answer to this is to use a durable message queue. Your app would send messages to the queue, and a backend app would receiver and store them in a database. Once the queue has accepted a message, it guarantees that it will be made available to the receiver, even if the sender, receiver, or the queue broker itself crash.
On my machine, using HornetQ, it takes ~1 ms to construct and send a short text message to a durable queue. That's quick enough that you can do it as part of handling a web request without adding any noticeable additional delay. Any good message queue will support your 10 messages per second throughput. HornetQ has been benchmarked as handling 8.2 million messages per second.
I should add that message queues are not that hard to set up and use. I downloaded HornetQ, and had it up and running in a few minutes. The code needed to create a queue (using the native HornetQ API) and send and receive messages (using the JMS API) is less than a hundred lines.
If you queue the data and send it in a thread, it should be fine if your rate is 5-10 per second and there's only one client. If you have multiple clients, to the point where your database inserts begin to get slow, you could have a problem; given your requirement of "sending data must not fail." Which is a much more difficult requirement, especially in the face of machine or network failure.
Consider the following scenario. You have more clients than your database can handle efficiently, and one of your users is a fast typist. Inserts begin to back up in-memory in their app. They finish their work and shut it down before the last ones are actually uploaded to the database. Or, the machine crashes before the data is sent - or while its sending; or worse yet, the database crashes while its sending, and due to network issues the client can't really tell that its transaction has not completed.
The easy way avoid these problems (most of them anyway), is to make the user wait until the data is committed somewhere before allowing them to continue. If you can make the database inserts fast enough then you can stick with a simpler scheme. If not, then you have to be more creative.
For example, you could locally write the data to disk when the user hits submit, and then upload it from another thread. This scenario needs to be smart enough to mark something that is persisted as sent (deleting it would work); and have the ability to re-scan at startup and look for unsent work to send. It also needs the ability to keep trying in the case of network or centralized server failure.
There also needs to be a way for the server side to detect duplicates. Because the client machine could send the data and crash before it can mark it as sent; and then upon restart it would send it again. The same situation could occur if there is a bad network connection. The client could send it and never receive confirmation from the server; time out and then end up retrying it.
If you don't want the client app to block, then yes, you need to send the data from a different thread.
Once you've done that, then the only thing that matters is whether you're able to send records to the database at least as fast as you're generating them. I'd start off by getting it working sending them one-by-one, then if that isn't sufficient, put them into an in-memory queue and update in batches. It's hard to say more, since you don't give us any idea what is determining the rate at which records are generated.
You don't say how you're writing to the database... JDBC? ORM like Hibernate? But the principles are the same.
We have a Java (Spring) web application with Tomcat servlet container.
We have a something like blog.
But the blog must load its posts dynamically with Ajax.
The client's ajax script checks for new posts every second.
I.e. Ajax must ask the server for new posts every second and it will be very heavy for database.
But what if we have hundreds of thousands connects simultaneously?
I think that we must retrieve all posts with cron every second and after that save it somewhere. But where? The main idea is to unload the database.
Any ideas about architecture?
Thanks in advance!
There is other architecture for polling that could be more optimal, depending on the case:
Long polling
Long polling is a variation of the
traditional polling technique and
allows emulation of an information
push from a server to a client. With
long polling, the client requests
information from the server in a
similar way to a normal poll. However,
if the server does not have any
information available for the client,
instead of sending an empty response,
the server holds the request and waits
for some information to be available.
Once the information becomes available
(or after a suitable timeout), a
complete response is sent to the
client. The client will normally then
immediately re-request information
from the server, so that the server
will almost always have an available
waiting request that it can use to
deliver data in response to an event.
In a web/AJAX context, long polling is
also known as Comet programming.
Long Polling
Example of Implementations of this technology:
Push Server
You could also use the observer pattern to register the requests, and notify them when an update is done.
Hundreds of thousands of concurrent users all polling our site every second makes for a huge amount of traffic. If you truly expect this load you are going to have to design your platform accordingly, probably by clustering multiple web, application and DB servers.
Remember that with a database connection pool you don't need a DB connection for every user.
I'm not as familiar with Tomcat, but in WebSphere we can set up connection pools to prepare a certain number of connections.
Also, are you mainly worried about reads or the same number of writes?
Plus, you may also want to have the database "split" depending on region etc. This way there is no single heavy load across the entire database, but it can then be split and even load balanced.
There is also the "NoSQL" databases to look into as well. Maybe something to consider. Just ideas to help out.
Architecture:
A bunch of clients send out messages to a server which is behind a VIP. Obviously this server poses an availability risk.
The client monitors a resource and the server is responsible to take action based on the what status the majority of the clients report to it and hence the need for only 1 server/leader.
I am thinking of adding another server as a backup on the VIP, which gets turned on only when the first server fails. However when the backup comes up it would have no information to process and would lose time waiting for clients to report and waiting for the required thresholds etc.
Problem:
What is the best and easiest way to have two servers share client state information with only one receiving client traffic?
Solution1:
I thought of have having the server forward client state information to backup server and in the event of a failure when the backup server comes up, it can take it from there.
Is there any other way to do this? I thought of having a common/shared place to store state information where both servers can read client state information from. But this doesn't work well as the shared space is a single point of failure too.
One option is to use a write-ahead log. Essentially, any modification you make to your state gets sent over to the backup server, which replays the change on its own copy of the state. As long as it can keep up with the streaming log, the backup is always up-to-date.
This is the approach generally used by most databases; if you use one as your backend, you may be able to get support for this with little work.
Be careful to have a plan to recover from communication failure - either save the log to disk and resend the missing portion, or send a snapshot of the state, plus all log entries since the snapshot on reconnect.
There are various distributed caching products which do the kind of thing you're talking about here. Some are supplied with App Servers, such as WebSphere's dynacache and Object Grid. In fact ObjectGrid can be used in JSE, no need for an App Server.
Those distributed cache products use various push and pull models with pub-sub messaging to achieve consistency across the instances. Working for IBM I'm a fan of ObjectGrid, but more impartant, I'm fan of not reinventing wheels. My take is that this stuff can get quite complex and hence finding something off-the shelf might save a load of work - there are links to various Open Source solutions here.
The is very much dependent on how available your solution needs to be (how many 9's). There is a spectrum of solution.
A lightweight one could be crafted around Memcache: extremely fast distributed state facility. As example, it is used extensively on Google AppEngine.