Just start scaling APNS provider program unfortunately I am really new to networking protocol implementation.
The provider now only runs on one thread and it's just handling a tiny amount of notifications. Now I want to increase its capability to send significantly more than before.
My questions are:
According to Apple doc I can maintain multiple connections to gateways. So my understanding is that I run multithreads in the provider program and maintain a separate connection in each. Is this right?
It first one is right the real difficulty for me comes: my program polls a queue database every 5 seconds to check new message that's to be sent. I do not think it's a good idea for all the threads to poll this same database because there should be duplicate message same to users. How to solve this problem?
I have seen the connections pooling but I do not really understand what that is. Is that the thing I need to study and use? If it is can someone offer an brief explanation regarding what it is and how to use it?
Thanks guys!
Your first assumption is reasonable. Each thread should have its own connection.
As for the second point, the access to the DB that contains the new messages should be synchronized. For example, you can access that DB by a synchronized method that fetches a message or several messages that haven't been processed yet, and marks them as being processed. Two threads can't access that method at the same time, and therefore won't get the same messages.
Another option is to put the messages in memory in a blocking quoue (with the DB only serving for backup in case of a crash). The threads can request an item from the queue, which would block them until an item is available.
Related
I have problem with counting responses from response queue. I mean, once per day we run a job which gather some data from db and send them to queue. When we receive all responses we should shutdown connection. The problem is how we can check if all responses arrived ? Keeping this in global variable is risky because of concurrence issue. Any idea ? I am quite new in JMS so maybe solution is obvious but I dont see it.
I don't know what your stack is or whatever tools you might be using to accomplish this but I've got this in mind and this might help you out (hopefully).
Generate a hash for each job you plan on queuing and store it in a concurrent list/map. (i.e: ConcurrentHashMap)
Send the job to the queue.
Once the job is done and sends back a response, reproduce the hash and store it a separate concurrent list/map that holds all the jobs that are done.
Now that you have two lists of all the jobs supposed to be executed and the jobs that you got a response from. There multiple ways to accomplish this. If you lookup Java Concurrency, you'd find plenty of tutorials and documentation. I like to use CyclicBarrierandCountDownLatch`. If plan on using any of these methods, take extra precautions to prevent your application from hanging or worse, a filthy memory leak.
OR, you could simply check on how many queuing requests and responses you've and if they are equal to each other, drop the connection.
I have a List which contains a lot of objects.
The problem is that i have to process these objects (process includes cloning, deep copy, and making DB calls, running business logic etc etc.
Doing this in a normal fashion, first come first serve is really time consuming and in a web application , this generally results in transaction timeouts at the server side (as this processing is anync from client perspective).
How do i process those objects so as to take minimal time and not overload the DB.
I'm using java 7 on server environment.
I'm already using a messaging solution , rabbitmq, which gets me the item and its quantity. problem occurs when i try to deep copy items to mimic real items (business logic every item should be uniquely processed) and save them to DB.
After some discussions, the viable solution is using a ABQ (array blocking queues) which is processed by a pool of threads.
Following are the thought out benefits:
1) we wont have to manage the 3rd party queues created e.g. rabbitmq
2) At any point in time the blocking queue wont have all the items to be processed as the consumer threads will be simultaneously processing them, so it will leave lesser memory footprint.
#cody123 i'm using spring batch for retry mechanisms in this case.
After another round of profiling i found that the bottle neck was the DB connection pool having low number of max connections.
I deduced this by running the same transaction without db thread pool and it went perfectly well and completed without any exception.
However combining the previous approach i.e. managing an ABQ and light commits with HA DB will be the best solution.
We currently have a distributed setup where we are publishing events to SQS and we have an application which has multiple hosts that drains messages from the queue and does some transformation over it and transmits to interested parties. I have a use case where the receiving end point has scalability concerns with the message volume and hence we would like to batch these messages periodically (say every 15 mins) in the application before sending it.
The incoming message rate is around 200 messages per second and each message is no more than 10 KB. This system need not be real time, but would definitely be a good to have and also the order is not important (its okay if a batch containing older messages gets sent first).
One approach that I can think of is maintaining an embedded database within the application (each host) that batches the events and another thread that runs periodically and clears the data.
Another approach could be to create timestamped buckets in a a distributed key-value store (s3, dynamo etc.) where we write the message to the correct bucket based the messages time stamp and we periodically clear the buckets.
We can run into several issues here, since the messages would be out of order a bucket might have already been cleared (can be solved by having a default bucket though), would need to accurately decide when to clear a bucket etc.
The way I see it, at least two components would be required one which does the batching into a temporary storage and another that clears it.
Any feedback on the above approaches would help, also it looks like a common problem are they any existing solutions that I can leverage ?
Thanks
Can you help me in two problem :
A. We have a table on which read and write operation happens simultaneously. Write happens very vastly so read is very slow - sometimes my web application does not come up due to heavy write operation on this table. How could i handle such scenario. Write happens through different Java application while read happens through our web application, so web application become very slow. Any idea?
B. Write happens to this table happens through 200 threads, these thread take connection from connection pool and write into the table and this application run 24 by 7. is the thread priority is having issue and stopping read operation from web application.
C. Can we have master- master replication for that table only- so write happens in one table and write happens in other table and every two minute data migrates from one table to other table?
Please suggest me .
Thanks in advance.
Check connection pool size - maybe it's too small and your threads waste time waiting for connection from pool.
Check your database settings, if you just running it with out-of-the-box params there maybe a good space for improvements.
You probably need some kind of event-driven system - when vehicle sends data DB is not updated, but a message is added to some queue (e.g. JMS). Your app then caches data on startup, and updates both cache and database upon receiving this message. The key thing is that the only component that interacts with DB is your app, and data changed only when you receive event - so you don't need to query DB to read the data, plus you may do updates in the background using only few threads, etc. There are quite good open-source messaging systems (e.g. Apache Active MQ) and caching libraries (e.g. EH Cache), so you can built reasonably perfomant and fault-tolerant system with not too much effort.
I guess introducing messaging will be a serious reengineering, so to solve your immediate problem replication might be the best solution - merge data from the updateable table to another one every 2 minutes, and the tracker will read that another table; obviously works well if you only read the data in the web-app, and not update them, otherwise you need to put a lot of effort to keep 2 tables in sync. A variation of that is batching - data from vehicle are iserted into intermediate table, and then every 2 minutes transferred into main table from which reader queries them; intermediate table is cleaned after transfer.
The one true way to solve this is to use a queue of write events and to stop the writing periodically so that the reader has a chance.
Create a queue for incoming write updates
Create an atomicXXX (see java.util.concurrency) to use as a lock
Create a thread pool to read from the queue and execute the updates when the lock is unset
Use javax.swing.Timer to periodically set the lock and read the table data.
Before trying anything too complicated try this perhaps:
1) Don't use Thread priorities, they are rarely what you want.
2) Set up your own priority scheme, perhaps simply by having a (priority) queue for both reads and writes where reads are prioritized. That is: add read and write requests to a single queue and have them block or be notified of the result.
3) check your database features to optimize write heavy tables
We have a JMS queue of job statuses, and two identical processes pulling from the queue to persist the statuses via JDBC. When a job status is pulled from the queue, the database is checked to see if there is already a row for the job. If so, the existing row is updated with new status. If not, a row is created for this initial status.
What we are seeing is that a small percentage of new jobs are being added to the database twice. We are pretty sure this is because the job's initial status is quickly followed by a status update - one process gets one, another process the other. Both processes check to see if the job is new, and since it has not been recorded yet, both create a record for it.
So, my question is, how would you go about preventing this in a vendor-neutral way? Can it be done without locking the entire table?
EDIT: For those saying the "architecture" is unsound - I agree, but am not at liberty to change it.
Create a unique constraint on JOB_ID, and retry to persist the status in the event of a constraint violation exception.
That being said, I think your architecture is unsound: If two processes are pulling messages from the queue, it is not guaranteed they will write them to the database in queue order: one consumer might be a bit slower, a packet might be dropped, ..., causing the other consumer to persist the later messages first, causing them to be overridden with the earlier state.
One way to guard against that is to include sequence numbers in the messages, update the row only if the sequence number is as expected, and delay the update otherwise (this is vulnerable to lost messages, though ...).
Of course, the easiest way would be to have only one consumer ...
JDBC connections are not thread safe, so there's nothing to be done about that.
"...two identical processes pulling from the queue to persist the statuses via JDBC..."
I don't understand this at all. Why two identical processes? Wouldn't it be better to have a pool of message queue listeners, each of which would handle messages landing on the queue? Each listener would have its own thread; each one would be its own transaction. A Java EE app server allows you to configure the size of the message listener pool to match the load.
I think a design that duplicates a process like this is asking for trouble.
You could also change the isolation level on the JDBC connection. If you make it SERIALIZABLE you'll ensure ACID at the price of slower performance.
Since it's an asynchronous process, performance will only be an issue if you find that the listeners can't keep up with the messages landing on the queue. If that's the case, you can try increasing the size of the listener pool until you have adequate capacity to process the incoming messages.