Large scale mailout platform using PHP - java

What is the best practice for sending email campaigns?
My company is asking me to come up with an application that is able to send hundres of thousands of emails per day.
We have the capacity to send this amount using Amazon SES.
As a PHP developer I have created a script with PHP to find for example 100,000 records from the database and send emails one by one according to use preferences. This script is executed using cron several times a day.
But this approach fails due to the script being slow, and the browser time outs (Even with high php set_timeout). Or in other words, its not robust and reliable.
I was thinking to perhaps use Java or some other "active" programming language that is alive in the background and is able to handle this without timing out etc.
Has any one of you had this issue before? What is your suggestions for this large scale mailout platform?
Side note 1: We call an API in order to send emails, no sendmail etc.
Side note 2: It has to be able to Call an API around 40 times per second, my script only calls 1 per second
Side note 3: Database is MySQL

If you need to do long running tasks in PHP I recommend running the script from the command line, without a web server. You won't have the timeout issue.

WOW - Sounds like a SPAM factory :) Anyways, I would look into writing some type of service that can spin multiple threads up and process the requests that way. 40 times per second to the cloud seems like a lot. Good luck!

Better purchase something like expressmail from godaddy .
Always having the chance to mark as spam if we do manually using php

Related

Java Chat application bandwidth usage?

I've tried to look around for data concerning how much of a bandwidth hog a chat application is.
In this case maybe with a Java/AJAX implementation or simply just Java, using Server/Client relationship.
I want to find out, how much bandwidth such a system would use when it's written in Java. The benchmark could be 15-20 users from all over the world and peaking at maybe 8 or 10 max connected at a time. I know it might seem vague, but I simply can't seem to find data on this specific situation.
Can anyone point me to some resources regarding this? Or chip in if possible?
Unless the chat application is sending photos or files, it will use a trivial amount of data. With a max user count of ten people at once you could wrap the messages in a bandwidth hog of xml and I would still stick with my answer: it will use a trivial amount of bandwidth.
Say all ten of your users are fast typers and very chatty. They type non-stop at 100 words per minute. Break that down to 10 sentences per minute and wrap each of these in a message to the server. Add some XML data describing who the message came from and whether it is private to another user or sent to a group of users and maybe you could get 1K per message. So each user is then sending 1K to the server every 6 seconds. With 10 users, we get 10K sent to the server every 6 seconds.
So by my estimate, we could connect your server to a 56K modem from 1995 and you'll be fine.
The reason you can't find data about this is because there's nothing particularly Java- or AJAX-related here. Bandwidth usage depends on the data you send/receive over the network, and therefore is dependent upon the protocol that you design to pass data around; it has nothing to do with whether you use Java only, or AJAX in combination of Java, or CGI scripts, PL/I or Assembler.
You can code a chat application in Assembler that will be a worse bandwidth hog than a chat application coded in Java.
In order to know your bandwidth impact, you need to analyze your data model, data flow and your overall communication protocol: namely, what data is being sent, in what structure, and how frequently.

BlazeDS Polling Interval set to 0: Unwanted side-effects?

tl;dr: Setting the polling-interval to 0 has given my performance a huge boost, but I am worried about possible problems down the line.
In my application, I am doing a fair amount of publishing from our java server to our flex client, publishing on a variety of topics and sub-topics.
Recently, we have been on a round of performance improvements system-wide, and the messaging layer was proving to be a big bottleneck.
A few minutes ago, I discovered that setting the <polling-interval-millis> property in our services-config.xml to 0 caused published messages, even when there are lots of them, to be recognized by the client almost instantly, instead of with the 3 second delay that is the default value for polling-interval-millis, which has obviously had a tremendous impact.
So, I'm pretty happy with the current performance, only thing is, I'm a bit nervous about unintended side-effects caused by this change. In particular, I am worried about our Flash client slowing way down, and of way too much unwanted traffic.
My preliminary testing has not borne out this fear, but before I commit the change to our repository, I was hoping that somebody with experience with this stuff would chime in.
Unfortunately your question is too general...there is no way to receive a specific answer. I'll write below some ideas, maybe they are helpful.
Decreasing the value from 3 to 0 means that you are receiving new data way faster. If your Flex client uses this data in order to make complex computations it is possible to slow your client or to show obsolete data (it is a known pattern, see http://help.adobe.com/en_US/LiveCycleDataServicesES/3.1/Developing/WS3a1a89e415cd1e5d1a8a18fb122bdc0aad5-8000Update.html ). You need to understand how the data is processed and probably to do some client benchmarking.
Also the server will have to handle more requests, and it would be good to identify what is the maximum requests per second which can be handled. For that, you will need to use a tool like Jmeter in order to detect the maximum capacity of your system, after that you can do some computations trying to figure out how many requests per second you will have after you reduced the interval from 3 to 0, taking into account that the number of clients is increasing with 10% per month etc etc.
The main idea is that you should do some performance testing for some API and save the scripts in order to see if your future modification are slowing down the system too much. Without having this it is quite hard to guess if it ok or not to change configuration parameters.
You might want to try out long-polling. For our Weblogic servers, we don't get any problems unless we let the poll request go to 5 minutes, so we keep it to 4, then give it a 1 second rest before starting again. We have a couple of hundred total users, with 60-70 on it hard core all day. The thing to keep in mind is that you're basically turning intermittent user requests into what amounts to almost always connected telnet sessions. Depending on the browser your users are using it can implications from that as well, but overall we've been very pleased.

What is the fastest way to 302 a link to it's final URL?

given the link http://bit.ly/2994js
What is the most efficient way or library to use that would get you to the final URL of a bit.ly,fb.me, etc... after the 302 redirects? Assume the scale to be 10+ million of these a day with the ability to scale across servers.
Java HttpClient?
PHP with cURL?
other?
The implementation language isn't likely to make much odds in terms of performance - there's almost nothing to do. It'll all be network latency. It's possible that using a customized network stack might help, but I wouldn't bother unless I really needed to.
I'm not sure whether a 302 response is still able to keep the connection alive with HTTP 1.1 - but if it can, that could really be a boon. That's also an argument against using cURL (which is going to start a new process, requiring a new connection) for each URL, unless there's some way of putting cURL into a batch mode. (There may be - worth investigating.)
The important thing will be to make sure that you don't hit any server so hard it thinks you're launching a DDOS attack, but to make as many requests in parallel as you can within that limit.
Note that 10,000,000 per day is only ~116 requests per second. If you've got an adequate network connection and the target servers aren't blocking you, that shouldn't be hard to achieve.
cURL is fastest. So, if you want absolute speed, go with writing a bash script that does it by cURL.
However, making 10+ million request may get your IP banned pretty soon from them.
In the case of bit.ly, there is an API call (expand) that gets the target URL from the shortened URL. Other URL shortening services may have similar API calls. In those cases, you wouldn't have to handle the redirect.

Sanity Check - Is a Multiplayer Game Server in Java using TCP (ServerSocket) viable?

Please stop me before I make a big mistake :) - I'm trying to write a simple multi-player quiz game for Android phones to get some experience writing server code.
I have never written server code before.
I have experience in Java and using Sockets seems like the easiest option for me. A browser game would mean platform independence but I don't know how to get around the lack of push using Http from the Server to the Browser.
This is how the game would play out, it should give some idea of what I require;
A user starts the App and it connects using a Socket to my server.
The server waits for 4 players, groups them into a game and then broadcasts the first question for the quiz.
After all the players have submitted their answers (Or 5 seconds has elapsed) the Server distributes the correct answer with the next question.
That's the basics, you can probably fill in the finer details, it's just a toy project really.
MY QUESTION IS;
What are the pitfalls of using a simple JAR on the server to handle client requests? The server code registers a ServerSocket when it is first run and creates a thread pool for dealing with incoming client connections. Is there an option that is inherently better for connection to multiple clients in real time with two way communication?
A simple example is in the SUN tutorials at the bottom you can see the source for a multithreaded server, except that I have a pool of threads initially to reduce overhead, my server is largely the same.
How many clients do you expect this system to be able to handle? If we have a new thread for each client I can see that being a limit, also the number of free Sockets for concurrent players. Threads seem to top out at around 6500 with the number of sockets available nearly ten times that.
To be honest If my game could handle 20 concurrent players that would be fine but I'm trying to learn if this approach is inherently stupid. Any articles on setting up a simple chess server or something would be amazing, I just can't find any.
Thanks in advance oh knowledgeable ones,
Gav
You can handle 20 concurrent players fine with a Java server. The biggest thing to make sure you do is avoid any kind of blocking UI like it was the devil itself.
As a bonus, if you stick with non-blocking I/O you can probably do the whole thing single-threaded.
Scaling much past 100 users or so may need to get into multiple processes/servers, depending on how much load each user places on your client.
It should be able to do it without an issue as long as you code it properly.
Project Darkstar
You can get around the "push from server to client over HTTP" problem by using the Long Poll method.
However, using TCP sockets for this will be fine too. Plenty of games have been written this way.

How to create an automated way of monitoring to see if this application is running

We have a application that should always be running. Does anyone know of a way to create an automated way of monitoring to see if this application is running (possibly using a batch file)? If it is not running, then send an email notification and start the application?
Nagios is generally what's used by systems administrators that I've come across. You can script it to do whatever check you need and alert based on a variety of conditions. Works well with cacti so you can graph stuff too :)
If you want to ensure that your service always restarts should it die you could use supervise from daemontools.
Alternative to Nagios is zabbix
You don't mention an OS but if you're looking for something on Windows, Application Monitor might be a good start.
If you're on Linux, monit look pretty useful.
Most monitoring systems have a built-in test which watches the process list to check that everything that should be running is running.
We use Hobbit, it has a configurable table of processes which should be running (and the number of instances, red/yellow alert etc).
We are now heading to release our service that can do some monitoring tasks that usually are hard to handle by Nagios or other similar tools. We provide instant notifications (email, SMS) when:
a) your application/service does not respond for some time
b) some conditions are met (e.g. time of execution of some part of logic > X, number of emails sent < Y or whatever you want)
This is absoltely easy to use when compared to Nagios or others and it does not require installation. We spent a lot of time to make it user-friendly
As I mentioned this will be released very soon (will come back and give you the information). If you are interested in our approach we invite you to beta tests of our application (there will be some promotion for participants).

Categories