Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
As a Java developer I'm going to participate in a web project. So I'm trying to get informed on different aspects of web security.
Now I have came to the DoS attack subject and I'm trying to figure it out what I can/should do as a Java developer. Or may be it would be the system administrator job.
What comes to my mind at first is to implement the functionalities in a way so a single request can not take too much time and resources. For example to put some limits on the amount of the processed data. But I'm not sure if this will be applicable in all cases.
Should I take any care for DoS due to many requests?
Any advices will be appreciated.
Many thanks in advance!
Firstly, there's nothing either of you can do to prevent a DoS attack.
All you can do is make your code sensible (Developer), and your architecture robust (SysAdmin). It is a joint effort.
Developers should try to minimise resource usage as part of their job anyway - not just for DoS attacks.
Developers should use caches to protect the database. If every request needs to consult a list of Countries, then requesting that list from database every single time isn't good practice anyway.
Developers should make sure that bad requests fail as quickly as possible. eg. don't consult the Countries list at all, until you've verified their account number actually exists.
Developers should adopt approaches like REST: treating each request individually rather than maintaining Sessions in memory. This could stop your memory usage from rocketing during an attack. You don't want memory problems as well as your network being flooded!
Developers should make their application scalable. Again, REST helps here as you aren't tied to having things stored in memory. If you can run ten instances of your application at once, each handling a subset of the requests, you will last much longer in a DoS attack (and probably give your users a smoother website experience anyway).
SysAdmins should provide the load-balancing, fail-over, etc. frameworks to manage this scalability. They will also manage hardware for the instances. You could also have the option to add more instances automatically on demand, meaning that automatic server creation and deployment become important. Using VMs rather than physical boxes can help with this.
SysAdmins can set up firewalls and proxies so that, when an attack does happen, they can keep your REAL traffic coming through and stop the attack traffic. They can filter traffic by suspected IP range, block 'suspicious-looking' requests, throttle traffic levels to a gentle flow, etc.
Overall, you can look at DoS as just "high amounts of traffic". If your application code and architecture can't cope with increasing traffic from "regular users" then you are doomed anyway, regardless of a DoS attack. When Facebook was threatened with DoS, I remember someone pointing out that "Everyday is a DDoS attack for Facebook...". But it is developed and structured in such a way that it copes.
DoS attacks are usually the concern of IT. If you are developing a web application, usually it's behind a front controller (apache, nginx, etc) that forwards requests to your application container (Tomcat, Rails, etc... ). The front controllers usually/always have logic to deal with this issue
If you are an application developer, then concentrate on XSS attacks (http://en.wikipedia.org/wiki/Cross-site_scripting) as that is totally within the application developer's responsabilities
I'd say that it is the sysadmins concern mainly, but that doesn't mean that the developer shouldn't take measures to avoid it.
Since DoS attack usually is about bogging your system down with requests so that it cannot handle real requests (Denial of Service), wikipedia has this to say about DoS prevention:
Defending against Denial of Service attacks typically involves the use of a combination of attack detection, traffic classification and response tools, aiming to block traffic that they identify as illegitimate and allow traffic that they identify as legitimate.
In my opinion these are sysadmin tasks, since they are the ones who should be configuring the firewall, routers, switches etc.
Related
Basically I want a Java, Python, or C++ script running on a server, listening for player instances to: join, call, bet, fold, draw cards, etc and also have a timeout for when players leave or get disconnected.
Basically I want each of these actions to be a small request, so that players could either be processes on same machine talking to a game server, or machines across network.
Security of messaging is not an issue, this is for learning/research/fun.
My priorities:
Have a good scheme for detecting when players disconnect, but also be able to account for network latencies, etc before booting/causing to lose hand.
Speed. I'm going to be playing millions of these hands as fast as I can.
Run on a shared server instance (I may have limited access to ports or things that need root)
My questions:
Listen on ports or use sockets or HTTP port 80 apache listening script? (I'm a bit hazy on the differences between these).
Any good frameworks to work off of?
Message types? I'm thinking JSON or Protocol Buffers.
How to make it FAST?
Thanks guys - just looking for some pointers and suggestions. I think it is a cool problem with a lot of neat things to learn doing it.
As far as frameworks goes, Ginkgo looks promising for building a network service (which is what you're doing). The Python is very straightforward, and the asynchronicity enabled by gevent lets you do asynchronous things without generally having to worry about callbacks. The gevent core also gives you access to a lot of building blocks.
Rather than having lots of services communicating over ports, you might look into either 1) a good message queue, like RabbitMQ or 0mq, or 2) a distributed coordination server, like Zookeeper.
That being said, what you aim to do is difficult, especially if you're not familiar with the basics. It's a worthwhile endeavor to learn about those basics.
Don't worry about speed at first. Get it working, then make it scale. Of course, there are directions you can go that will make it easier to scale in the future. Zookeeper in particular gives you easy-to-implement primitives for scaling horizontally (i.e. multiple workers sharing the load). In particular, see the Zookeeper recipe book and their corresponding python implementations (courtesy of the kazoo, a gevent-based client library).
Don't forget that "fast" also means optimizing your own development time, for quicker iterations and less time cursing your development environment. So use Python, which will let you get up and running quickly now, and optimize later if you really truly start to bind on CPU time or memory use. (With this particular application, you're far more likely to bind on network IO.)
Anything else? Maybe a cup of coffee to go with your question :-)
Answering your question from the ground up would require several books worth of text with topics ranging from basic TCP/IP networking to scalable architectures, but I'll try to give you some direction nevertheless.
Questions:
Listen on ports or use sockets or HTTP port 80 apache listening script? (I'm a bit hazy on the differences between these).
I would venture that if you're not clear on the definition of each of these maybe designing an implementing a service that will be "be playing millions of these hands as fast as I can" is a bit hmm, over-reaching? But don't let that stop you as they say "ignorance is bliss."
Any good frameworks to work off of?
I think your project is a good candidate for Node.js. There main reason being that Node.js is relatively scaleable and it is good at hiding the complexity required for that scalability. There are downsides to Node.js, just Google search for 'Node.js scalability critisism'.
The main point against Node.js as opposed to using a more general purpose framework is that scalability is difficult, there is no way around it, and Node.js being so high level and specific provides less options for solving though problems.
The other drawback is Node.js is Javascript not Java or Phyton as you prefer.
Message types? I'm thinking JSON or Protocol Buffers.
I don't think there's going to be a lot of traffic between client and server so it doesn't really matter I'd go with JSON just because it is more prevalent.
How to make it FAST?
The real question is how to make it scalable. Running human vs human card games is not computationally intensive, so you're probably going to run out of I/O capacity before you reach any computational limit.
Overcoming these limitations is done by spreading the load across machines. The common way to do in multi-player games is to have a list server that provides links to identical game servers with each server having a predefined number of slots available for players.
This is a variation of a broker-workers architecture were the broker machine assigns a worker machine to clients based on how busy they are. In gaming users want to be able to select their server so they can play with their friends.
Related:
Have a good scheme for detecting when players disconnect, but also be able to account for network latencies, etc before booting/causing to lose hand.
Since this is in human time scales (seconds as opposed to miliseconds) the client should send keepalives say every 10 seconds with say 30 second session timeout.
The keepalives would be JSON messages in your application protocol not HTTP which is lower level and handled by the framework.
The framework itself should provide you with HTTP 1.1 connection management/pooling which allows several http sessions (request/response) to go through the same connection, but do not require the client to be always connected. This is a good compromise between reliability and speed and should be good enough for turn based card games.
Honestly, I'd start with classic LAMP. Take a stock Apache server, and a mysql database, and put your Python scripts in the cgi-bin directory. The fact that they're sending and receiving JSON instead of HTTP doesn't make much difference.
This is obviously not going to be the most flexible or scalable solution, of course, but it forces you to confront the actual problems as early as possible.
The first problem you're going to run into is game state. You claim there is no shared state, but that's not right—the cards in the deck, the bets on the table, whose turn it is—that's all state, shared between multiple players, managed on the server. How else could any of those commands work? So, you need some way to share state between separate instances of the CGI script. The classic solution is to store the state in the database.
Of course you also need to deal with user sessions in the first place. The details depend on which session-management scheme you pick, but the big problem is how to propagate a disconnect/timeout from the lower level up to the application level. What happens if someone puts $20 on the table and then disconnects? You have to think through all of the possible use cases.
Next, you need to think about scalability. You want millions of games? Well, if there's a single database with all the game state, you can have as many web servers in front of it as you want—John Doe may be on server1 while Joe Schmoe is on server2, but they can be in the same game. On the other hand, you can a separate database for each server, as long as you have some way to force people in the same game to meet on the same server. Which one makes more sense? Either way, how do you load-balance between the servers. (You not only want to keep them all busy, you want to avoid the situation where 4 players are all ready to go, but they're on 3 different servers, so they can't play each other…).
The end result of this process is going to be a huge mess of a server that runs at 1% of the capacity you hoped for, that you have no idea how to maintain. But you'll have thought through your problem space in more detail, and you'll also have learned the basics of server development, both of which are probably more important in the long run.
If you've got the time, I'd next throw the whole thing out and rewrite everything from scratch by designing a custom TCP protocol, implementing a server for it in something like Twisted, keeping game state in memory, and writing a simple custom broker instead of a standard load balancer.
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I've been programming for a while now, and I am pretty familiar with Java and PHP and websites. What I'm confused about is how programmers use them together. I hear about how Facebook and Google use all sorts of languages like Python, C, Java, PHP all for one product, but I'm just confused on how that would be possible.
Also, another side question:
What work exactly do software engineers do when working for large online companies like Twitter and Facebook? Most of the code deals with database and information, and so what major level programming, besides what can be learned online with a few tutorials, needs to be done on the server side?
This is an incredibly broad question, but here's a shot at a vague answer. Often times large applications will have a number of components. For instance, you may have some sort of reporting engine, business logic, web interface, desktop interface, web service API, mobile interface, etc, etc, etc. Each of these could, in theory be written in a different language and communicate via a database or something like a web service.
To your second question. At large companies there is a great deal of work to be done to maintain stability, develop new features, fix bugs as they are discovered and work to increase efficiency etc. Facebook, for instance (and Google) employs a large number of software engineers to help them deal with the massive amounts of volume they receive on a daily basis.
Edit Here's a bit more clarification and a direct answer to your question.
Most of the code deals with database and information, and so what major level programming, besides what can be learned online with a few tutorials, needs to be done on the server side?
The truth is, for the most part, the high-level principals are the same. You could pretty easily build a Facebook clone after doing some basic PHP/MySQL tutorials on the web. Here's the difference: your clone would die before it reached a fraction of the users Facebook sees on a daily basis. It would be slow, unreliable and people would leave because their data would be consistently hacked through SQL injection and other malicious attacks. And that's not even talking about distributed computing. So, yes, from a high-level, that's all you need to know. The implementation and reality is much, much more complex.
As you might expect, larger "websites" are not built in the traditional sense that you have some PHP code, a few HTML templates and a database, since this kind of architecture has severe issues scaling to thousands of concurrent users.
What you can to to mitigate this is split the website out in several components:
Load balancers that distribute requests to several App servers
App servers which generate the UI and handle user actions
Middleware servers that handle business logic and distribute it among DB servers
DB servers that store data in some way
Every component of this system might be implemented in a different language and you might even have different app servers depending on request type (e.g. mobile devices).
This type of system is called Multitier Architectures. You can also find academic books on this topic.
Most complex products consist of numerous pieces. For example, StackExchange has code that runs in your browser that's written in JavaScript so it can run in your browser. But the code that builds the web pages doesn't run in a browser and so isn't written in JavaScript. And if complex database queries are needed, they're likely to be in SQL. And so in. Each piece of the big puzzle is implement in the language most appropriate for what that piece does and the environment in which it runs.
Thank about GMail. There's a in-browser piece that's written in JavaScript. There's also a web server, a database, a mail server, a bulk storage system, indexing, and many, many other pieces.
this is the actual answer you are looking for
you are confused because you dont see how using the C and C++ applications in websites but I want to tell you that, they are used for many things... like, when you upload a image in facebook containing pornographic content, then php wont validate that image, what they will do is that execute a program by passing the address of that image by parameters and that application will validate the image... and some data should be stored for future use, so that application uses the common database that the site is using, if we upload a image in googleplus, then it will load tag sugestion to some part where people's faces are seen, it is done by that app, it will save the image data to the common database which google is using and php takes that information from there, this is the technique of developing much more functional websites...
like, i have made a program to shutdown my home computer while working on localhost:
<?php
$command="shutdown -s -f -t 5";
shell_exec($command);
?>
this script once run in apache will shut the server down similarly you can pass the parameters into some apps like if you want to create email account in command line for your own server which dont have Cpanel installed...
and the answer of second part of your question:
actually software engineers are hired so that they will develop some apps that can be run in a server for increasing the functionality of the website... like if there would be only webscripting language for websites, then google couldnot recognise the face neither facebook, and artificial intillegence would not be possible for websites..
this post may clear your confusion...
I have to write an architecture case study but there are some things that i don't know, so i'd like some pointers on the following :
The website must handle 5k simultaneous users.
The backend is composed by a commercial software, some webservices, some message queues, and a database.
I want to recommend to use Spring for the backend, to deal with the different elements, and to expose some Rest services.
I also want to recommend wicket for the front (not the point here).
What i don't know is : must i install the front and the back on the same tomcat server or two different ? and i am tempted to put two servers for the front, with a load balancer (no need for session replication in this case). But if i have two front servers, must i have two back servers ? i don't want to create some kind of bottleneck.
Based on what i read on this blog a really huge charge is handle by one tomcat only for the first website mentionned. But i cannot find any info on this, so i can't tell if it seems plausible.
If you can enlight me, so i can go on in my case study, that would be really helpful.
Thanks :)
There are probably two main reasons for having multiple servers for each tier; high-availability and performance. If you're not doing this for HA reasons, then the unfortunate answer is 'it depends'.
Having two front end servers doesn't force you to have two backend servers. Is the backend going to be under a sufficiently high load that it will require two servers? It will depend a lot on what it is doing, and would be best revealed by load testing and/or profiling. For a site handling 5000 simultaneous users, though, my guess would be yes...
It totally depends on your application. How heavy are your sessions? (Wicket is known for putting a lot in the session). How heavy are your backend processes.
It might be a better idea to come up with something that can scale. A load-balancer with the possibility to keep adding new servers for scaling.
Measurement is the best thing you can do. Create JMeter scripts and find out where your app breaks. Built a plan from there.
To expand on my comment: think through the typical process by which a client makes a request to your server:
it initiates a connection, which has an overhead for both client and server;
it makes one or more requests via that connection, holding on to resources on the server for the duration of the connection;
it closes the connection, generally releasing application resources, but generally still hogging a port number on your server for some number of seconds after the conncetion is closed.
So in designing your architecture, you need to think about things such as:
how many connections can you actually hold open simultaneously on your server? if you're using Tomcat or other standard server with one thread per connection, you may have issues with having 5,000 simultaneous threads; (a NIO-based architecture, on the other hand, can handle thousands of connections without needing one thread per connection); if you're in a shared environment, you may simply not be able to have that many open connections;
if clients don't hold their connections open for the duration of a "session", what is the right balance between number of requests and/or time per connection, bearing in mind the overhead of making and closing a connection (initialisation of encrypted session if relevant, network overhead in creating the connection, port "hogged" for a while after the connection is closed)
Then more generally, I'd say consider:
in whatever architecture you go for, how easily can you re-architecture/replace specific components if they prove to be bottlenecks?
for each "black box" component/framework that you use, what actual problem does it solve for you, and what are its limitations? (Don't just use Tomcat because your boss's mate's best man told them about it down the pub...)
I would also agree with what other people have said-- at some point you need to not be too theoretical. Design something sensible, then run a test bed to see how it actually copes with your expected volumes of data. (You might not have the whole app built, but you can start making predictions about "we're going to have X clients sending Y requests every Z minutes, and p% of those requests will take n milliseconds and write r rows to the database"...)
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
It seems the hype about cloud computing cannot be avoided, but the actual transition to that new platform is subject to many discussions...
From a Theoretical viewpoint, the following can be said:
Cloud:
architectural change (you might not install anything you want)
learning curve (because of the above)
no failover (since failure is taken care of)
granular cost (pay per Ghz or Gbyte)
instantaneous scalability (not so instantaneous, but at least transparent?) ? lower latency
Managed:
failover (depends on provider)
manual scalability (requires maintenance)
static cost (you pay the package , whether you use it fully or not)
lower cost (for entry- packages only)
data ownership ( you do )
liberty ( you do ) ? lower latency ( depends on provider)
Assuming the above is correct or not; Nevertheless, a logical position is "it depends.." .. on the application itself.
Now comes the hidden question: how would you profile your j2ee app to determine if it is a candidate to cloud or not; knowing that it is
a quite big app in number of services/ functions (i.e.; servlets)
relies on a complex database (ie. num. tables)
doesn't need much media resources, mostly text based
"Now comes the hidden question: how would you profile your j2ee app to determine if it is a candidate to cloud or not; knowing that it is"
As an aside, make that the Explicit question. Make it the TITLE of this question. Put it in the beginning of the question. If possible, delete all of your assumptions, and focus on the question.
Here's what we do.
Call some vendors for your "cloud" or "managed service" arrangement. Not too many. One or two of each.
Ask them what they support. More importantly, what they don't support.
Then, given a short list of features that aren't supported, look at your code for those features. If they don't support things you need, you have some architecture work to do. Or cross them off your preferred vendor list.
For good vendors, write a pilot contract that gives you free (or cheap) access for a few months to install and test. If it doesn't work, you haven't paid much.
"But why go through the expense of trying to host it when it may not work?"
What expense? You can spend months "studying" your code. Or you can try to host it. Usually, the "try to host it" will turn up an answer within a few days. It's less effort to just do it.
What sort of Cloud Service are you talking about? IaaS, PaaS, DaaS ?
architectural change (you might not install anything you want)
Depends: moving from a "managed server" to a Platform (e.g. GAE) might be.
learning curve (because of the above)
Amazon EC2 might not be a big learning curve if you are used to running your own server
no failover (since failure is taken care of)
Depends: EC2 -> you have to roll your own
instantaneous scalability (not so instantaneous, but at least transparent?) ? lower latency
Depends: EC2 -> you have to plan for this / use an adjunct service
There are numerous cloud providers and as far as I've seen there are two main types of them:
Ones that support a cloud computing platform (e.g. Amazon E2C, MS Azure)
Virtual instance providers providing you ability to create numerous running instances (e.g. RightScale)
As far as the platforms, be aware that relational database support is still quite poor and the learning curve is quite long.
As for virtual instance providers the learning curve is really little (you just have to fire up your instances), but instances need some way of synchronizing... for a complex application this might not work.
As for your original question: I don't think there's any standard way you could profile wether an application should / could be moved to the cloud. You probably need to familiarize yourself with the options, narrow down to a few providers and see if the benefits you would get to them would be of any significant win over managed hosting (which you're probably currently doing).
In some ways comparing google app engine (gae) and amazon ec2 is like comparing apples and oranges.
With ec2 you get an operating system, with or without installed server software (tomcat, database, etc.; your choice, depending on which ami you choose). With ec2 you need a (or be a) system administrator to keep things running smoothly. Load balancing on ec2 is something you'll have to figure out and implement; I've never done that part. The big advantage with ec2 is that you can spin up and down new instances programatically, and compared to a regular web server provider, only pay for when your instance is up and running. You use this "auto spin up/down" to implement your load balancing and failover. But you have to do the implementation (again, I have no experience with this part).
With google app engine (gae), all of the system administration is taken care of for you. It also automatically scales out as needed, both on the app side and the database side. You also only pay for what you use; an idle app that gets no hits incurs no costs. The downsides to gae are that you're restricted in the languages you can use; python and java (or things that run on the jvm, like jruby). An even bigger downside is that the database is not sql (they don't call it a database; they call it the datastore) and will likely require reworking your ddl if you have an existing database; it's a definite cost in programmer time to understand how it works and how to use it effectively.
So if you're starting from scratch (or willing to rewrite) and you have the resources and time to learn its ways, gae may be the way to go. If you have a sysadmin or sysadmin skills and have the time or know how to set up load balancing and failover then ec2 could be the way to go. If your app is going to be idle a lot then ec2 is expensive; last time I checked it was something like $70 a month to leave a small instance up.
I think the points you have to focus are:
granular cost (pay per Ghz or Gbyte)
instantaneous scalability (not so instantaneous, but at least transparent?) ? lower latency
Changing your application to run on a cloud will take a good amount of time, but it will not really matter if the cloud don't lower your costs and/or you don't really need instantaneous/fast scalability (the classic example are eCommerce app)
After considering these 2 points. The one IMO you should think about is relies on a complex database (ie. num. tables), since depending on its "conplexity", changing to a cloud environment can be really troublesome.
I'm working with a start-up, mostly doing system administration and I've come across a some security issues that I'm not really comfortable with. I want to judge whether my expectations are accurate, so I'm looking for some insight into what others have done in this situation, and what risks/problems came up. In particular, how critical are measures like placing admin tools behind a vpn, regular security updates (OS and tools), etc.
Keep in mind that as this is a start-up, the main goal is to get as many features as possible out the door quickly, so I'll need as much justification as I can get to get the resources for security (i.e. downtime for upgrades, dev time for application security fixes).
Background Info:
Application is LAMP as well as a custom java client-server.
Over the next 3 months, I project about 10k anonymous visitors to the site and up to 1000 authenticated users.
Younger audience (16-25) which is guaranteed to have an above average number of black-hats included.
Thanks in advance for your responses, and I'll welcome any related advice.
Also, don't forget you need to have your server secured from current (that is, soon-to-be-past) employees. Several startups were totally wiped due to employee sabotage, e.g. http://www.geek.com/articles/news/disgruntled-employee-kills-journalspace-with-data-wipe-2009015/
If security isn't thought of and built into the application and its infrastructure from day one it will be much more difficult to retrofit it in later. Now is the time to build the processes for regular OS/tool patching, upgrades, etc.
What kind of data will users be creating/storing on the site?
What effect will a breach have on your users?
What effect will a breach have on your company?
Will you be able to regain the users' trust after a breach?
Since your company is dependent on keeping existing users and attracting new ones, you should present your concerns along the lines of how the users would react to a breach. The higher-ups will understand that the users are your bread and butter.
Reputation is everything here, especially for a startup. As a startup, you don't have a long history of reliability/security/... - so all depends on users to give you the 'benefit of the doubt' when they start using your app.
If your server gets hacked and your users notice that, your reputation is gone. Once it's gone, it doesn't matter whether your app and your features are the 'next new thing' or not. It doesn't matter whether the security breach was minor or not - people won't trust your app/company anymore.
So, I would consider security to be the top priority.
I agree with Stefan about reputation. You don't want to get hacked because you were lacking on security. Not only will that hurt your site and company, it will look bad on you since you're in charge of that.
My personal opinion is to do as much as you can because no matter how much you do there will be vulnerabilities.
Unfortunately security like testing and documentation are often afterthoughts. You should really make sure to do risk assessments early in your site/software's life and to keep on doing assessments. I think it is important to patch all software for security holes.
These will probably be obvious:
Limit password attempts.
Sanitize your database inputs
Measures to prevent XSS attacks
It's also worth mentioning that, as you said, the network architecture should be set up appropriately. You should definitely have a decent firewall that's locked down as much as possible. Some people recommend putting your systems between dual firewalls of different makes so that in the event one of them has a critical vulnerability, the second will most likely not have the same vulnerability and you'll be safe. It all depends on what you can afford since it's a startup.
If you're explicitly trying to attract the sort of users who are inclined to try to crack systems, then you can pretty well bet that your system will come under attack.
You should suggest to the management that if they're not going to take security seriously, then you should just go ahead and post the company's bank statements and accounting books (in clear text) on the site, with a prominent link from the home page. At least that way, you can tell them, the end result will be about the same, but they're less likely to damage everything else to get what they're looking for.
I'd think that the reputation issue might have a slightly different cast with this audience, too -- they may forgive you for being hacked, but they probably won't forgive you for being an easy target.
Make sure you know what version and patch level your servers are running, not just the OS, but all related components and everything that is actually executing the the machine.
Then make sure you are never more than a day behind.
Not doing so leads to much pain, and you don't hear of most of it - most of my past employers would never publicly admit being hacked as it reflects badly on them, so you can assume systems are getting hacked left and right with pretty serious consequences to companies, you just don't hear about most of these events.
A few basic "security" measures here that while are more reactive than proactive, are some things to consider.
1) Backup strategy, of course not just for those who hack into your site, but it is nice to restore everything back to pre-hack days if possible, make sure it's reliable and most importantly was tested in a near-live restore drill
2) Mitigation, have plans in place at least on a napkin somewhere for how to react if the server is hacked
3) Insurance, find insurance companies that understand the world of cyber-business and the damages resulting from these things, buy policies
4) Someone already mentioned employee sabotage problems, you're screening your employees beforehand right? Background checks are cheap and do dig up stuff...
My best suggestion is monitoring.
There is no perfect security and it is all about accepting risks and preventing them when necessary. However, if you have no monitoring in place you will have no way to know if something (an attack) has succeeded and how it happened.
So, keep your system updated and install a few lightweight tools to monitor it properly. If you have custom applications, add logging in there. Log on error-generated errors (bad input), failed passwords, or any user-generated error.
As for lightweight tools to monitor, there is quite a few free/open source:
OSSEC (to look for anomalies, changes and logs)
modsecurity (web-based monitoring)
Sucuri (whois/dns/blacklisting monitoring)
Have a look at Mod Security for the various possibilities in the software setup:
Do a Google search for "mod_security howto example"
Simple example to start: http://www.ghacks.net/2009/07/15/install-mod_security-for-better-apache-security/