February 25, 2007
Web Application Performance and Scalability
If you frequent community-driven news sites such as Digg and Reddit, you often come across posts on how to get every drop of performance out of your web application. The sister issue of scalability is oftentimes addressed, usually in its manifestation, loss of service due to overwhelming usage.
These performance-enhancing and scalability guides often focus on software-level solutions. Use this data type. Use that API or framework. Use caching in your application-- and here is how to implement it in Perl/Ruby/PHP/Java/.NET. There is always the tip on how to make a simple caching framework in your favorite web application language. There is always the list of settings to tune your web server and your script execution engine. These are all fine and dandy, but people always seem to overlook the obvious solution.
Most web applications generate the same set of data for sequential requests, even highly-dynamic, user-driven, Web 2.0 sites. With this in mind, we realize that all web applications share the simple trait that they communicate using the HTTP protocol. There is a server. There is a client. In between there is HTTP. Nothing more. Nothing less. And wouldn't you know it, the HTTP specification has a whole section on caching!
So, next time you are trying to squeeze hits per second out of your web application, don't search the net for obscure hacks and optimizations, turn to the specs. Introducing a few extra HTTP headers into your dynamically-generated response and adding a web cache, such as Squid, in front of your main web server will do more than you realize. Even if you implement caching inside of your web script, you still have to go through all the overhead to initialize the script for each request. In many cases, the functionality of your page output cache can be duplicated using HTTP caching.
It makes no sense to execute a script for a simple HTTP GET request. In the ideal world, the only time a request should actually hit your script server is when new data is being sent or when the HTTP cache does not have the data you seek.
Now that you have been enlightened, your challenge is to implement it. I admit, it can be challenging, especially when content on your site is highly dynamic. But just think of the smile on your face when you realize you just increased capacity by a factor of ten without buying new hardware. Never underestimate the power of the HTTP protocol.
Trackback
You can ping this entry by using http://blog.case.edu/gps10/mt-tb.cgi/12068 .
Comments
[URL=http://hfjifltf.com]bteoabhh[/URL] bitcbsrl leyjzvtn http://dhgngxaj.com izgkuulm euzcqsia
bnuzqdoy hzxmpvdy http://xeuwodgv.com ighvewdz ecmvnveg [URL=http://yukxyjfr.com]fxbsnvmz[/URL]