Sunday, December 16, 2007

 

Distribute your cache like the big boys do

Suppose you have a fairly high volume, database intensive, ASP.NET website. Chances are that you have multiple front-end webservers in a loadbalanced farm. The front-end scales nicely this way. Depending on your load and the complexity of your queries, sooner or later, your database will become your bottleneck. Among other tricks (such as reviewing your queries, upgrading hardware, partitioning your data), you will probably try to cache some of the query results in memory. This is suitable for read-only queries that are not very volatile. You would use the ASP.NET Cache object to store your data.

Now here comes the problem: all of your webservers have their own private cache memory. So if five customers come within ten minutes and query your database for exactly the same product, chances are that each customer is served by a different webserver and the database is still hit five times. And the cached resultsets are now claiming five times the memory. Also: the amount of available cache memory in the server is limited (normally to about 800 MB, around 1200 if you use the /3GB switch and much more if you run win64). Sooner or later, you'll hit that limit. So, while caching hugely increases the scalability of you solution, the caching solution itself does not scale out very well with many (>2) webservers. Also, depending on your site's characteristics, you might cache more aggressively if you had (much) more memory available.

A solution for this situation may be a distributed cache memory, like memcached (or). this solution will keep cached resources in memory on only one machine and uses the key to determine which machine holds which items. It means that your cache must be accessed over the network, significantly slower than in-process memory, but nowadays much faster than file access (the bottleneck of your database server). Many of the largest sites (facebook, wikipedia, YouTube, livejournal) use memcached (see here and here and here), which proves that is scales like hell, but I think it can be usefull in scenarios much smaller than that. You could set up a number of cheap 64 bit Linux boxes with loads of memory. Note that these boxes need not even to have a hard disk and processor requirements are very modest. This allows you to create a huge amount of in-memory cache, opening up caching scenarios that you would normally dismiss without serious thought. A Win32 port is available, so you could also run an instance on each of your webservers, using just the memory you are now using for ASP.NET Cache. If you happen to use NHibernate, it has a memcached caching provider for caching both object instances and queries.

What a pity that the Cache object has no plug-in model.


Comments:
http://aspalliance.com/cachemanager/

And this one


-- Kris
 
CacheManager is a handy tool that allows you to inspect and manage your ASP.NET cache at runtime.

Nice, but not related to what I describe above.
 
I received a referral to this the Sharedache project on CodePlex:
http://www.codeplex.com/SharedCache
I have no idea of the maturity of the product, but it fits the requirements of this post nicely and a managed solution may have advantages. On the other hand: memcached has been tested in extreme circumstances and is highly optimized (better than a managed code solution could ever be, I think).
 
Post a Comment

Links to this post:

Create a Link



<< Home

This page is powered by Blogger. Isn't yours?