Posts Tagged ‘scaling

24
Oct

tv2.dk on Rails on Amazon EC2

What if tv2.dk (one of the most popular Danish sites) was running Rails and ran on Amazon EC2? This posting is a followup to this posting

Let’s assume that we on average during 24 hours have 50 requests/sec. The traffic during the 24 hours could for example be distributed like this

Hours Traffic
00 - 08 5 requests/sec
08 - 11 30 requests/sec
11 - 12 150 requests/sec
12 - 14 250 requests/sec
14 - 15 150 requests/sec
15 - 24 30 requests/sec

This works out to be 50 requests/sec on average. As we saw in the tv2.dk on Rails blog posting each quad core server will get you 80 requests/sec. Unfortunately we have to have servers that have enough power to satisfy the peak number of requests and not the average number of requests. In this example we need four quad core servers in order to be able to serve the 250 requests/sec needed from 12-14 (actually they are able to serve 320 requests/sec). So on average we need one server but in the peak hours we need four servers.

What if we were able to adjust the number of servers dynamically to satisfy the current load on the website?

Amazon EC2 to the rescue! Amazon EC2 is a service that allows you to run one or many virtual servers and you pay by the hour for each virtual server. So we can start and stop a number of virtual machines depending on the load on the website. Each EC2 instance has one (virtual) cpu core so with 10 mongrels doing 2 requests/sec it should be able to handle 20 requests/sec.
So how many servers do we need to run?

Hours Traffic # of servers
00 - 08 5 requests/sec 1
08 - 11 30 requests/sec 2
11 - 12 150 requests/sec 8
12 - 14 250 requests/sec 13
14 - 15 150 requests/sec 8
15 - 24 30 requests/sec 2

If we calculate the number of “server hours” it is 74. How much is this going to cost? Using the AWS Simple Monthly Calculator it turns out to cost 222$ for a 30 day month. Not too bad! Of course we have to add a server for static content (or use Amazon S3) as well as a couple of DB servers. With a small instance server for static content and two large instances (virtual quad cores with 7.5 GB ram) the monthly bill goes up to 811$ - still not too bad.

But you also have to pay for traffic. I will cover that in another blog posting.

22
Oct

tv2.dk on Rails

Recently I started wondering: What kind of server setup would it take to run one of the most popular sites in Denmark if it was running Rails?

According to FDIM (the association of Danish Internet media) tv2.dk was the thirdmost visited site in Denmark during August 2007:

Number of users 1.435.208
Number of visits 15.233.358
Number of page views 142.132.680

142 million page views! That’s a lot! But how much data is this?

According to Alexa the three most popular subdomains of tv2.dk are galleri.tv2.dk, nyhederne.tv2.dk, and vip.tv2.dk.
Using YSlow we get a picture of the total page size and the number of http requests for the main page of each subdomain:

galleri.tv2.dk Total size 140.1K HTTP requests 19
nyhederne.tv2.dk Total size 580.2K HTTP requests 85
vip.tv2.dk Total size 464.4K HTTP requests 58

The average page size is 395K and the average number of requests per page is 54.

The number of page views for August is 142.132.680. This translates to an average of 53 page views/sec. Given the average page size and number of requests this translates to an average bandwith usage of 20.5 MB/sec and an average of 2866 request/sec.

What kind of server setup do you need to handle this amount of traffic?

First of all, by far most traffic comes from static content (images, javascript files, stylesheets). Any decent web server (Apache, nginx and in tv2.dk’s case lighttpd) - this page shows that without too much hassle it is possible to achieve 10000+ requests/sec on a decent server with enough network bandwidth.

But what about the Rails part? If we assume that each page view corresponds to one Rails request we have to handle 53 Rails requests/sec. If we assume that the average request time is 500 ms, each Mongrel can handle 2 requests/sec. So we have to have 27 Mongrels to be able to handle 53 queries/sec. As noted in the excellent article Scale rails from one box to three, four and five by Courtenay a rule of thumb is that each cpu core can handle 10 Mongrels. So with a quad core machine we should be able to handle 2*10*4 = 80 Rails requests/sec. Each Mongrel uses in the range of 60-100 MB, so the machine has to have 4*10*80 = 3200 MB ram. Rails applications can be built to scale quite well, so adding additional quad core machines with 4 gigs of ram should get you another 80 requests/sec. With 4 of these machines you should be able to handle 320 requests/sec.

There must some database servers to handle all the data. There a several ways to setup these servers. Take a look at Courtenay’s article for an explanation of these.