Enjoying Rails

November 4, 2007

Creating an Amazon EC2 Ubuntu 6.06 LTS server edition image

Filed under: Uncategorized — enjoyingrails @ 18:04
Tags: ,

Last week I decided to try out Amazon EC2 mainly for running and testing Rails applications and so far it has been great fun!

This blog posting describes how I created an Ubuntu 6.06 LTS server edition image for use with Amazon EC2.

Download and install the EC2 command line tools

curl -O http://s3.amazonaws.com/ec2-downloads/ec2-api-tools.zip
mkdir ~/.ec2
cd ~/.ec2
unzip ec2-api-tools.zip
ln -s ec2-api-tools-1.2-13740 ec2-api-tools

Set the environment variables necessary to run the tools

export EC2_HOME=~/.ec2/ec2-api-tools
export PATH=$PATH:$EC2_HOME/bin
export JAVA_HOME=/System/Library/Frameworks/JavaVM.framework/Home/

Download your private key and certificate from your Amazon Web Services account to the ~/.ec2 folder.

Generate a key pair

export EC2_PRIVATE_KEY=~/.ec2/pk-8WU9XGOPO65IKA7O96M2KEKVOS5288KU.pem
export EC2_CERT=~/.ec2/cert-8WU9XGOPO65IKA7O96M2KEKVOS5288KU.pem
mkdir ~/.ec2
ec2-add-keypair gsg-keypair > ~/.ec2/id_rsa-gsg-keypair
chmod 600 ~..ec2/id_rsa-gsg-keypair

Launch a Fedora Core 4: Base instance.

ec2-run-instances ami-20b65349 -k gsg-keypair

This returns an instance number like i-9536dcfc.

Try running

ec2-describe-instances i-9536dcfc

until the status returned is no longer ‘pending’ but ‘running’.

Allow ssh access and log in

ec2-authorize default -p 22
ssh -i ~/.ec2/id_rsa-gsg-keypair root@ec2-67-202-21-218.compute-1.amazonaws.com

On the EC2 instance run

wget http://erichsen.net/blog/fc4-base
chmod 755 fc4-base
./fc4-base

fc4-base is a script found in this forum posting. I adapted it to create an Ubuntu 6.06 image instead of 6.10.

After the script has finished execution copy the private key and certificate from the local machine to the EC2 instance

scp -i ~/.ec2/id_rsa-gsg-keypair ~/.ec2/pk-8WU9XGOPO65IKA7O96M2KEKVOS5288KU.pem root@ec2-67-202-21-218.compute-1.amazonaws.com:/root/
scp -i ~/.ec2/id_rsa-gsg-keypair ~/.ec2/cert-8WU9XGOPO65IKA7O96M2KEKVOS5288KU.pem root@ec2-67-202-21-218.compute-1.amazonaws.com:/root/

Create an image and sign it with the private key

ec2-bundle-image -i /mnt/ubuntu606base.img -k /root/pk-8WU9XGOPO65IKA7O96M2KEKVOS5288KU.pem -c cert-8WU9XGOPO65IKA7O96M2KEKVOS5288KU.pem -u '5171-9220-6573'

The image must be stored on S3 so I create a bucket (from my local Mac)

sudo gem i aws-s3 -y
export AMAZON_ACCESS_KEY_ID="1IQ8AHOAWNRDMQOI91ZK"
export AMAZON_SECRET_ACCESS_KEY="VCddmbA9C4D8w/mw6aLZjzCkMoEyx5EUvouJdY/4"
s3sh
Bucket.create('erichsen.net')

On the EC2 instance I upload the image

ec2-upload-bundle -b erichsen.net -m /tmp/ubuntu606base.img.manifest.xml -a '1IQ8AHOAWNRDMQOI91ZK' -s 'VCddmbA9C4D8w/mw6aLZjzCkMoEyx5EUvouJdY/4'

From my local Mac I register the instance and that’s it!

ec2-register erichsen.net/ubuntu606base.img.manifest.xml

The ec2-register returns an AMI id – in this case ami-4acd2823. Let’s try it out

ec2-run-instances ami-4acd2823 -k gsg-keypair
ec2-describe-instances i-9536dcfc
ssh -i ~/.ec2/id_rsa-gsg-keypair root@ec2-67-202-24-151.compute-1.amazonaws.com

YES! I was able to ssh into an EC2 instance running my Ubuntu 6.06 image! Note that the ssh configuration of the image is not the best – it allows root logins which is not in general a good idea.

Hope this helps someone else wanting to play with Ubuntu on EC2.

October 24, 2007

tv2.dk on Rails on Amazon EC2

Filed under: rails, scaling — enjoyingrails @ 20:53
Tags: , ,

What if tv2.dk (one of the most popular Danish sites) was running Rails and ran on Amazon EC2? This posting is a followup to this posting

Let’s assume that we on average during 24 hours have 50 requests/sec. The traffic during the 24 hours could for example be distributed like this

Hours Traffic
00 – 08 5 requests/sec
08 – 11 30 requests/sec
11 – 12 150 requests/sec
12 – 14 250 requests/sec
14 – 15 150 requests/sec
15 – 24 30 requests/sec

This works out to be 50 requests/sec on average. As we saw in the tv2.dk on Rails blog posting each quad core server will get you 80 requests/sec. Unfortunately we have to have servers that have enough power to satisfy the peak number of requests and not the average number of requests. In this example we need four quad core servers in order to be able to serve the 250 requests/sec needed from 12-14 (actually they are able to serve 320 requests/sec). So on average we need one server but in the peak hours we need four servers.

What if we were able to adjust the number of servers dynamically to satisfy the current load on the website?

Amazon EC2 to the rescue! Amazon EC2 is a service that allows you to run one or many virtual servers and you pay by the hour for each virtual server. So we can start and stop a number of virtual machines depending on the load on the website. Each EC2 instance has one (virtual) cpu core so with 10 mongrels doing 2 requests/sec it should be able to handle 20 requests/sec.
So how many servers do we need to run?

Hours Traffic # of servers
00 – 08 5 requests/sec 1
08 – 11 30 requests/sec 2
11 – 12 150 requests/sec 8
12 – 14 250 requests/sec 13
14 – 15 150 requests/sec 8
15 – 24 30 requests/sec 2

If we calculate the number of “server hours” it is 74. How much is this going to cost? Using the AWS Simple Monthly Calculator it turns out to cost 222$ for a 30 day month. Not too bad! Of course we have to add a server for static content (or use Amazon S3) as well as a couple of DB servers. With a small instance server for static content and two large instances (virtual quad cores with 7.5 GB ram) the monthly bill goes up to 811$ – still not too bad.

But you also have to pay for traffic. I will cover that in another blog posting.

October 22, 2007

tv2.dk on Rails

Filed under: rails, scaling — enjoyingrails @ 22:02
Tags: , ,

Recently I started wondering: What kind of server setup would it take to run one of the most popular sites in Denmark if it was running Rails?

According to FDIM (the association of Danish Internet media) tv2.dk was the thirdmost visited site in Denmark during August 2007:

Number of users 1.435.208
Number of visits 15.233.358
Number of page views 142.132.680

142 million page views! That’s a lot! But how much data is this?

According to Alexa the three most popular subdomains of tv2.dk are galleri.tv2.dk, nyhederne.tv2.dk, and vip.tv2.dk.
Using YSlow we get a picture of the total page size and the number of http requests for the main page of each subdomain:

galleri.tv2.dk Total size 140.1K HTTP requests 19
nyhederne.tv2.dk Total size 580.2K HTTP requests 85
vip.tv2.dk Total size 464.4K HTTP requests 58

The average page size is 395K and the average number of requests per page is 54.

The number of page views for August is 142.132.680. This translates to an average of 53 page views/sec. Given the average page size and number of requests this translates to an average bandwith usage of 20.5 MB/sec and an average of 2866 request/sec.

What kind of server setup do you need to handle this amount of traffic?

First of all, by far most traffic comes from static content (images, javascript files, stylesheets). Any decent web server (Apache, nginx and in tv2.dk’s case lighttpd) – this page shows that without too much hassle it is possible to achieve 10000+ requests/sec on a decent server with enough network bandwidth.

But what about the Rails part? If we assume that each page view corresponds to one Rails request we have to handle 53 Rails requests/sec. If we assume that the average request time is 500 ms, each Mongrel can handle 2 requests/sec. So we have to have 27 Mongrels to be able to handle 53 queries/sec. As noted in the excellent article Scale rails from one box to three, four and five by Courtenay a rule of thumb is that each cpu core can handle 10 Mongrels. So with a quad core machine we should be able to handle 2*10*4 = 80 Rails requests/sec. Each Mongrel uses in the range of 60-100 MB, so the machine has to have 4*10*80 = 3200 MB ram. Rails applications can be built to scale quite well, so adding additional quad core machines with 4 gigs of ram should get you another 80 requests/sec. With 4 of these machines you should be able to handle 320 requests/sec.

There must some database servers to handle all the data. There a several ways to setup these servers. Take a look at Courtenay’s article for an explanation of these.

Blog at WordPress.com.