Releng has been successfully reducing the Amazon bill recently. We managed to drop the bill from $115K to $75K per month in February.
To make this happen we switched to a cheaper instance type, started using spot instances for regular builds and started bidding for spot instances smarter. Introducing the jacuzzi approach reduced the load by reducing build times.
More details below.
When we fist tried to switch form m3.xlarge ($0.45 per hour) to c3.xlarge ($0.30 per hour) we hit an interesting issue when ruby won't execute anything -- segmentation fault. It turned out that using paravirtual kernels on c3 instance type is not a good idea since this instance type "prefers" HVM virtualization unlike the m3 instance types.
Massimo did a great job and ported our AMIs from PV to HVM.
This switch from PV to HVM went pretty smoothly except the fact that we had to add a swap file because linking libxul requires a lot of memory and the existing 7.5G wasn't enough.
This transition saved us more or less 1/3 of the build pool bill.
Smarter spot bidding
We used to bid blindly:
- I want this many spot instances in this availability zone and this is my maximum price. Bye!
- But the current price is soooo high! ... Hello? Are you there?.. Sigh...
- Hmm, where are my spot instances?! I want twice as many spot instances in this zone and this is my maximum price. Bye!
As a part of this transition we reduced the amount of on-demand builders from 400 to 100!
Additionally, now we can use different instance types for builders and sometimes get good prices for the c3.2xlarge instance type.
As a part of the s/m3.xlarge/c3.xlarge/ transition we also introduced a couple other improvements: * Reduced EBS storage use * Started using SSD instance storage for try builds. All your try builds are on SSDs now! Using instance storage is not an easy thing, so we had to re-invent our own wheel to format/mount the storage on boot.