Two weeks ago Guy Rosen posted a very interesting analysis of the EC2 instance IDs which reveals how many instances (virtual machines) have been launched on EC2 since its beginning in 2006. We’ve also been digging in our records and I can share some interesting findings.
First of all, Guy’s analysis contains one significant error which is due to the limited data set he had access to. Before May 2009 EC2 issued even and odd instance IDs, not just even ones as he mentions. Since that date EC2 issued only even IDs until it switched to only odd ones in early September. The even/odd switches don’t seem to correlate with ID boundaries, perhaps Amazon switches between two active/standby reservation systems or something else is going on.
The formula to convert an EC2 ID into a sequential launch number as far as we call tell is:
Given an aws id as i-11223333 Assign p1 the 1's, p2 the 2's and p3 the 3's Also assign p31 the first two 3's and p32 the last two 3's Compute: c1 = (p1 ^ p32) ^ 0x69 c2 = (p2 ^ p31) ^ 0xe5 c3 = p3 ^ 0x4000 And finally concatenate c1-c2-c3. (This does not include the even/odd adjustments)
The upshot of Guy’s error is that he underestimates the launches by almost 2x! Here is a graph showing the instances launched daily since late 2006 that we would postulate based on his formula for instance IDs and what we’ve observed. We compute a total of 15.5 million instances (!) launched to date:

You can see that EC2 has been growing very steadily, except for dips during the holidays and a spike in activity in april of 2008. That spike was due to Animoto’s scaling to several thousands of servers within few days. We’re a little puzzled about this spike, however, because the instance ID analysis shows about 2x more servers launched than Animoto actually launched (we launched them so we know). We believe this discrepancy to be temporary, but there remain some mysteries in the instance ID allocation…
It’s also important to be clear about the what an instance launch means — namely, the launch of a virtual server. It says nothing about what size server is launched (and therefore it’s cost per hour) or how long that server runs (and therefore how many servers are running concurrently). As a result, an “instance launch” might mean as little as 10 cents in EC2 revenue (1 small instance for 1 hr) or, for example, $7008 in EC2 revenue (1 XL instance run for 365 days), or even more. That’s quite a difference, and makes it challenging to calculate revenues based solely on total instance launch statistics.
Another interesting facts that we have observed is that during 2009 many of the larger EC2 customers have been migrating to the larger instance sizes. In earlier days the predominant method of scaling was by launching more servers, but we are now seeing a lot more scaling by replacing smaller servers by larger ones. Those XL servers are going like hotcakes! In addition we see a clear rule where the larger the server the longer it runs. A lot of the small servers go as quickly as they came: they’re used for experimentation, development, and testing. Once you launch a large server and fill it up with data chances are you’ll keep it running for a while. Hold onto your wallet!
Another interesting trend we’ve seen is the improvement in sysadmin-to-server ratio. Our customers who grok the RightScale platform become very effective at managing lots of servers with few people. Hundreds to thousands per sysadmin. As a result they use servers aggressively to solve business needs — whether to keep up with exponential traffic or simply flexibility during dev & test.
Overall, in terms of all cloud spending, in the last 12 months we’ve observed:
- Cloud infrastructure spending grew 380% – i.e. $$ spent on cloud provider resources
- Average cloud costs per customer grew 140% – i.e. cloud users on average are spending 2.5X more than a year ago
- RightScale’s own cloud infrastructure consumption grew 440%
That’s phenomenal growth – and testimony to the value of managed cloud computing.
Meanwhile, the beat goes on, and we’re all consuming more and more cloud resources as each day passes. If you have a story about your own cloud usage, or trends and patterns you’re seeing in cloud usage in general, please post a comment or send it in.
Anatomy of an Amazon EC2 Resource ID › ec2base said
[...] http://blog.rightscale.com/2009/10/05/amazon-usage-estimates/ [...]
egrep-cloud-cambrian-watch-2009-10-06 « すでにそこにある雲 said
[...] Amazon Usage Estimates « RightScale Blog [...]
Geva Perry said
Great follow up analysis on Guy Rosen’s insightful post (faulty as it may have been). Your observation that we cannot extrapolate financial information on Amazon’s cloud business is of course true. But i have a suggestion.
As this may be “the $1 billion question” and you guys probably have statistically significant data on Amazon instance size and duration usage, you can extrapolate financials from your sample statistics.
Thorsten said
Geva, thanks for your comment, and just to be clear: Guy did all the hard work. We just matched it against our data and ironed out some details. The kudos all all his. With respect to revenue estimates you are correct that we ought to be able to do a pretty good extrapolation, but we’re also a close partner of AWS and respect their strategy not to talk about the revenues generated by their services.
Guy Rosen said
Great work RightScale team! It’s fabulous that you have all these IDs in your historical logs and were able to complete the picture. Firstly, I’d like to thank you for your definition of c2: it ironed out the missing piece which I described as “jumping around in a pattern which is yet to be explained”.
Are you able to generate any similar reports for EBS volumes, AMIs, etc.? I’m curious to see those expanded with the wealth of data at your disposal.
Thorsten said
Guy, the work was almost all on your end, let’s be clear about that. Your analysis of the EC2 IDs really brings to light that these IDs are not random. Hopefully that will prevent people from using them for security or privacy sensitive purposes.
.
WRT the other IDs we haven’t done the analysis, I have to admit it looks less motivating, the servers are really the most exciting
Anatomy of an Amazon EC2 Resource ID :: Jack of all Clouds :: Guy Rosen on Cloud Computing said
[...] (Oct 7th 2009): RightScale applied the findings for the two years worth of data they have in their systems. Based on that data, they estimate the [...]
Amazon Usage Estimates and Updates :: Jack of all Clouds :: Guy Rosen on Cloud Computing said
[...] decided to apply the findings to the mountain of EC2 data they have – a few years worth. Firstly, this solved a few of the [...]
Eric Hammond said
If anybody is concerned about having their identifiers studied to figure out how fast they are growing, here is the approach I use to scramble 32-bit integers so that I can unscramble them, but nobody else can (available in Perl and C, and easily portable to other sensible languages):
http://search.cpan.org/~esh/Crypt-Skip32/lib/Crypt/Skip32.pm
Once you approach 4 billion (a large number, by the way) you’ll need to extend your identifiers a bit or switch a prefix, perhaps giving away information, but then you can decide to do this when you’re at a mere 1 billion, thus confusing the seekers.
Thorsten said
Very nice idea Eric, thanks for posting it!
No more servers et la progression du Cloud chez Amazon said
[...] http://blog.rightscale.com/2009/10/05/amazon-usage-estimates/ [...]
Amazon Usage Estimates « RightScale Blog « urban-listening said
[...] October 12, 2009 via blog.rightscale.com [...]
Wow… “Cloud-compute Power” at it’s Best. 50k CPU Launches a Day! « My blog said
[...] pioneering cloud-compute friends at RightScale published some very interesting observations about Amazon Web Services’ Elastic Compute Cloud (EC2) daily CPU launches. They estimate a daily [...]
Turbo Talk » Amazon MFA and VPS, Hyper-V FUD, and physical datacenter design tips said
[...] some of the security concerns that are inherent to cloud deployments. Also, Amazon partner Rightscale also posted some interesting statistics about cloud growth over the past year, which were derived from their internal and customer [...]
Amazon EC2 Performance Drops – Too Many Users | The "Break it Down" Blog said
[...] some of the available usage statistics out there about Amazon EC2 load, it may not be surprising that the existing underlying hardware is starting to buckle under the [...]
Revisiting EC2 Instance IDs :: Jack of all Clouds :: Guy Rosen on Cloud Computing said
[...] at hand we can now uncover the constants needed for all EC2 regions. Except for us-east-1 which thanks to RightScale enjoyed a 3-year history, we did not have enough data to extract the constants for other regions. [...]
Alexey Bokov’s weblog » Blog Archive » Cloud computing links – August 2010 said
[...] Anatomy of an Amazon EC2 Resource ID and based on this anatomy EC2 usage estimates [...]
боков блог » Blog Archive » Amazon Web Services – немного фактов said
[...] id и расчет количества ежедневно запущенных instance – Amazon EC2 usage estimates – около 50 тысяч инстансов в пределах одной геозоны [...]