Two weeks ago Guy Rosen posted a very interesting analysis of the EC2 instance IDs which reveals how many instances (virtual machines) have been launched on EC2 since its beginning in 2006. We’ve also been digging in our records and I can share some interesting findings.
First of all, Guy’s analysis contains one significant error which is due to the limited data set he had access to. Before May 2009 EC2 issued even and odd instance IDs, not just even ones as he mentions. Since that date EC2 issued only even IDs until it switched to only odd ones in early September. The even/odd switches don’t seem to correlate with ID boundaries, perhaps Amazon switches between two active/standby reservation systems or something else is going on.
The formula to convert an EC2 ID into a sequential launch number as far as we call tell is:
Given an aws id as i-11223333 Assign p1 the 1's, p2 the 2's and p3 the 3's Also assign p31 the first two 3's and p32 the last two 3's Compute: c1 = (p1 ^ p32) ^ 0x69 c2 = (p2 ^ p31) ^ 0xe5 c3 = p3 ^ 0x4000 And finally concatenate c1-c2-c3. (This does not include the even/odd adjustments)
The upshot of Guy’s error is that he underestimates the launches by almost 2x! Here is a graph showing the instances launched daily since late 2006 that we would postulate based on his formula for instance IDs and what we’ve observed. We compute a total of 15.5 million instances (!) launched to date:

You can see that EC2 has been growing very steadily, except for dips during the holidays and a spike in activity in april of 2008. That spike was due to Animoto’s scaling to several thousands of servers within few days. We’re a little puzzled about this spike, however, because the instance ID analysis shows about 2x more servers launched than Animoto actually launched (we launched them so we know). We believe this discrepancy to be temporary, but there remain some mysteries in the instance ID allocation…
It’s also important to be clear about the what an instance launch means — namely, the launch of a virtual server. It says nothing about what size server is launched (and therefore it’s cost per hour) or how long that server runs (and therefore how many servers are running concurrently). As a result, an “instance launch” might mean as little as 10 cents in EC2 revenue (1 small instance for 1 hr) or, for example, $7008 in EC2 revenue (1 XL instance run for 365 days), or even more. That’s quite a difference, and makes it challenging to calculate revenues based solely on total instance launch statistics.
Another interesting facts that we have observed is that during 2009 many of the larger EC2 customers have been migrating to the larger instance sizes. In earlier days the predominant method of scaling was by launching more servers, but we are now seeing a lot more scaling by replacing smaller servers by larger ones. Those XL servers are going like hotcakes! In addition we see a clear rule where the larger the server the longer it runs. A lot of the small servers go as quickly as they came: they’re used for experimentation, development, and testing. Once you launch a large server and fill it up with data chances are you’ll keep it running for a while. Hold onto your wallet!
Another interesting trend we’ve seen is the improvement in sysadmin-to-server ratio. Our customers who grok the RightScale platform become very effective at managing lots of servers with few people. Hundreds to thousands per sysadmin. As a result they use servers aggressively to solve business needs — whether to keep up with exponential traffic or simply flexibility during dev & test.
Overall, in terms of all cloud spending, in the last 12 months we’ve observed:
- Cloud infrastructure spending grew 380% – i.e. $$ spent on cloud provider resources
- Average cloud costs per customer grew 140% – i.e. cloud users on average are spending 2.5X more than a year ago
- RightScale’s own cloud infrastructure consumption grew 440%
That’s phenomenal growth – and testimony to the value of managed cloud computing.
Meanwhile, the beat goes on, and we’re all consuming more and more cloud resources as each day passes. If you have a story about your own cloud usage, or trends and patterns you’re seeing in cloud usage in general, please post a comment or send it in.

Pingback: Anatomy of an Amazon EC2 Resource ID › ec2base
Pingback: egrep-cloud-cambrian-watch-2009-10-06 « すでにそこにある雲
Great follow up analysis on Guy Rosen’s insightful post (faulty as it may have been). Your observation that we cannot extrapolate financial information on Amazon’s cloud business is of course true. But i have a suggestion.
As this may be “the $1 billion question” and you guys probably have statistically significant data on Amazon instance size and duration usage, you can extrapolate financials from your sample statistics.
Geva, thanks for your comment, and just to be clear: Guy did all the hard work. We just matched it against our data and ironed out some details. The kudos all all his. With respect to revenue estimates you are correct that we ought to be able to do a pretty good extrapolation, but we’re also a close partner of AWS and respect their strategy not to talk about the revenues generated by their services.
Great work RightScale team! It’s fabulous that you have all these IDs in your historical logs and were able to complete the picture. Firstly, I’d like to thank you for your definition of c2: it ironed out the missing piece which I described as “jumping around in a pattern which is yet to be explained”.
Are you able to generate any similar reports for EBS volumes, AMIs, etc.? I’m curious to see those expanded with the wealth of data at your disposal.
Guy, the work was almost all on your end, let’s be clear about that. Your analysis of the EC2 IDs really brings to light that these IDs are not random. Hopefully that will prevent people from using them for security or privacy sensitive purposes.
.
WRT the other IDs we haven’t done the analysis, I have to admit it looks less motivating, the servers are really the most exciting
Thank you very much Guy, Thorster and the RightScale team for sharing this.
@Thorster, is there a way to make the dataset available for EBS volumes and AMIs so those interested can run the analysis themself?
frank
Pingback: Anatomy of an Amazon EC2 Resource ID :: Jack of all Clouds :: Guy Rosen on Cloud Computing
Pingback: Amazon Usage Estimates and Updates :: Jack of all Clouds :: Guy Rosen on Cloud Computing
If anybody is concerned about having their identifiers studied to figure out how fast they are growing, here is the approach I use to scramble 32-bit integers so that I can unscramble them, but nobody else can (available in Perl and C, and easily portable to other sensible languages):
http://search.cpan.org/~esh/Crypt-Skip32/lib/Crypt/Skip32.pm
Once you approach 4 billion (a large number, by the way) you’ll need to extend your identifiers a bit or switch a prefix, perhaps giving away information, but then you can decide to do this when you’re at a mere 1 billion, thus confusing the seekers.
Very nice idea Eric, thanks for posting it!
Pingback: No more servers et la progression du Cloud chez Amazon
Pingback: Amazon Usage Estimates « RightScale Blog « urban-listening
Pingback: Wow… “Cloud-compute Power” at it’s Best. 50k CPU Launches a Day! « My blog
Pingback: Turbo Talk » Amazon MFA and VPS, Hyper-V FUD, and physical datacenter design tips
Pingback: Amazon EC2 Performance Drops – Too Many Users | The "Break it Down" Blog
Pingback: Revisiting EC2 Instance IDs :: Jack of all Clouds :: Guy Rosen on Cloud Computing
Pingback: Alexey Bokov’s weblog » Blog Archive » Cloud computing links – August 2010
Pingback: боков блог » Blog Archive » Amazon Web Services – немного фактов
Pingback: Amazon Is Still a Retailer | iPhone 2 die 4
Pingback: WalMart Love » Amazon Is Still a Retailer