Enterprise-class Software becoming available in the Cloud through RightScale

More and more enterprise-class software vendors are making their software available in the cloud and doing it through RightScale. Over the past two weeks the IBM DB2 team made DB2 Express-C v9.7 available, SpringSource published Hyperic HQ, and CohesiveFT published VPN-Cubed, all on the RightScale platform. Publishing software to the cloud is still a somewhat mysterious activity. While almost all software runs in infrastructure cloud environments such as Amazon EC2, publishing to the cloud creates new expectations and opportunities. Over the last year, we’ve been adding features to our platform to help ISVs publish to the cloud and are excited that the DB2 team found it easier to get the next version out using a RightScale ServerTemplate than an Amazon AMI.

I thought it would be helpful to write down how publishing software into the cloud is different from the more traditional software delivery:

Server templates, not software packages: Users expect to point and launch, not download, install, and configure. Of course some software is meant to be embedded or adapted, but in those cases there still is the opportunity to deliver a ready-to-go sample from which the embedding or adaptation can start.

In the cloud, you can launch IBM DB2 and have it running in a couple of minutes. That makes it much easier to get going and then later to start modifying configuration details. I’m sure most users will want to change the ServerTemplate published by IBM, but few will start there. Using the ServerTemplate not only gives you a server with the software already loaded, installed, and configured, it also has all the right software versions, is set up with monitoring and alerts, has  logging prepped correctly, plus offers other goodies.

From a vendor’s point of view the great opportunity of the cloud is to control the software environment from A-Z. You don’t need to have a long list of required software packages and compatible versions, you just provide it in a neatly wrapped-up ServerTemplate that automatically installs all the right components.

Free one-click trials: They don’t need to be literally one click, but the cloud offers tremendous opportunities for users to try before they buy. It’s so much easier to try out software if you can just spin-up a server in the cloud, possibly with some live demo data already loaded. It’s even better when the server is running on the vendor’s dime!

From the vendor’s perspective the cost is really close to just the cloud infrastructure cost. We’ve offered $1 of free EC2 time in our trial sign-up for years now. The $1 is good for about 10 hours of a small EC2 instance and really lets people get a first touch onto RightScale. Who wouldn’t pay $1 to get a prospect to try their product?

No lonely servers: Whose software is designed to run on a lonely server these days? What use is, for example, Hyperic HQ on its own? Its purpose is to monitor other servers so you really want to embed it into multi-server deployments. CohesiveFT’s VPN-Cubed product is similarly targeted to making life easier when you have lots of servers to connect back to the main office or datacenter.

Using RightScale simplifies the configuration of multiple machines because configuration inputs can be defined across many machines at once and it’s much easier for a vendor to also provide scripts that install client plugins or agents on a customers’ other servers.

Pay-per-use: users have come to expect more flexible billing methods in the cloud, such as pay per use. This is good and bad. The good is that it really is a requirement for enabling the scaling of resources on demand or for supporting flexible usage models. Use cases range from the famous scaling up in response to a traffic surge, to  being able to add a database slave server on a whim to test the performance impact of some schema transformations. Pay-per-use really makes the cloud unique and this tells me that like it or not, pay per use is here to stay. However other models will likely co-exist.

From a vendor perspective pay-per-use introduces new challenges. On the execution end it suddenly means that vendors need to meter the usage of their customers. I’m distinguishing metering from billing: the former is about measuring the usage and producing the data that can be used to compute per-use charges, the latter is about sending the customer a bill and getting it paid. We’ve been adding metering support to the RightScale platform for ourselves and we’re starting to make the data available to ISVs to feed into their billing.

On-site support: The servers in the cloud are very easy to access by the vendor’s support engineers and users will soon start to expect such “on-site” support. This is a true win-win proposition because it can reduce problem resolution time and increase customer satisfaction. Of course this means that the support reps need to have the skill to actually fix something and not just to dig in the knowledgebase and send an email reply.

One of the required underpinnings to enable vendor access in a controlled manner is access control. After all, the server’s user needs to be able to selectively grant access to the vendor when help is needed. What we’ve found is that the RightScale dashboard not only offers the ability to do just that, but it also gives the support engineer a lot of context and history information that can help getting to the bottom of the problem quickly. As an extreme case, our support guys have responded to a number of “help, our site is down and we can’t reach our IT guy” calls and were able to get things back up without having prior knowledge of the site. (In case you’re wondering, this is not what our standard support covers, but we also don’t just leave customers fall off the cliff in a situation like that.)

Delivered as a service: “And can you run it for me?” is a question prospects ask more and more. I know for myself that many times I’d rather pay the vendor to run it and sell it to me in SaaS form than go and figure it all out myself. Of course the cloud makes this much easier than ever before because the whole provisioning planning is largely taken out of the equation. When more customers sign-up the vendor can just launch more servers. A good number of our customers do just that and utilize RightScale to manage what one could call virtual appliances for their customers. At the more sophisticated end, companies such as StarCut use RightScale to provision multi-tenant clusters to host many small users and they then move larger users to private clusters and even set them up with their own fully-managed auto-scaled deployment.

Runs everywhere: The final consideration is that “publishing to the cloud” is a rather deceptive term because there isn’t just one cloud. I hate to borrow the “write once, run anywhere” slogan but it really describes what vendors are looking for. It’s too early in the industry to have a clear picture of what the solution should look like, but we’ve certainly made significant strides towards enabling multi-cloud ServerTemplates in RightScale and we’ll have more coming out shortly.

To give credit where credit is due, Amazon has done a great job in preparing the runway for software vendors to make their software available in the cloud. First the fact that EC2 is based on immutable machine images, which are not a snapshot of a server but rather a template from which new servers can be spun up really enabled the first catalog of ready-to-launch servers. Second the pay-per-use pricing which has gotten everyone to rethink how flexible computing could be if the licensing models allowed it. Somewhat to my surprise vendors with a lot of legacy pricing, such as IBM, have jumped into this new opportunity and decided to adopt it. Third Amazon’s DevPay service, which allows vendors to add a charge on top of Amazon’s hourly server fee, was the first offering that closed the metering and billing loop so vendors don’t have to reinvent the wheel. All this has really created a tremendound level of awareness and interest in the new ways software can be delivered in the cloud. We’re now leveraging this to introduce what we belive to be a more multi-cloud friendly and more flexible way to publish software in the cloud.

Comments (2)

RackSpace releases draft Cloud Servers API

In case you missed, the “cloud without an API” is about to become a real cloud with an API! (Sorry RackSpace guys, I just couldn’t resist!) Bret at RackSpace posted a blog entry asking for feedback a little over a week ago and it’s looking pretty good! If you haven’t looked at it, now is a good time. We’ve been in touch with Bret for a while and it’s good to see everything progressing. One item they solved nicely is passing “personalization” data into a new server. In the API you get to tell it to put some arbitrary data into any file you want on the root partition. This way it’s possible, for example,  to set some environment variables in /etc that get picked up by various programs on the server. Nice!

Leave a Comment

Amazon adds Load balancing, Monitoring, and Auto-Scaling (updated)

[This post was updated with sections on the monitoring and auto-scaling services]

Announced late last year, Amazon tonight launched load balancing, monitoring, and auto-scaling for the Elastic Compute Cloud (EC2). These features have been requested many times by EC2 users and with this release Amazon continues to show that it listens and responds to feedback. Read Jeff Barr’s description on the AWS blog and Werner Vogels’ backgrounder on his blog.

At RightScale we’ve been experimenting with a preview version of the new services for a while. We’re pretty excited because they allow us to offer new features and more choices to our customers. In particular, we’ll integrate the load balancing with what we already have. It can be used as an alternative to our haproxy based set-up or in combination with it for more flexibility. For example, for more complex web sites and for SSL sites a more application-specific load balancing layer behind Amazon’s will usually be required.

The new monitoring service will provide additional data sources to our users as well as the ability to aggregate across many servers. The service introduced by Amazon can collect data at the hypervisor level and provides a very versatile storage back-end. We’re planning on sourcing data from the service in our graphing front-end and also integrating the data into our alerting and escalation system. At scale, the use of Amazon’s monitoring service by itself actually costs half of what all of RightScale costs, monitoring included, so we’re offering great value as an integrated solution.

Finally the auto-scaling service is something that has been lacking in many users’ mind from EC2: we’ve often heard from people looking at EC2 the first time “you mean Amazon doesn’t automatically launch more instances when my app is overloaded?” Amazon now has an answer for those questions, which was badly needed. However, unless we’re missing something, there’s nothing additional to our current offering, but we’ll keep listening to what our customers tell us. We believe that the most difficult part of auto-scaling isn’t the actual launching of servers but that it’s lining up all the configuration management and lifecycle management so the new servers go into production successfully, and that dynamic runtime self-configuration is where RightScale really shines.

Load balancing

Let’s take a closer look at the new features introduced tonight starting with the load balancing. It is now possible to allocate a load balancer and have it distribute requests across multiple EC2 instances running in multiple availability zones within one region. (An availability zone is roughly equivalent to a datacenter and the two current regions are US east and EU west.) The interface to the service is pretty simple. You create a load balancer and define for which ports and protocols it should process requests. Then you launch instances and add them to the load balancer. You also define a health check that the load balancer uses to probe the instances to ensure that they’re operating. With that the load balancer is in operation and starts passing incoming requests through to healthy instances.

The load balancing service is designed to serve as a first level of distributing load across a number of instances, dealing specifically with DNS and handling the failure of an availability zone. Most sophisticated web sites will need an additional level of load balancing that is more customizable and more application specific, for example to map portions of the URL space to different back-end services or to optimize the handling of persistent sessions.

Some of the features details of the load balancing service are:

  • It supports HTTP and TCP meaning that it load balances HTTP at the request level and provides TCP switching for all other protocols. This in particular means that it does not terminate HTTPS, instead HTTPS must be balanced at the TCP level and terminated on the user’s instances.
  • It can listen on ports 80, 443, and 1024 thru 65535, which means it cannot be used to load balance many standard protocols that use ports below 1024. I’m not sure why this restriction exists, but it’s perhaps an indication that the service is primarily geared towards web sites.
  • The health checks can either issue an HTTP GET request to a specific URL and check for a 200 OK response, or they can open a TCP connection to an arbitrary port and check that the connection is accepted.
  • Servers can be added to or removed from the load balancer rotation without interrupting operation, and the load balancer can be queried for the status of instances according to the health check.
  • The load balancing occurs in two-stages, first a client is directed to a specific availability zone using DNS, and then it is directed to an available instance in that zone. The zone selection is equal-weight, which means that one better run an equal number of instances in each zone or instances in one zone will end up with a higher load than the other.

We’re currently planning to support the load balancing service at multiple levels. We’ll enable our server templates to use Amazon’s load balancing both instead of and in addition to our own. For simple highly scalable HTTP services Amazon’s can be used on its own, but for more complex configurations a second level of load balancing is needed. In particular for SSL sites, a back-end load balancing after the SSL termination is often required.

Monitoring

The CloudWatch monitoring service is really a special storage engine that is designed for time series data. On one end data collected periodically from servers and from other services is pumped into the monitoring store, and at the other end clients can run queries against the store to extract data from it. What this means is that while not being a complete monitoring system, it is the central storage piece to which all the others would interface.

On the data input side CloudWatch is very limited at the moment. There are seven metrics per server that the virtual machine host injects into CloudWatch, and there are four metrics for each Load Balancer instance. Not very exciting yet, but don’t be fooled, this is just the beginning. Amazon will add more and more metrics and also provide an API for inputting custom metrics. At that point it becomes really interesting!

On the data output side the store offers a number of ways to query the data. The result of a query is always an array of data points over time. What’s interesting is that one can get much more than just the raw data points back out and that’s where CloudWatch shines. For example, it is possible to retrieve the max cpu utilization across a number of servers as a time series. It’s unclear how flexible this aggregation will end up because initially the way to name the servers of interest is somewhat limited, but we’ll find out.

Some other characteristics of the service:

  • data is retained for 2 weeks, so one better extract it and save it somewhere else for longer term comparison and trending
  • the smallest data resolution is 1 minute, anything input more frequently gets aggregated automatically at minute boundaries
  • the service costs $0.015 per server hour and there is no per-query charge

Overall CloudWatch looks like a very promising service that will really gain momentum when many more metrics can be input. We’re still on the fence whether we should modify our graphing and alerting to be able to pull data from CloudWatch directly or whether we should pull data from CloudWatch on a continuous basis and re-store it in our monitoring system. In either case, we’ll make the data available to our customers.

Auto-scaling

The auto-scaling service is something that a lot of first time users of EC2 have been missing. Everyone expects Amazon to automatically scale the resources of a user since this auto-scaling is what is most often quoted in connection with the cloud. Never mind that it doesn’t really make much sense since Amazon just provides compute boxes and doesn’t have any info about the application a user is running and how one would make it scale. It’s like the UPS guys arriving at your doorstep with a bunch of Dell boxes: “here, they noticed you need more, I’ll unbox and plug them in for you”. I wish auto-scaling were that simple!

The Auto-Scaling API reflects the fact that a lot of set-up is required for auto-scaling to work. You have to define not just the auto-scaling behavior but also what to launch, captured in a  “LaunchConfiguration” data structure which includes the image to launch, security group, SSH key, instance size, kernel id, ramdisk id, block device mappings and user data. If I counted correctly, you get to specify a total of 27 parameters to make auto-scaling work.

What’s really missing for the auto-scaling to fit in is all the context information that would provide a lot of the configuration information. The servers being auto-scaled don’t operate in a vacuum, they work in concert with other servers For example, most of our customers have two “base servers” that are not auto-scaled. They often have some extra functions, like acting as front-end www servers, but otherwise they’re configured just like the additional auto-scaled servers. Well, the config of those base servers and the auto-scaled ones share a lot in common, so it’s nice to be able to set all this up in one place for the whole deployment, and not individually for each server and the auto-scaling too. The same goes for other deployment-wide information, from the web site name to the credentials for accessing the database, all that is context that can be shared across all servers in a RightScale deployment.

What becomes even more painful is that any changes to what’s running on the servers, like a bug fix, requires creating a new image and relaunching the servers. This is where our ServerTemplates which build the server at boot time from a base image are a lot more flexible. You can make small changes to the server template, test that on an individual server on the side, and then slide it into the scaling. A neat alternative is being able to run a script that updates existing servers on the fly. In some cases you want a new server array that gradually scales up and takes the load over while the other scales down, other times you just want to run a quick script on the existing servers to patch up a minor config detail. You really do want the whole toolbox so you can pick the right tool for the job.

The good news is that the auto-scaling service itself is free. However, it requires that all launched instances use the CloudWatch monitoring service which costs $0.015 per instance hour.

Summary

I won’t repeat what I wrote at the beginning of the blog entry, but it’s great to see Amazon continue innovating at a break-neck pace! At a feature level what they’ve introduced tonight overlaps more with some features of RightScale than most other parts of their offering. Our focus is to provide an integrated solution on top of their array of infrastructure services. In particular, we have long had support for dynamic configuration management and advanced automation. More recently we have embraced portability across other cloud providers and even hybrid public/private clouds.

Comments (22)

Eucalyptus Systems gets funded

Our friends and neighbors at the new Eucalyptus Systems just got funding from no less than Benchmark and BV Capital. That’s a pretty exceptional accomplishment in the current venture climate! Read more about it at Venture beat or on the new Eucalyptus site directly. Oh, in case you missed my recent posts, Eucalyptus is open source software that lets you set-up your own cloud in you own datacenter. You can then plug your Eucalyptus cloud into RightScale and manage it through the RightScale dashboard alongside all the public clouds. Go Eucalyptus!

Leave a Comment

RightScale + Ubuntu + Eucalyptus = cloud in a box

Need a cloud in a box? Want a cloud in a box? Well, then, start requisitioning a couple of machines now so you’re ready on Thursday to load up Ubuntu 9.04, install Eucalyptus, and follow the prompt to register your cloud with RightScale! And best of all, it’s all free! Free open source software and access to a free RightScale service account.

It’s been a hectic last few months and I’m sure we have some interesting times ahead, but we’re finally getting oh so close with the impending release of Ubuntu 9.04 which includes the technology preview for the Ubuntu Enterprise Cloud powered by Eucalyptus. We’ve been working very closely with both Canonical and the Eucalyptus team to ensure that all the cloud pieces will work together as seamlessly as possible.

To make it easy for you to set up your private cloud we integrated the RightScale registration into the Eucalyptus installation. This means that as you plod along installing and configuring your Eucalyptus cloud controller you’ll have the option to register your new cloud with RightScale by simply following a link on the configuration web page. It could hardly be any simpler.

What we’re supporting at the Ubuntu 9.04 release is to register your Eucalyptus cloud with RightScale and access it within your RightScale free or paid account right alongside Amazon EC2. You can invite friends to access your cloud so they can launch their own cloud servers on your cloud! We will also provide a RightImage that you can download to your cloud so you have a clean and small machine image to work with. Unfortunately, we won’t have support for ServerTemplates and automation available at the initial release. We still have a number of things to hook up on our end to make that happen, but we’ll release it as soon as it’s ready. At that point, you’ll be able to operate in your own cloud just as you do on EC2. Yikes!

But we’re by no means forgetting about Amazon EC2! We’ve been working with Canonical to ensure that the official Ubuntu 9.04 Amazon Machine Images (AMIs) work out of the box with RightScale! This means that if you launch one of the 9.04 AMIs from the RightScale dashboard then all the RightScale goodness will work: server templates, monitoring, automation, etc. If you launch the same AMI using the API or from a different console, then they’ll work as if RightScale didn’t exist. The inclusion of the RightScale start-up script in the Ubuntu AMI means that we’ll be able to continue ramping up our Ubuntu support and we won’t have to create a 9.04 image ourselves at this point. In the future, as we roll out new versions of our configuration management and automation we’ll probably release new Ubuntu RightImages ourselves, but we’ll cross that bridge when the time comes. In the meantime, enjoy Ubuntu 9.04 & RightScale seamlessly on Amazon EC2!

Comments (10)

My cloud, your cloud, our cloud

As we’re getting ready to support private and hybrid clouds in RightScale I thought it would be worthwhile to write up some of the experience and thinking that we’ve gone through. Over the past few months we’ve seen a sustained rise in the buzz around private and hybrid clouds. As far as I’m aware, the following terminology has pretty much emerged from multiple sources:

  • a public cloud is a shared cloud computing infrastructure that anyone can access and that is connected to the public Internet
  • a private cloud is a cloud computing infrastructure owned by a single party (usually with deep pockets to pay for datacenters and machines!) and that may or may not be connected to the public Internet
  • a hybrid cloud is the union of private and public clouds that are used together to be able to leverage the benefits of both

To date cloud computing has by and large been in the realm of public clouds. There has been a lot of buzz around “I want a cloud in my own datacenter” but it is taking time for the technology to mature and for players to commit the resources and do the build-out. Let’s review the pros of public vs. private clouds (pros of one are cons of the other):

Public clouds:

  • no capital expenditures — pay per use
  • ability to pass headaches of expansion to someone else
  • no physical plant staff and reduced sysadmin staff
  • leverage high-volume Internet connectivity

Private clouds:

  • ability to control details of hardware provisioning and of hardware characteristics
  • fully owned infrastructure reduces security concerns, ability to satisfy regulatory requirements without requiring cooperation of cloud provider
  • close proximity to non-cloud datacenter resources, potentially also to offices or other parts of physical plant
  • may leverage existing resources (sunk costs)

What has been really interesting to watch is the level of interest in hybrid clouds, which is an attempt to get the best of both worlds. While many organizations would really love to have a private cloud they realize that a lot of what attracts them to the cloud computing model in the first place is intrinsic to the public cloud. So it is natural to want to put into the public cloud the workloads that benefit from its advantages and into the private cloud what is better served there.

From the bullet lists above it is clear that the benefits of the public cloud all revolve around cost and can only be duplicated internally at a really large scale. The benefits of the private cloud all revolve around control. So it makes sense that almost everyone ends up gravitating toward wanting a hybrid model.

I was originally skeptical about the whole notion of an internal cloud. What ended up convincing me is the long list of use cases we’ve encountered. Here are the most typical ones:

Develop in public, deploy in private. You may be outsourcing part of the development and so running dev and test servers in the public cloud makes it all much easier. Even if the developers are internal but distributed around the globe it’s often easier to converge in the public cloud. The flexibility to acquire and relinquish dev as well as test resources is also often a key benefit. But in the end production may have to run internally for regulatory reasons or similar concerns. In that case it should ideally run in an internal cloud that is managed the same way that dev & test resources were to ensure that everything works as planned.

Develop in private, deploy in public. You may have existing in-house dev and test resources that you’d like to bring to bear on projects that ultimately will be launched in the public cloud for connectivity, redundancy, and scalability reasons. Being able to test using the same type of environment as will be used in production is a good reason to set up an internal cloud with the same management system bridging the internal and external deployments.

Private core, public expansion. You may have applications in-house that need to stay there for regulatory or other reasons but you have related apps that can run internally or externally. For example, batch analysis, seasonal/temporary apps, etc. These can run internally or be moved to the public cloud and having the flexibility can ease the transition as well as help optimize costs.

Some runs are public, some are private, some are in-between. You may have researchers running modeling or analysis applications that span the spectrum of security requirements. Some runs or experiments use public software and run on public domain data sets, these are likely able to run in the public cloud already. Others use proprietary software and operate on very secret data sets. Some of these may never be candidates for the public cloud. Many other experiments fall in-between. Giving the researchers the ability to launch a cluster of machines on-demand internally or externally whenever they want to test something out can really enhance productivity and reduce overall cost.

What we hear across the board is the requirement to link the two types of clouds together. Users want to be able to seamlessly move applications back and forth. Oops, let me be more precise. Users want the assurance that they can develop an application and build deployments in one cloud in such a way that they can replicate the same type of deployment successfully in the other type of cloud. And they would like to use the same management system for all these deployments. It’s basically the requirement of being able to realize the above use cases.

What we fortunately hear much less frequently are ideas about seamlessly scaling apps out from the private cloud into the public cloud. Sort of transparently adding public cloud resources to your private cloud when the latter runs out of capacity on one app. The reason I’m not a fan of this is that it just raises a lot of tricky technical issues, from latency and bandwidth bottlenecks to routing and access control issues.

I believe there is a very simple realization that makes such “scale out into the public cloud” scenarios unattractive: by the time you convinced yourself that you can run half of an app in the public cloud, you effectively also convinced yourself that you could run it entirely in the public cloud! I’ll come to exceptions below, but what happens is that the most cost effective way to proceed is to move the app entirely into the public cloud and make room in your datacenter for the growth requirements of some other app that is more difficult to run in the public cloud.

Good exceptions are situations where your application has legacy requirements around the database tier but you want to run some compute intensive parts of the application in the public cloud. Say you have an Oracle RAC installation that you can’t move into the public cloud, but you need an increasing amount of horsepower to perform compute intensive analysis of some data. Then it’s interesting to evaluate how the database can stay in private and the compute stuff can scale out into the public cloud.

Next week we’ll take the first step in supporting the above use cases by allowing anyone to plug their private cloud into the RightScale service! For free! We’ll help you do the following:

  • commandeer a bunch of machines
  • create a cloud using Eucalyptus
  • register your cloud with RightScale
  • enjoy the power of RightScale for your cloud for free
  • invite your friends to your cloud

All this will be just the first step on a long road towards realizing the promise of hybrid cloud architectures, stay tuned for more!

Comments (5)

McKinsey doesn’t ‘get’ the cloud…

It looks like the McKinsey report “Clearing the air on cloud computing” is getting some attention. It has some good stuff in it, including the warning that cloud computing is approaching the top of the Gartner hype cycle. However, its claim that cloud computing (in the guise of EC2) ends up being more expensive per server month for large enterprises than doing it in-house seems fatally flawed. In particular, it doesn’t seem to be accounting for the costs correctly and it  completely fails overlook the benefits of automation in the cloud which ultimately leads to a revolution in the way compute resources are consumed.

The cost equation in the report starts on slide 22 and it’s really, really sketchy. They mix EC2 compute units and cores together (compare 22 and 23). They talk about “$14K/Server (2 CPU, 4 core)” which on my calculator comes out to $97/core/month over 3 years, but they have a cost of $45/mo/cpu on the same slide (and $97 doesn’t even account for the facility or power or cooling).

On slide 24 they suddenly compare an in-house datacenter server with “75% of EC2 Large Standard Windows configuration on Amazon EC2″ and nowhere do they mention that the latter cost includes the Windows license. Ouch!

Unless they actually document more details of their cost accounting I can only say that it’s flawed. This is supported by the many business line owners in large corporations that come to us and tell us they can’t believe how cheap EC2 is because their internal charge-backs by IT are $400+ per server.

The other big mystery is how McKinsey arrives at just a 10% labor reduction when moving to a “third-party cloud provider” and they quote $96/mo of labor for the cloud servers. For what? For the guy that clicks the “launch” and “terminate” buttons on the management dashboard???

Again, the report is so thin on details that it’s impossible to figure out what they’re really thinking. Clearly there is a lot of staff required to run a whole datacenter as well as a lot of service providers, from the architects and engineers to build the facility, to the hvac guys cleaning filters, the folks maintaining the UPS batteries, the genset, and the security crew. 10%, yeah, right.

What the report seems to completely overlook is the possible reduction in sysadmin costs. One of the huge benefits of the cloud is that the entire computing infrastructure can be automated. Top to bottom. That saves a lot of sysadmin labor and in the end it means that requisitioning more compute capacity can be done by the end user somewhere in a business unit instead of being an IT chore.

The report also doesn’t take into account the cost of the red tape that surrounds corporate IT. Things the business can’t do because IT can’t support them. Wasted time spent doing work-arounds instead of just launching a few more servers. Having to guess 6 months ahead of time how many servers will be needed at launch. Opportunity cost because projects don’t happen for lack of IT resources.

Well, I have to say that it would have been great to read a report that lays out the costs and assumptions clearly so one can retrace what is included and what is not. I would have loved to learn more about the corp IT costs. Alas the report fails to do that and it also fails to recognize that cloud computing revolutionizes the way compute resources are consumed, which ultimately is where the bigger benefits will come from.

Comments (10)

RightScale supports Ubuntu on Amazon EC2

Fear not, this is not a rehash of our press release from the other day. I was browsing what people wrote about Ubuntu and EC2 and I’m amazed at some of the confusion. The most bizarre so far is an article about “Ubuntu’s next wave: Open server, closed cloud” talking about how Canonical is steering users to Amazon’s “closed cloud” in order to monetize Ubuntu and thus how they’re betraying free software. Very weird conspiracy theory!

Back to topic, the reason we’ve decided to support Ubuntu is because we couldn’t afford not to :-) Since the very beginning we decided to support a single OS because we couldn’t afford to support a second one simply due to the development and testing overhead and at that time CentOS was the right choice. All along we’ve had users using Ubuntu with RightScale and we did our best to support them without spending a lot of official time on it. But perhaps 6-9 months ago it became clear that demand for Ubuntu server was on the rise and that we’d better pay attention.

What finally pushed us over the edge is that the Ubuntu team and Canonical made it clear that they are supporting the cloud. They see the opportunity to be the OS of choice in the cloud and they are going at it. In addition they are supporting the Eucalyptus project and we have been supporting them as well, so that’s another  common point. All that made it clear that it’s in everyone’s benefit for us to roll up our sleeves and unleash Ubuntu on RightScale. This is just the beginning, stay tuned for more…

It’s very weird how the article quoted above sees Canonical’s support of EC2 as betraying open source. I, of course, hope that Canonical will indeed monetize their cloud efforts by offering paid support services in the cloud environment. I want them to stay around to continue supporting the Ubuntu project! But I see a cloud support offering as being no different than offering paid support services in the datacenter environment, which they do today. Does anyone complain that Canonical offers support on dell servers because the servers are not free? What’s different in the cloud? We pay Amazon to “lease” the servers and they run them to boot. How different is that with respect to Ubuntu or Canonical? Go figure…

Anyway, I’m looking forward to many Ubuntu users on the RightScale service, whether free or paid, and we’ll keep increasing our Ubuntu offering!

NB: One of the things we’re doing somewhat behind the scenes is running redundant mirrors of the Ubuntu repos within EC2, so if you’re launching hundreds of servers and doing apt-gets then those will all go at lightning speed and succeed. In addition, we’re keeping daily versions of the mirrors, so next year you’ll still be able to apt-get with the state of the mirrors of today to guarantee a successful launch and install.

Comments (9)

RightScale Ruby Cloud Gems released

We’ve just re-released all the Ruby gems (libraries) we use to interface to the various cloud providers. These gems are what we use in our RightScale platform in production, not some stripped-down version. They have performance optimizations and extensive error checking as well as retries for failing operations.

As part of the current wave, we’ve released the Amazon web Services interfaces RightAws 1.10.0. The big new features are SDB’s SQL-like query and query_with_attributes support as well as signature v2 support for all services. There are also numerous bug fixes, many of them reported and patched by users and customers. Thank you!

We’ve also released alpha versions of the GoGrid, FlexiScale, and Slicehost gems. These have seen less production use and the APIs at these providers are still seeing changes, so we expect we’ll have some fixing to do. Please report any bugs to us and we’ll fix’em!

We remain committed to contributing open source libraries to the cloud community, more coming soon! Also, by popular demand, we will be moving the development of these gems to a public git repository soon. It’s a bit more tricky than you might expect as we’re often ahead on a private branch with non-public cloud features, so we need to make it all work correctly… But stay tuned!

Oops: I almost forgot to add a link to the gems: http://rightscale.rubyforge.org/ and http://rubyforge.org/projects/rightscale

Comments (10)

The Skinny on Cloud Lock-in

The topic of cloud lock-in is getting quite some attention as of late, and it definitely needs to be a primary concern for anyone planning to move business critical applications to the cloud. (And who isn’t planning on that these days?) Given all the different layers of cloud computing the conversation can quickly get more confusing than anything else. At Cloud Connect a few weeks ago the lock-in discussion bounced from Salesforce.com to Google App Engine, and then to Amazon Web Services within a single argument — which just makes no sense. To put it simply, different layers of cloud offerings vary widely when it comes to the dangers of lock-in.

Lock-in hypothesis

Let me state Thorsten’s Lock-in Hypothesis:

The higher the cloud layer you operate in, the greater the lock-in.


lockin increase

This means that if you use an application in the cloud, such as an all-in-one CRM package, you have the highest chance of getting locked-in. Move one level down to a platform in the cloud and you are somewhat less likely to get locked-in. Google App engine is one example: you can move a simple Python app off that platform fairly easily, but anything of substance that uses its BigTable storage and other services will end up relying on a lot of proprietary technology.  This “black box” effect locks you in more than, for example, a platform like Heroku where apps follow more of a standard Rails code base. When you move down to an infrastructure cloud, such as Amazon Web Services, it becomes even easier to see how you can move your application stack from one provider to another. After all, there’s not much distinguishing the Linux box you get in EC2 from the Linux box you get at GoGrid. But even here, lock-in needs to be thought through because the system behavior –from storage persistence to networking details and on and on — is far from identical.

So where does this leave us? I’ve been talking about lock-in, but what does that really mean? Well, with cloud computing you outsource the operation of compute resources to a cloud vendor who “runs” your application and who “stores” your data. Lock-in occurs with this vendor to the extent it is prohibitively expensive or time-consuming to run your application elsewhere or move your data elsewhere. Whether this “elsewhere” is another vendor or whether it is your own infrastructure is not important: if you can’t move, or it costs a lot or takes a long time to do so, you’re locked-in. We recently asked our customers and prospects what concerned them most about lock-in. Here are the results:

lockin concerns

The layer cake

Lock-in can actually occur at many levels in the stack, and that’s why the cloud layers differ in their effective lock-in risk. The more code that is controlled “behind the curtain” by the cloud, the more you tend to lose freedom. Conversely, the more that is under your control, the easier it is to replicate it elsewhere and retain freedom. Here are a number of different layers at which you could find yourself locked-in:

  • Application: do you own the application that manages your data or do you need to find/write another one to move?
  • Web services: does your app make use of 3rd party web services that you would have to find or build alternatives to (e.g. storage, search, billing, accounting, …)?
  • Development & run-time environment: does your app run in a proprietary run-time environment and/or is it coded in a proprietary development environment? Would you need to retrain programmers and rewrite your app to move to a different cloud?
  • Programming language: does your app make use of a proprietary language, or language version? Would you need to look for new programmers to rewrite your app to move?
  • Data model: is your data stored in a proprietary or hard to reproduce data model or storage system? Can you continue to use the same type of database or data storage organization if you moved or do you need to transform all your data (and the code accessing it)?
  • Data: can you actually bring your data with you and if so, in what form? Can you get everything exported raw, or only certain slices or views?
  • Log files and analytics: do you own your history and/or metrics and can you move it to a new cloud or do you have to start from scratch?
  • Operating system and system software: do your sysadmins control the operating system platform, the versions of libraries and tools so you can move the know-how and operational procedures from one cloud to another?

All these issues become pertinent when you face questions such as: “How can I move my Force.com application or my web site running in Google App Engine to my own data center?” Or “Can I get the click-stream data for my site out of the platform so I can analyze, for example, last year’s traffic compared to this year’s?” Or “Can I easily move an application between my datacenter and EC2 easily?”

Altitude increases lock-in

The value proposition of the higher cloud layers is appealing and I predict more and more movement in that direction. But lock-in is one of the issues that really gives me pause and that has kept me in the past from adopting some of the services that otherwise have looked compelling.

Let me pick on Google App Engine for a minute. Suppose you develop your site on App Engine and you find yourself having to move away for whatever reason. I don’t know of a good solution for you at that point. While there are ways to port an app from App Engine to Django it’s not clear this is really an answer if you’re running a high volume production app. It’s going to be interesting to see whether we will end up with commercial or perhaps open-source App Engine clones that are “industrial strength” to the point where one can really contemplate moving a big app from one App Engine vendor to another. (Well, first Google App Engine needs to be complete enough to host the types of apps where this is a real concern.)

An example closer to home is Amazon’s Simple DB. I’ve been interested in Simple DB since I first heard about it, but we have yet to use it as part of the RightScale service and the #1 reason is lock-in. For example, we store audit entries for everything that happens with our users’ servers and I’d love to get those out of the SQL database they’re in. Simple DB may be a good solution to the problem from a technical point of view, but we don’t see how we’d be able to move that data out of Amazon without major headaches. In addition, we need to be able to run all pieces of the RightScale service in other clouds and we’d have to build an alternate storage solution there. By the time we do that we might as well only use this alternate solution and forego Simple DB altogether.

At the level of infrastructure clouds like Amazon EC2 the questions around lock-in are somewhat different but still pertinent. The cloud vendor provides what I like to think of as the “atoms of computing,” namely processing, storage, and networking. You get to build your infrastructure using virtual machines (EC2), disk block devices (EBS), hashed storage buckets (S3), security groups, etc. This means that the choices of programming language, development environment, runtime environment, database storage and so forth are all yours and can all at least in principle be duplicated in another cloud, at a traditional hosting provider, or in your own datacenter. Where lock-in starts to creep in is in the system architecture and in the operations infrastructure (automation, scripts, procedures) that your sysadmins put in place to manage everything.

Maintaining freedom of choice

One of the principles that I’ve upheld in the design of the RightScale system from the beginning is transparency. Everything happening on your systems should be visible to you. This not only means that you can find out why something happened and who did it, but also that you can replicate it elsewhere. There’s no magic happening behind the curtain to which you’re held hostage. I love it when others can do magic for me and save me a lot of time and effort by providing a pre-built platform. But there are solid reasons — both business and technology-related — to demand the ability to look into the “secret sauce.” That way, I can be enchanted by the magic but not locked-in to the magician. Our users need to be able to enjoy the same capability.

A second principle we follow is to focus as much as possible on standard software, architectures and configurations. This means that our solutions can easy be replicated elsewhere, such as in your own datacenter. This can present more of a challenge when designing for a cloud environment, which is why we provide cloud-ready solutions for various types of scalability, but it also frees you from being tied to a particular cloud.

lockin details

In the end, there may not exist a zero lock-in option. In fact, certain kinds and degrees of lock-in are probably unavoidable and are actually tolerable. The point is that the lock-in question is an important consideration to take into account when choosing among different cloud computing alternatives, and it’s equally important to keep the differences among cloud layers in mind when you decide what you’re willing to live with. All clouds are not created equal, and all clouds do not create equal lock-in. The key is to know the implications of your cloud choices.

Comments (17)

Older Posts »