[This post was updated with sections on the monitoring and auto-scaling services]
Announced late last year, Amazon tonight launched load balancing, monitoring, and auto-scaling for the Elastic Compute Cloud (EC2). These features have been requested many times by EC2 users and with this release Amazon continues to show that it listens and responds to feedback. Read Jeff Barr’s description on the AWS blog and Werner Vogels’ backgrounder on his blog.
At RightScale we’ve been experimenting with a preview version of the new services for a while. We’re pretty excited because they allow us to offer new features and more choices to our customers. In particular, we’ll integrate the load balancing with what we already have. It can be used as an alternative to our haproxy based set-up or in combination with it for more flexibility. For example, for more complex web sites and for SSL sites a more application-specific load balancing layer behind Amazon’s will usually be required.
The new monitoring service will provide additional data sources to our users as well as the ability to aggregate across many servers. The service introduced by Amazon can collect data at the hypervisor level and provides a very versatile storage back-end. We’re planning on sourcing data from the service in our graphing front-end and also integrating the data into our alerting and escalation system. At scale, the use of Amazon’s monitoring service by itself actually costs half of what all of RightScale costs, monitoring included, so we’re offering great value as an integrated solution.
Finally the auto-scaling service is something that has been lacking in many users’ mind from EC2: we’ve often heard from people looking at EC2 the first time “you mean Amazon doesn’t automatically launch more instances when my app is overloaded?” Amazon now has an answer for those questions, which was badly needed. However, unless we’re missing something, there’s nothing additional to our current offering, but we’ll keep listening to what our customers tell us. We believe that the most difficult part of auto-scaling isn’t the actual launching of servers but that it’s lining up all the configuration management and lifecycle management so the new servers go into production successfully, and that dynamic runtime self-configuration is where RightScale really shines.
Let’s take a closer look at the new features introduced tonight starting with the load balancing. It is now possible to allocate a load balancer and have it distribute requests across multiple EC2 instances running in multiple availability zones within one region. (An availability zone is roughly equivalent to a datacenter and the two current regions are US east and EU west.) The interface to the service is pretty simple. You create a load balancer and define for which ports and protocols it should process requests. Then you launch instances and add them to the load balancer. You also define a health check that the load balancer uses to probe the instances to ensure that they’re operating. With that the load balancer is in operation and starts passing incoming requests through to healthy instances.
The load balancing service is designed to serve as a first level of distributing load across a number of instances, dealing specifically with DNS and handling the failure of an availability zone. Most sophisticated web sites will need an additional level of load balancing that is more customizable and more application specific, for example to map portions of the URL space to different back-end services or to optimize the handling of persistent sessions.
Some of the features details of the load balancing service are:
- It supports HTTP and TCP meaning that it load balances HTTP at the request level and provides TCP switching for all other protocols. This in particular means that it does not terminate HTTPS, instead HTTPS must be balanced at the TCP level and terminated on the user’s instances.
- It can listen on ports 80, 443, and 1024 thru 65535, which means it cannot be used to load balance many standard protocols that use ports below 1024. I’m not sure why this restriction exists, but it’s perhaps an indication that the service is primarily geared towards web sites.
- The health checks can either issue an HTTP GET request to a specific URL and check for a 200 OK response, or they can open a TCP connection to an arbitrary port and check that the connection is accepted.
- Servers can be added to or removed from the load balancer rotation without interrupting operation, and the load balancer can be queried for the status of instances according to the health check.
- The load balancing occurs in two-stages, first a client is directed to a specific availability zone using DNS, and then it is directed to an available instance in that zone. The zone selection is equal-weight, which means that one better run an equal number of instances in each zone or instances in one zone will end up with a higher load than the other.
We’re currently planning to support the load balancing service at multiple levels. We’ll enable our server templates to use Amazon’s load balancing both instead of and in addition to our own. For simple highly scalable HTTP services Amazon’s can be used on its own, but for more complex configurations a second level of load balancing is needed. In particular for SSL sites, a back-end load balancing after the SSL termination is often required.
The CloudWatch monitoring service is really a special storage engine that is designed for time series data. On one end data collected periodically from servers and from other services is pumped into the monitoring store, and at the other end clients can run queries against the store to extract data from it. What this means is that while not being a complete monitoring system, it is the central storage piece to which all the others would interface.
On the data input side CloudWatch is very limited at the moment. There are seven metrics per server that the virtual machine host injects into CloudWatch, and there are four metrics for each Load Balancer instance. Not very exciting yet, but don’t be fooled, this is just the beginning. Amazon will add more and more metrics and also provide an API for inputting custom metrics. At that point it becomes really interesting!
On the data output side the store offers a number of ways to query the data. The result of a query is always an array of data points over time. What’s interesting is that one can get much more than just the raw data points back out and that’s where CloudWatch shines. For example, it is possible to retrieve the max cpu utilization across a number of servers as a time series. It’s unclear how flexible this aggregation will end up because initially the way to name the servers of interest is somewhat limited, but we’ll find out.
Some other characteristics of the service:
- data is retained for 2 weeks, so one better extract it and save it somewhere else for longer term comparison and trending
- the smallest data resolution is 1 minute, anything input more frequently gets aggregated automatically at minute boundaries
- the service costs $0.015 per server hour and there is no per-query charge
Overall CloudWatch looks like a very promising service that will really gain momentum when many more metrics can be input. We’re still on the fence whether we should modify our graphing and alerting to be able to pull data from CloudWatch directly or whether we should pull data from CloudWatch on a continuous basis and re-store it in our monitoring system. In either case, we’ll make the data available to our customers.
The auto-scaling service is something that a lot of first time users of EC2 have been missing. Everyone expects Amazon to automatically scale the resources of a user since this auto-scaling is what is most often quoted in connection with the cloud. Never mind that it doesn’t really make much sense since Amazon just provides compute boxes and doesn’t have any info about the application a user is running and how one would make it scale. It’s like the UPS guys arriving at your doorstep with a bunch of Dell boxes: “here, they noticed you need more, I’ll unbox and plug them in for you”. I wish auto-scaling were that simple!
The Auto-Scaling API reflects the fact that a lot of set-up is required for auto-scaling to work. You have to define not just the auto-scaling behavior but also what to launch, captured in a “LaunchConfiguration” data structure which includes the image to launch, security group, SSH key, instance size, kernel id, ramdisk id, block device mappings and user data. If I counted correctly, you get to specify a total of 27 parameters to make auto-scaling work.
What’s really missing for the auto-scaling to fit in is all the context information that would provide a lot of the configuration information. The servers being auto-scaled don’t operate in a vacuum, they work in concert with other servers For example, most of our customers have two “base servers” that are not auto-scaled. They often have some extra functions, like acting as front-end www servers, but otherwise they’re configured just like the additional auto-scaled servers. Well, the config of those base servers and the auto-scaled ones share a lot in common, so it’s nice to be able to set all this up in one place for the whole deployment, and not individually for each server and the auto-scaling too. The same goes for other deployment-wide information, from the web site name to the credentials for accessing the database, all that is context that can be shared across all servers in a RightScale deployment.
What becomes even more painful is that any changes to what’s running on the servers, like a bug fix, requires creating a new image and relaunching the servers. This is where our ServerTemplates which build the server at boot time from a base image are a lot more flexible. You can make small changes to the server template, test that on an individual server on the side, and then slide it into the scaling. A neat alternative is being able to run a script that updates existing servers on the fly. In some cases you want a new server array that gradually scales up and takes the load over while the other scales down, other times you just want to run a quick script on the existing servers to patch up a minor config detail. You really do want the whole toolbox so you can pick the right tool for the job.
The good news is that the auto-scaling service itself is free. However, it requires that all launched instances use the CloudWatch monitoring service which costs $0.015 per instance hour.
I won’t repeat what I wrote at the beginning of the blog entry, but it’s great to see Amazon continue innovating at a break-neck pace! At a feature level what they’ve introduced tonight overlaps more with some features of RightScale than most other parts of their offering. Our focus is to provide an integrated solution on top of their array of infrastructure services. In particular, we have long had support for dynamic configuration management and advanced automation. More recently we have embraced portability across other cloud providers and even hybrid public/private clouds.