Load balancing in general is a complicated process, but there’s some secret sauce in managing DNS along with multiple load balancers in the cloud. It requires that you draw from a few different sets of networking and “cloudy” concepts. In this second article in my best practices series (my first post covered how to use credentials within RightScale for storing sensitive or frequently used values), I’ll explain how to set up load balancers to build a fault-tolerant, highly available web application in the cloud.
Here’s what you’ll need:
- Multiple A records for a host name in the DNS service of your choice
- Multiple load balancers to protect against failure
Before I explain how the two work together, let’s check out how each of them works individually.
Multiple A Records for a Host Name
A records translate friendly DNS names to an IP address. For example, when you type rightscale.com in your browser, behind the scenes your computer is asking a DNS server to translate the name to an address.
I’m working from a Mac and the process is a little different for Windows-based machines, so check out more on nslookup for *nix and Windows, respectively. I’m also using one of Google’s public DNS servers to perform my lookups. Check out the request below and note that when I query DNS, I’m getting a single address back for my test domain of dnsdemo.cloudlord.com:
My test domain has one A record associated and it resolves to the IP noted.
Let’s check out a more complicated example — Google.com:
Note that as shown in Figure 3 below, Google returned six addresses (at the time I queried it) because Google has six A Records registered to serve its main domain. When I ran the same nslookup query again, the IP addresses were returned in a different order. This is commonly referred to as “DNS load balancing.”
Figure 4 below shows DNS load balancing in action using dnstest.cloudlord.com and a test file that indicates which server is being served up. For this example, I set up my dnstest.cloudlord.com domain with two A records. Note that this first request has one attempt and the content reads that “this is server 1.”
Next, I terminated the first server on the next request (to force a failed state), and the results are shown in Figure 5 below. Note that there’s a timeout on the first IP, and then the second request goes through without any issue. You’ll also notice that it’s returning a response of “this is server 2.”

Figure 5 – Curl request to test domain with primary A record in failed state (note timeout and new IP)
The order in which IP addresses are returned varies by the DNS server and provider used but often follows a round-robin or geographically specific algorithm.
The idea here is that different clients will get different ordered lists of IP addresses corresponding to your domain name. This has the effect of distributing requests across the group of IPs in a specific manner. If an IP address is does not respond in an appropriate amount of time, the client will time out on that request and move on to the next IP address until the list is exhausted or it finds a connection that’s valid. Although it’s not an exhaustive list, most modern browsers, along with curl as shown above in Figure X, follow this retry process.
There are a few things to remember though:
- DNS failover doesn’t provide any additional features such as “sticky sessions” for your application.
- Upstream DNS caching is unpredictable — client DNS providers may or may not respect your TTL settings.
- This isn’t a replacement for TCP load balancing because it’s not terribly precise based on the upstream DNS caching process noted above.
Multiple Load Balancers for Redundancy and Scalability
With multiple IP addresses routing to your deployment, each of these addresses can terminate at a load balancer that serves your back-end application (see Figure 6 below). Doing this, you’ll be able to present multiple endpoints to the public to serve your application (I’ll get back to why this is important in a minute).
In Figure 7 below, I go a step further and illustrate how connectivity to the application layer can be set up from multiple TCP load balancers. This allows you to have multiple incoming connections each serving up the same content, providing a redundant load balancing layer as well as a redundant application layer.

Figure 7 – Connection diagram for multiple load balancers connecting redundantly to the same application server tier
DNS Load Balancing: Bringing It All Together
The end result is that by using DNS load balancing, you can achieve a fairly rough balance of traffic between multiple TCP load balancers, which can manage applying load to your application servers at a more granular level:

Figure 8 – Full incoming connection diagram showing multiple load balancers with their own IP address
This is a great way to protect against failure and increase overall throughput, giving you the ability to scale for high availability and high performance. For more information on metrics related to configuration and throughput on HAProxy in the cloud, check out this white paper, Load Balancing in the Cloud: Tools, Tips and Techniques.
Setting up DNS load balancing can be a bit of a hassle, but the Load Balancer with HAProxy ServerTemplate, along with scripts for application servers to attach to load balancers, simplifies the process. The RightScale ServerTemplateTM and scripts use a tag-based, managed solution that will keep your HAProxy config files synchronized and that will automate the deployment, registration, and detach process for all servers involved. To use the ServerTemplate for setting up DNS load balancing, sign up for a free RightScale trial.






Hey Patrick,
Nice post! You’ve raised some great issues, and the multi-tier scheme you’ve illustrated in figure 8 is a solid design template. Just two quick observations:
1) Regarding TCP load balancer failure, while it’s true that most browsers do have reliable “time out and move on” behavior, that can still result in a suboptimal client experience. An alternative to consider might be a DNS load balancer that does active healthchecking of the TCP lb’s in the first place, to avoid clients having to time out at all.
2) With a DNS load balancer in place that does that healthchecking, it’s possible to deliver single “A” records to clients, instead of multiples. And it’s then possible to maintain a “sticky” association between the client and its TCP load balancer instance. For delivering apps that are more stateful, this is an invaluable tweak.
Cheers,
Jon Braunhut
Chief Scientist
KEMP Technologies
Hi Jon,
I certainly agree with you on the sub-optimal client experience and I don’t think that the goal is to stay in the ‘failed’ state for a long period of time, rather to keep the site/service up and running while recovery is underway. I also worked this out using an existing DNS provider-I generally don’t set up DNS load balancers for cloud deployments, but it’s definitely something worth checking out. My main concern about a single ‘A’ record is that once that server fails, there needs to be a plan to manage the failure of the DNS load balancer, so the other goal of two ingress IP addresses (in this case) was to mitigate the risk of either one of them going down. Setting up a redundant DNS infrastructure across disparate zones/cloud regions or providers could also be an option.
Best,
Patrick
Hi Patrick,
I’m with you on both points. I don’t think many folks are (or should be) engineering for an extended “failed” state. And if someone is going to pursue the single “A” record approach, it’s that much more important to have redundant DNS infrastructure. But most people with serious deployments will already have planned for DNS redundancy, so there’s little marginal cost to going down this road.
Cheers,
Jonathan