Defining Achievement
Load Balancer Bugs
"Vendors don't always get things perfect when writing code for network appliances."
I love a good reason to read RFCs. Nothing pleases me more to learn exactly why and how the Internet works. On this occasion, I got the opportunity to learn something about DNS because of a broken piece of network gear. I'm not Paul Vixie, but I've learned a lot about DNS over the years. It's always exciting to learn more from a technology you (think) you know everything about.
"Your DNS servers in New York, Orlando, and Phoenix are broken, but the ones in San Diego, Chicago, and Redwood City are fine." Installation coordinators are my favorite.
"Okay," I said. "What makes you think they're broken?"
"Well, the engineer in the network provisioning group said so. The global server load balancers for customer X are working fine in some cities, but not in the others."
"Okay, but I'd like to point out that we have thousands of customers in all of those locations that are working fine. If there were something wrong with the DNS servers, wouldn't it be affecting everyone, and not just this one customer?" My logic seemed flawless. "Let me talk to the provisioning engineer and we'll get back to you."
My first thought was that the load balancers weren't configured properly in the "broken" locations. After verifying with my own eyes that the configs were the same everywhere, I had to accept the possibility that the DNS servers were, in fact, the root of the problem.
While my previous point was correct -- we did have plenty of functioning customers in all of the centers mentioned -- something about the break pattern bothered me. All of the sites that were failing had recently had their DNS servers upgraded from BIND version 8 to 9. What could have changed between these two versions of code that only affected a new customer and none of the existing ones? Time to get a better idea of how the load balancers were relying on the DNS servers to perform global load distribution.
Every device I've used to distribute traffic between geographically diverse clusters relies on some form of DNS trickery. Usually this entails delegating authority for a particular DNS zone to the load balancers, which themselves implement a basic level of DNS server functionality to tell clients where to send their traffic. Instead of adding a lot of code to their network OS to implement a basic authoritative DNS server, this vendor chose take a more simple approach by proxying the DNS queries to a real authoritative DNS server. Had they done this right, it would have been a good choice in my opinion. Simple is always better, and I don't see why most vendors would greatly increase the complexity of their code with DNS server functionality.
The only problem was that none of the servers accused of being broken were authoritative DNS servers. How could servers that weren't responsible for authoritative DNS answers be responsible for breaking this customer's install? After speaking with the provisioning engineer, it turned out that what caused him to think they were broken was an inability to resolve the globally load balanced zone. As a standard part of the provisioning process he'd been testing the resolution of the load balanced records from different sources within our network. When he tested from a source located in New York the query failed. From San Diego it worked just fine. When he tried to resolve against other provider's DNS servers he'd get failures in some cases and success in others. This was back in the day when most DNS administrators hadn't disabled the "version.bind" TXT resource record in the CHAOS class. This made it possible to confirm that any recursive server running BIND 9 was unable to resolve the record while those running BIND 8 could. Something was different in the way these two versions of BIND were trying to resolve this zone, and whatever it was had to be the answer.
I could either start reading every change log between the latest versions of BIND 8 and 9, or I could take a more direct, and hopefully less time consuming approach. This was definitely a job for tcpdump. What became immediately apparent from the packet captures was the use of a protocol called EDNS in all the failed BIND 9 originating queries. The successful queries from BIND 8 servers were not using EDNS. I had never heard of this Extension Mechanism for DNS. Research time!
The particulars of this mechanism are described in RFC 2671. As with everything authored by Mr. Vixie, this document is informative and concise. The standard defines an extension mechanism for DNS so that enhancements and improvements can be added over time. The smart thing about these DNS guys is that they don't make changes that break things. If the presence of EDNS was breaking things, it wasn't going to be BIND's fault. What could the load balancer be doing wrong with these EDNS queries that would cause them to fail? Suppose something that doesn't implement the newer standard gets an EDNS-laden query from something that does. What's the right way to handle this? Thankfully the RFC's author tells us exactly what's supposed to happen:
5.3. Responders who do not understand these protocol extensions are expected to send a response with RCODE NOTIMPL, FORMERR, or SERVFAIL. Therefore use of extensions should be "probed" such that a responder who isn't known to support them be allowed a retry with no extensions if it responds with such an RCODE.
How's that for clear? If you're a DNS server, or in this case acting like one, and you get an EDNS query you don't understand, you have three valid options to respond with: NOTIMPL, FORMERR, or SERVFAIL. Any of these answers will result in a query retry without EDNS. Follow the rules and you'll get another chance to do your job. Unfortunately, the DNS proxy code in this load balancer wasn't following the rules. Instead of responding with any of the three approved answers, this device was answering with a REFUSED RCODE. When the BIND 9 servers got this answer to their EDNS queries they didn't try again, which is exactly what an earlier RFC, 1035 in this case, tells them to do.
Thankfully when a vendor makes such a big mistake, and you have excerpts from RFCs to back you up, a fix is soon to follow. We even managed to get the customer installed by their due date. More importantly I defended some innocent DNS servers from being thought of as broken, and learned even more about one of my favorite protocols.
-ksp
I was in the industry at the time smurfing became all the rage. I was still pretty inexperienced, though, and didn't completely grasp the gritty details. I knew enough to understand the impact, basic concept, and that everyone on the good team made a concerted effort to fix the problem. Pretty soon you couldn't run a "show conf" without seeing "no ip directed-broadcast" somewhere near the tippy-top of the output.
This experience didn't involve broadcast address traffic or even the ICMP protocol, but we did discover a network OS bug that allowed for fun and simple packet amplification. It all started when a colleague noticed a strange traffic pattern on a server load balancer's Cricket graph. When the SLB should have been doing next to nothing, the graphs showed an extremely high packet rate. More curiously, the rate of ingress packets was exactly half that of the egress packets. What confused everyone, though, was that none of the real servers behind the SLB were doing anything with the traffic. Not only was this odd, but the one-for-two traffic ratio was a clear sign that something was definitely wrong.
Because we're suspicious and untrusting of our equipment, we started with the assumption that the right kind of wrong traffic directed at the load balancer could entice it to respond with two response packets for every request. If the device had been sophisticated enough to provide some kind of traffic sampling mechanism, or if we'd had an upstream entity that could, guessing wouldn't have been necessary. Unfortunately, this was the reality, so we needed to be smarter than the broken gear. Enter hping: better than sliced bread and twice as filling. Command line utilities that allow arbitrarily customized packets to be thrown at a wire should be everyone's favorite toy. Half an hour of blind testing later we learned something. Send a TCP packet to this device's VIP address with the FIN and SYN, PSH, ACK, or URG flags set, and enjoy the equivalent of a buy-one-get-two sale. Spoof the source address on the inbound packet and you've got yourself a nifty traffic doubler to point at any target you like. After all, why use all of your botnet to inundate your enemies with DoS traffic when you can use half and get the same result? It also helps if your partner in crime, (my unwitting employer), has multiple, geographically diverse nodes, with multiple OC48 connections to handle all your nefarious attack traffic.
-ksp