During this season of giving, you can show your support for the NTP Project by making a donation to Network Time Foundation.

Failing-over DNS Server Addresses

1. Problem Summary

The current implementation of NTP uses DNS to retrieve the IP addresses of the server name in the configuration file. However, it just copies the first IP address and deletes the rest. For servers in the pool (pool.ntp.org) for example the number of addresses returned by DNS can be quite large but only the first is used. This can be a problem if the NTP server at that address is not available, has been removed or gets removed after some period of time. The NTP reference implementation however never comes back to DNS to try to look up the address again nor does it have any longer the list of IP addresses previously received so it continues to retry the one IP Address.

2. Proposed Solution

To fix this problem we will implement the following:
  1. Save all returned IP addresses from the lookup in a structure associated with the server name so that the NTP server can try the next IP address in the returned list if the first one fails.
  2. If there are no more entries in the IP address list, run a new DNS query and get a fresh list.
  3. Allow the configuration to specify how many IP addresses to use simultaneously and avoid to need to specify the same server name multiple times which could potentially result in getting the same IP address list.

3. Design Issues

In order to implement this a number of issues need to be decided. Because of the architecture of NTP it is expect that packets will be lost during the usage of the server. Indeed, UDP is meant for such situations.
  1. For an IP address that has never responded to an NTP Packet, how many times should an NTP packet before it gives up and tries a different server?
  2. For a server that has been responding to NTP packets, after it stops responding, how many retries of NTP packets should be made before it stops trying and moves on to the next IP address in the list?

Since the TTL of the IP addresses returned by DNS is not easily available to the client we will not rely on it to make decisions on whether or not we can use the IP addresses.

4. Implementation

In order to address these issues, there will be a number of configurable variables available to be set in the configuration file. The following is a partial list of parameters that will be available to be set:
  1. InitialFailoverRetryLimit
  2. ServerFailureRetryLimit
  3. MaxAddresses

There will be defaults for these in case the configuration files do not specify them. The default values have yet to be decided.

The code will perform a DNS lookup of the name and save a copy of all the IP addresses returned in an an association list with the name. The first address on the list will be used to try an form an association.

The peer association will need to keep track of both whether or not it's successfully received a packet and whether or not it's stopped receiving packets. It will also need to reset an address-specific not-received count every time it receives a response from the server.

The code will track the number of consecutive times that the server does not respond. When the number of consecutive failures exceeds the limit the association is demobilised and the next unused address is fetched from the previously fetched list of addresses. If there are no more unused addresses in the list then it will perform another DNS lookup to fetch a new address list and start again.

Note that using IP addresses for the server effectively disables the mechanism since only that address can be used. Setting the value to 0 of the above variables will disable the failover mechanism for which the variable is used.

-- DannyMayer - 05 Aug 2007


Steve Atkins writes:

There are some problems with the proposed solution that would risk causing operational problems if they were done.

Some DNS servers will randomize the order they return a response in (in order to work around buggy applications that only use the first response) but some don't.

If the response isn't randomised, then every NTP server accessing a peer in the pool will preferentially connect to the first A record returned, until that system falls over or degrades service sufficiently that clients think it's failed. Then they'll try the second one. This isn't good load balancing. (In practice quite a lot of dns servers randomize the results, so you many not care. But why implement a bad algorithm when it's as easy to implement a good one?)

A better algorithm is to use gethostbyname() (or one of it's more modern friends) to get all the A records for the hostname. Try connecting to those in random order, until you decide you have a working peer. If none of them are live, start over.

Just as good, better in some respects, is to use gethostbyname(), pick one response at random, try and connect. If the peer doesn't seem live, start over. That requires less state, and is more slightly more responsive to DNS changes.

The first approach extends to trying multiple connections in parallel a little more easily (shuffle the results, attempt to connect to the first in parallel, repeat).

Danny Mayer replies:

Most DNS Servers do not randomize. BIND which has something like 90% of the market uses a rotating order by default (The behavior can be changed). Randomizing after receipt might not be a bad idea once the code is working. Getting things working is the main problem.

gethostbyname() is not used anywhere in the code except in the one place where we need to emulate getaddrinfo() for systems without the real function. We couldn't support IPv6 without it. More importantly we cannot wait on the resolver, that has to be done asynchronously and that's the biggest problem since that means that on Unix at least running a separate process and using the process to communicate with ntpd. That is done today, but the problem with it is the way that it then communicates with ntpd using private mode 7 packets to configure the server. That will have to change.

-- DannyMayer - 21 Dec 2007

Ronald F. Guilmette says:

In my (rarely humble) opinion, the "Problem Summary" given [above] specified fails to fully list all of the real issues here.

Most particularly, I think that the presentation of the "Problem" fails to even consider one potentially important issue entirely.

I am speaking of load balancing.

Although I do not count myself as a serious or hard-core DNS expert, I am nonetheless a diligent and careful user of DNS, and one who has tried to understand even its less common modes of usage. And one thing that I was taught about DNS (by people more knowledgeable than I) some long time ago is that in cases where a single domain name is associated with more than one `A' type record, that multiple address association is often, usually, and customarily done with the intent being that clients of the service(s) associated with the relevant host name should try, whenever possible, to treat all of the associated IP address as being essentially equal priority redundant servers providing the exact same service, and ones that should, ideally, be utilized by the client(s) in a round-robin fashion, so as to distribute the client load among multiple this set of multiple redundant servers.

This method of distributing load is certainly used in conjunction with web (http) service, and also, I believe, with mail (smtp) service. Why should NTP services be any different?

In short, I think that any NTP client that does a DNS query for a given host name (i.e. one designating an NTP server) should, ideally, if it gets back multiple A records, cache all of those A records (along with their respective TTLs) and should, at each and every point in time, treat all of the associated A records that are still "live" (according to their TTLs) as being part of a load-balancing round-robin pool of IP addresses for the relevant server... a pool which the client should make an effort to balance/distribute the load among.

Regards, rfg

P.S. The issue of how/when to declare a given NTP server IP address as "non-functioning" or "non-responding" is almost completely separate from, and orthogonal to the issue of load balancing that I've mentioned above, except for the fact that the simplest way to handle a non-responding IP address might be to simply deleted it from the current round-robin pool for the relevant NTP server... at least until such time as the client does the next DNS lookup on the server hostname. (It is a more complex question of policy as to when, exactly, a fresh DNS query should be performed for a given NTP server hostname. In my own view, the NTP client should perform a fresh DNS query for the (server) hostname whenever the size of the current round-robin pool of IP addresses associated with that server name drops to zero, i.e. because all IP addresses in the pool have been eliminated, either due to TTL expiry of the A records, or because the associated server(s) has/have been deemed "non-responding" by the NTP client.)

Danny Mayer replies:

One of the goals of this is load balancing of servers listed in the DNS records. Steve Atkins suggestion of randomizing the list of resultant IP addresses is a good one once the service is working. One doesn't want to do this during development since you need to check your results.

While you may not be a DNS expert, I am, having been part of the BIND9 development team. BIND9 at least will by default rotate the order of the returned list of records each time it's asked so the first record is usually different. Note however that resolving DNS servers are intermediaries and you may go through several layers of DNS servers to get an answer and each has it's own idea of what the order is that gets returned.

http servers need A and AAAA records and follow the above rules. SMTP servers use MX records and the rules are different since the MX records contain priority orders of contacting the SMTP servers for the domain. Also the contacts with those servers are shortlived (relatively speaking). This is not true of NTP which wants a longterm association with the NTP server.

The goal of this design is to cache all of the A and AAAA records returned. However it cannot get the TTL values of these records using standard function calls like getaddrinfo() since that information is not returned. The only way to get that information is to write our own DNS packets (using libbind for example) and doing everything ourselves. That's just not realistic. In any case there is a more subtle problem: Even if we were able to get the TTL and store it (timestamp of receipt + TTL) because this is NTP and we are disciplining the clock a jump in the system clock, makes those TTL's invalid. I much prefer the solution I outlined in the Implementation section above.

-- DannyMayer - 21 Dec 2007

Topic revision: r5 - 21 Dec 2007, DannyMayer
Copyright © by the contributing authors.Use of this website indicates your agreement with, and acceptance of, the PrivacyPolicy, the WikiDisclaimer, and the PrivateWebPolicy.