Leap Smear for NTPv4
Related Topics: UpdatingTheRefidFormat
Actual Implementation
Leap smearing was implemented in ntpd versions 4.2.8.p3 and 4.3.47; more information about the actual implementation can be found in the README.leapsmear page found in the distribution or via BitKeeper:
https://bk1.ntp.org/ntp-stable/README.leapsmear?PAGE=anno
(Lightly hacked by Harlan)
It is pretty obvious that we need a way to shoehorn the current offset between a monotonic clock (probably TAI) and the reported timestamp, right?
Adding an extension field, probably in 32 (short) or 64-bit (full) ntp timestamp format which gives this offset would be very good, but we only really need 24 bits or so: 4 bits for integer seconds and 20 bits for us level fractional data, right?
Using nothing but the current NTP packet format that gives us a few options:
- The RefID field, where we've already determined the need for a way to stuff an IPv6 hash into 24+ bits, reserving one or a few leading byte values for this use. We can reserve a similar range for current TAI (or just UTC to save integer bits) offset.
- Stealing low order bits from the reference timestamp: I.e. grabbing the bottom 24 bits here leaves 8 fractional bits which is enough to locate the reference time to the nearest 8 ms or so: I have never seen any ntp code which bothers about the fractional bits in the reference timestamp!
- Steal a few bits each from the Receive and Transmit timestamps: If we grab 12 bits here the remainder is sufficient to handle us precision.
My recommendation is a combination:
- Use a special RefID token (
SMRx
, with the x
containing 6 base64 encoded bits).
- Tweak the Reference timestamp, grabbing 10 low order bits, leaving 250ns resolution.
- Take the bottom 8 bits from the Receive timestamp
- Take the bottom 8 bits from the Transmit timestamp.
The total is 32 bits, sufficient for a signed fixed-point value in 8:24 format, i.e. +/- 127 seconds with 60 ns resolution.
Yes, the encoding of the fields will cause some cpu overhead, but not very significant, and the resulting packets will be indistinguishable from the current format, with no measurable effect on the control loops: We are currently adding random noise in those same lower bits, now we'll just make the bits non-random during the time smearing takes place.
It should be obvious that we can do this with even fewer bits, but the bits I've identified above should all be freely available!
TerjeMathisen - 2015-06-20
Per
RefidFormat we could use a high-order 255 to mean an IPv6-encoded refid, and use a high-order 011110xxx to mean "Leap Smear", which would then give us 3+24 bits for the smear encoding, with no need to mess with other parts of the packet.
- REFID fields do not propagate, but I'm otherwise happy with a single reserved high byte, i.e. 254.x.y.z for a 24-bit offset indication. -- TerjeMathisen - 22 Jun 2015
Notes and Questions
Stratum
For Terje's proposal, this mechanism works great for hosts providing S1 answers. What do we do about intermediate time servers at S2+?
SMRx
translates to 83.77.82.x, which is allocated to somebody.
- During smearing servers cofigured to smear should probably lie, i.e. claim to be S1 servers even if they are S2. -- TerjeMathisen - 22 Jun 2015
Who gets smeared time?
Do we provide smeared time on all responses? Only to client time requests? Only to associations of greater stratum?
- Only clients, i.e. smearing only propagates downstream. -- TerjeMathisen - 22 Jun 2015
Precision
For Terje's proposal, should we make sure the
precision
reported in the packet is such that it's clear that the timestamps are only good to the point where we're stealing bits?
- This doesn't matter as long as we steal maximum 12 bits: The remaining 20 bits corresponds to us precision which is the best anyone ever can provide over a network. -- TerjeMathisen - 22 Jun 2015
Timestamps
The intent is that clients that don't know what we're doing will be getting smeared timestamps, and will therefore follow smeared time, correct? This means that the "believable" time bits will contain smeared time, and in Terje's proposal the bits we're stealing can be used to determine the amount of smear, correct? For Harlan's proposal, we can determine the amount of smear from the bottom 3 octets of the refid.
- Using REFID is fine as well, but it needs to be propagated if lower level clients should be able to determine that smearing is taking place. -- TerjeMathisen - 22 Jun 2015
Root Dispersion
Should the amount of smear be added to the packet's root dispersion? Harlan thinks so.
- This might cause a client to not steer its local clock until very late, since the root dispersion will always (or at least until the half-second point) be as large as the offset between client and server. -- TerjeMathisen - 22 Jun 2015
8:24
The amount of smear in the bits we're stealing will provide an 8:24 timestamp. Under what circumstances will the integer portion of the smear be non-zero?
- Never unless we want a mechanism to provide UTC<->TAI offset. We only need approximately 20 fractional bits maximum, 24 fractional bits will provide ~60 ns precision. -- TerjeMathisen - 22 Jun 2015
Deleting a leap-second
We've never had to delete a leap second before and it's likely we won't need to.
But it might happen, and while some may say it's fine to just jump the clock forward at that time, others may choose to smear that second, too, in order to have 86400 seconds on that day instead of 86399 seconds.
This means the smear will be negative.
How should we account for this possibility?
- This should work transparently, and afaik is already handled by the algorithm Google posted: The smear offset's sign depends on the irection of the leap event. -- TerjeMathisen - 22 Jun 2015