r46 - 2014-03-02 - 20:55:13 - KenyonRalphYou are here: NTP >  Support Web > KnownOsIssues
NTP users are strongly urged to take immediate action to ensure that their NTP daemon is not susceptible to use in a reflected denial-of-service (DRDoS) attack. Please see the NTP Security Notice for vulnerability and mitigation details, and the Network Time Foundation Blog for more information. (January 2014) REFACTOR See KnownOsIssuesDev for discussion of this topic.

9.2. Known Operating System Issues

9.2.1. Lost Interrupts

There are several mechanisms for systems to miss timer interrupts. All cause troubles for time keeping.

9.2.1.1. Scheduler HZ too high

See below for more discussion of several cases.

9.2.1.2. Disk drivers using non-DMA

Some early Linux distributations shipped with DMA for IDE disks disabled by default. Lots of disk activity would provoke lost interrupts.

See man hdparm for info on how to change this setting.

http://www.megapathdsl.net/~hmurray/hacks/read.c has is a program that will cause lots of disk activity to test this case.

9.2.2. Xen, VMware, and Other Virtual Machine Implementations

NTP server was not designed to run inside of a virtual machine. It requires a high resolution system clock, with response times to clock interrupts that are serviced with a high level of accuracy. NTP client is ok to run in some virtualization solutions.

Run NTP on the base OS of the machine, and then have your various guest OSes take advantage of the good clock that is created on the system. Even that may not be enough, as there may be additional tools or kernel options that you need to enable so that virtual machine clients can adequately synchronize their virtual clocks to the physical system clock.

9.2.2.1. VMware

VMware recommends that you use NTP instead of VMware Tools to sync the time (see the KB article below). You may also sync the time with separately installed service, VMware Tools which is highly recommended always. Older kernel versions may need the boot time kernel options like "clock=pit nosmp noapic nolapic" (see the VMware KB article link below for distribution and version specific recommendations) although you may need to do some experimenting to see which particular options work best for you.

Once you've got these changes in place, if and only if you want to sync with the tools, you will need to set "tools.syncTime" to "true" in the vmx file or set the checkmark "Synchronize guest time with host" in VM options. See also the VMware knowledgebase article Clock in a Linux Guest Runs More Slowly or Quickly Than Real Time. Note that tools will only sync the time forward, not backwards, so using NTP from within the guest is probably the preferable method. The guest will also sync the time from the hosts when it boots or is VMotioned, so the host clock should sync to the same source as the guests.

Do NOT set the guest to sync with BOTH NTP and Tools, only one or the other.

When running VMware with RedHat Linux, there are some additional things that need to be done. I quote:

    With VMware 2.5.x and RHEL4 you need to go into the MUI under Advanced
    Options and set "Misc.TimerHardPeriod" to 333.  The default value is
    1000.  You always have to set the host rate faster than the guest.
    That's pretty much the culprit behind all the clock problems is that
    single setting.  By default the host isn't able to keep up with the
    guests requests, thus the guests lose time.

Note that the above quote is for a very old (year 2006?) version of ESX and RHEL and may not apply today (2012).

Also See: http://www.vmware.com/pdf/vmware_timekeeping.pdf

See: http://kb.vmware.com/kb/1006427 for VMware recommendations on kernel options on various Linux distributions and their versions.

9.2.2.2. Xen

It appears that Xen just passes time-related system calls to the underlying master domain, and does not require any additional changes to support time sync into the guest domains.

9.2.2.3. Final notes

If your management or your client insists on running an ntpd instance inside of a VM client, even in the face of all the above information, there is a solution. Simply add a "noselect" keyword to each of the "server" definitions, and your VM client ntpd will monitor the defined upstream servers, but it won't actually try to sync with any of them -- leaving that job to the copy of ntpd running on the base hardware, as defined above. This will allow your client applications to confirm that they have good time sync, through the use of the ntpq program and by looking at the offsets reported, while avoiding the problems of actually trying to set the system time on the VM client.

However, do keep in mind that the kinds of additional/alternative kernel options you need to enable good time sync within your virtialization system may interfere with the proper operation of certain other types of programs. In that case, you need to make a decision -- do you run those applications under virtualization without good clock sync, or do you run them on a separate non-virtual machine that does have good clock sync?

Our thanks to Seph and Doug Hanks.

Related Links:

-- BradKnowles - 22 Feb 2007

9.2.3. Windows and Sun's Java Virtual Machine

Sun's Java Virtual Machine needs to be started with the -XX:+ForceTimeHighResolution parameter to avoid losing interrupts.

See http://www.macromedia.com/support/coldfusion/ts/documents/createuuid_clock_speed.htm for more information.

9.2.4. Linux

9.2.4.1. Kernel 2.4 (and Earlier)

9.2.4.1.1. Using a Local Refclock

  • First, you need to make sure that the PPSKit mods have been applied to your kernel. See PPSKit Implementation Status to see which is the right version of the kit for your system.
  • Second, if you're still having problems, make sure that the HZ= setting in your kernel configuration is set to 100. Some newer systems have come with this value set to 1000 instead, and that has tended to cause a lot of problems for some people by losing too many interrupts.

9.2.4.1.2. Without a Local Refclock

  • The PPSKit mods may not be necessary if you do not have a locally attached refclock (presumably over a serial line).
    • The issue seems to be primarily one of losing interrupts over a serial line that is very sensitive to delays.
  • You may still find that the PPSKit mods will make your ntpd server considerably more accurate and precise, even without a local refclock, due to the decrease in lost interrupts.
    • If you have a poorly performing ntpd which is not keeping good time on your system, you should seriously consider applying the PPSKit mods, or confirming that they are already applied, before you start assuming more serious hardware problems.

9.2.4.2. Kernel 2.6

9.2.4.2.1. Using a Local Refclock

  • The PPSKit mods have not been ported to kernel 2.6. There is currently no clear indication that the functionality provided by these mods have been subsumed into kernel 2.6.
  • You still have the same HZ= issue as shown above for kernel 2.4.
    • This is a bigger issue with kernel 2.6, since many distributions based on 2.6 are shipping with HZ= defaulting to a value of 1000.
  • Kernel 2.6 is still having problems working correctly with APIC and ACPI on many machines. You may need to disable APIC and/or ACPI at boot time before loading the OS, in order to get anything remotely resembling decent timekeeping.
  • See also the Dev Issues topic LinuxImplementationLinuxPPS

9.2.4.2.2. Without a Local Refclock

  • If you do not have a local refclock, you may find that kernel 2.6 works adequately for you, once any HZ= and APIC/ACPI issues are dealt with.
    • Otherwise, stick with kernel 2.4 until these issues have been resolved.

9.2.4.2.3. Lost ticks causing clock instability

From http://gossamer-threads.com/lists/linux/kernel/494604

In 2.6, some code has been added to watch for "lost ticks" and increment the jiffies counter to compensate for them. A "lost tick" is when timer interrupts are masked for so long that ticks pile up and the kernel doesn't see each one individually, so it loses count.

Lost ticks are a real problem, especially in 2.6 with the base interrupt rate having been increased to 1000 Hz, and it's good that the kernel tries to correct for them. However, detecting when a tick has truly been lost is tricky. The code that has been added (both in timer_tsc.c's mark_offset_tsc and timer_pm.c's mark_offset_pmtmr) is overly simplistic and can get false positives. Each time this happens, a spurious extra tick gets added in, causing the kernel's clock to go faster than real time.

9.2.4.2.4. A problem with the Reiser file system

The addition of the Reiser file system to the kernel caused a problem with ntpd. It was unable to stay synchronized, losing more than 10 minutes per day if allowed to run freely. The stock 2.6.18 kernel from Centos 5 had no problem. When the kernel interrupt rate (HZ) was reduced from 1000 to 250, the problem was solved. Apparently the Reiser FS produces enough interrupts to break the kernel clock at 1000 Hz. This occurred on a machine with a 2.4 GHz Intel Core Duo CPU.

9.2.4.2.5. Running ntpd without root privileges

The Linux Capabilities mechanism allows ntpd to drop all root privileges, except for the one it actually needs (the privilege to set the system clock). How to use this feature:

  • You need the Default Linux Capabilities in your kernel, either as a module ( modprobe capability), or statically (under Security Options in the kernel configuration menu)
  • You need a working(!) version of libcap.so (http://www.kernel.org/pub/linux/libs/security/linux-privs)
    If you get cap_set_proc(): failed to drop root privileges errors after a kernel upgrade, you may need to recompile this library!
  • ntpd must be configured with --enable-linuxcaps
  • ntpd must be started as root, but with a -u argument to give it a non-root user id to switch to
  • Optionally, you can use the -i argument to additionally chroot ntpd (in fact, -i without -u should also work: ntpd will then run chrooted, with user id 0 but without root privileges, but this is not recommended)

You can verify your setup by looking at /proc/<PID>/status: For ntpd running without privileges, it should contain the lines

   CapInh: 0000000002000000
   CapPrm: 0000000002000000
   CapEff: 0000000002000000      

while for a root shell, you should see

   CapInh: 0000000000000000
   CapPrm: 00000000fffffeff
   CapEff: 00000000fffffeff

9.2.4.2.5.1. A problem with IPv6 interfaces after chroot

The ifiter_ioctl interface iterator reads IPv6 interface names from /proc/net/if_inet6. If no proc filesystem is mounted in the chroot jail, ntpd drops all IPv6 interfaces after startup.

The easy choices are

  • don't use chroot
  • mount proc in the chroot directory
  • disable interface updates with -U 0. ntpd will not notice any new or dropped interfaces anymore.

It might also work to

  • change ifiter_ioctl to enumerate IPv6 interface by another method. IPv4 interfaces are enumerated through ioctl on a socket.
  • install libinet6 to enable getifaddrs()

9.2.4.2.5.2. Using rsyslog in chroot

rsyslog creates a new /dev/log device on every reload, thus chrooted daemons stop logging.

For a chrooted ntpd

  • create a dev/ subdirectory in the chroot base
  • create /etc/rsyslog.d/ntp.conf:
    $AddUnixListenSocket <yourchrootdir>/dev/log
  • restart rsyslog, then ntpd

9.2.4.2.5.3. Name resolution failures in chroot with getaddrinfo()

Since ntp-dev 4.2.7, ntpd uses getaddrinfo() to resolve hostnames, if available. This function requires several additional files in the chroot directory, otherwise host name resolution fails with error messages like "Servname not supported for ai_socktype (-8)" or "Name or service not known (-2)"

Copy or hard link:

  • /etc/services and /etc/resolv.conf to <yourchrootdir>/etc/
  • /lib/libnss_dns* and /lib/libresolv* to <yourchrootdir>/lib/

9.2.4.2.6. Using udev

Most linux distributions use udev to manage

/dev

. To setup a symlink to a refclock device you need an udev-rule like this one:

KERNEL=="ttyS0" SYMLINK+="refclock-0"

Very old versions of udev will need this instead:

NAME=="ttyS0" SYMLINK+="refclock-0"

The rule have to be defined after the rule for the device linked to. If your distribution supports udev-rules in many files you should put the refclock rules in a file by itself to ease maintenance.

9.2.4.2.7. Kernel 2.6 Mis-Detecting CPU TSC Frequency

Starting with Linux Kernel 2.6.18, the CPU's Time Stamp Counter is used to keep time, and when booting sometimes the Kernel mis-detects the frequency of this counter. This may result in severe clock drift which is impossible for ntpd to correct.

On some systems, it works well enough to run ntpd, but gets a slightly different calibration each time you boot. That may cause a startup transient when you reboot because the drift file will be off. The size of the transient depends upon your luck.

One solution to this problem is to change back to the old "acpi_pm" clock, which is what was used in earlier kernels. For example, in your grub.conf or /boot/grub/menu.lst file, you can set:

        clocksource=acpi_pm

The final configuration for a given kernel image should then read similar to this:

kernel   /vmlinuz-2.6.XXXX root=XXXX clocksource=acpi_pm ro quiet splash

And then reboot. A similar procedure is apparently possible with earlier versions of Kernel 2.6, which uses a "clock=" designation instead of "clocksource=".

Depending on your distribution and the update mechanism involved other locations for specifying the option are possible; have a closer look at menu.lst or grub.conf for that and consult the documentation. If you use LILO or another boot loader you will have to consult the documentation about how to specify boot options to the kernel. If in doubt check your boot loader config file(s) again after kernel updates - your new kernel might have been set up without the proper boot option.

Our thanks to Jordan Russell for locating and resolving this issue.

Additional notes:

At least for some AMD processors the TSC frequency depends on the core frequency, which can be changed dynamically according to the load. So power saving can actually influence the clock if the OS fails to closely monitor TSC speed changes; one symptom for this kind of problem is a system clock with a very high drift of a few seconds per hour.

You can use

cat /sys/devices/system/clocksource/clocksource0/available_clocksource

to find out what clock sources are available on your system; 'hpet' seems to be preferable over 'acpi_pm' which in turn seems to better than 'pit'. Try the available ones in order to get a stable clock.

9.2.4.3. Access control mechanisms causing "permission denied" errors

Access control mechanisms like AppArmor or SELinux have been developed to increase system security and thus prevent systems from being compromised. Those mechanisms control which resources may be accessed by which processes, and the controlled resources may include specific files or hardware devices.

Even if a system comes with a preconfigured ruleset or policy for the NTP daemon, that policy or ruleset may need to be extended if the NTP configuration is modified, e.g. to create additional log files, or to use a hardware clock device as reference time source.

If access to the desired resources is inhibited by the access control mechanism then in most cases "permission denied" errors can be found in the system log after the NTP daemon has started.

Please note that under kernels 2.6.x and newer the symbolic links for hardware devices (e.g. /dev/refclock-0) are also created newly after every reboot. If this doesn't appear to happen you must create an udev rule for this. See also 9.2.4.2.6. Using udev.

9.2.4.3.1. AppArmor

AppArmor is a security tool which has been developed by Novell and has made its way into the SuSE Linux/openSUSE distribution, and maybe also other distributions.
See: http://en.opensuse.org/AppArmor_Detail

AppArmor uses profiles to control which system devices and resources may be accessed by an application, allowing finer control than the standard Unix rights management. If an application tries to access a resource it has not been granted sufficient rights to then access is prevented, and a "permission denied" error occurs.

AppArmor is shipped with default profiles which work with the standard installation, but if an application's configuration is modified to use some non-standard configuration then the AppArmor profile has to be modified accordingly. This affects any application, not only ntpd.

The AppArmor profile for ntpd may require modification if refclock devices have been configured manually, or even if log files or statistics files shall be generated by ntpd.

In order to check whether a "permission denied" problem is related to AppArmor you can temporarily stop AppArmor and see if the problem persists, or not.

If AppArmor shall be used it must be configured to allow access to the refclock device used by ntpd. Under SuSE Linux/openSUSE this may be done using the configuration tool, yast2. To add an entry for a refclock /dev/refclock-0 which points to /dev/ttyS0:

  Yast2 -> Novell AppArmor -> Edit Profile
  Select profile /usr/sbin/ntpd
  Add entry: /dev/ttyS0
  Mark allow for: Read, Write, Link

This generates a new entry in the AppArmor profile file:

  /dev/ttyS0   rwl

Similar changes may be required to allow log or statistics file to be generated by ntpd under AppArmor control.

9.2.4.3.2. SELinux

SELinux has been developed by the NSA, see:
http://www.nsa.gov/research/selinux/index.shtml
and often comes with RedHat/CentOS, Debian/Ubuntu, and maybe other distributions. It it more powerful than AppArmor but is also harder to configure. Fortunately SELinux itself provides a way to create a new profile from its own log messages.

The procedure described below has been tested on CentOS 5.2 / Red Hat Enterprise Linux 9 (RHEL9).

By default SELinux runs in "enforced" mode which inhibits access to resources which have not explicitely been configured. In order to find out which kind of access shall be granted to a process SELinux can be temporarily switched to "permissive" mode which does not inhibit access but logs all accesses which would be inhibited in "enforced" mode:

  setenforce Permissive

Now start or restart the NTP daemon so it tries to access the required resources:

  service ntpd restart

Wait some time until the NTP daemon has opened all devices, created all log files, etc. The relevant log messages can then be found at the end of the SELinux logfile, and can be extracted using grep:

  grep ntpd /var/log/audit/audit.log > ntpd-audit.log

Finally set SELinux back to enforcing mode:

  setenforce Enforcing

The relevant log entries are now in our file ntpd-audit.log, and you may edit this file to see whether there are old/duplicate entries which can be removed.

The following three commands are required to parse the log entries and create a .pp file which can be supplied to SELinux. In the example below we assume the basename of the generated files is ntpd, so the final target file is ntpd.pp:

  audit2allow -m ntpd <ntpd-audit.log >ntpd.te
  checkmodule -M -m -o ntpd.mod ntpd.te
  semodule_package -o ntpd.pp -m ntpd.mod

If all the commands above have been finished without error the new profile can be installed and loaded:

  semodule -i ntpd.pp

After this has been done once ntpd should run fine without "permission denied" errors.

9.2.5. Mac OS X

The Mac OS X method of enabling ntpd is to go to the Apple menu option System Preferences..., then into the Date & Time sheet, then go to the Date & Time sub-panel, and click on the radio button labeled Set Date & Time automatically, which allows you to select a time server to use from a drop-down, or you can fill in the name of your own preferred time server.

Note that every time you click the check-box, or change the server address, or perhaps just open the panel, the system will re-write your /etc/ntp.conf based on the information you have provided, then kill and restart the ntpd service.

  • In OS X 10.9 Mavericks, Apple includes a daemon called pacemaker that seems to conflict with ntpd (ntpq -p will show low reach and offsets will grow). See https://discussions.apple.com/thread/5604114 for some discussion. Pacemaker can be disabled with launchctl unload -w /System/Library/LaunchDaemons/com.apple.pacemaker.plist, which allows ntpd to work normally.
  • In version 10.5 ("Leopard" introduced in 2007) the menu replaces every "server" line at the top of /etc/ntp.conf, but stops at the first comment line, thus allowing you to add any configuration parameters you want. Synchronization succeeds, but on 5 machines I tested, ntp.drift never got updated, until I replaced /usr/sbin/ntpd (version 4.2.2 according to: strings /usr/sbin/ntpd | grep '^ntpd [0-9]') with e.g. 4.2.4 from ntp.org. Boot-time start is controlled by XML file org.ntp.ntpd.plist in /System/Library/LaunchDaemons/ . Unchecking the box in menu is equivalent to "defaults write /System/Library/LaunchDaemons/org.ntp.ntpd Disabled 1" . To manually update /var/db/ntp.drift , run "ntpdc -c loopinfo" after stabilizing (perhaps a couple of hours) and copy "frequency" number.
  • In older version 10.3 and 10.4, it will munge every line, appending "minpoll 12" (i.e. 4096 seconds between polls). At best this slow polling means that synchronization only succeeds for computers with very accurate clocks, and ntp.drift is never updated; at worst the syntax errors will prevent ntpd from starting. The parameter controlling boot-time startup is TIMECONFIG or TIMESYNC in /etc/hostconfig (toggled via menu). A boot-time script to replace /System/Library/StartupItems/NetworkTime/NetworkTime could be placed in e.g. /Library/StartupItems/NTPlocal/NTPlocal along with StartupParameters.plist . Modifications to MacOs script risk being overwritten by system updates, and of course a configuration location other than /etc/ntp.conf should be specified; finally note that /var/run/ is a temporary directory that gets re-created at boot-time, and is unsuitable for ntp.drift .

9.2.6. Sun

9.2.6.1. Sun Device Drivers

9.2.6.1.1. su Driver An issue with the Sun su driver has surfaced with respect to PPS support. Currently (200508) the su driver is not supporting PPS correctly in some configurations. Sun is working on a patch for that issue. For more information please refer to

bug_small.png Bug #361. Novell AppArmor - BUG{361}
Edit | WYSIWYG | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r46 < r45 < r44 < r43 < r42 | More topic actions
 
SSL security by CAcert
Get the CAcert Root Certificate
This site is powered by the TWiki collaboration platform
IPv6 Ready
Copyright & 1999-2014 by the contributing authors. All material on this collaboration platform is the property of the contributing authors. Ideas, requests, problems regarding the site? Send feedback