Eight more years of leap-second problems loom as governments punt decision to 2023
Just as adding an extra day in leap years helps us keep our calendars in step with the rotation of the earth around the sun, adding occasional leap seconds to Coordinated Universal Time (UTC) allows us to keep this time reference in step with the earth's gradually slowing rotation. Without adjustment, there would be about a minute's difference between the two by 2100.
Leap seconds are great if you're using your time reference to note exactly when the sun should be directly overhead, or when certain stars should be in view, but for keeping a bunch of servers or Internet routers in sync around the world, continuity matters more than your place in the universe.
For that purpose, a continuous reference timescale with no leap seconds would be easier to live with -- as thousands of unlucky sysadmins and network admins were reminded at the end of June, when the last leap second was added. More than 2,000 networks crashed around the globe.
Proposals to create such a continuous scale were first made to the World Radio Conference (WRC) in 2012. WRC is a meeting of the International Telecommunication Union, convened every three or four years to review and occasionally modify the Radio Regulations, of which the definition of UTC and other time references is a part. The ITU is a United Nations body, so WRC votes are cast by delegates of national governments.
The 2012 conference, after some debate, voted to postpone the decision, calling on the ITU to conduct some feasibility studies and on WRC-2015 to "consider the feasibility of achieving a continuous reference time-scale, whether by the modification of UTC or some other method."
On Thursday, delegates considered deeply and decided that the decision would be better put off a little longer -- not just (clang) until the next WRC in 2019 but until (clang, clang) the one after, in 2023 -- while the ITU conducts further studies into the impact of tinkering with the definition of UTC.
There will be no shortage of expert opinion for delegates to consider in 2023: The ITU has invited an alphabet soup of organizations to contribute, including the IMO, ICAO, CGPM, CIPM, BIPM, IERS, IUGG, URSI, ISO, WMO and IAO.
Meanwhile, sysadmins will have to deal with the effects of a new leap second every time that changes in the earth's rotation put UTC more than 0.9 seconds out of step with mean solar time.
What exactly those effects are depends on where you work and what hardware and software you're running, because although the definition of when the leap second occurs is quite precise, the way computers take it into account varies greatly.
Computers keep time by counting the number of seconds since the "epoch" -- midnight on Jan. 1, 1970 for most systems, although some use other years, such as 1900, as the base. Times before the epoch are negative; those after, positive. Given the number of seconds since the epoch, computers can determine the date and time by repeated division by the number of seconds in a day, days in a year and so on. There's a simple rule for identifying leap years (although Microsoft Excel famously gets it wrong) but handling leap seconds requires computers to look things up in a table. There have been 27 leap seconds since 1971, and the table will only get longer with the passing years.
Leap seconds are supposed to occur at the end of the day (UTC) on which they are added, so the sequence of time will run 23:59:58, 23:59:59, 23:59:60, 00:00:00, 00:00:01. If the leap second occurred at midnight for everyone, things might not be so bad, but midnight UTC is in the middle of the business day in Asia and on the West Coast of the U.S.
The recommended way of dealing with the leap second is to stop the clock for a second, but on a computer, that's not practical: The computer and its clock will keep running and have to be jumped back, with the same second seemingly occurring twice -- a chance for high-frequency stock traders in Asia and California to make -- or lose -- a fortune.
Software, then, must deal with the consequences of that repeated second, or find a way to fake it, perhaps by gradually adjusting the clock over the last few minutes of the day.
But the sheer variety of clever ways to handle the extra second is part of the problem, as systems using different methods drift slowly out of sync with one another before slowly realigning once again.
Oracle's Cluster Ready Services turned out not to be ready for the leap second added to the end of 2008. Reddit, LinkedIn and the flight reservation system at Quantas crashed in 2012.
While the ITU is conducting its impact study, sysadmins will be taking the fall -- unless IT vendors can agree on, and implement, a consistent way to mitigate the effects of future leap seconds.
They'll need to find a way to do that not just for servers and routers, but also for the ever-growing number of connected devices, the Internet of Things, that send sensor readings back to time-stamped databases but are unlikely ever to get an update of their onboard software.
If the vendors can do that, though, delegates to WRC-23 will find it easy to kick the leap-second can even further down the road.