Leap Second Bugs Are Coming

A leap second is a one second adjustment to the universal time. Irregularities in the Earth's rotation mean that there is slightly more time in a day than 24 hours. But we still use 24 hours (86,400 atomic seconds) to measure a day. Over time this discrepancy would lead to the daylight hours shifting. Eventually, noon would no longer be mid-day.

To compensate for this discrepancy, a lead second is sometimes added to a year. This is similar to a leap year. Usually how this works is that on a chosen day an extra second is added between the last second of that day and the first second of the next day. That is, a second is added between 23:59:59 and 00:00:00.

Because the rotation of the Earth is irregular, these leap seconds are not added in any sort of consistent manner. It is not possible to create a calendar where we will know when a leap second should be added to our time system.

A leap second can create bugs in all software that uses time. Remember the y2k bug? The y2k bug existed because software was not prepared to deal with a change in how the year is represented.

A leap second bug is similar in that most software is not prepared and does not know how to deal with an additional second being inserted into the day.

Most software tracks time based on UNIX (POSIX) behavior which defines a day to be 86,400 seconds. It measures the current time by performing:

time % 86400

And the variable 'time' cannot be more than 86400. This leads to an interesting question. How do you add a leap second to a day?

Some software tries to deal with this issue by counting the last second of a day twice (86399 -> 86399 -> 86400).

Counting a second twice leads to problems. In software, the difference between two times is a useful measure. By definition time always moves forward, so the difference between two times is never going to be zero.... Unless the software was written to double count a second. This would lead the software to think that the same time happened twice. Now you have the difference between two times equal to zero. This is very unexpected and can lead to the software trying to divide by zero. That's not something you want.

A second method for adding an extra second to software is to fudge the definition of a second. You slow down the count through the day. Slowing down the count means that the software program's definition of a second is changed to be longer than a normal second. A fudged "second" can be defined as 1.1 real seconds. Then if the program runs ten of these "seconds" the result is an additional second of time has passed.

This lengthening of the time between seconds solves the problem of there being two times with the same value. Time is still constantly moving forward. The problem now becomes that through out the time period where the length of the second has been fudged, the wrong time is given.

A system which gives the wrong time (though only off by milliseconds) will lead to its own problems. There are computers that run code that needs to have the correct time down to milliseconds. If this time is off, the computer will run commands at slightly the wrong time, leading to potential disaster.

Another issue is that is one system is using the fudged seconds and another is using normal seconds, the times between the two systems will be off. Computer systems which are not synched to the same time have serious problems talking to one another.

In 2012, a lead second addition led to two Qantas Airlines systems having a different time. This time difference resulted in the two systems not being able to communicate, bringing down flight reservations. This lasted for over two hours.

So while there are methods of dealing with a leap second, they are prone to producing bugs. Because leap seconds are unpredictable, there is not a lot of effort put into testing systems to see what happens when a leap second is added. This is especially true because the Earths rotation can lead to there never being the need for another leap second to be added.

One may think that because a system has survived a leap second in the past, it will survive the next leap second problem free. This is not necessarily the case. Depending on what the software does, a second might have been short enough that any buggy code was simply not run during the previous leap second. There is a chance that an upcoming leap second can display a bug that just happened to not be run in the past.

We have never had a negative leap second. That is, a second being taken away from the day. Though this has not happened before, because the Earth's rotation is unpredictable, a negative leap second might one day be a reality. Good luck to the software engineers having to deal with that!

As any computer programer can tell you, a software bug can be difficult to trace and might need a very specific set to conditions to show up.

Time is more or less stable. There are usually 24 hours and 86400 in a day. But not always. And sometimes programers forget to account for those special cases when the usual is not true.