Wednesday, 14 January 2004

Our very own Y2K bug

Listening to:

Schumann, Piano trio in D minor, op. 63.

A time overflow bug in Moscow ML

Some time on 10 January, 230 seconds had elapsed since midnight on 1 January, 1970.

Deep in HOL’s dark internals, there’s code that calls the ML function Time.now(). This function returns the current time in the form of an opaque type that you can only manipulate using the functions defined in the Time API. One such function is Time.toSeconds, which turns a time value into an integral number of seconds since 1 January 1970.

So what? you say, integers are all represented in 32-bit registers these days, aren’t they? That’s more than enough space to represent numbers bigger than 230.

First of all, note that we’re using signed integers, which means that one of those 32 bits is being used to tell us whether or not the number is positive or negative. Then, for more deep and dark reasons, our ML implementation uses another bit of those 32 to “tag” its integers (to distinguish them from other possibilities). This means that our ML has a maximum integer value of 230 − 1, and this was exceeded on 10 January.

And lo, we were much surprised. Code that was working fine before the weekend stopped working on the Monday. Our beautiful programs, pristine in their Platonic perfection, were actually dependent on the environment, the tick-tick-tick of passing time, and this dependency twirled ’round and bit back.

Luckily, we can also convert times into floating point numbers that use more than 32 bits, so we’re safe for the moment. Nonetheless all our users are going to have to upgrade their system. (Roll on the day when our ML has an arbitrary precision integer type built in.)

Comments