Sunday, June 30, 2013

It's a Date

It felt like I'd been working on the spreadsheet for a long time but when Excel crashed and then, on recovery, told me I'd last saved it on New Year's Day 1601 I had to wonder whether it was right. But only for a millisecond because, in fact, I'm an inveterate frequent saver. My fingers have been burned so much over the years by losing data due to crash, keyboard shortcut misadventure, server timeout, accidental browser closure and so on that I'm typing this with blackened stumps.

More interesting to me was that it looked like the Windows epoch time was 1601-01-01T01:00  unlike Unix and Linux which use 1970-01-01T00:00. (Repeat after me: ISO date is great, even if it has its own effective epoch, um, mate.)  

So I poked around a bit and found that not only is this the Windows epoch date with some justifications of varying believability but that Excel 97 formats had their own built-in epoch time, or times:
Both 1900 and 1904 file types are commonly found in the wild, usually depending on whether the file originated on Windows or Mac. Converting from one to another silently can cause data integrity errors, so Excel won’t change the file type for you. To parse Excel files you have to handle both. That’s not just a matter of loading this bit from the file. It means you have to rewrite all of your date display and parsing code to handle both epochs. 
I had planned to write more but I realise I can summarise the remainder in one word: Sigh.

1 comment:

  1. The history behind the "1900 on Windows" but "1904 on Macintosh" feature of Excel (http://support.microsoft.com/kb/180162) is a classic example of early (seemingly trivial) design decisions escalating into nightmares down the line.

    In fact, the problem goes back beyond what the Microsoft Support page indicates. It originates in a bad date library implementation in early Macintosh software that didn't handle the old "centuries except 400-years are NOT leap years" rule. To get around it, they made dates before 1904 "unsupported".

    PS. Repeat after me: ISO dates are better, but NOT as good as they should be.

    ReplyDelete