Parsing BC dates with JavaScript

[Cross-posted from dclure.org]

Last semester, while giving a workshop about Neatline at Beloit College in Wisconsin, Matthew Taylor, a professor in the Classics department, noticed a strange bug – Neatline was ignoring negative years, and parsing BC dates as AD dates. So, if you entered “-1000″ for the “Start Date” field on a Neatline record, the timeline would display a dot at 1000 AD. I was surprised by this because Neatline doesn’t actually do any of its own date parsing – the code relies on the built-in Date object in JavaScript, which is implemented natively in the browser. Under the hood, when Neatline needs to work with a date, it just spins up a new Date object, passing in the raw string value entered into the record form:

Sure enough, though, this doesn’t work – Date just ignores the negative sign and spits back an AD date. And things get even funkier when you drift within 100 years of the year 0. For example, the year 80 BC parses to 1980 AD, bizarrely enough:

Obviously, this is a big problem if you need to work with ancient dates. At first, I was worried that this would be rather difficult to fix – if we really were hitting up against bugs in the native implementation of the date parsing, it seemed likely that Neatline would have to get into the tricky business of manually picking apart the strings and putting together the date objects by hand. It’s always feels icky to redo functionality that’s nominally built into the programming environment. But I didn’t see any other option – the code was unambiguously broken as it stood, and in a really dramatic way for people working with ancient material.

So, grumbling at JavaScript, I started to sketch in the outlines of a bespoke date parser. Soon after starting, though, I was idly fiddling around with the Date object in the Chrome JavaScript terminal when stumbled across an unexpected (and sort of inexplicable) solution to the problem. In reading through the documentation for the Date object over at MDN, I noticed that the constructor actually takes three different configuration of parameters. If you pass in a single integer, it treats it as a Unix timestamp; if you pass a single string, it treats it as a plain-text date string and tries to parse it into a machine-readable date (this was the process that appeared to be broken). But you can also pass three separate integers – a year, a month, and a day. Out of curiosity, I plugged in a negative integer for the year, and arbitrary values for the month and day:

Magically, this works. A promising start, but not a drop-in solution for the problem – in order to use this, Neatline would still have to manually extract each of the date parts from the plain-text date strings entered in the record forms (or break the dates into three parts at the level of the user interface and data model, which seemed like overkill). Then, though, I tried something else – working with the well-formed, BC date object produced with the year/month/day integer values, I tried casting it back to ISO8601 format with the toISOString method. This produced a date string with a negative date and…

two leading zeros before the four-digit representation of the year. I had never seen this before. I immediately tried reversing the process and plugging the outputted ISO string back into the Date constructor:

And, sure enough, this works. And it turns out that it also fixes the incorrect parsing of two-digit years:

I am deeply, profoundly perplexed by this. The ISO8601 specification makes cursory note of an “expanded” representation for the year part of the date, but doesn’t got into specifics about how or why it should be used. Either way, though, it works in all major browsers. Mysterious stuff.

Formerly Web Applications Developer on the Scholars' Lab R&D team, David graduated from Yale University with a degree in the Humanities in 2009 and worked as an independent web developer in San Francisco, New York, and Madison, Wisconsin before joining the lab in 2011. David was the lead developer on Neatline and works on research…

7 Comments

  1. Karl,

    You can create a date in the years between 0 and 100 using Date.setFullYear();

    > d = new Date();

    Sun Mar 23 2014 22:20:13 GMT-0400 (EDT)

    > d.setYear(80); // BAD: Sets to 1980

    322716013964

    > d.setFullYear(80); // GOOD: Sets to 80

    -59635431586036

    > d

    Sat Mar 23 80 22:20:13 GMT-0400 (EDT)

    • Oh, that’s fantastic Paul, thanks for posting this! What an interesting API, Date has.

  2. Hi David — All well and good for negative years -1 to -99, but not positive (0 – 99). Driving me crazy! Any ideas?


    >>> foo=new Date('-000080');
    Date {Wed Dec 31 -0081 16:00:00 GMT-0800 (Pacific Standard Time)}

    >>> foo=new Date('000080');
    Date {Invalid Date}

    best, Karl

  3. Is the timeline part of Neatline Simile or a homegrown library? If it’s Simile, I was surprised by the bug, as I’ve represented BC dates in it before with no problems. One thing to be mindful of moving forward is documenting the precise format that people should enter into Neatline when they create B.C. dates–whether simply a negative integer or the ISO date proper. I’m not sure if you’re aware, but -0080 is not 80 B.C. It’s 81 B.C. since there’s no year 0 (1 B.C. is thus xs:gYear 0000). I don’t like this. It’s counter-intuitive, but that’s the ISO standard. So if your users enter -80 for a date thinking this represents 80 B.C., you’ll have to change this to -0079 in the backend for the timeline to place the point correctly.

    • Hey Ethan,

      Yeah, the one-year offset for BC dates is definitely weird. I’m never sure in this kind of situation whether it’s better to just stick to the built-in standard or manually adjust things in our code. Intuitiveness is important, but it can also be jarring to see monkey-patched modifications to an ISO spec in the business logic. What do you think?

      I’ll try to get the documentation updated in the next day or so. The docs are actually served directly out of the gh-pages branch on GitHub. If you can beat me to it, feel free to fork and send a pull request! The document in question is style-tab-dates.md.

      Cheers,
      David

  4. Very interesting. Would you have any leads about how to solve the similar problem in ArcGIS? It also will not accept BC dates. Since ArcGIS is mainly used in industry, the people at ESRI don’t have a financial incentive to address the issue. But the program has great potential for mapping history.

    • Hey Peter,

      Hmm, I’m not sure about how this would interact with ArcGIS – I’ve actually never worked with temporal information in ESRI products. My suspicion, though, is that it wouldn’t – just from browsing through the documentation, it looks like they use a format similar to ISO8601, but not identical. So I suspect this particular trick wouldn’t work, since (I think) it’s pretty narrowly specific to the browser-JavaScript implementations of the spec.

      Let me know if you find out anything else about this, though!

      -David

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Archives