Somehow i am wondering if the 1900 spike is just a flaw in the data model. Does he explain why so many items are labeled as 1900?<p>Could it be that for items with an unknown date in the large collection the data field only contains "'00" or can be itnerpreted as 1900 or someone decided that the default value for an items creation date is 1900 and not all items have a date set?<p>Edit: I think the mystery is solved: Looking at <a href="https://raw.githubusercontent.com/MuseumofModernArt/collection/master/Artworks.csv" rel="nofollow">https://raw.githubusercontent.com/MuseumofModernArt/collecti...</a> you will find a lot of data fields like "c. 1900". For example the data field for this item says "c. 1900": <a href="http://www.moma.org/collection/works/60868?locale=en" rel="nofollow">http://www.moma.org/collection/works/60868?locale=en</a><p>I first assumed "c. 1900" means "created 1900" but it's more likely "circa 1900". Also, this particular piece was created 1910: <a href="http://www.davidrumsey.com/amica/amico866157-125545.html" rel="nofollow">http://www.davidrumsey.com/amica/amico866157-125545.html</a><p>So, i think the collection contains a lot of items with unknown creation date which are labeled "circa 1900".