Here's how The New York Times is trying to preserve millions of old pages the way they were originally published


Project Kondo has identified and archived over 7 million New York Times web pages that contain news content in outdated and unsupported formats. Readers can report broken links, but the number of sites to review is too big to do by hand, so the team created an automated tool called ‘munger’ to identify JavaScript with unsupported code and clean it up into HTML that can be shared widely. In order to preserve the content exactly how it was originally published, the websites are moved to a different domain, archive.nytimes.com, where readers are notified that they are reading an archived article.

Related Stories