How to Preserve and to Access Information for Over 1 Million Years

Scientists at the University of Southampton may have figured out how to preserve digital information for over one million years. They've demonstrated high density laser recording onto a durable quartz disc, a disc which could theoretically hold about 360 TB. They're a long way from demonstrating commercial viability for this technology, and that's not a given. Many storage technologies have looked promising but haven't panned out. It's also very unclear what the costs of quartz discs and their writers/readers will be.

Further to my "Big Data, Big Tape" post, tape is currently the most popular durable storage medium, but unfortunately magnetic tape isn't one million years durable. Organizations with tape media must periodically refresh their tape media every couple decades or so to maintain reliable readability, and there may be strong economic reasons to refresh more frequently. IBM, for example, only very recently ended hardware support for its famous 7-track and 9-track reel tape drives, officially closing that chapter in tape media history. There are still data recovery specialists that can read reel tapes if necessary, but those tapes aren't getting any younger. If you've got such tapes, now's the time to read them and convert them to newer media.

One million year media only solves one long-term data storage problem and perhaps not even the most interesting one. The real problem is how to make sense of all those bits in the distant future or even a few years from now. Human civilization is only a few thousand years old, but there are still many recorded human languages that are now incomprehensible. The Digital Age we now live in may end up being the New Dark Ages. Software developers and vendors do not usually prioritize readability of older data formats. Newer versions of Microsoft Word cannot open older Microsoft Word files, for example. Humanity is losing its collective memory with each new software release.

Mainframe vendors generally aren't so careless, and mainframe customers routinely access decades-old data using decades-old code to interpret those data. That works and works well, and it's one of the important reasons why mainframes are useful and popular as central information hubs and systems of record. Moreover, if for some reason the applications must be retired, that's possible while retaining the ability to retrieve and to interpret the archived data. Contrast that multi-decade mainframe track record with, for example, Apple which opened for business in the mid-1970s. The sad fact is that every Macintosh application compiled up through 2005 completely stopped working on the version of the Macintosh operating system introduced only 6 years later in 2011. Consequently some of the proprietary data formats that those "obsolete" applications know how to access and interpret remain unintelligible on newer Macs. Apple very deliberately broke binary program compatibility within just a few short years. Macintosh models introduced starting in the second half of 2011 cannot run Macintosh applications compiled before 2006. That might be a great technical strategy if you're Apple and you're trying to sell more Macintosh computers — to "wipe the slate clean" periodically — but, I would argue, that's horrible for the human race and its collective memory.

Virtualization offers one possible solution to the problem of interpreting old data, but virtualization isn't a complete solution on its own. Mainframes excel in virtualization. IBM has incorporated many different types of virtualization technologies throughout the zEnterprise architecture, and those technologies continue to evolve and improve. They help preserve backward compatibility. Nonetheless, on rare occasions IBM has dropped support for older application execution modalities despite the availability of virtualization as a possible solution. But then there's someone who thankfully steps in to help and then help again.

Maybe those one million year quartz discs all need to include a built-in "Rosetta Stone" that helps future generations interpret the bits they contain. Some scientists have at least thought about the problem.

by Timothy Sipples July 11, 2013 in Future


TrackBack URL for this entry:

Listed below are links to weblogs that reference How to Preserve and to Access Information for Over 1 Million Years:


The comments to this entry are closed.

The postings on this site are our own and don’t necessarily represent the positions, strategies or opinions of our employers.
© Copyright 2005 the respective authors of the Mainframe Weblog.