By Paul Conway
Could it be that an important reason that digital preservation remains an elusive ideal is that we haven’t experienced the digital version of the Florence Flood?
Forty-six years ago on November 4, 1966, the Arno River in Florence, Italy, flooded its banks, breaching the basements of museums, libraries, and private residences, and burying centuries of books, manuscripts, and works of art in muck and muddy water.
This natural disaster spawned a generation of conservation professionals committed to preventing future disasters while focused on triage decision-making in support of cost-effective treatment. The Florence Flood was a clear and unambiguous wake-up call for the entire cultural community and all of its benefactors.
Today, a flood of a different sort is sweeping across the land — a veritable deluge of digital data along with tools and toys designed expressly to tap the flow of digital information for commercial, entertainment, and educational ends. The digital deluge has two streams that converge to give the impression — at least in the more technologically developed places of the world — that we are indeed immersed in an all-digital environment. One flow of digital data is the collectively massive and accelerating conversion of book and nonbook materials from analog to digital form. Spurred by Google’s Book Search partnerships with publishers and libraries and the University of Michigan Library’s decision to contribute its entire book and serial collection to the digitization effort, we are facing the very real prospect that the vast majority the world’s published record may exist in digital form a decade from now.
The second source of the digital deluge derives from the fact that nearly all new information is created digitally, communicated digitally, used in a digital environment, and stored “for posterity” in digital systems. As technology devices become increasingly feature-rich, user friendly, and affordable, the proportion of information that ever makes its way to paper or film is declining, along with the proportion of paper that warrants preservation for the long haul.
The quantity of digital information that we create — and keep — is nearly incomprehensible. A decade ago, information scientists at the University of California Berkeley estimated that over 93 percent of all new information was created in digital form. Last year, the Blue Ribbon Task Force on Sustainable Digital Preservation and Access, citing reliable estimates from computer manufacturers, noted that since 2008 the scale of digital creation is far outpacing the capacity to store the data. In the absence of a preservation crisis over the loss of culturally significant digital assets — a digital Florence Flood — the only short-term recourse may be a liberal use of the ìdeleteî key.
In considering the implications of our emerging all-digital society, it is important to establish clear distinctions between the terms “digitization for preservation” and “digital preservation.” Digitization for preservation is an investment in valuable new digital products through the conversion of analog sources such as books, photographs, maps, and archival documents. Digital preservation, on the other hand, is the suite of tools, operations, standards, and policies to prevent this investment from being squandered, regardless of whether the original source is a tangible artifact or data that were born and live digitally. Digitization for preservation and digital preservation are intimately related ideas, but the underlying standards, processes, technologies, costs, and organizational challenges are quite distinct.
Digitization for preservation is reaching a new level of maturity. The Federal Agencies Digitization Guidelines Initiative is establishing next-generation technical standards for digitization that combine two decades of best practices with the insights garnered from research by image scientists. The new guidelines are the product of a coalition of 13 US government organizations that hold important collections of archives, books, and audiovisual resources. The group is also working on guidelines for the quite complicated processes of digitizing sound recordings, video tapes, and motion picture film. The innovative strategy of the new guidelines matches properties of the digital surrogate to the characteristics of the analog source, offering great promise for preservation-oriented digitization.
When any digital resource has long-term value (more than 10 years or so), then the technology systems and accompanying policy frameworks that preserve those digital assets must inspire the same level of trust and confidence for users and stakeholders as do traditional preservation and access services. Fortunately, progress is being made to create trustworthy digital repositories. One of these, the HathiTrust Digital Library, is a coalition of 62 universities based at the University of Michigan with pooled resources and a common governance structure. HathiTrust currently contains over 10 million volumes of books and serials digitized by Google, the Internet Archive, and coalition members. HathiTrust demonstrates that creative technical energy can generate from collaboration around a shared need, in this case the need to protect the value of digitized collections that are becoming essential resources for scholarship and learning.
Education for the new realities of digitization and digital preservation surely must be viewed as an important priority in our emerging all-digital world. The School of Information is stepping up to the educational challenges with its Preservation of Information (PI) specialization. One of nine specializations in the UMSI graduate curriculum, PI presently hosts 13 courses that cover the range of preservation challenges faced by information resources across the analog to digital spectrum. Courses engage the opportunities of digitization for preservation and challenges of digital preservation. The specialization features a very robust practical engagement internship program and creative experimentation with a virtual technology laboratory to deliver software tools to student desktops. The PI specialization offers students a respite from the tensions of our ongoing transition to a digitally mediated society, one that excites and intimidates simultaneously.
Michael Stipe of R.E.M. has something to say about the mixed up, chaotic world that we live in, capturing in a blur of word images the emotions of the nightly news and the energy of new media, blurting a refrain of change and, ultimately, optimism:
It’s the end of the world as we know it.
It’s the end of the world as we know it.
It’s the end of the world as we know it and I feel fine.
It is the end of preservation as we know it, too. But the future of preservation in the age of Google is precisely where it has always been: transforming artifacts into new forms and extending their useful life.
Note: This article is adapted from: Conway, P. (2010). “Preservation in the Age of Google: Digitization, Digital Preservation, and Dilemmas.” The Library Quarterly 80 (1): 61-79.