Text Patterns - by Alan Jacobs

Tuesday, February 23, 2010

fragility

Jason Epstein writes,

Digitization makes possible a world in which anyone can claim to be a publisher and anyone can call him- or herself an author. In this world the traditional filters will have melted into air and only the ultimate filter — the human inability to read what is unreadable — will remain to winnow what is worth keeping in a virtual marketplace where Keats's nightingale shares electronic space with Aunt Mary's haikus. That the contents of the world's libraries will eventually be accessed practically anywhere at the click of a mouse is not an unmixed blessing. Another click might obliterate these same contents and bring civilization to an end: an overwhelming argument, if one is needed, for physical books in the digital age.

This is of course not true, and one wonders what caused Epstein to make such a claim. Does he think that every book ever digitized is on a single un-backed-up computer? “Digital content is fragile,” he continues. “The secure retention, therefore, of physical books safe from electronic meddlers, predators, and the hazards of electronic storage is essential.” I agree with this, but maybe not for the reasons Epstein has in mind. Paper codices contain a great deal of information — data and metadata — that can't easily be transferred to digital form, and that information is worth preserving. But it’s not clear, to me anyway, that electronic texts are more fragile than books. It’s true that digital media deteriorate, and at rates and under conditions we still don't understand, but steps can be taken and are being taken to keep those media constantly updated. And books are damaged, lost, or destroyed as well. Few objects persist over time unless they are cared for, which is presumably what certain Chinese Buddhist scholars were thinking about when they built a library.

It’s interesting to think about what would happen if certain sources of information we rely on were somehow to disappear, wholly and instantaneously. Losing Wikipedia wouldn't be a big deal, since by design its information comes from other sources, most of which are online elsewhere. Losing the books that Google has scanned would be more problematic, but there are many other sources of digitized texts. We need to be good custodians of all the information we have gathered, but with proper care, I don't think that digital media are any more fragile than any other kind.

8 comments:

  • Is he perhaps thinking of that blog network (the name escapes me) that accidentally erased its entire contents and the contents of all of its members a while back, all because their back-up system was woefully designed? Pretty big disaster for those involved, but it's the digital equivalent of a fire or flood taking out a single physical library, not the "end of civilization."

  • There's something to his argument here, but it seems (just from the excerpt you've posted) like he's talking about data persistence as a way of getting at some argument he wants to make about why it's bad that the Internet levels the old distinctions of quality and authority. This makes for a rather confused argument that can lead the reader to easily forget that there is a whole industry and academic field dedicated the resolving the former problem.

    If all he's worried about is data persistence, then it's a well-established principle that backing up a piece of data in multiple formats and multiple locations is a strong form of protection. Both physical and digital formats have their own peculiar advantages and vulnerabilities. The fact that almost all computers are networked today and network security is a huge problem means that it's quite possible that the entire network could be lost (think large-scale computer virus or EMP attack) or at least that all copies of a piece of data could be compromised (see the previous commenter). At the same time, physical objects are much harder to copy, are subject to fires, etc, so having digital versions that can be copied at nearly zero cost and stored throughout the world makes sense too. Why not do both?

    The question of whether there is something intrinsically valuable about physical media that cannot be replicated in digital formats is a very important one, but I think ought to be considered as mostly distinct from the persistence question.

  • It is true that “electronic texts are more fragile than [printed] books.” See Abby Smith, “Preservation,” in A Companion to Digital Humanities (2004), http://www.digitalhumanities.org/companion/:

    “As described by computer scientists and engineers, the two salient challenges to digital preservation are: [1] physical preservation: how to maintain the integrity of the bits, the Os and Is that reside on a storage medium such as a CD or hard drive; and [2] logical preservation: how to maintain the integrity of the logical ordering of the object, that code that makes the bits "renderable" into digital objects.

    “In the broader preservation community outside the sphere of computer science, these challenges are more often spoken of as: [1] media degradation: how to ensure that the bits survive intact and that the magnetic tape or disk or drive on which they are stored do not degrade, demagnetize, or otherwise result in data loss (this type of loss is also referred to as "bit rot"); and [2] hardware/software dependencies: how to ensure that data can be rendered or read in the future when the software they were written in and/or the hardware on which they were designed to run are obsolete and no longer supported at the point of use.

    “Because of these technical dependencies, digital objects are by nature very fragile, often more at risk of data loss and even sudden death than information recorded on brittle paper or nitrate film stock.”

    Older, pre-electronic media often survived through benign neglect (tablets buried under desert sands, vellum manuscripts forgotten on monastic library shelves, personal papers left in filing cabinets). We have had some success with the salvation and conversion of new media over the last 100 years, but there are many unanswered questions about the sufficiency, scalability, and sustainability of current digital curation initiatives.

  • No hyperbole in the phrase "bring civilization to an end," huh? All we are is our ideas and creativity. Pshaw. (I pause to note that some would hasten that end for a variety of reasons.)

    Michael's comments above are excellent regarding the vagaries of long-term storage, especially the digital and magnetic sorts (some overlap there). Lots of broadcast archivists have bemoaned the loss of film and tape canisters to age and degradation.

    On the flip side, one aspect of today's degradation, according to Epstein (with whom I agree), is that the bar to publication is now set so low, through the wondrous efficiencies of our technical creativity, that we're awash in garbage content. The phrase "500 channels and nothing to watch" springs to mind, which applies to books, music, and art as well. So while the democratization of production has obvious benefits, mixed in is the destruction of a business model and network of connections that reward excellence and stem publication of every banal creative utterance.

  • It's also worth remembering that as we shift to a "cloud computing" model, where we all tap into a shared database rather than maintaining copies of files on our individual hard drives, the protections provided by redundant storage diminish. Right now, it seems fairly likely that we may entrust control over the world's digital book archive to one company (Google), and most of our use of that library will be through a browser, without any persistent downloads. Although Google is very sophisticated in protecting data, by, for instance, storing copies in many data centers, one can envision a future scenario (very unlikely but not entirely impossible) involving a catastrophe - natural, technological, or even commercial - that obliterates that central store. Of course, some of the books (particularly popular and/or recently published ones) in that store will probably have been replicated widely on e-book readers and other devices, and hence protected, but most will not have been. If we further assume that this catastrophe happens a long time from now when many physical libraries have jettisoned most of their printed books (having come to take the electronic copies for granted), the loss would be even more devastating. If, furthermore, the cloud computing model has by then become so dominant that local copies of files are no longer used (everybody draws on the central store all the time), the scale of the catastrophe becomes magnified even further.

    Far-fetched? You bet. End of civilization? Doubtful. Worth worrying about? Yes.

  • Thanks to all for the thoughtful comments and critiques. Here’s another helpful account of the possibilities and the problems.

    Nick, if the scenario you imagine were to happen, wouldn't we just be back where we were before Google started digitizing books — that is, dependent on a limited number of codexes, most of them in large university libraries? Unless, of course, those libraries decide that Google’s digitizing project allows them to throw the books away — the Nicholson Baker nightmare. . . . In any event, this all does suggest that we’re not yet in a position, and may never be in a position, to see digital texts simply as replacements for paper codices.

  • if the scenario you imagine were to happen, wouldn't we just be back where we were before Google started digitizing books — that is, dependent on a limited number of codexes, most of them in large university libraries?

    Probably yes, though on a philosophical note I'm not sure that, when it comes to technological progress, you can ever really return to the status quo ante. And, of course, a variable is the extent to which the existence of the digital archive spurs libraries to accelerate their disposal of physical copies.

    The point I was trying to make, though, is that we're moving to a computing model that diminishes the safeguard historically provided by redundant, locally stored copies.

    Nick

  • In our library/archives consortium, we are looking at augmenting—not replacing—local storage with cloud storage as well as distributed offline storage.

    We’re planning to keep old codices, too.

Post a Comment

[Basic HTML tags can be used in this comment field.]