Intellectual Preservation and Electronic Intellectual Property

by Peter S. Graham

ABSTRACT

Preserving intellectual property means protecting it from easy change in electronic form. Change can be accidental, well-intended or fraudulent; protection must be for terms longer than human lifetimes. Three possible solutions for authenticating electronic texts are described: encryption (least useful), hashing, and (with the most potential) digital time-stamping, which can fix document existence at a point in time using public techniques.

INTRODUCTION

This conference is concerned with means of protecting intellectual property in the networked environment. This paper will focus on the authenticity of electronic information content, that is, on intellectual preservation. [1]

The concern with authentication arises from the concerns of librarianship, which has the imperatives of identifying information on behalf of users and of providing it to them, intact, when they need it. The professional paradigm librarians speak of is that they acquire information, organize it, make it available and preserve it. The paradigm is appropriate for electronic information just as for print over the last several centuries.

For printed texts preservation of the work has meant preservation of the artifact that contains the work. Indeed, for most people there has been no distinction between the book and the text, though the more sophisticated analytical bibliographers and librarians have discussed that distinction for some decades. But now, in the electronic environment, the work (which may be a text or may be graphic, numeric or multimedia information) can migrate from medium to medium and has no necessary residence on any one of them. The preservation of the work independent of its medium takes on importance in its own right.

Librarians have as their professional responsibility the serving up of the information placed in their custody as true to its original intellectual content as they can. This conference’s concern is with protection of intellectual property, a related concern. Such protection must extend not only to intellectual rights over the property, but to the property itself: how can we preserve information content from unauthorized, intentional or accidental change? The exercise of property rights includes purchase and sale. Both the buyer and seller have an interest in the property being what it is said to be, that is, in authenticating the property or text. Authentication is an interest of librarians as well.

Barry Neavill, a professor at the library school at the University of Alabama, wrote presciently almost ten years ago that no one had “addressed the issue of the long-term survival of information. . . . The survival of information in an electronic environment becomes an intellectual and technological problem in its own right.”[2] If we want to assure permanence of the intellectual record that is published electronically, he said, then it will be necessary consciously to design and build the required mechanisms within electronic systems. We are still in need of those mechanisms.

To address this need, this paper is in two parts. First, it will briefly describe some of the issues associated with preservation of the objects containing electronic information: medium preservation. Second, it will discuss the challenge of intellectual preservation, or the protection and authentication of information which exists in electronic form. Several potential methods of electronic preservation will be described, and one will be recommended for further attention.

THE MEDIUM–AND ITS PRESERVATION

In the electronic environment it is unlikely that a focus of critical study will be upon the electronic medium itself. To begin with, there is nothing in an electronic text that necessarily indicates how it was created; and the ease with which electronic texts can be transferred from disk to disk, or networked from computer to computer, means that there is no necessary indication of the source medium or even if the information has been copied at all. We are not likely to see sale catalog references in the future, therefore, which remark on the fine quality of the floppy disk’s exterior label, or which remark on the electronic text’s provenance (“Moby Dick on the original Seagate drive; never reformatted, very fine”). [3]

The preservation of the information will still require the preservation of whatever medium it is contained on at any given time. This is mostly what has been meant up to now when electronic preservation has been discussed. But there is another kind of preservation required for information media: not only the preservation of the physical medium on which the information resides, but the preservation of the storage technology that makes use of that medium.[4]

The physical preservation of media do not need extensive address here, for at any given time the physical characteristics of the medium in use are well understood and the problems inherent in preserving it are simply financial and managerial: Who should pay for the necessary equipment and for the properly designed and acclimatized space, how often should backups be made, and who keeps track of backups and sees that they happen? These issues cause expenses for the electronic collection, but they raise only routine technological questions.[5]

The storage obsolescence problem is quite another matter. A brief sequence of storage media many of us have seen in our lifetimes would include:

punched cards*, in at least three formats (80-column, 90-col, 96-col);
7-track half-inch tape* (at densities of 200, 556 and 800 bits per inch);
9-track half-inch tape* for mainframes, with various recording modes and densities up to 3200 bpi and beyond;
9-track half-inch tape cassettes* for mainframes (“square tapes”, as they are known in contradistinction to the earlier “round tapes”);
RAMAC disk storage;
magnetic drum storage;
data cell drives*;
removable disk packs*;
Winchester (sealed removable) disk packs*;
mass storage devices (honeycombs of high-density tape spindles);
sealed disk drives;
floppy disks* of 3 sizes so far; and at least 3 storage densities so far;
cartridge tapes* of very high density (e.g. Exabyte) for use in workstation backups and data storage;
removable disk storage media on PCs;
laser-encoded disks* (CD-ROMs and laser disks);
magneto-optical disks*, both WORM (write-once-read-many) and rewritable.

   * = considered by some to have long-term storage potential

Some of the storage options appearing now and in the near future include new floppy disk sizes and storage densities, and “flash cards” (PCMCIA), or memory cards for use with very small computers. One sees discussion of storage crystals, encoded by laser beams and having the advantage of great capacity without moving parts, and probably even as stable as good paper.

Technologies are superseding each other at a rapid rate. We know that authors and agencies are now storing long-term information on floppy disks of all sizes, but we don’t know for how long we are going to be able to read them. No competent authorities yet express confidence in the long-term storage capabilities or technological life of any present electronic storage medium. CD-ROMs are an example. Their economical use in librarianship derives from their mass market use for entertainment; that mass market may be threatened by DVI (digital video interactive) technology, by DAT technology, or by others now being actively promoted by entertainment vendors. If forms alternative to CDs win out in entertainment, the production of equipment for CDs and therefore CD-ROMs will be quickly curtailed.

There are perhaps three possible long-term solutions for preserving storage media in the face of obsolescence (as opposed to physical decay), and they vary in practicality: preserve the storage technology, migrate the information to newer technologies, or migrate the information to paper or other long-term eye-readable hard copy.

The prospect for the first option, preserving older technologies, is not bright: equipment ages and breaks, documentation disappears, vendor support vanishes, and the storage medium as well as the equipment deteriorates.

The second option is migration. Most character-based data could be preserved by migrating it from one storage medium to another as they become decrepit or obsolete. To do this requires a computer which can read in the old mode and write in the new; with present network capabilities, this is usually not difficult to arrange.

Whether “refreshing” data is practical for large quantities of information over long periods of time is another matter. The present view of the Commission on Preservation and Access is expressed in a report entitled Preservation of New Technology by Michael Lesk (see fn. 4). His view is that “refreshing” is the necessary and essential means of preserving information as media obsolesce; I do not believe it will be possible for more than a fraction of recorded information. The investment necessary to migrate files of data will involve skilled labor, complex record-keeping, physical piece management, checking for successful outcomes, space and equipment. A comparable library data migration cost and complexity at approximately this order of magnitude would be the orderly photocopying of books in the collection every five years. This is not practical. In any case, this migration solution will only work easily for ASCII text data. Migrating graphic, image, moving or sound data, or even formatted text, will only work as long as the software application can also be migrated to the next computing platform.

The third option — practical but unexciting — is to migrate information from high-technology electronic form to stable hard copy, either paper or microform. In the near term, for certain classes of high-value archival material, this is likely to be the permanent medium of choice. It offers known long life, eye readability and freedom from technological obsolescence. It also, of course, discards the flexibility in use and transport of information in electronic form. But until we have long-term stable electronic storage media, it offers the medium preservation mode most likely to be used.

THE MESSAGE–AND ITS PRESERVATION

The Problem

The more challenging problem is intellectual preservation — preserving not just the medium on which information is stored, but the information itself. Electronic information must be dealt with separately from its medium, much more so than with books, as it is so easily transferable. The great asset of digital information is also its great liability: the ease with which an identical copy can be quickly and flawlessly made is paralleled by the ease with which a flawed copy may be undetectably made. Barry Neavill wrote in 1984 of the “malleability” of electronic information, that is, its ability to be easily transformed and manipulated.[6] For an author or information provider concerned with the integrity of their documents, there are new problems in electronic forms that were not present in print.

The issue may be framed by asking several questions which confront the user of an electronic document (which may be a text or may be graphic, numeric or multimedia information, for the problems are similar). How can I be sure that what I am reading is what I want? How do I know that the document I have found is the same one that you read and made reference to in your bibliography? How can I be sure that the document I am using has not been changed since you produced it, or since the last time I read it? How can I be sure that the information you sell me is that which I wanted to buy? To put it most generally: How can a reader be sure that the document being used is the one intended?

We properly take for granted the fixity of text in the print world: the printed journal article I examine because of the footnote you gave is beyond question the same text that you read, and it is the same one that the author proofread and approved. Therefore we have confidence that our discussion is based upon a common foundation. The present state of electronic texts is such that we no longer can have that confidence.

Taxonomy of Changes

Let us examine three possibilities of change or damage which electronic texts can undergo that confront us with the need for intellectual preservation:

accidental change;
intended change that is well-meant;
intended change that is not well-meant; that is, fraud.

Accidental change

A document can sometimes be damaged accidentally, perhaps by data loss during transfer or through inadvertent mistakes in manipulation. For example, data may be corrupted in being sent over a network or between disks and memory on a computer; this happens seldom, but it is possible.

More likely is the loss of sections of a document, or a whole version of a document, due to accidents in updating. For example, if a document exists in multiple versions, or drafts, the final version might be lost leaving only the previous version; many of us have had this experience. It is easy for the reader or author not to notice that text had been lost in this way.

Just as common in word-processing is the experience of incorrectly updating the original version that was supposed to be retained in pristine form. In such a case only an earlier draft (if it still exists) and the incorrectly updated version remain. Again, a reader or author may not be aware of the corruption. Note that in both cases backup mechanisms and the need for them are not the issue, but rather how we know what we have or don’t have.

Intended change — well-meaning

There are at least three possibilities for well-meaning change. The change might result in a specific new version; the change might be a structural update that is normal and expected; or the change might be the normal outcome of working with an interactive document.

New versions and drafts are familiar to those of us who create authorial texts, for example, or to those working with legislative bills, or with revisions of working papers. It is desirable to keep track bibliographically of the distinction between one version and another.

In the past we have been accustomed to drafts being numbered and edition statements being explicit. We are accustomed to visual cues to tell us when a version is different; in addition to explicit numbering we observe the page format, the typos, the producer’s name, the binding, the paper itself. These cues are no longer dependable for distinguishing electronic versions, for they can vary for identical informational texts when produced in hard copies. It is for this reason that the Text Encoding Initiative Guidelines Project has called for indications of version change in electronic texts even if a single character has been changed. [7 ]

It is important to know the difference between versions so that our discussion is properly founded. Harvey Wheeler, a professor at the University of Southern California, is enthusiastic about what he calls a “dynamic document,” continually reflecting the development of an author’s thinking.[8] But scholars and readers need to know what the changes are and when they are made. Authors have an interest in their intellectual property. There is a sense in which the scholarly community has an interest in this property as well, at least to the extent of being able properly to identify it.

Structural updates, changes that are inherent in the document, also cause changes in information content. A dynamic data base by its nature is frequently updated: Books in Print, for example, or a university directory (“White Pages”). Boilerplate such as a funding proposal might also be updated often by various authors. In each of these cases it is appropriate and expected for the information to change constantly.[9] Yet it is also appropriate for the information to be shared and analyzed at a given point in time. In print form, for example, BIP gives us a historical record of printing in the United States; the directory tells us who was a member of the university in a given year. In electronic form there is no historical record unless a snapshot is taken at a given point in time. How do we identify that snapshot and authenticate it at a later time?[10]

Another form of well-meaning change occurs in interactive documents. Consider the note-taking capabilities of the Voyager Extended Books, and the interactive HyperCard novels.[11] We can expect someone to want snapshots of these documents, inadequate though they may be. We need an authoritative way to distinguish one snapshot from another.

Intended change — fraud

The third kind of change that can occur is intentional change for fraudulent reasons. The change might be of one’s own work, to cover one’s tracks or change evidence for a variety of reasons, or it might be to damage the work of another. In an electronic future the opportunities for a Stalinist revision of history will be multiplied. An unscrupulous researcher could change experimental data without a trace. A financial dealer might wish to cover tracks to hide improper business, or a political figure might wish to hide or modify inconvenient earlier views.

Imagine that the only evidence of the Iran-Contra scandal was in electronic mail, or that the only record of Bill Clinton’s draft correspondence was in e-mail. Consider the political benefit that might derive if each of the parties could modify their own past correspondence without detection. Then consider the case if each of them could modify the other‘s correspondence without detection. We need a defense against both cases.

Solutions

The solution is to fix a text or document in some way so that a user can be sure of the original text when it is needed. This solution is called authentication. There are three important electronic techniques proposed for authentication: encryption, hashing and digital time-stamping. While encryption offers a form of data security, only hashing and digital time-stamping are useful for long-term scholarly communication and for providing protection against change of an intellectual creation.

Encryption

The two best-known forms of encryption are DES and RSA. DES is the Data Encryption Standard, first established about 1975 and adopted by many business and government agencies. RSA is an encryption process developed by three mathematicians from MIT (Rivest, Shamir and Adleman) at about the same time, and marketed privately. It is regarded by many as superior to the Data Encryption Standard.[12]

Encryption depends upon mathematical transformation of a document. The transformation uses an algorithm requiring a particular number as the basis of the computation. This number, or key, is also required to decode the resulting encrypted text; the key is typically many digits long, perhaps 100 or more. Modern encryption depends upon the process being so complex that decoding by chance or merely human effort is impossible. It also depends upon the great difficulty of decoding by brute force. Computational trial-and-error methods would take unreasonably long periods of time, perhaps hundreds or thousands of years even using modern supercomputers.

Therefore the key is crucial to DES encryption. It is also the problem, for passing the key to authorized persons turns out to be the Achilles heel of the process. How is the key sent to someone — on paper in the mail? By messenger? These introduce the usual vulnerabilities dramatized in thriller literature. Do you send the key electronically? Sending it as plain text doesn’t seem like a good idea, and sending it in encrypted form — well, you see the problem. This is a recognized flaw in the widely-used DES encryption method.

The RSA encryption technique is called public key encryption. The computational algorithm depends upon a specific pair of numbers, a public key and a private key; data encoded by one number cannot be decoded using the same number but can only be decoded by the other number, and vice versa (see Fig. 1). A correspondent B keeps one of the pair of numbers secret as a private key and makes the other number available as a public key. The public key can be used by anyone, for example her friend A, for coding messages which he sends to B; only B can decode them, because only she has the other number of the pair. She sends an encrypted message back to A using not her private key, but A’s public key, and only he can decode it, mutatis mutandum.

Alternatively, B can code a simple message using her private key; anyone can decode it using her public key. This functions as a digital signature, allowing her messages to be authenticated, since only she is able to create such messages. The usefulness is evident in financial transfers, for example, or in authenticating e-mail or electronic purchase orders.

Encryption is valuable for security. But neither the DES nor the RSA form is useful as an authentication system. Encryption could perhaps be used to authenticate a text if one considered it as an envelope with contents presumed to be intact, but this would only work if the text had not been changed and re-encrypted. Encryption also has several drawbacks as a long-term authentication means. No matter which method is used, encryption requires keys specific to the reader and writer. If the keys are generally available, as they would need to be for wide document access, then authentication is not possible, for the document could easily be modified and re-encrypted using the same keys. In addition, one of our concerns in librarianship is authentication over periods of time longer than a normal human lifetime. Secret keys may be lost over such periods of time, making encrypted documents useless.

Hashing

Another technique is called hashing; it is a shorthand means by which the uniqueness of a document may be established. Hashing depends upon the assignment of arbitrary values to each portion of the document, and thence upon the resulting computation of specific but contentless values called “hash totals” or “hashes.” They are “contentless” because the specific computed hash totals have no value other than themselves. In particular, it is impossible or infeasible to compute backward from the hash to the original document. The hash may be a number of a hundred digits or so, but it is much shorter than the document it was computed from. Thus a hash has several virtues: it is much smaller than the original document; it preserves the privacy of the original document; and it uniquely describes the original document.

Fig. 2 allows a simplified description of how a hash is created. If each letter is assigned a value from 1 to 26, then a word will have a numeric total if its letters are summed. In the first example, EAT has the value of 26. The problem is, the word TEA (composed of the same letters) has the same value in this scheme. The scheme can be made more complicated, as shown in the second pair of examples, if the letter-values are also multiplied by a place value. In this scheme, the two words composed of the same letters end up with different totals. For the sake of illustration, the numbers at the right are shown as summed to the value 52 at the bottom; in fact the total is 152, but the leftmost digit can be discarded without materially affecting the fact that a specific hash total has been found: contentless, private, and (in this simple example) reasonably distinctive of the particular words in the “document.”

This is a very simplistic description of a process that can be made excessively complicated for human computation. Using cryptographic techniques, it is easy for current computing technology to compute quite complex hashes for any kind of document; paradoxically, these hashes are beyond the reach of computers to phony up or break in the perceived future. Hashing as a means of authentication is a topic of interest to the business and governmental communities and there have been several recent mathematical papers on it, including descriptions of recent patents.

How might authors use hashing as an authentication technique? Above all it must be easy to use. It is typical for a document to be mundane at the time of its creation; it is only later that a document becomes important. Therefore an authentication mechanism must be so cheap and easy that documents can be authenticated as a matter of routine. First, there must be an agreement on a hashing algorithm that is generally trusted. Second, the algorithm must be widely distributable in a useful form, perhaps as a menu or hot-key command on a microcomputer or even embedded as a routine operating system option. To be useful, the selected algorithm must be commercially licensed and so cheap that there is no barrier to hashing documents at will.

In such a scheme, each time a document or a draft is created or saved the hash is created and saved with it and is separately retrievable. If the document is electronically published, it is published with its hash; and if the document is cited, the hash is part of the citation. If a reader using the document then wishes to know if she has the unaltered form, she computes the hash easily on her own computer using the standard algorithm and compares it with the published hash. If they are the same, she has confidence she has the correct, untampered version of the document before her.

Time-stamping

Digital time-stamping takes the process a step further. Time-stamping is a means of authenticating not only a document but its existence at a specific time. It is analogous to the rubber-stamping of incoming mail with the date and time it was received. An electronic technique has been developed by two researchers at Bellcore in New Jersey, Stuart Haber and Scott Stornetta.[13] Their efforts initially were prompted by charges of intellectual fraud made against a biologist, and they became interested in the problem of demonstrating whether or not electronic evidence had been tampered with. In addition, they are aware that their technique is useful as a means for determining priority of thought, for example in the patenting process, so that electronic claims for intellectual priority could be unambiguously made.

Their technique depends on a mathematical procedure involving the entire specific contents of the document, which means they have provided a tool for determining change as well as for fixing the date of the document. A great advantage of their procedure is that it is entirely public, except (if desired) for the contents of the document itself. Thus it is very useful for the library community, which wishes to keep documents available rather than hide them, and which needs to do so over periods of time beyond those it can immediately control. It is also likely to be useful for segments of the publishing community which will want to provide a means for buyers to authenticate what they have purchased.

The time-stamping process envisioned by Haber and Stornetta depends upon hashing as the first step. Assume, in Fig. 3, that Author A creates Document A and wishes to establish it as of a certain time. First he creates a hash for Document A using a standard, publicly-available program. He then sends this hash over the network to a time-stamping server. Note that he has thus preserved the privacy of his document for as long as he wishes, as it is only the hash that is sent to the server. The time-stamping server uses standard, publicly-available software to combine this hash with two other numbers: a hash from the just-previous document that it has authenticated, and a hash derived from the current time and date. The resulting number is called a certificate, and the server returns this certificate to Author A. The author now preserves this certificate, a number, and transmits it with Document A and uses it when referring to Document A (e.g. in a bibliography) in order to distinguish it from other versions of the document.

The time-stamping server has one other important function: It combines the certificate hash with others for that week into a number which, once a week, is now published in the personals column of The New York Times (“Commercial and Public Notices”), as in Fig. 4. The public nature of this number (what Stornetta calls an example of a “widely-witnessed event”) assures that it cannot be tampered with.

The privacy of the document been preserved for as long as Author A wishes; there is also no other secrecy in this process. All steps are taken in public using available programs and procedures. Note too that no other document will result in the same certificate, for Document A’s certificate is dependent not only upon the algorithms and the document’s hash total, but also upon the hash of the particular and unpredictable document that was immediately previous. Once Document A has been authenticated, it becomes itself the previous document for the authentication of Document B.

Now let us consider Reader C, who wishes to determine the authenticity of the electronic document before her. Perhaps it is an electronic press release from a senatorial campaign, or an index purchased over the network from an electronic publisher, or perhaps it is the year 2093 and the document is an electronic text of Author A. Reader C has available the certificate for Document A. If she can validate that number from the document she can be sure she has the authenticated contents. Using the standard software, she recreates the hash for the document and sends the hash over the network, with the certificate, to the time-stamping server. The server reports back on the validity of the certificate for that document.

But let us suppose that it is the year 2093 and the server is nowhere to be found. Reader C then searches out the microfilm of The New York Times for the putative date of the document in question and determines the published hash number; using that number and the standard software she tests the authenticity of her document just as the server would.

What I have described are simplified forms of methods for identifying a unique document, and for authenticating a document as created at a specific point in time with a specific content. Whether the specific tools of hashing or time-stamping are those we will use in future is open to question. It is however the first time that authors, publishers, librarians and end-users have been offered electronic authentication tools that provide generality, flexibility, ease of use, openness, low cost, and functionality over long periods of time on the human scale. Using such tools (or similar ones yet to be developed), an author can have confidence that the document being read is the one he or she published, and that it has not been altered without the reader being aware of it. Such tools are essential for every player in the chain of scholarly communication.

ROLE OF LIBRARIANS

It may be asked why librarians make such authentication issues their concern. Why do they do this — why do they bother? The short answer is that it is what librarians do. As noted earlier, the basic professional paradigm for librarians is to acquire information, organize it, preserve it and make it available.

It is the preservation imperative that is particularly important for this audience of authors and publishers as well as for librarians. Authors and publishers have an interest in seeing that their works are preserved and provided in uncorrupted form, but neither have taken on the responsibility for doing so; librarians have. Authors have a specific interest in the uncorrupted longevity of their works, and both authors and research libraries have long periods of time as their concern. Librarians have taken on the particular responsibility to see that authors’ works (and the graphic culture in general) are preserved and organized for use, not only by our generation but by succeeding generations of scholars and students. On behalf of future readers, librarians have the general responsibility for preserving against moth, rust and change. If librarians do not preserve works for the long haul, no one else will; once again, it is what librarians do.

Speaking pessimistically for a moment, it is possible that the job cannot be done. We may all — librarians, authors and publishers — be swimming against the tide. Our society is obsessed with the present and is generally uncaring of the past and of its records. Technologically refined tools are now available which allow and encourage the quick and easy modification of text, of pictures, and of sounds. It is becoming routine to produce ad hoc versions of performances, and to produce technical reports in tailored versions on demand. Post-modernist critical theory detaches authorial intention from works, and demeans the importance of historical context. The technology that allows us to interact with information itself inhibits us from preserving our interaction.

However, there is cause for optimism. In our house there are many mansions; there will continue to be people who want history, who care what authors say, and who wish the human record to last. They will support the efforts of librarians to achieve these goals. We are fortunate that electronic preservation is of some interest to other communities for the mundane commercial reasons. The financial, publishing and other business communities have a stake in the authenticity of their electronic communications. The business and computing communities wish to protect against the undesired loss of data in the short term. The governmental and business communities profess an interest in the security of systems.

The protection of intellectual property in the internetworked multimedia environment is the concern of this conference. The preservation of the actual information content is a prerequisite to the protection of property rights. Recognizing the need for authenticating and preserving our intellectual productivity is a common ground for authors, publishers and librarians.

NOTES

1. Parts of this paper are drawn from the author’s presentation at the 1992 annual preconference of the Rare Books and Manuscripts Section of the Association of College and Research Libraries, and published in Robert S. Martin, ed., Scholarly Communication in an Electronic Environment: Issues for Research Libraries (Chicago: American Library Association, 1993), as “Preserving the Intellectual Record and the Electronic Environment” (pp. 71-101).

2. Gordon B. Neavill, “Electronic Publishing, Libraries, and the Survival of Information,” Library Resources & Technical Services 28:76-89 (Jan. 1984), p. 78.

3. However, see the recent work by Stuart Moulthrop, Victory Garden (Cambridge, Mass.: Eastgate Systems, 1991 [800 MB disk (signed and numbered 226/250 by author) for Macintosh + 16 p. brochure with introduction by Michael Joyce and explanatory matter, in plastic casing labeled “first edition”]).

4. There is a third kind, the obsolescence of software designed to read a specific medium. For example, Kathleen Kluegel has pointed out how CD-ROM software updates have left unreadable older disks of the same published data base. She fears CD-ROM ending up “being the 8-track tape of the information industry” in “CD-ROM Longevity,” message on PACS-L (listserv@uhupvm1.bitnet, April 29, 1992).

The best discussion of medium preservation, and the distinctions between the various kinds of obsolescence, is in Michael Lesk, Preservation of New Technology: A Report of the Technology Assessment Advisory Committee to the Commission on Preservation and Access (Washington, DC: CPA, 1992).

5. See especially Lesk, but also Janice Mohlhenrich, ed., Preservation of Electronic Formats: Electronic Formats for Preservation (Fort Atkinson, Wis.: Highsmith, 1993), the proceedings of the 1992 WISPPR preservation conference.

6. Neavill, 1984, p. 77.

7. TEI P1, Guidelines, Version 1.1: Chapter 4, Bibliographic Control, Encoding Declarations and Version Control (Draft Version 1.1, October 1990); sec. 4.1.6, Revision History, p. 55: “…[I]f the file changes at all, even if only by the correction of a single typographic error, the change should be mentioned…. The principle here is that any researcher using the file, including the person who made the changes, should be able to find a record of the history of the file’s contents.”

8. Harvey Wheeler, keynote speech at the October, 1988 LITA conference (Boston, Mass.). The issue arises in a different context in the ESTC note below.

9. A peculiar case is the transportation time-table; theoretically it could be dynamically updated in electronic form, yet it is the timetable’s hard-copy publication that signals to the users that a change has occurred.

10. An electronic catalog is a similar case. Librarians never pretended that card catalogs were static, but the electronic catalogs (particularly when on the network) are so accessible as to raise citation problems. Robin Alston, in Searching the Eighteenth Century (London: British Library, 1983), claimed superiority for the Eighteenth Century Short Title Catalog (ESTC) on the grounds that “machine-readable data…can be always provisional.” Hugh Amory, a Harvard rare books cataloger, responded in a review by noting: “The permanence of print has its own advantages, moreover: who will wish to cite a catalogue that can change without notice?” Papers of the Bibliographical Society of America (PBSA) Vol. 79 (1985), p. 130.

11. See the discussion of hypertext books in Robert Coover, “The End of Books,” The New York Times Book Review (June 21, 1992), p. 1, 23-25. Examples of such works include Moulthrop (n. 2 above), Michael Joyce, Afternoon: A Story (Cambridge, Mass.: Eastgate Systems, 1987), and Carolyn Guyer and Martha Petry, “Izme Pass,” Writing on the Edge Vol. 2, no. 2 (Spring, 1991), attached Macintosh disk.

12. DES is described in FIPS Publication 46-1: Data Encryption Standard, National Bureau of Standards, January 1988. RSA Data Security, from whom information is available about their product, is at 10 Twin Dolphin Drive, Redwood City, California 94065; the original description of RSA’s method is in R. L. Rivest, A. Shamir, and L. Adleman, “A Method for Obtaining Digital Signatures and Public-key Cryptosystems,” Communications of the ACM, Vol. 21, No. 2 (Feb. 1978), p. 120-126.

A few readily available popular articles on the two schemes include John Markoff, “A Public Battle Over Secret Codes,” The New York Times (May 7, 1992), p. D1; Michael Alexander, “Encryption Pact in Works,” Computerworld, Vol. 25, No. 15 (April 15, 1991); G. Pascal Zachary, “U.S. Agency Stands in Way of Computer-security Tool,” The Wall Street Journal (Monday, July 9, 1990); D. James Bidzos and Burt S. Kaliski, Jr., “An Overview of Cryptography,” LAN Times (February 1990). More technical and with many references is W. Diffie, “The First Ten Years of Public-key Cryptography,” Proceedings of the IEEE, Vol. 76, No. 5 (May 1988), p. 560-577.

13. Stuart Haber and W. Scott Stornetta, “How to Time-stamp a Digital Document,” Journal of Cryptology (1991) 3:99-111; also, under the same title, as DIMACS Technical Report 90-80 ([Morristown,] New Jersey: December, 1990). DIMACS is the Center for Discrete Mathematics and Theoretical Computer Science, “a cooperative project of Rutgers University, Princeton University, AT&T Bell Laboratories and Bellcore.” The authors are Bellcore employees.

D. Bayer, S. Haber. and W. S. Stornetta, “Improving the Efficiency and Reliability of Digital Time-stamping,” Sequences II: Methods in Communication, Security, and Computer Science, ed. R. M. Capocelli et al (New York: Springer-Verlag, 1993), p. 329-334.

A brief popular account of digital time-stamping is in John Markoff, “Experimenting with an Unbreachable Electronic Cipher,” The New York Times (Jan. 12, 1992), p. F9. A better and more recent summary is by Barry Cipra, “Electronic Time-Stamping: The Notary Public Goes Digital,” Science Vol. 261 (July 9, 1993), p. 162-163.

BIOGRAPHY

Peter S. Graham, Associate University Librarian for Technical and Networked Information Services at Rutgers University, co-leads the Working Group on Legislation, Codes, Policies and Practices of the Coalition for Networked Information, and serves on the Council of the American Library Association. Holding an M.L.S., he has been a senior administrator of university libraries and computing centers.

        Peter S. Graham
        Associate University Librarian for Technical 
           and Networked Information Services
        Rutgers University Libraries
        169 College Ave.
        New Brunswick, N.J. 08903
        (908) 932-5908
        fax (908) 932-5888
        e-mail:  psgraham@gandalf.rutgers.edu

Intellectual Preservation and Electronic Intellectual Property

by Peter S. Graham

Contact Us

Keeping up with CNI

Follow CNI

A joint project

by Peter S. Graham

Share this:

Contact Us

Keeping up with CNI

Follow CNI

A joint project