by Prof. Kenneth L. Phillips
Information is present when a more informed decision between two equally probable events can be made. Information loses half its value in an information half-life, which is shortening as the velocity and bandwidth of information flows increase. The tremendous economic incentives to collect and synthesize information about the use of information must be balanced against possible threats to individual privacy.
The nature of information itself has changed fundamentally, as a result of advanced networking technologies, and in ways which will require the development of novel concepts and approaches to the protection of intellectual property. Technologies of telecommunications are never content-neutral, rendering the content/conduit distinction a legal fiction. As a result of these technological changes, new forms of information will develop, and along with them, increased incentives to sell these new forms, often complicating the development and enforcement of privacy and intellectual property concepts.
Although even a cursory review of the trade press will reveal considerable debate over the `’future of the network,” I feel secure in setting forth a few planning assumptions which I feel are not contentious:
- The Network is moving away from the dedicated paths typical of circuit switched technologies, at all bandwidth levels, and in the direction of `’virtual” switched environments, routing information on a per-cell or per-packet basis.
- Switching will take place at the packet level, though it is more difficult to predict whether variable length, fixed length, synchronous, asynchronous or isochronous formats will be first choice.
- Packet processing at the baseband level will be at such a rate as to render transmission and switching of cells and packets of compressed video and other multimedia data practicable, on a real-time, low latency basis.
- Dynamic bandwidth allocation will become a fundamental feature of integrated networks of the future, as contrasted with the deterministic time division multiplexing methodologies typical of today’s digital networks, used largely for highly predictable voice traffic. Application sectors experiencing the highest rates of future growth produce traffic characteristics which are intensely “bursty”, where demand fluctuates drastically from millisecond to millisecond, and where peaking is not highly predictable (i.e., not Poisson-like).The interconnection of networks on a global level has resulted in an amplification of the spikes in traffic brought on by natural disasters, political change, and fundamental global financial trends in foreign exchange, rare metals, and international arbitrage.
- The variance in traffic arrival rates will grow further as demand rises for the simultaneous delivery of bit streams of information, ranging from the traditional 56/64kb representative of basic voice telephony, to 155Mb/s for high definition television, and as these technologies come on line.
These basic changes in user requirements have dictated the development of cell relay switching methods using fixed length cells and Asynchronous Transfer Mode (ATM). Using fixed length cells running at heretofore unencountered speeds enables the network to basically utilize the benefits of the Law of Large Numbers to make the arrival distribution more predictable. Fixed length cell structure allows the pipelining of applications, further smoothing the mean arrival rate curves–ultimately increasing economy.
Although these technologies will require solution sets to intellectual property issues having characteristics unlike anything we have developed in the past, the basic problems are surprisingly old.
While the East Coast of the United States was experiencing the “storm of the century” a couple of months ago, I had the good fortune to have been working in Europe. One morning, while taking the train from Zurich to Basel, I purchased a copy of the Herald Tribune and noticed what struck me as a rather odd headline to deserve placement across the lower half of the front page. It chronicled the decommissioning of an `’Elite French Army Squad”, which has had as its principal duty the transmission of packets of information since nearly the time of Julius Caesar. This battalion saw its most heroic hour at the Battle of Verdun, in 1916, when its members carried messages back to base through poison gas appealing for assistance. Yet despite such bravery, this most recent announcement was but the fourth time in French history that this unique group has been threatened with dissolution. In the past, fear mounted that the messages would be intercepted and the identities of the senders, as well as the content, disclosed to those who would sell or otherwise pass such information to the enemy.
Back in the late 1970’s, while completing graduate school, I worked for the United States District Court for the Southern District of New York assisting the Court in criminal cases having complex backgrounds, often involving conflicting expert testimony. I remember a case in which the FBI, in attempting to locate a terrorist who had allegedly blown up microwave relay towers, visited the local public library and asked the staff to compile a list of patrons who had borrowed books on such subjects as making explosives. The polite ladies refused, arguing that such information about who sought information on a particular subject was private. The federal government sued, arguing that since the library was funded from public monies, its records were as public as the books it loaned. The government initially lost, but appealed and won. The matter was then joined by the ACLU and other groups, and again appealed, overturning the appellate court decision. The court finally held that absent a disclosure statement to the contrary, patrons of libraries have a reasonable expectation in the form of an implicit contract or guarantee that such information will not be sold or otherwise disclosed without their permission, except where a court of jurisdiction grants a warrant, which strangely, in the instant case, was not sought by the law enforcement organization.
My purpose in telling you these things, which on the surface may strike you as unrelated to the subject of this meeting, is to alert you to a new form of information which, while not entirely new, becomes both more readily available and very much more valuable as a result of, and throughout the digital age: meta-information, or information about the use of information. Indeed, as will be seen shortly, this new form of information has the potential to alter pervasively the nature of some of our largest industries, such as telecommunications, retail, and finance, not to mention the enormous inducement it could provide to breach personal privacy in ways totally unheard of in the past. In addition, while both the legal and regulatory communities will have to revise their statutes and rules significantly in order to provide adequate protection and enforcement of intellectual property rights, history clearly teaches that we should not wait for changes to take place in these areas. Both federal regulatory and intellectual property law lag years behind the introduction of technologies altering the powers of those who use them, regardless of their intentions and motivations.
Elsewhere, I have argued that the proliferation of meta-information, coupled with advanced telecommunications technologies, has profound implications for those whose notion of political sovereignty includes operating so-called `’closed societies”. Perhaps the most lucid discussion of this dynamic may be found in The Twilight of Sovereignty, an exceptional volume authored by Walter Wriston, Citicorp’s former Chairman.
In order to understand the dynamics of meta-information, it is first necessary to recognize the basic unit of information, which I like to call the infon, a term first used by Keith Devlin. Though a more formal mathematical definition is possible, for our brief purposes suffice it to say that information is present if and only if the presence of information aids one in making a decision between two equally probable choices. Such a definition establishes a distinction between data and information. For example, the statement `’We are at the Kennedy School” surely contains data, but not information, since it is reasonable to assume that everyone here knows where they are. An infon, therefore, is a basic unit of information and by definition must have some value, though at this juncture we have not agreed on how information should be valued.
If we concede that information exists and that its basic unit may be called an infon, and that it has at least some minimal value, then in order to understand what must be done to protect that value, we must first look at the dynamics affecting value. These dynamics have changed significantly at the hands of technology, and telecommunications in particular.
Perhaps the most impressive aspect of what has gone on in the technology of telecommunications in recent years is the increase in both the rate and the bandwidth at which information is transmitted, switched, processed and then sometimes retransmitted. It is generally assumed that the acceleration of information transfer rates to the speed of light minus some ever-decreasing variable is for the good. I shall hold true to my promise to the conference chair to leave the so-called `’policy” issues for another time, but would like to remind you, through the use of a riddle, that these questions are more complex than they appear at first blush. The riddle I use in class is, `’What do a greengrocer in the days prior to refrigeration and the modern information manager have in common?” The answer, of course, is that both are dealing with a terribly fragile commodity with a very short shelf life. Those who earn their keep from the sale of information in many ways have their lives made more difficult by the acceleration in velocity and bandwidth. For example, not many years ago, one could sell a quotation service offering the spot price of chromium, which is principally traded out of Zaire, on the London, New York or Zurich markets, based on transactions occurring 24 hours earlier. Today, such data has no value, because trading desks are linked to one another via broadband networks operating at SONET rates. Within a couple of seconds the latest spot price appears updated on electronic spreadsheets seen on hundreds of trading screens in over a dozen countries. Not only are traditional opportunities for spread-based arbitrage significantly reduced, but the base prices are subject to drastic fluctuations due to the simultaneous presentation of infons connected with either related metals, industries which are high consumers of chromium, or political events affecting Zaire. All of this sort of information is now available essentially at the speed of light.
The value of an infon in this sort of environment becomes critically related to the amount of time that has elapsed since the receipt of the most recent infon dealing with the same matter. Accordingly, I would argue that it now makes sense to speak of information or infon half-lives: a measure of a quantum of time in which a given infon loses 50% of its value. Indeed, when it has lost 100% of its value it no longer constitutes information, since it can play no role in assisting one with the classical choice between two equally probable outcomes.
These notions are simple and I hope clear, and came to my mind as meaningful analogies to things I learned as a graduate student in physics. Information, it seems to me, suffers from the classical paradox of being considered to behave simultaneously as a wave-like phenomenon, and as discrete entities or particles/commodities of some kind. This is why most businessmen, with a few interesting exceptions, have such a hard time figuring out how to sell it.
What stands to change this somewhat is the advent of such techniques of information transfer as Asynchronous Transfer Mode (ATM), where the advantages of fixed cell structures on network operation render it almost certain that high-level infons will require more than one cell or packet. Indeed, under the current wisdom, information is packaged into fixed-size cells of 53 octets. Cells are identified and switched throughout the network by means of a label in the header. ATM allows bit-rate allocation on demand, so the bit rates can be selected on a connection-by-connection basis. The actual channel mixture at the broadband interface point can change dynamically on very short notice. Theoretically, ATM supports channelization from low kb/sec. up to the entire payload capacity of the interface, minus some small overhead factor.
The ATM header contains the label, which is comprised of a Virtual Path Identifier (VPI) and an error detection field. Error detection in ATM is limited to the header alone–a mixed blessing. Further content-based error correction takes place at the periphery of the network, within applications running on hosts and their interface nodes. The ATM cell format for user, as opposed to bearer, network interfaces is specified in CCITT Recommendation I.361. The header, as usual, is transmitted first. However, inside the octet bits are sent in decreasing order, starting with bit 8. But octets are sent in increasing order, beginning with octet 1. (The network node interface cell `’NNI” is identical to the layout in Figure 1 except that the VPI occupies the entire first octet rather than just bits 1 through 4.)
The ATM Cell Fields consist of the following:
Generic Flow Control (GFC) Field.
The 4-bit field allows encoding of 16 states for flow control. No standardization has yet occurred for coding values. The CCITT is presently considering several proposals.
Routing Field (VPI/CV)
24 bits are available for routing: 9 bits for the VPI and 16 for the VCI (Virtual Channel identification). Except for 2 reserved codes used for signaling, and VCI and for indicating general broadcast, the encoding methodology has yet to be set. This is very important, for reasons which will become clear shortly.
Payload Type (PT) Field.
Two bits are available or Payload Type identification, differentiating user information payloads from network information. In user information cells, the payload consists of user information and service adaptation information; in network information cells, the payload does not form part of the user’s information transfer.
Cell Loss Priority Field. (CLP).
If the CLP field is set (CLP value is 1.), the cell is subject to discard, depending on network conditions. If the CLP is not set, and the value is 0, the cell has a higher priority rating.
Header Error Control Field (HEC).
This field consists of 8 bits and is used for error management of the header itself.
This field, consisting of 1 bit, is for further enhancement of existing cell header functions yet to be specified.
For example, with the implementation of both the Line Interface Data Base (LIDB), justified to achieve 800-number portability for customers between long distance carriers, and the CCITT Signaling System VII (SS-VII), it is now possible for inter-LATA carriers to generate lists of customers by the 800 number called.
In a friendly deposition, the Direct Marketing Association (DMA) told the Committee of Corporate Telecommunications Users that its members would `’be willing to pay $3 per name and address for a list of telephone subscribers sorted by 800 number destination. For example an 800 number associated with a hotel charging at least x-amount for a room, or a contributions line to a charity or political party.” Following discovery of this fact, a similar inquiry was made of AT&T: How many calls are processed per day, and could such a list be compiled. AT&T averred that in excess of 100,000 such calls were processed per day, that the exact number was not obtainable on short notice, and that indeed, given SS-VII capabilities, originating station information was captured and could be cross-referenced with customer account files and addresses lists printed out.
Aware of the more recent fact that AT&T is now the second largest issuer of consumer credit cards in the United States, processing literally millions of transactions per month, I sought to determine the value of infons consisting of telephone traffic information and credit card purchasing data linked by Boolean operands. In other words, what would the value to the list brokers (or banks, law enforcement agencies, tax collectors, lobbyists, etc.) be of data assembled in the new format of lists of people who, for example, called a hotel reservations 800 number and also spent over $500/month on sports equipment? To my astonishment the DMA indicated that if the list had been generated within one month of their members receiving it, the brokers would pay between the earlier $3 and $7 per name. Given the traffic numbers provided by AT&T earlier, clearly there exists an opportunity of at least $300,000 to $700,000 per day, simply based on the AT&T traffic.
All of this is just an example, and indeed one which AT&T rightly protests, since none of these practices is taking place at present. However, the writing is on the wall. Citicorp, with a much larger customer base, has used Thinking Machine’s equipment to develop detailed customer purchasing profiles linking telephone numbers, to ZIP codes, to SMSA statistics and default rates. AT&T has issued letters of intent to purchase and lease similar equipment. Companies will eventually be forced to become far more open about such policies, just as nation states have had to as technology has forced the issue. In so doing, they will also become more profitable as a greater sphere of potential consumers of meta-information become customers. But so far, few have figured this out. Indeed, telephone companies and banks are especially covetous of this sort of information. (Just ask a telephone company for traffic statistics between various parts of a city or state, or a bank for the average number of Automated Teller Transactions on a time of day/neighborhood by neighborhood basis–all useful behavioral data.)
This phenomenon, of infons describing the use of information, constitutes second-order information, what I first termed meta-information several years ago. When linked to the identity of the user or other classes of information, both the theft of intellectual property without the detection of the act, and the invasion of personal property become increasingly easy. Indeed, I believe that one might adopt the potentially draconian means of measuring the technological advancement of a given society by measuring how many sorts of interconnected data bases such as those containing meta-information are required in order to gain the identity of any given citizen. Alternatively, in the case of intellectual property protection, one would simply ask the same question pertaining to detecting the location of some file or piece of unique work, be it art, software, or your latest manuscript. This will all become most interesting as we move towards such future institutions as digital libraries, for-profit image-based archives, high-definition audio recording, and the like.
The solution sets required of these problems are not at hand, but do bode of careful and thoughtful consideration of just what goes into such things as ATM Cell Fields. In non-dedicated route networks and in packetized environments–where the packet length is finite and small, resulting in a proliferation of transport cells–the identity of owners and users of intellectual property becomes far more accessible to the casual interloper as well as the professional thief.
Incentives to obtain meta-information will increase at least geometrically as the number of interconnected sources goes up arithmetically. Indeed, the value of such information may be expected to approach a log function of the number of sources. Figure 2. (Courtesy of Privacy Journal) depicts basic meta-information flows between major categories of data collection in the United States. Clearly this is a booming business poised to take off, once the `’Network of the Future” becomes perceived as a meta-information engine. Profound business, policy, and regulatory issues attend all this development. A long-distance carrier may see a contribution to revenue from processing a transcontinental call of only 9.7cents per minute, while the existence of the virtual path through the network generates $3 to $7 worth of meta-information per call.
How much is the string of four letters representing the Adenine, Guanine, Cytosine, and Uracil (A,G,C,T) bases of the DNA found on a particular allele of your 18th chromosome worth to you, the police, your bank, or a genetics engineering company attempting to clone antibodies in order to replicate adaptive or otherwise positive immune responses in less healthy individuals? What is the meaning of Justice Brandeis’ prescient equation of privacy with the right to be left alone, in light of these developments? I do not believe that there is cause for panic–but there is cause for pause and serious thought given these matters.
Yet again, these are not new issues. In fact, earlier on, in mentioning the decommissioning of the French Army Division and past concerns over the identity of the senders of data, I told the truth, but not the whole truth. Indeed, in the age of meta-information, lies of comission will become increasingly simple to spot while the detection of deception by omission, without violating privacy, will present some uniquely vexing problems. And on that note I close, but not before I tell you that all the members of the famous French Guard threatened with extinction are pigeons.
Kenneth L. Phillips, Ph.D. has been Vice President for Telecommunications Policy at Citicorp for 15 years, where he is now Of Counsel. He is presently a Professor of Psychology at the Graduate Interactive Telecommunications Program at the Tisch School of New York University.