Meta-Information, The Network of the Future and Intellectual Property
Protection
by Prof. Kenneth L. Phillips
ABSTRACT
Information is present when a more informed decision between two equally
probable events can be made. Information loses half its value in an information
half-life, which is shortening as the velocity and bandwidth of
information flows increase. The tremendous economic incentives to collect and
synthesize information about the use of information must be balanced against
possible threats to individual privacy.
The nature of information itself has changed fundamentally, as a result of
advanced networking technologies, and in ways which will require the
development of novel concepts and approaches to the protection of intellectual
property. Technologies of telecommunications are never content-neutral,
rendering the content/conduit distinction a legal fiction. As a result of these
technological changes, new forms of information will develop, and along with
them, increased incentives to sell these new forms, often complicating the
development and enforcement of privacy and intellectual property concepts.
Although even a cursory review of the trade press will reveal considerable
debate over the `'future of the network,'' I feel secure in setting forth a few
planning assumptions which I feel are not contentious:
- The Network is moving away from the dedicated paths typical of circuit
switched technologies, at all bandwidth levels, and in the direction of
`'virtual'' switched environments, routing information on a per-cell or
per-packet basis.
- Switching will take place at the packet level, though it is more difficult
to predict whether variable length, fixed length, synchronous, asynchronous or
isochronous formats will be first choice.
- Packet processing at the baseband level will be at such a rate as to render
transmission and switching of cells and packets of compressed video and other
multimedia data practicable, on a real-time, low latency basis.
- Dynamic bandwidth allocation will become a fundamental feature of integrated
networks of the future, as contrasted with the deterministic time division
multiplexing methodologies typical of today's digital networks, used largely
for highly predictable voice traffic. Application sectors experiencing the
highest rates of future growth produce traffic characteristics which are
intensely "bursty", where demand fluctuates drastically from millisecond to
millisecond, and where peaking is not highly predictable (i.e., not
Poisson-like).
The interconnection of networks on a global level has resulted in an
amplification of the spikes in traffic brought on by natural disasters,
political change, and fundamental global financial trends in foreign exchange,
rare metals, and international arbitrage.
- The variance in traffic arrival rates will grow further as demand rises for
the simultaneous delivery of bit streams of information, ranging from the
traditional 56/64kb representative of basic voice telephony, to 155Mb/s for
high definition television, and as these technologies come on line.
These basic changes in user requirements have dictated the development of cell
relay switching methods using fixed length cells and Asynchronous Transfer Mode
(ATM). Using fixed length cells running at heretofore unencountered speeds
enables the network to basically utilize the benefits of the Law of Large
Numbers to make the arrival distribution more predictable. Fixed length cell
structure allows the pipelining of applications, further smoothing the mean
arrival rate curves--ultimately increasing economy.
Although these technologies will require solution sets to intellectual property
issues having characteristics unlike anything we have developed in the past,
the basic problems are surprisingly old.
While the East Coast of the United States was experiencing the "storm of the
century'' a couple of months ago, I had the good fortune to have been working
in Europe. One morning, while taking the train from Zurich to Basel, I
purchased a copy of the Herald Tribune and noticed what struck me as a
rather odd headline to deserve placement across the lower half of the front
page. It chronicled the decommissioning of an `'Elite French Army Squad'',
which has had as its principal duty the transmission of packets of information
since nearly the time of Julius Caesar.[1] This battalion
saw its most heroic hour at the Battle of Verdun, in 1916, when its members
carried messages back to base through poison gas appealing for assistance. Yet
despite such bravery, this most recent announcement was but the fourth time in
French history that this unique group has been threatened with dissolution. In
the past, fear mounted that the messages would be intercepted and the
identities of the senders, as well as the content, disclosed to those who would
sell or otherwise pass such information to the enemy.
Back in the late 1970's, while completing graduate school, I worked for the
United States District Court for the Southern District of New York assisting
the Court in criminal cases having complex backgrounds, often involving
conflicting expert testimony. I remember a case in which the FBI, in attempting
to locate a terrorist who had allegedly blown up microwave relay towers,
visited the local public library and asked the staff to compile a list of
patrons who had borrowed books on such subjects as making explosives. The
polite ladies refused, arguing that such information about who sought
information on a particular subject was private. The federal government sued,
arguing that since the library was funded from public monies, its records were
as public as the books it loaned. The government initially lost, but appealed
and won. The matter was then joined by the ACLU and other groups, and again
appealed, overturning the appellate court decision. The court finally held that
absent a disclosure statement to the contrary, patrons of libraries have a
reasonable expectation in the form of an implicit contract or guarantee that
such information will not be sold or otherwise disclosed without their
permission, except where a court of jurisdiction grants a warrant, which
strangely, in the instant case, was not sought by the law enforcement
organization.
My purpose in telling you these things, which on the surface may strike you as
unrelated to the subject of this meeting, is to alert you to a new form of
information which, while not entirely new, becomes both more readily available
and very much more valuable as a result of, and throughout the digital age:
meta-information, or information about the use of information. Indeed,
as will be seen shortly, this new form of information has the potential to
alter pervasively the nature of some of our largest industries, such as
telecommunications, retail, and finance, not to mention the enormous inducement
it could provide to breach personal privacy in ways totally unheard of in the
past. In addition, while both the legal and regulatory communities will have to
revise their statutes and rules significantly in order to provide adequate
protection and enforcement of intellectual property rights, history clearly
teaches that we should not wait for changes to take place in these areas. Both
federal regulatory and intellectual property law lag years behind the
introduction of technologies altering the powers of those who use them,
regardless of their intentions and motivations.
Elsewhere, I have argued that the proliferation of meta-information, coupled
with advanced telecommunications technologies, has profound implications for
those whose notion of political sovereignty includes operating so-called
`'closed societies''. Perhaps the most lucid discussion of this dynamic may be
found in The Twilight of Sovereignty,[2] an
exceptional volume authored by Walter Wriston, Citicorp's former Chairman.
In order to understand the dynamics of meta-information, it is first necessary
to recognize the basic unit of information, which I like to call the
infon,[3] a term first used by Keith Devlin.
Though a more formal mathematical definition is possible, for our brief
purposes suffice it to say that information is present if and only if the
presence of information aids one in making a decision between two equally
probable choices. Such a definition establishes a distinction between data and
information. For example, the statement `'We are at the Kennedy School'' surely
contains data, but not information, since it is reasonable to assume that
everyone here knows where they are. An infon, therefore, is a basic unit of
information and by definition must have some value, though at this juncture we
have not agreed on how information should be valued.
If we concede that information exists and that its basic unit may be called an
infon, and that it has at least some minimal value, then in order to understand
what must be done to protect that value, we must first look at the dynamics
affecting value. These dynamics have changed significantly at the hands of
technology, and telecommunications in particular.
Perhaps the most impressive aspect of what has gone on in the technology of
telecommunications in recent years is the increase in both the rate and the
bandwidth at which information is transmitted, switched, processed and then
sometimes retransmitted. It is generally assumed that the acceleration of
information transfer rates to the speed of light minus some ever-decreasing
variable is for the good. I shall hold true to my promise to the conference
chair to leave the so-called `'policy'' issues for another time, but would like
to remind you, through the use of a riddle, that these questions are more
complex than they appear at first blush. The riddle I use in class is, `'What
do a greengrocer in the days prior to refrigeration and the modern information
manager have in common?'' The answer, of course, is that both are dealing with
a terribly fragile commodity with a very short shelf life. Those who earn their
keep from the sale of information in many ways have their lives made more
difficult by the acceleration in velocity and bandwidth. For example, not many
years ago, one could sell a quotation service offering the spot price of
chromium, which is principally traded out of Zaire, on the London, New York or
Zurich markets, based on transactions occurring 24 hours earlier. Today, such
data has no value, because trading desks are linked to one another via
broadband networks operating at SONET rates. Within a couple of seconds the
latest spot price appears updated on electronic spreadsheets seen on hundreds
of trading screens in over a dozen countries. Not only are traditional
opportunities for spread-based arbitrage significantly reduced, but the base
prices are subject to drastic fluctuations due to the simultaneous presentation
of infons connected with either related metals, industries which are high
consumers of chromium, or political events affecting Zaire. All of this sort of
information is now available essentially at the speed of light.
The value of an infon in this sort of environment becomes critically related to
the amount of time that has elapsed since the receipt of the most recent infon
dealing with the same matter. Accordingly, I would argue that it now makes
sense to speak of information or infon half-lives: a measure of a
quantum of time in which a given infon loses 50% of its value. Indeed, when it
has lost 100% of its value it no longer constitutes information, since it can
play no role in assisting one with the classical choice between two equally
probable outcomes.
These notions are simple and I hope clear, and came to my mind as meaningful
analogies to things I learned as a graduate student in physics. Information, it
seems to me, suffers from the classical paradox of being considered to behave
simultaneously as a wave-like phenomenon, and as discrete entities or
particles/commodities of some kind. This is why most businessmen, with a few
interesting exceptions, have such a hard time figuring out how to sell it.
What stands to change this somewhat is the advent of such techniques of
information transfer as Asynchronous Transfer Mode (ATM), where the advantages
of fixed cell structures on network operation render it almost certain that
high-level infons will require more than one cell or packet. Indeed, under the
current wisdom, information is packaged into fixed-size cells of 53 octets.
Cells are identified and switched throughout the network by means of a label in
the header. ATM allows bit-rate allocation on demand, so the bit rates can be
selected on a connection-by-connection basis. The actual channel mixture at the
broadband interface point can change dynamically on very short notice.
Theoretically, ATM supports channelization from low kb/sec. up to the entire
payload capacity of the interface, minus some small overhead factor.
The ATM header contains the label, which is comprised of a Virtual Path
Identifier (VPI) and an error detection field. Error detection in ATM is
limited to the header alone--a mixed blessing. Further content-based error
correction takes place at the periphery of the network, within applications
running on hosts and their interface nodes. The ATM cell format for user, as
opposed to bearer, network interfaces is specified in CCITT Recommendation
I.361. The header, as usual, is transmitted first. However, inside the octet
bits are sent in decreasing order, starting with bit 8. But octets are sent in
increasing order, beginning with octet 1. (The network node interface cell
`'NNI'' is identical to the layout in Figure 1 except that the VPI occupies the
entire first octet rather than just bits 1 through 4.)
The ATM Cell Fields consist of the following:
Generic Flow Control (GFC) Field.
The 4-bit field allows encoding of 16 states for flow control. No
standardization has yet occurred for coding values. The CCITT is presently
considering several proposals.
Routing Field (VPI/CV)
24 bits are available for routing: 9 bits for the VPI and 16 for the VCI
(Virtual Channel identification). Except for 2 reserved codes used for
signaling, and VCI and for indicating general broadcast, the encoding
methodology has yet to be set. This is very important, for reasons which will
become clear shortly.
Payload Type (PT) Field.
Two bits are available or Payload Type identification, differentiating user
information payloads from network information. In user information cells, the
payload consists of user information and service adaptation information; in
network information cells, the payload does not form part of the user's
information transfer.
Cell Loss Priority Field. (CLP).
If the CLP field is set (CLP value is 1.), the cell is subject to discard,
depending on network conditions. If the CLP is not set, and the value is 0, the
cell has a higher priority rating.
Header Error Control Field (HEC).
This field consists of 8 bits and is used for error management of the header
itself.
Reserved Field.
This field, consisting of 1 bit, is for further enhancement of existing cell
header functions yet to be specified.

Since large numbers of multiple cells are going to be required in literally all
applications, and ATM and related technologies are not circuit switched,
identification and addressability will have to be handled on a cell-by-cell
basis. Indeed, such addressing information, regardless of whether it references
dedicated virtual circuits or user identification numbers, constitutes in its
own right infons, or what I have recently discovered is information for which
some parties are willing to pay a great deal.
For example, with the implementation of both the Line Interface Data Base
(LIDB), justified to achieve 800-number portability for customers between long
distance carriers, and the CCITT Signaling System VII (SS-VII), it is now
possible for inter-LATA carriers to generate lists of customers by the 800
number called.
In a friendly deposition, the Direct Marketing Association (DMA) told the
Committee of Corporate Telecommunications Users that its members would `'be
willing to pay $3 per name and address for a list of telephone subscribers
sorted by 800 number destination. For example an 800 number associated with a
hotel charging at least x-amount for a room, or a contributions line to a
charity or political party.'' Following discovery of this fact, a similar
inquiry was made of AT&T: How many calls are processed per day, and could
such a list be compiled. AT&T averred that in excess of 100,000 such calls
were processed per day, that the exact number was not obtainable on short
notice, and that indeed, given SS-VII capabilities, originating station
information was captured and could be cross-referenced with customer account
files and addresses lists printed out.
Aware of the more recent fact that AT&T is now the second largest issuer of
consumer credit cards in the United States, processing literally millions of
transactions per month, I sought to determine the value of infons consisting of
telephone traffic information and credit card purchasing data linked by Boolean
operands. In other words, what would the value to the list brokers (or banks,
law enforcement agencies, tax collectors, lobbyists, etc.) be of data assembled
in the new format of lists of people who, for example, called a hotel
reservations 800 number and also spent over $500/month on sports equipment? To
my astonishment the DMA indicated that if the list had been generated within
one month of their members receiving it, the brokers would pay between the
earlier $3 and $7 per name. Given the traffic numbers provided by AT&T
earlier, clearly there exists an opportunity of at least $300,000 to $700,000
per day, simply based on the AT&T traffic.
All of this is just an example, and indeed one which AT&T rightly protests,
since none of these practices is taking place at present. However, the writing
is on the wall. Citicorp, with a much larger customer base, has used Thinking
Machine's equipment to develop detailed customer purchasing profiles linking
telephone numbers, to ZIP codes, to SMSA statistics and default rates. AT&T
has issued letters of intent to purchase and lease similar equipment. Companies
will eventually be forced to become far more open about such policies, just as
nation states have had to as technology has forced the issue. In so doing, they
will also become more profitable as a greater sphere of potential consumers of
meta-information become customers. But so far, few have figured this out.
Indeed, telephone companies and banks are especially covetous of this sort of
information. (Just ask a telephone company for traffic statistics between
various parts of a city or state, or a bank for the average number of Automated
Teller Transactions on a time of day/neighborhood by neighborhood basis--all
useful behavioral data.)
This phenomenon, of infons describing the use of information, constitutes
second-order information, what I first termed meta-information several
years ago. When linked to the identity of the user or other classes of
information, both the theft of intellectual property without the detection of
the act, and the invasion of personal property become increasingly easy.
Indeed, I believe that one might adopt the potentially draconian means of
measuring the technological advancement of a given society by measuring how
many sorts of interconnected data bases such as those containing
meta-information are required in order to gain the identity of any given
citizen. Alternatively, in the case of intellectual property protection, one
would simply ask the same question pertaining to detecting the location of some
file or piece of unique work, be it art, software, or your latest manuscript.
This will all become most interesting as we move towards such future
institutions as digital libraries, for-profit image-based archives,
high-definition audio recording, and the like.
The solution sets required of these problems are not at hand, but do bode of
careful and thoughtful consideration of just what goes into such things as ATM
Cell Fields. In non-dedicated route networks and in packetized
environments--where the packet length is finite and small, resulting in a
proliferation of transport cells--the identity of owners and users of
intellectual property becomes far more accessible to the casual interloper as
well as the professional thief.
Incentives to obtain meta-information will increase at least geometrically as
the number of interconnected sources goes up arithmetically. Indeed, the
value of such information may be expected to approach a log function of
the number of sources. Figure 2. (Courtesy of Privacy Journal) depicts
basic meta-information flows between major categories of data collection in the
United States. Clearly this is a booming business poised to take off, once the
`'Network of the Future'' becomes perceived as a meta-information engine.
Profound business, policy, and regulatory issues attend all this development. A
long-distance carrier may see a contribution to revenue from processing a
transcontinental call of only 9.7cents per minute, while the existence of the
virtual path through the network generates $3 to $7 worth of meta-information
per call.
How much is the string of four letters representing the Adenine, Guanine,
Cytosine, and Uracil (A,G,C,T) bases of the DNA found on a particular allele of
your 18th chromosome worth to you, the police, your bank, or a genetics
engineering company attempting to clone antibodies in order to replicate
adaptive or otherwise positive immune responses in less healthy individuals?
What is the meaning of Justice Brandeis' prescient equation of privacy with the
right to be left alone, in light of these developments? I do not believe that
there is cause for panic--but there is cause for pause and serious
thought given these matters.
Yet again, these are not new issues. In fact, earlier on, in mentioning the
decommissioning of the French Army Division and past concerns over the identity
of the senders of data, I told the truth, but not the whole truth.
Indeed, in the age of meta-information, lies of comission will become
increasingly simple to spot while the detection of deception by omission,
without violating privacy, will present some uniquely vexing problems. And on
that note I close, but not before I tell you that all the members of the famous
French Guard threatened with extinction are pigeons.
NOTES
1. International Herald Tribune., No. 34,229, March 18, 1993., page 1.
2. Wriston, Walter B. Twilight of Sovereignty. Scribner's & Sons,
NY, 1992.
3. Devlin, Keith. Logic and Information. Cambridge U.Press, 1991, p.11,
ff.
4. Deposition of J. Rankel, DMA, 8/13/89, by CCTU; Reid & Priest
5. See end notes at conclusion of paper for other related papers by this
author.
BIOGRAPHY
Kenneth L. Phillips, Ph.D. has been Vice President for Telecommunications
Policy at Citicorp for 15 years, where he is now Of Counsel. He is presently a
Professor of Psychology at the Graduate Interactive Telecommunications Program
at the Tisch School of New York University.