Previous Chapter: Glossary Next Chapter: 2 Numbering 1 IntroductionThis chapter describes the DOI® Handbook and its updating process; explains the
environment which leads to the need for the DOI® System, and outlines the components and use of
the DOI System. 1.1 The DOI® Handbook 1.1 The DOI® HandbookThis Handbook is intended as:
The Glossary of Terms defines selected terms unique to the DOI System, and other terms with meanings used in a specific way within the DOI System which are discussed in the Handbook. Other introductory material may be found in:
The Handbook is regularly updated to reflect progress in DOI System development. The primary publication medium of the Handbook is the DOI.ORG web site; users working with print versions are advised to ensure that they are using the most up-to-date version by checking the version on the DOI.ORG Web site. Earlier versions of development documentation are superseded by any later edition. The numbering system of versions follows the convention of edition.release.update (the most significant digit on the left). Minor changes such as typographical corrections with no substantive effect will be numbered as updates; more substantive changes as releases; major changes as editions. Criteria for numbering are pragmatic: the IDF's aim is to clearly distinguish new versions for users, especially when use of an earlier version may result in error. Edition 1 of this Handbook was issued in February 2001. Edition 2 was issued in February 2002 and followed by several updated releases throughout 2002. Edition 3 was issued in May 2003 followed by several updated releases throughout 2003. This Edition 4 (first release April 2004) incorporates substantial additional material especially on the DOI® Data Model and DOI System Applications, related appendices and revision of all other chapters. If you have any questions or suggestions relating to this Handbook, please let us know by contacting contact@doi.org; your input will help to improve future versions of the Handbook. DOI® and DOI.ORG® are registered trademarks of the International DOI Foundation, Inc., filed with the U.S. Patent and Trademark Office, and granted registration numbers 2,360,527 and 2,360,526, respectively. The "doi>" is a trademark of the International DOI Foundation. indecsTM is a registered trademark of the International DOI Foundation, Inc., and EDItEUR, filed with the U.K. Patent Office, and granted registration number 2257426. The Handle System® is a registered trademark of the Corporation for National Research Initiatives, Inc. (U.S. registration number 6,135,646) and is used by permission. © International DOI Foundation 2006. The contents of this Handbook are copyright of the International DOI Foundation, Inc. 1.2 Identification and the InternetOne of the key challenges in the move from physical to electronic distribution of content is the rapid evolution of a set of common technologies and procedures to identify and manage pieces of digital content. A widely implemented and well understood approach to naming digital objects is essential if we are to see the development of services that will enable content providers to grow and prosper in an era of increasingly sophisticated computer networking. The boundaries that currently exist between different types of content, especially at the level of the infrastructure that supports their production and distribution, will be broken down and ultimately eliminated. Instead of different physical formats requiring different content distribution infrastructures, all content will consist of streams of digital data moving over networks. Diverse content industries will increasingly find themselves sharing the same challenges and opportunities in delivering content to their customers, whether direct or through intermediaries. "A developing trend that seems likely to continue in the future is an information centric view of the Internet that can live in parallel with the current communications centric view. Many of the concerns about intellectual property protection are difficult to deal with, not because of fundamental limits in the law, but rather by technological and perhaps management limitations in knowing how best to deal with these issues. A digital object infrastructure that makes information objects "first-class citizens" in the packetized "primordial soup" of the Internet is one step in that direction. In this scheme, the digital object is the conceptual elemental unit in the information view; it is interpretable (in principle) by all participating information systems. The digital object is thus an abstraction that may be implemented in various ways by different systems. It is a critical building block for interoperable and heterogeneous information systems. Each digital object has a unique and, if desired, persistent identifier that will allow it to be managed over time. This approach is highly relevant to the development of third-party value added information services in the Internet environment." (What Is The Internet (And What Makes It Work) -- Robert E. Kahn and Vinton G. Cerf, 1999). The International DOI Foundation (IDF) was established in 1998 to address this challenge, assuming a leadership role in the development of a framework of infrastructure, policies and procedures to support the identification needs of providers of intellectual property in the multinational, multi-community environment of the network. The IDF has developed, and continues to evolve, a fully implemented solution to this challenge: the DOI System, using the DOI name, an "actionable identifier" for intellectual property on the Internet. The DOI System is now widely implemented by hundreds of organisations through millions of identified objects. 1.3 What is an Identifier?For detailed information on concepts of identification and metadata, see the documents referenced in the bibliography. As the use of numbering in digital networks has developed, the use of the word "identifier" in this context has become expanded to the point where it is now used synonymously to cover several different things, all of which are useful but which actually carry different implications that need to be distinguished. It is not possible to compare two "identifiers" unless it is clear which of the following is implied by each: (1) A single unambiguous string or "label" that references an entity (e.g. ISBN 0-19- 853737-9) (2) A numbering scheme: a formal standard, an industry convention, or an arbitrary internal system providing a consistent syntax for generating a series of labels (identifiers (1)) denoting and distinguishing separate members of a class of entities (e.g. ISBN, or DOI® Syntax NISO Z39.84). The scheme is a specification for generating a number: this resulting "number" may include alphanumeric characters, but the accepted parlance is to speak of these as numbers (e.g. ISBN=International Standard Book Number). The intention is establishing a one-to-one correspondence between the members of a set of labels (numbers), and the members of the set counted and labelled. The product of the process is enumeration, a cardinality judgement, and assigned numbers for each cardinal member.The numbering scheme may or may not be accompanied by some policy apparatus -- for example, a registration agency and maintenance agency. An important point is that the resulting number is simply a label string. It does not of itself create a string that is actionable in a digital or physical environment without further steps being taken. It may be used (and probably will be used) in databases; or it may be incorporated into another mechanism later. Common standard numbering schemes of interest in digital content management include those standardised by ISO:
Whilst these ISO TC46 identifiers were originally simple numbering schemes, of late they have also begun to adopt the notion of associating some minimal structured descriptive metadata with the identifier. Also relevant are the ISO- affiliated NISO standards including:
(3) An infrastructure specification: a syntax by which any identifier (1) can be expressed in a form suitable for use with a specific infrastructure, without necessarily specifying a working mechanism (e.g. URI). This is sometimes known as creating an "actionable identifier" -- meaning that in the context of that particular piece of infrastructure, the label can now be used to perform some action: e.g. in an Internet Web browser, it can be "clicked on" and some action takes place. The set of Internet specifications known as Uniform Resource Identifiers (embracing URLs and URNs) provides mechanisms for taking labels and specifying them as actionable within the Internet. The same principles apply in the physical environment -- for example by prefixing an ISBN with the EAN sequence 978 or 979, the ISBN becomes a UPC/EAN identifier expressible as a physical bar code symbol, or a radio-frequency tag, for use in the physical supply chain. Importantly, note here that such "identifiers" do not mandate a way of creating labels, they merely accept any labels: hence if one does not have an existing numbering scheme, it will be necessary to adopt or create one in order to form URIs. A URI specification merely ensures that a label follows the rules to become actionable in an Internet environment: a specification is not an implementation, with all the other aspects that a fully functioning identifier system (see below) may require: URI may for example specify the syntax, and specify a recording registration procedure, but not create a managed environment (e.g. by which registrations are "policed"), or carry any specifications of metadata or policy. Some identifier specifications of this form may have limited rules or requirements for implementation: so far this is limited to the URN specification including a proposed (not implemented) mechanism for resolution. The acid test one should ask of such a specification is: what does specifying my label in this particular form get me, in practical terms, in a specific infrastructure? (4) A system for implementing labels (identifiers (1)) through a numbering scheme (identifiers (2)) in an infrastructure using a specification (identifiers (3)) and management policies (e.g. DOI System). The DOI System is an "identifier system" in the digital supply chain, just as the UPC/EAN is an "identifier system" in the physical supply chain; ISBNs for example become implemented in the physical supply chain through UPC/EAN bar codes or RFID tags. This sense of "Identifier" denotes a fully implemented identification mechanism that includes the ability to incorporate labels, conforms to an infrastructure specification, and adds to these practical tools for implementation such as registration processes, structured interoperable metadata, and a policy/governance mechanism. Such a system is necessary for practical DRM applications; since DRM deals with digital entities, structured metadata will be an essential component of such a system. The DOI System is one of the better developed, with several million DOI names currently in use by several hundred organisations. 1.4 What is a DOI® name?A DOI® (Digital Object Identifier) name is an identifier (not a location) for an entity on digital networks. It provides a system for persistent and actionable identification and interoperable exchange of managed information on digital networks. Unique identifiers are essential for the management of information in any digital environment. It is an identifier in sense (4) above. One of the components is a syntax specification (identifier (2)). The DOI System conforms to a URI (identifier (3)) specification. It provides an extensible framework for managing intellectual content based on proven standards of digital object architecture and intellectual property management. It is an open system based on nonproprietary standards. It has the following notable features: 1.4.1 DOI names are persistent identifiers A DOI name differs from commonly used Internet pointers to material such as the URL -- Uniform Resource Locator, the usual means of referring to World Wide Web material -- because it identifies an object as a first-class entity, not simply the place where the object is located. A first-class entity or object in the information infrastructure is stored on one or more servers and is accessible from these servers using a globally accessible identifier (URI). An entity is referred to as first class when it represents an object, not some attribute of an object; e.g. an address is an attribute of a thing, whereas the thing itself is a first class object. The DOI System is not solely designed for use on the World Wide Web; the same functionality can be made available through any digital network and protocol, but the Web demonstrates its advantages well. 1.4.2 DOI names are actionable identifiers The purpose of the DOI System is to make the DOI name an actionable identifier: a user can use a DOI name to do something. The simplest action that a user can perform using a DOI name is to locate the entity that it identifies. In this respect, a DOI name may look superficially like a URL. However, the technology which underlies the DOI System facilitates much more complex applications than simple location; and the DOI name identifies the intellectual property entity itself rather than its location. The ease of assigning URLs was no doubt responsible in part for the expansion of the Web -- but the fact that they are easy to create (and neglect) means they are not strong enough alone for a commercial basis. "Not Found" link messages are a scourge across the Internet: the rate at which once-valid links start pointing at non-existent addresses -- a process called "link rot" -- is reported to be as high as one sixth of all links in six months. The fact that URLs change (technically, they are not "persistent") isn't a bad thing in itself: in fact, it is very helpful to separate names from locations -- since location is only one property of (or piece of metadata about) a name which we might want to manage by the process of resolution. We want to be able to move things around -- there are legitimate reasons such as change of ownership. The problem is that using URLs alone we can't track what's changed, or use one name persistently irrespective of where the item is. This does not imply that the DOI name will necessarily resolve to the entity that it identifies -- although that will sometimes be the case. The DOI name, though, can be used to identify classes of intellectual property -- abstract "works", physical "manifestations", performances -- that cannot be directly accessed in a digital file. Even when the DOI name does identify a digital file, this will not always be the most appropriate or useful data for the DOI name to resolve to. Even if there is no current location for a digital file, it might still be useful to know what it represented, or who owned it, or search for it elsewhere. Even if we have a location, we might want to offer other resolution results. Therefore it is very important to distinguish what the DOI name identifies from what the DOI name resolves to. They may be the same thing, but they will often be very different. The technology used to manage the resolution of DOI names is the Handle System®; a description can be found in Appendix 2. The Handle System is unlike most other resolution technologies in supporting multiple resolution. A DOI name may have multiple data values of different types associated with it (email addresses and URLs, for example), and multiple data values of the same type (several URLs). The same DOI name can resolve to different data, depending on the way in which the Handle System is queried. This enables the DOI name, and the metadata with which it is associated, to form the foundation for many different services relating to the management of intellectual property in the network environment, to the benefit of intellectual property owners and users alike. In order for the DOI name to be resolved, the Registrant (or the Registration Agency he uses) needs to maintain the data associated with that DOI name in the Handle System; this data is referred to as "state data". The simplest form of state data is a single URL. However, a DOI name can resolve to many other forms of data. 1.4.3 DOI names are interoperable identifiers The DOI System has been designed to interoperate with past, present and future technologies.
1.4.4 Identifying at the appropriate level An achievement of DOI System work has been a practical implementation of the idea of rethinking the Internet as management of information, not movement of data packets. Managing information on the Internet at the appropriate level is a recurring theme in the vision of the future of the Internet. As will be seen in what follows, the DOI System is not (only) an identifier of digital objects but (more widely) a digital identifier of objects -- that is, it facilitates digital management of any entities (focussing on those involved in intellectual property transactions). Identification of non-digital entities, such as underlying abstractions (the "work") and physical manifestations are also needed in expressing real world transactions, and any technology which considers only "digital representations" is inadequate for digital rights management. There is nothing new in using abstractions or representations in trading -- we do it all the time with physical property: representations such as deeds and mortgages are what alters (not the physical bricks etc.) when a house changes hands. Similarly with intellectual property, representations such as licences and files are traded. Digital trading of these pieces of property requires that each entity be uniquely and persistently identified, and associated with data. The indecs framework recognises the concept of functional granularity ("it should be possible to identify an entity when there is a reason to distinguish it"); this is echoed in the DOI System treatment of an identified entity as a first class object (an object in itself, not some attribute of an object). Whereas URLs are grouped by domain name and then by some hierarchical structure (originally based on file trees), DOI names offer a more finely grained approach to naming, where each name stands on its own, unconnected to any Domain Name System (DNS) or other hierarchy. The most common mechanism for resolution on the Internet is DNS (http as used in URL is a use of DNS). The Handle System used by the DOI System uses TCP/IP but avoids the need to use the DNS, and this has significant advantages. One advantage is that names are not implicated in trademark disputes. Another is flexibility over time as the document origins reflected in a hierarchy lose meaning, such as a change in ownership (if acmeco.com sells some assets to newco.com, all URL filenames beginning acmeco.com/ which pertain to the sale need to be changed. This benefit has already been seen in the case of CrossRef, where millions of DOI names identified through the Academic Press IDEAL system were merged into Elsevier's Science Direct system when the companies merged). In order to manage DOI names we have created tools that allow more flexible management of sets of DOI names, in a more useful way than as a fixed sub-domain: a DOI name, DOI® Application Profile and DOI System services can all be thought of as layers of abstraction which allow this. Functionality such as URL partial redirection and relative URLs (which assume as "known" or inherited a part of a URL / domain name address) make a lot of sense in the context of URLs. However since DOI names deliberately have a more finely grained approach to naming things, functionality such as partial redirection is dealt with through tools that capitalise on that finer granularity: precise definition of components and their associated services. Identifying at the appropriate level is key to managing information. Too low a level of granularity makes it impossible to pick out important differences: too high a level of granularity makes it too complex to group similarities. Here is a good analogy from Jorge Luis Borges: "Locke, in the seventeenth century, postulated (and rejected) an impossible idiom in which each individual object, each stone, each bird and branch had an individual name; Funes had once projected an analogous idiom, but he had renounced it as being too general, too ambiguous. In effect, Funes not only remembered every leaf on every tree of every wood, but even every one of the times he had perceived or imagined it. He determined to reduce all his past experience to some seventy thousand recollections, which he would later define numerically. Two considerations dissuaded him: he thought the task was interminable and that it was useless. He knew that at the hour of his death he would scarcely have finished classifying even all the memories of his childhood." ("Funes the Memorious") 1.4.5 Identifying copies and versions A common question is: if I identify entity A with a DOI name, and then I adapt it in some way to create entity B, should I assign a new DOI name to entity B? The answer is: there can be no general rule which applies to all cases and each must be treated in context. If a registrant finds it useful to do so, they may. The rules of Application Profiles, and business rules of Registration Agencies, will help in deciding for DOI names registered in Application Profiles. The key point is that one should precisely specify what A is and what B is; two digital entities are never the same in any absolute sense and can be considered copies of each other only in the context of some defined purpose. For a more detailed explanation of this fundamental topic, see the article "On Making and Identifying a Copy" http://dx.doi.org/10.1045/january2003-paskin. 1.5 Components of the DOI SystemThe DOI System has four components:
By combining a tool for naming "content objects" as first class objects in their own right with a mechanism to make these names actionable through "resolution", the DOI System offers persistent managed identification for any entity. But that alone is not enough: managing resources interoperably requires appropriate metadata: creating a mechanism to provide a description of what is identified in a structured way allows services about the object to be built for any purpose. The IDF has outlined, and is actively developing in more detail, a standard way of not only doing this, but linking to existing standards such as ONIX, Dublin Core and so on, allowing each community to bring its own identifiers and descriptions into play. Finally, wrapping these tools into a social and policy framework, through the Registration Agency federation, allows the development of DOI names in a consistent quality-assured way across many sectors, opening the possibility of managing multimedia objects seamlessly. 1.6 What can be identified by a DOI name?A DOI name can be used to identify any resource involved in an intellectual property transaction including, for example, text, audio, images, software, etc., and the agreements and parties involved. While the scope of intellectual property transactions is quite broad, it is unlikely that DOI names would be appropriate for identifying entities such as people or natural objects unless they are involved in such a transaction, or entities such as trucks. Intellectual property transactions don't necessarily involve money: DOI names can be used to identify free materials and transactions as well as entities of commercial value. While a DOI name can be used like any other URI to identify "anything that has identity", the DOI System is a combination of components (identification, resolution, data model and policies) devised with the specific primary aim of identifying any "intellectual property entity". The initial focus of DOI System applications was "Creations" -- that is, resources made by human beings, rather than other types of resource (natural objects, people, places, events, etc.). Other types of resource are also necessarily involved in intellectual property transactions, and so may be identified by DOI names where appropriate. As an example, the initial aim of the DOI System was not to be used to identify natural objects (e.g., specimens in a natural history museum, or natural substances used in pharmaceutical research): but if these were involved in intellectual property interactions there may be an application of DOI name to museum artefacts or pharmaceutical components which would be appropriate. Similarly, the DOI System was not initially an identifier for agreements or licences (which in the indecs framework are types of events), but implementers may find it useful to identify these with DOI names alongside the intellectual property that they govern. Critically, a DOI name is a persistent identifier: even if ownership of the entity or the rights in the entity change, the identification of that entity should not (and does not) change. The responsibility for managing the DOI name changes, but not the name itself. 1.6.2 Identification of abstractions Creations may be in both tangible and intangible forms. DOI names can be assigned not only to manifestations of intellectual property (books, recordings, electronic files) but also to performances and to "abstractions" -- the underlying concepts (often referred to as "works") that underlie all intellectual property. This may be necessary for applications such as rights management or citation. These "abstractions" are what enable us to recognize a performance of a song, or the words of a book, entirely separately from any particular performance or specific edition. In fact there is nothing new in using abstractions or representations in trading -- we do it all the time with physical property: representations such as deeds and mortgages are what alters (not the physical bricks etc.) when a house changes hands. Similarly with intellectual property, representations such as licences and files are traded. Digital trading of these pieces of property requires that each entity be uniquely identified. DOI names can be used to identify any of the various physical objects that are "manifestations" of intellectual property: for example, printed books, CD recordings, videotapes, journal articles. A DOI name can also be used to identify less tangible manifestations, the digital files that are the common form of intellectual property in the network environment. But the use of a DOI name can go beyond the identification only of "manifestations" -- it can also be used to identify performances of intellectual property or the "abstractions" that underlie the different manifestations, and other types of resources where they are involved in intellectual property transactions. Formally, DOI System scope is defined in terms of a data model, the model underlying the indecs work: a DOI name can be assigned to any entity which is a Resource within the indecs context model. This means the type of entity must be described in terms of attributes in the dictionary (e.g., media, mode, content, subject), and become an entry in the indecs Data Dictionary used by the DOI System. The practical outcome of this is important and provides a pragmatic functional specification: a DOI name can identify any Resource, but the DOI System requires that the Resource is defined (technically and hence precisely) in terms of agreed public (RDD) attributes. This is one role of the DOI Data Model. A DOI name can be applied at any level of granularity; in other words, there is no preset definition of the size or form of an entity that may be identified with a DOI name. Rather the decision as to what a DOI name identifies is taken by the Registrant on a purely functional basis -- what is it that I need to be able to identify? This is an application of what the indecs analysis calls Functional Granularity. The principle of functional granularity proposes that "it should be possible to identify an entity whenever it needs to be distinguished". A DOI name can equally be used to identify a complete opera, an individual aria or a single bar of music. In the same way, it can be used to identify a journal, an individual issue of a journal, an individual paper in the journal, or a single table in that paper. However, it is not always possible to identify in advance which specific elements will need to be identified. It has to be possible to identify only those elements where there is a recognized need to do so -- whenever that need is recognized. Functional granularity should be considered in addressing any question as to application. For example, if a journal publication were to exist in English and Spanish, how many DOI names would there be per article? There is no simple yes/no answer. This is a "functional granularity" issue, and hence ultimately a decision for the publisher. A publisher could consider the English (E) and Spanish (S) to be different "versions" of the same underlying "work" or "creation" (similar to having both a pdf and html version) in which case one DOI name. Or a publisher could consider them two separate underlying works, hence two DOI names. These could perhaps be related in one or more applications using the indecs entities and relationships or they could be grouped together under a third DOI name for the work. This latter approach is envisioned as a possible future evolution of the DOI System involving multiple resolution, in which a single DOI name for the work could be resolved to multiple additional DOI names for versions of the work, e.g., language, and each of those DOI names could further be resolved to multiple locations. Functionally the decision comes down to this; does the publisher wish to distinguish between E and S for any purpose, e.g., to enable certain mirror sites to carry only the Spanish or English versions and not have to carry both. The safe option is always to take granularity down as low as possible (two DOI names), retaining the flexibility to aggregate them in one or more ways at a later date. 1.7 Benefits of the DOI System to Publishers, Intermediaries, and usersThe DOI System offers a unique set of functionality:
For users, these features provide the ability to
Some benefits which the DOI name enables:
Some specific benefits of DOI names in various aspects of the supply chain are described below in more detail. 1.7.2 Benefits in internal content management DOI names and associated metadata ensure accurate, interoperable and efficient product information is available both externally but also internally, reducing costs in many places:
1.7.3 Benefits in the distribution and sales life-cycle
1.7.4 Benefits in the production life-cycle
1.7.5 Quantified benefits: case studies A white paper "Enterprise Content Integration with the Digital Object Identifier: a business case for information publishers", (http://dx.doi.org/10.1220/whitepaper5) quantifies the business benefits for information publishers of implementing the DOI System to facilitate internal content management and to enable faster, more scalable product development, by delivering four key advantages in making it easier and cheaper to:
This is illustrated by four examples of cost savings, each of which is supported by a worked actual case study:
1.8 The DOI System as social infrastructureThe implementation of the DOI System adds value, but necessarily incurs some costs. The three principle areas of cost currently lie in the following tasks:
There is a widespread recognition of the advantages of assigning identifiers; and a widespread misconception that an abstract specification (like a URN or URI) actually delivers a working system rather than a namespace that still needs to be populated and managed. A common misperception is that one can have such a system at no cost. It is inescapable that a cost is associated with managing persistence and assigning identifiers and data to the standards needed to ensure long-term stability. This is because of the need for human intervention and support of an infrastructure. Assigning a library catalogue record, for example, will typically cost anything up to $25. Assigning an ISBN or ISSN or National Bibliography Numbers will also have costs, even if these are not paid directly by the assigner. Although a DOI name is free at the point of use, there is a small fee to an assigner for creating a DOI name (a few cents). This is because we have deliberately chosen to make the DOI System a self-funding (though not for profit) system. Our task now is to show that the system offers value for money as a tool which producers of information can use: CrossRef is one proven example of a registration Agency and Application Profile in text publishing; we expect to see other variants on this theme develop. If adding a URL "costs nothing" (which itself ignores some infrastructure costs), why should assigning a name? It is indeed possible to use any string, assigned by anyone, as a name -- but to be useful and reliable any name must be supported by a social as well as technical infrastructure that defines its properties and utilities. URLs for example have a clear technical infrastructure (standards for how they are made), but a very loose social infrastructure (anyone can create them, with the result that they are unreliable alone for long term preservation use as they have no guarantee of stability let alone associated structured metadata). Product bar codes, Visa numbers, and DOI names have a tighter social (business) infrastructure, with rules and regulations, costs of maintaining and policing data -- and corresponding benefits of quality and reliability (When a credit card is presented, we can be reasonably certain that the number is valid, and has been issued only after careful correlation with associated metadata by the registrant). It does not necessarily imply a centralised system -- it may be a distributed system (like domain names), but it must have some form of regulation. Such regulation of infrastructure for a community benefits all its members; funding the development of it is often a problem, and there is no "one size fits all" solution to how this should be done. But finding a workable model for the development of an infrastructure can yield obvious benefits. There are many modern examples -- 3G telephone networks, railways -- which are struggling with the right model for supporting a common infrastructure. The Internet was largely a creation of central (US) government; the product bar code, a creation of a commercial consortium. The IDF has chosen as its model the concept of Registration Agencies, based on market models like bar codes and Visa rather than on centralised subsidy: these Agencies effectively hold a "franchise" on the DOI System: in exchange for a fee to the IDF, and a commitment to follow the ground rules of the DOI System, they are free to build their own offerings to a particular community, adding value services on top of DOI name registration and charging fees for participation. At the outset of the DOI System development, a very simple model was introduced whereby a prefix assignment was purchased for a one-off fee from the IDF. It was recognized at the outset that this fee structure was a starting point but would be insufficiently flexible for the long term. DOI names allocated using these prefixes purchased directly from IDF are registered without structured metadata: they are now defined as being in the zero Application Profile. We are now in a process of migration to the long term aim of a wide variety of potential business models, using third party Registration Agencies, in recognition of the fact that such a simple model is not a "one size fits all" solution. The disadvantage of using direct prefix purchase is that IDF cannot offer the level of metadata support and social infrastructure support of the type which can be given by a Registration Agency. DOI name prefixes obtained directly from IDF may however be useful if you wish to experiment or consider developing your own applications. DOI name prefixes will now only be issued through this direct route at the discretion of the Managing Agent. Our intention is that eventually all DOI names will be registered through one of many Registration Agencies, each of which is empowered to offer much more flexible pricing structures. The pricing structures and business models of the Registration Agencies will not be determined by the IDF; each RA will be autonomous as to its business model, which could include, but not be limited to, cost recovery via direct charging based on prefix allocation, numbers of DOI names allocated, numbers of DOI names resolved, volume discounts, usage discounts, stepped charges, or any mix of these; indirect charging via cross subsidy from other value added services, agreed links, etc. DOI names may be made available at "no charge", if the costs of doing so can be met from elsewhere (there is no such thing as "free", only "alternatively funded"). IDF itself is willing to allocate a DOI name prefix free of charge to organizations for limited experimental non-commercial uses at the discretion of the Managing Agent. For the longer term, the business model includes two separate steps: a business relationship between IDF and an RA (the "franchise fee"); and a business relationship between an RA and a DOI name registrant (the "registration fee"). The two are not directly connected; this enables the RA to offer to registrants any business model whatever, which suits its needs. This could include assigning DOI names without charge. Hence DOI names can be used in both commercial and non-commercial settings, interoperably. Like any other piece of infrastructure, an identifier system (especially one which adds much value like metadata and resolution) must be paid for eventually by someone. So an organization could, if it wished, assign DOI names freely (registration fee zero to registrants) and subsidize this added-value service by paying a franchise fee to IDF from a central fund, as an acceptable cost for supporting the service. 1.9 The DOI System as managed systemLike Domain Name registration, assignment of DOI names requires a fee and agreement to follow the defined standard and rules. This does not make the system closed, or commercial, but it does make it managed. The IDF is a not-for-profit organization, not a commercial operation; however, the system has costs that need to be met. Persistence is a function of organizations, not technology: to support a persistent identifier system, a persistent organization needs to exist. The principle concern of a persistent organization is of continuing funding; hence the model selected for a long-term position for a DOI System organization was a body that is not reliant on external sources, such as grants or membership, but is a self-funding system that can be supported in perpetuity from its own resources. The IDF is currently undergoing controlled migration from its initial member-funded organization (like W3C) to an organization that is operationally funded. The implementation of the DOI System adds value, but the implementation necessarily incurs some resource costs in data management, infrastructure provision and governance, all of which contribute to persistence. The mechanism chosen to recoup those costs incurred by the organization is a self-funding "franchise" business model, as used by the physical bar code UCC/EAN system, and other proven systems. This is funded by a fee for participation (which may optionally be passed on to registrants, waived, or subsidised by the operating entity), but not for use of a DOI name once issued. To make such a system work effectively requires protection of the assets within the system (1) from illicit exploitation, and (2) for assured quality control. Illicit exploitation would include someone calling something a DOI name when it is not part of the system; this could be damaging to one or both of the financial health (avoiding payment of an issuing fee) or the quality (poor data) of the system. To prevent this exploitation requires the availability of legal remedies: specifically, the DOI System relies on copyright and trademark law to protect the "DOI" brand and reputation. The DOI System is not a patented system; the IDF has not developed any patent claims on the DOI System and does not rely on patent law for remedy. The underlying technologies used by the DOI System also have similar considerations. The Handle System is used by IDF under licence from Corporation for National Research Initiatives, who have certain intellectual property claims to protect the misuse of the Handle System; <indecs> intellectual property (IP) is assigned to, jointly and solely, IDF and EDItEUR and made available freely but under stated terms to others (an example being the <indecs>RDD work contributed to MPEG 21). There is a widespread recognition of the advantages of assigning identifiers as well as a widespread misconception that an abstract-free specification (like a URN or URI) actually delivers a working system rather than a namespace that still needs to be populated and managed. URLs, for example, have a clear technical infrastructure (standards for how they are made) but a very loose social infrastructure (anyone can create them once a domain name has been obtained, with the result that they are unreliable: they have no guarantee of stability, let alone associated structured metadata). Product bar codes, Visa numbers, and DOI names have tighter social (business) infrastructures, with rules and regulations, costs of maintaining and policing data, and corresponding benefits of quality and reliability. From this need for management stems some misconceptions about the DOI System funding and business model. The most common myths are:
Previous Chapter: Glossary Next Chapter: 2 Numbering |