Previous Chapter: 1 Introduction Next Chapter: 3 Resolution 2 NumberingThis chapter explains how a DOI® name is constructed and assigned. It discusses the use of the
DOI name prefix as a naming authority, and the DOI name suffix as a mechanism for assigning
individual numbers within that naming authority, incorporating (if required) existing
identifiers. The ability of the DOI name to incorporate existing identifiers and the benefits of that
approach are discussed in detail. Character sets, case sensitivity, uniqueness, and Internet
identifier specifications are also discussed. 2.1 Assigning numbers 2.1 Assigning numbersEach DOI® name is a unique "number", assigned to identify only one entity. Although the DOI® System will assure that the same DOI name is not issued twice, it is a primary responsibility of the Registrant (the company or individual assigning the DOI name) to identify each object within a DOI name prefix uniquely. That uniqueness is enforced by the DOI System. It is important for the integrity of the system that the same number is not used twice to identify different things; it is desirable that two DOI names should not be assigned to the same thing (although the same thing may have other, different identifiers applied to it for other applications -- a book may have both an ISBN and a DOI name). The DOI System is designed in such a way as to make it as simple as possible for anyone to name uniquely any item of intellectual property -- tangible or intangible, in physical or digital form. Existing identifiers -- like the ISBN -- can be used as part of the DOI name, which should make it much easier for registrants to issue DOI names to all their existing "content assets". However, the DOI System goes much further than most existing identifiers, in being able to identify much smaller "fragments" of content -- and types of intellectual property for which no existing identification scheme (or "legacy identifier") exists. 2.2 The structure of a DOI nameThe numbering system of the DOI name follows a syntax standardised as ANSI/NISO Z39.84- 2000 (see Appendix 1). The DOI System is also an implementation of a URI (Universal Resource Identifier) and is defined as such in an IETF RFC document. In use, the DOI name is an "opaque string" or "dumb number" -- nothing at all can or should be inferred from the number in respect of its use in the DOI System. The only secure way of knowing anything about the entity that a particular DOI name identifies is by looking at the metadata that the Registrant of the DOI name declares at the time of registration. This means, for example, that even when the ownership of a particular item changes, its identifier remains the same -- in perpetuity. This is why the DOI name is called a "persistent identifier". DOI names have two components, known as the prefix and the suffix. These are separated by a forward slash. The two components together form the DOI name: 10.1000/123456 In this example, the prefix is "10.1000" and the suffix is "123456". There is no technical limitation on the length of either the prefix or the suffix; in theory, at least, there is an infinite number of DOI names available. The prefix itself has two components. All DOI names start with "10." This distinguishes a DOI name from any other implementation of the Handle System®. The next element of the prefix is the number (string) that is assigned to an organization that wishes to register DOI names. There is no limitation placed on the number of DOI name prefixes that any organization may choose to apply for. For example, a publishing company might have a single prefix, or might have a different one for each of its journals, or one for each of its imprints. This use of different prefixes within one organization may prove administratively convenient. It can help with ensuring that unique numbers are allocated (it is not always easy within a large organization to maintain uniqueness of suffixes unless numbers are centrally allocated). It may also help if some part of an organization (such as a journal) is transferred to the control of another organization. If all of the entities that make up that part of the organization share the same prefix, it can make transferring responsibility for the relevant DOI names rather more straightforward Blocks of prefixes are allocated to DOI® Registration Agencies for them to allocate to individual user organizations. All prefixes so far issued have been simple numeric strings, but there is nothing to prevent alphabetical characters being used. The prefix may be further divided into sub-prefixes, for example: 10.1000.10/123456 Remember, though, that the DOI name is an opaque string (a dumb number). No definitive information can or should be interpreted from the number in use. In particular, the fact that the DOI name has a prefix issued by a particular organization should not be used to identify the owner of any given intellectual property -- the DOI name remains persistent through ownership changes, and the prefix is unaltered. Following the prefix (separated by a forward slash) is a unique suffix (unique to a given prefix) to identify the entity. The combination of a prefix for the Registrant and unique suffix provided by the Registrant avoids any necessity for the centralized allocation of DOI names. The suffix can be any alphanumeric string that the Registrant chooses. This can simply be a sequential number, or it can make use of an existing (legacy) identifier (see more on this topic below). There are two possibilities on assigning a suffix: either (1) the entities are already numbered in some way, or (2) they are not yet numbered.
e.g., each of the following would be valid as DOI names: When a legacy identifier is incorporated into the DOI name in this way it is not intended to be interpretable as such within the DOI System (it may be useful as such outside the system in other applications). The check digit in such a number is not used by the DOI System, but may be retained without any problems arising; see more on this below.
It is not essential that all the registrants in a DOI System sector use the same mechanism for generating the suffix. In fact, use of a DOI name obviates the need for such standardization. A good example is the use of DOI names in identifying articles in CrossRef. Publishers use many different schemes which all form DOI names that can then be used together: e.g.: Publisher A uses PII: S1384107697000225 These three schemes are not at all interoperable, but become so in the DOI System as: doi:10.2345/S1384107697000225 Each publisher can retain his own scheme and does not need to switch to a new one, though all publishers need to agree on a common metadata set for their DOI names. 2.3 UniquenessIt is critical that the combination of prefix and suffix is unique, in order to support the integrity of the DOI System. The issuing of unique prefixes to Registrant organizations places the onus on those organizations to ensure that the DOI names that they are registering are indeed unique. A role of Registration Agencies is to provide a service to registrants which facilitates this. However, the DOI System will make internal checks for uniqueness at the time of registration. It is good practice never to reissue any unique identifier that has been once issued in error. 2.4 Case sensitivityDOI names are case insensitive. 10.123/ABC is identical to 10.123/AbC. All DOI names are converted to upper case upon registration, which is a common practice for making any kind of service case insensitive. The same is true with resolution. If a DOI name were registered as 10.123/ABC, then 10.123/abc will resolve it and an attempt to register 10.123/AbC would be rejected with the error message that this DOI name already existed. Although from a character encoding viewpoint suffixes are case sensitive, e.g. 10.123/ABC is different from 10.123/AbC and the two could be distinguished as different identifiers, the IDF decided to remove case sensitivity, after a detailed review of the consequences. The Handle System is configurable by service so as to be either case sensitive or case insensitive and therefore allows this. This restriction has been implemented from an early stage, and IDF agencies have not introduced any cases of two DOI names distinguishable only by ASCII case resolving to the same thing. This restriction is reflected in the current version of the ANSI/NISO syntax for the Digital Object Identifier Z.39/84 (2005) The advantages of case sensitivity (librarian and publisher practice, human readability and expectations) were outweighed by considerations of data integrity. Case sensitivity practice across Internet applications varies: DNS is not, the rest of URLs are except sometimes they aren't (this depends on the server), Unix vs PC/Mac file names (Microsoft Windows in general is not case-sensitive, Unix operating systems are always case sensitive), markup language tags, etc. can all cause unexpected problems and one cannot guarantee that any particular piece of software will respect case sensitivity and not conflate two DOI names intended to be different. Some search engines and directories are partially case sensitive. Different web browsers may differ in case sensitive handling (Netscape have stated that "authors should not rely on case-sensitivity as a way of creating distinct identifiers, unless they are designing solely for a truly standards-compliant browser"). This argued in favour of case insensitivity being the safer, and more robust, option for future evolution and development of the DOI System. 2.5 Character setsDOI names may incorporate any printable characters from the Universal Character Set (UCS-2), of ISO/IEC 10646, which is the character set defined by Unicode v2.0. The UCS-2 character set encompasses most characters used in every major language written today. However, because of specific uses made of certain characters by some Internet technologies (the use of pointed brackets < > in xml for example), there may some effective restrictions in day-to-day use (see Appendix 1). When thinking about prefixes, suffixes and character sets, it is important to distinguish the DOI System from the underlying technology, the Handle System. The DOI System is a Handle System implementation. Current usage (though not the only possible or potential usage) takes place almost entirely within the context of the World Wide Web (which is not the same as the Internet) and is governed by an evolving set of IDF policies. Prefix/suffix. Neither the Handle System nor DOI System policies, nor any web use currently imaginable, impose any constraints on the suffix, outside of encoding (see below). Handle syntax imposes two constraints on the prefix -- both slash and dot are "reserved characters", with the slash separating the prefix from the suffix and the dot used to extend sub prefixes. The root administrator for the Handle System has reserved all prefixes starting with "10." (for example 10.1000, 10.1000.1, 10.23) for the IDF to use for DOI names. Encoding. The Handle System at its core uses UTF-8, which is a Unicode implementation and so in its pure form has no character set constraints at all: any character can be sent to, stored in, and retrieved from a handle server. The IDF imposes no additional character set constraints. In practice, though, there are many character set constraints enforced by the current web environment, depending on the individual user's context -- for example, what kind of browser is being used. (This is something of a moving target -- does your current browser display kanji characters, for example? Do you know?) Implementation. It is essential to consider standards and the practical realities of implementation together. So, for example, it is imperative to "hex encode" the character "# " in a URL, since this character is used to indicates the beginning of a URL fragment. The character means nothing special to the Handle System or in DOI name syntax: nonetheless, a handle contained within a URL must have the # character encoded, otherwise a browser will abbreviate the handle at the # sign. This is true across all web implementations. The need to "hex encode" other characters, for example "<" or ">", varies with a particular browser implementation. Such required encoding in the DOI name syntax is considered within the NISO standard. In a more general sense, any implementation of identifiers in a digital context needs to consider likely encoding issues that may be encountered, and should address character set constraints and the need to move those characters through environments such as the web in such a way that they pass through unaltered. 2.6 Publishing DOI names in printSince most publication of content is via a mix of digital and print media, there are often requirements for a DOI name to be reproduced in print. A publisher might put the DOI name in the document it names, and insure that the DOI name appears whenever the item is downloaded or printed. It also might appear in the print version of a digital version. If the DOI name is represented by a button on a Web page, the Web browser will display the full DOI name at the base of the browser window when the cursor is moved over the button. Whereas in a digital context a DOI name might be assumed to be contextualized and updated (the active link it is referencing can be "wired" correctly), a print version cannot be updated or changed once released. Showing DOI names in print for e.g. journal articles tells people what an article's DOI name is, but it doesn't tell people how to access it on the Web; readers will not necessarily know that the DOI name is actionable. To do that, one may print the DOI name in a readily recognised form such as the http proxy server URL form e.g. http://dx.doi.org/10.1002/prot.9999. There are however a couple of reasons to hesitate showing the URL form: the URL is not the article's identifier, the DOI name is; and maybe the dx.doi.org form of the URL will not be the most persistent form, keeping in mind that these print copies will be around and immutable for many decades, even centuries. In practice one can feel safe in using the dx.doi.org formulation. It should continue to work for many years even if and when it is common to use DOI names in some other formulation. But if we are talking about centuries we will have moved beyond http:// as the most recognised route of access. So while it may be awkward, we recommend some convention of showing both the plain DOI name and a way to resolve it online (a shorthand way of saying "the DOI name for this article is 10.1002/prot.999 and current information may be found on the web through http://dx.doi.org/10.1002/prot.999" or "...available via http://dx.doi.org/...". For example: doi: 10.1002/prot.999 Specific DOI System implementations, such as CrossRef, may make additional recommendations appropriate to the particular applications concerned. DOI names do not replace traditional bibliographical citations but are a very useful addition, especially if articles are published online with volume, issue, and page numbers. For example, in the CrossRef application, a citation of the Science article with a DOI name would be:
A citation to a Nature article published in the Advanced Online Publication process without volume, issue or page number would be:
2.7 DOI System and Legacy IdentifiersAn aim of the DOI System is to allow existing numbering systems to be retained, and the functionality of DOI names added to them easily. 2.7.1 Using existing identifiers as a DOI name suffix An existing standard identification system number may be incorporated into a DOI name, if the registrant finds it convenient to do so (it is course recommended that precisely the same entity be identified by the two systems). The DOI System is not alone in being a system that can incorporate existing identifiers: for example, physical bar codes can be used to express ISBNs. For example, the prefix of a DOI name might consist of the ISTC (International Standard Textual Abstraction Code number): the DOI name would then identify the same entity (textual work) as the ISTC itself, with the added value of offering actionable resolution services which may be used to automate relationships (metadata); and interoperability with DOI names identifying related entities, such as manifestations of the textual work, or related textual works, even if these are not identified by ISTCs. The DOI name/ISTC may then be parsed either according to the rules of the DOI System or according to the rules of the ISTC embedded within, depending upon the context. The same mechanisms can apply equally well to the use of identifiers which are not formal standards. For example, PII (Publisher Item Identifier) is an informal agreed standard among some publishers for simple identification of articles independent of format (it identifies articles at the level of textual abstraction, as does the forthcoming ISTC standard from ISO). PII is used by several scientific publishers as an internal numbering system. (PII and the DOI System are two separate identifier systems. PII is not connected with the International DOI Foundation). So a publisher may use ISTC or PII in identifying article works. Since any existing legacy identifier can be used within a DOI name, a specific DOI System implementation can create interoperability where none existed before. For example, in the CrossRef implementation of the DOI System, some publishers create their DOI names by incorporating PII as a suffix; others incorporate SICI as a suffix; others may in future use ISTC as a suffix, and yet others may use entirely proprietary internal production numbers as a suffix. By using DOI names, each publisher gains the benefit of interoperability of its data within the CrossRef system yet does not have to "re-number" entities which have already been assigned identifiers in another scheme. Note that the kernel metadata for a DOI name mandates the inclusion of "Identifier": "A unique identifier (e.g. from a legacy scheme) applied to the entity...it is normal to include a legacy identifier if one exists". Consideration of datasets which already include existing (legacy) identifiers shows why this requirement exists: it is so that the existing legacy scheme may be used by any automated processes which pick up structured metadata from a DOI System service, using the kernel declaration of this element. Since, as we have stated earlier, DOI names are inherently opaque non-parsable strings, the legacy identifier will not be securely recovered from the DOI name suffix itself (consider for example the heterogeneous collection of suffixes in the CrossRef application). Yet including the legacy identifier, additionally, as the suffix may be convenient, make the DOI name more easily human readable, and be administratively desirable, even though it is not a requirement of DOI name creation. 2.7.2 Using DOI names to relate existing legacy identifiers Relationships between entities may be expressed via metadata. For example a single chapter of a work is an excerpt (as expressed in ISTC metadata) of that work, and (if it needs to be identified as a work) can also have an ISTC. Once a specification is made of the entities, the relationship between them may be expressed as an item of metadata ("a relationship that someone claims exists between two entities").
Figure 1 shows as an example a possible DOI System implementation of textual work identification, where the DOI names correspond to ISTCs (i.e. the ISTC may be incorporated as the DOI name suffix), in this case implemented as pop-up cascaded windows in a web browser (note that the DOI System is not restricted to use in web environments or with windows). Since multiple resolution offers an unlimited number of possible implementation choices for data types, it is possible to express any defined relationship of an ISTC to some associated datum by means of a DOI name resolution (from the DOI name to the current value of the associated datum). The specific choices selected for implementation is not specified by ISTC or DOI System rules, but a matter for application decisions: the options shown here are illustrative possible choices. Each entity may be shown in a user-friendly way (e.g. "Alice in Wonderland") with the associated DOI name (e.g. "10.1000/ISTC0A9200212B4A1057") embedded in the application, e.g. as hyperlinks embedded in HTML pages. The following examples are shown:
And so on; any desired relationship may be expressed providing the appropriate metadata or specification is available. The DOI System detailed technical architecture of Application Profiles and DOI System Services would be used to instantiate this example application. 2.7.3 Benefits of using legacy identifiers with DOI names In addition to the benefits common to any DOI name there are some benefits specific to the incorporation of an existing standard numbering scheme into a DOI name:
2.8 DOI names and check digitsA check digit is not compulsory or necessary in a DOI name, but if you wish to include one you may. Identifiers such as URL and URI specifications, deriving from an Internet environment, do not have check digits: the underlying TCP/IP protocol they use has an error-correction component. Identifiers such as ISBN and similar bibliographic or documentation identifiers do have check digits: these act as aids to readability or keyboard data entry in the absence of any automated protocol correction. The DOI System is deliberately designed as an opaque string, so that it is suitable for any use. The DOI System does not itself make use of check digits. However, other applications may make use of them, or may require them: so if you wish to incorporate a checksum digit into a DOI name you may. This could be useful for some other application. You may use as the suffix an existing string with a checksum (e.g. ISBN). You can also calculate the checksum across the whole DOI name if you wish (that would be akin to what the EAN/UPC does when it encapsulates an ISBN). Such a use of checksums in a particular DOI System application could be a rule of the DOI® Application Profile concerned: "your DOI names must include a checksum". A check digit is usually the last in the sequence within an identifier string, algorithmically derived from the preceding digits, rather than being part of the identifier itself. The aim is to ensure that if one digit is incorrectly transcribed, the check digit will change as an alerting mechanism, and that if two digits are incorrectly transcribed, the chance of their combined effect on the check digit cancelling each other out is minimised. Recalculation of the check digit from the body of the number, followed by comparison with the stated check digit, can be performed automatically at key points in processing. Note that this provides error detection, but not error correction. In a typical check digit algorithm, each digit is assigned a different weighting factor (ideally a prime number). Digits and their corresponding factors are individually multiplied and summed, the resulting sum divided by a prime modulus number, leaving a remainder being the check digit; using prime numbers minimises the chances of internal cancellation. Check digits occur in for example ISBN and ISSN numbers and in other contexts, e.g. bank account numbers; ISO has published a standard ISO 7064 for check digits. Check digits are typically of importance in an entry step (where identifiers have to be manually transcribed as input) and less important in a transmission step where error correction protocols are already in place, although their original introduction was to ensure consistency in both types of activity. This has led to the assumption that check digits are of less importance, in an Internet-enabled world, than had been assumed in earlier automation phases. Whether or not this is true depends to some extent on the consequences of an error slipping through: whether inputting an incorrect identifier generates an error message, or simply locates the wrong object. A message may be transmitted correctly, but contain incorrect initial input. Omitting check digits in bank account numbers would not provide adequate error protection for most users. So the choice of whether to include a check digit will depend on the nature of the application. DOI names can accommodate them if required. 2.9 The DOI System and Internet identifier specifications2.9.1 Generic identifier standards Persistent and actionable object names are required for coherence in the digital realm. "Persistent and actionable object names" thus necessarily require mechanisms for persistence (provided by social infrastructure); actionability (resolution from a name to some service); specification of an object (either through simple referencing or more formal description); and naming syntax (prescriptive rules for assigning identifiers in a standard format and ensuring uniqueness). The DOI System uses as its naming syntax the NISO standard DOI® syntax Z39.84. The DOI System uses for its name resolution the Handle System (IETF RFCs 3650, 3651, 3652). It uses for its optional object specification a DOI System data model and the indecs Data Dictionary and its subset the ISO MPEG 21 Rights Data Dictionary, ISO/IEC 21000-6. (The data dictionary component is designed to maximise semantic interoperability with existing metadata element sets; the data model allows descriptions to be grouped in meaningful ways so that certain types of DOI names all behave the same way in an application). DOI name persistence is guaranteed through the IDF social infrastructure which provides rules for registration, formal resilience procedures in the event of any single agency failing, etc. A standard represents an agreement by a community to do things in a specified way to address a common problem. Whilst the DOI System community has developed the system, it has also ensured conformance with relevant generic external formal standards. This note discusses those relevant in the Internet communities IETF and W3C. There is currently considerable debate here on the issue of generic standards for naming objects. The system is capable of being used in any specification which may finally be endorsed. Until a clear consensus is reached in the Internet communities on which approach is to be preferred the DOI System remains agnostic as to formal registration as a generic scheme, but useable and widely implemented for millions of objects. The DOI System conforms to the functional requirements of the two generic approaches for naming first-class objects on the Internet: the Uniform Resource Name (URN) and the Uniform Resource Identifier (URI). URI and URN specifications deal only with syntax and (in part) associated implementation through resolution, not with description or persistence policy. Broadly, the URN approach is favoured by IETF and the URI approach by W3C, though there is considerable ongoing debate about each; some documentation on these specifications is incomplete. Crucially, widespread practical implementations of these specifications as object naming do not exist: both URI and URN are specifications, not in themselves working implementations. The DOI System is de facto a practical implementation of URI and URN. The DOI System can also be implemented using current URL (http) specifications. The system is also a defined Digital Item Identifier within the ISO MPEG 21 multimedia framework specification. The Uniform Resource Identifier (URI) specification is IETF RFC 2396, URI Generic Syntax, currently under revision as RFC 2396 bis. URIs formally encompass URNs as a sub set. In practice, the URI specification defines (1) an implementation more often called the Uniform Resource Locator, a location on a file server, commonly accessed using the http protocol though other protocols are allowed; (2) a syntax for referencing in XML, through which e.g. ISBNs can be specified as URIs. This provides a single framework which can accommodate any other identifier for referencing, but it is not as such persistent (since persistence is not determined by the specification but by the practical implementation). Conflating these two causes confusion. URLs as currently understood are demonstrably not persistent; redefining them as URIs doesn't fix that. URL implementation. Users may resolve DOI names using the URL syntax through the DOI System proxy server (http://dx.doi.org). A DOI name of the form doi:10.123/456 would be resolved from the address: "http://dx.doi.org/10.123/456". Any standard browser encountering a DOI name in this form will be able to resolve it. The use of the proxy server does not interfere with any http requirements, so DOI names may be used with other http-based mechanisms such as OpenURL, PURL, parameter passing, etc. The proxy server is maintained by the IDF and the DOI System community for use by all. URI syntax implementation. In the URI specification, the network path of the URI is implicitly DNS based; there are no real provisions to include systems that are not DNS based. Original URI specifications, and good design practice, assume the URI to be opaque (that is, it is not assumed that software can parse the body of the URI but that it would simply recognize the name of the scheme and hand it off to some other software that understood the scheme). The current URI specification, however, assumes that the initial URI parser will look into every URI, no matter what the scheme, looking for certain meaningful characters such as dot and slash. This version of the URI proposed in RFC 2396 bis is so restrictive that it is difficult to see what system could make use of it. A specification for a DOI name as a URI exists as an Internet Draft: this document defines the 'doi' Uniform Resource Identifier (URI) scheme for DOI names, which allows a DOI name to be referenced by a URI for Internet applications. The current revision of the URI specification, plus ongoing debate within the IETF and W3C communities on several proposed URI specifications, have delayed the processing of this Draft. DOI System implementation does not depend on implementation of this specification. The URN (Uniform Resource Name) specification is RFC 2141 URN Syntax. In practice, the URN specification defines (1) a formal registration process as a urn namespace, e.g., urn:doi:10.1000/1 and (2) accompanying specifications to implement a series of functional requirements for such namespaces. Namespace referencing. One may specify any existing identifier as a URN: e.g urn:isbn:123456789, but this has no advantage over the simpler isbn:12345678. Such identifiers may be implemented using a specially written URN plug-in and resolved to URLs: functionally this gives nothing beyond the functionality achieved by coherent management of the corresponding URLs. URN implementation. In order to implement the functional requirements, the URN architecture assumes an additional network service: a DNS-based Resolution Discovery Service (RDS) to allow a client to deal with a previously unknown URN type by finding the specific service appropriate to the given URN scheme. URN resolutions are then delegated to that scheme-specific resolution service. However no such deployed RDS schemes currently exist: browsers cannot action URN strings without some additional programming in the form of a "plug-in". The lack of any wide-spread infrastructural support will require any URN implementation to develop its own resolution mechanisms, such as plug-ins or proxy servers. Resolution mechanisms which require functionality beyond 1 URN to 1 URL also require the creation of data models. Several such implementations have been developed for specific uses where deployment to a closed group of users may be achieved; these carry no guarantee of ready interoperability with other deployments, which may require a different plug in for each implementation and may use conflicting data approaches. DOI names do not require a plug-in but offer this as an option. The Handle System browser plug-in (native resolver, available in binary) delivers URN functional requirements in Windows-based browsers. URN plug-ins and Handle plug-ins share the problem of any new functionality of deploying the software to users. The Handle System has significant advantages: (1) the Handle System is a global supported resolution service; (2) the browser plug-in is freely available, widely tested and proven across multiple applications; (3) unlike URN plug-ins, it is part of a suite of freely available and managed supporting software configured to provide coherent server-side support, including the Local Handle Service software and Handle System Client Libraries. These are available across platforms and with several added security features such as trusted resolution and distributed administration. DOI System functionality can therefore be delivered through http, browser plug in, or incorporation of handle software in a dedicated application. The DOI System is not registered as a formal URN, despite fulfilling all the functional requirements, since URN registration appears to offer no advantage to the DOI System. It requires an additional layer of administration for defining DOI names as a URN namespace (the string urn:doi:10.1000/1 rather than the simpler doi:10.1000/1) and an additional step of unnecessary redirection to access the resolution service, already achieved through either http proxy or native resolution. If RDS mechanisms supporting URN specifications become widely available, DOI names will be registered URNs. 2.9.4 DOI System functional requirements The DOI System is designed to fulfil several functional requirements which we believe offer significant advantages in generic naming, notably: Neutral as to implementation. The DOI System allows but does not require http or other protocols. The design principle is that DOI names are not specific to the Web or any other implementation (e.g. information may be delivered in non-Web platforms such as PDAs). The DOI System is designed to be applicable in any environment on the Internet (the global information system linked by a globally unique address space based on the Internet Protocol (IP) using the Transmission Control Protocol/Internet Protocol (TCP/IP) suite). Granularity of naming and administration at the object level. Allows but does not mandate coarser level granularity tools such as domain names. Specifically, DOI name resolution in native resolver form does not require the use of the DNS (Domain Name System): the DNS administrative model argues against using it as a general-purpose name system and has well-recognised problems of security and updating. Neutral as to language/character set. Compatible with, but not restricted to, the ASCII character set. DOI names can use the Unicode capability of the Handle System to develop DOI names in Japanese, Chinese, etc. characters. The current DOI name syntax restricts initial implementations to ASCII simply for ease of adoption, but is intended to be widened (backward compatibly) to Unicode at the next revision. Previous Chapter: 1 Introduction Next Chapter: 3 Resolution |