[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Ref-Links] DOIs used for reference linking
Hi,
On Thu, 18 Mar 1999, Norman Paskin wrote:
> We have now produced a proposal for the use of DOIs for reference
> linking.
[...]
> Comments are welcome.
I am reading this on the Ref-Links mailing list and I am only replying in
this forum. If this is of sufficient interest for the other lists, as
judged by someone familiar with them, will forward these comments there. I
apologize upfront for the length of my remarks and their devil's advocate
nature.
I suppose I should take a moment to introduce myself. I am a physicist by
training, but no longer do active research. I worked with Paul Ginsparg
for several years on the Los Alamos e-Print Archive (http://xxx.lanl.gov/)
and now I work for the American Physical Society in our Journal
Information Systems R&D dept. My main responsibilities include our
Physical Review Online Archive (http://prola.aps.org/) and the APS link
manager (http://publish.aps.org/linkfaq.html) and our inter-publisher
linking relationships.
My view towards citing and reference linking is distinctly pragmatic (as
will be amply clear from what follows). Attached below is a PDF file of a
talk I have prepared for the ICCC/IFIP Third Conference on Electronic
Publishing '99 in Ronneby, Sweden in May
(http://www5.hk-r.se/ElPub99.nsf/). The talk is about the pragmatic
approaches the APS has taken to addressing some of the issues being
debated here. It is somewhat pedestrian in nature, but I think it
highlights some important aspects of the problems of citing and linking.
Comments on the paper are welcome.
The main problem that I have with using DOIs as the basis for citing and
linking references to scholarly journal articles is that it needlessly
tries to throw away the current scheme, a scheme that has been in use for
hundreds of years and has proven robust and stable. Namely, an article (I
am going to attempt to use the terms as defined in the proposal) is cited
by a subset of the (usually hierarchial) metadata that derives from the
peer-review process. For instance, Physical Review articles are cited by
(author, journal, volume, page, year). Conspicuously absent are the issue
number, title, ISSN, coden, PII, etc.
I believe the DOI proposal for reference linking errs when it regards this
subset of metadata as only applicable to a physical manifestation of the
work. It is, of course, true that such citations are tied to the printed
book, but they work perfectly well as a way to access an electronic
manifestation of the articles, especially when a "wrapper" is created that
supplies links to all of the alternative electronic manifestations
(availibility of which may vary with time). One does face the problem of
how to identify articles independently of the papyrocentric pagination.
This we have solved pragmatically by generalizing the page number to
a six digit "electronic identifier".
This number looks like a page number for the most part, but it works for
both the printed book (if it is even produced) and the online version. In
particular, the identifiers have modest intelligence built in so that the
article can be found on a library book shelf just as easily as if a page
number were given, i.e., the numerical ordering of the identifiers matches
the editorially supplied table of contents ordering, even though the
latter doesn't fully exist until an issue is completed. In a purely online
journal, an "issue" would just be a collection of articles published
during a convenient time period. Researchers, libraries, A&I services, and
other databases have been able to immediately integrate the electronic
identifier into what they do because it is transparently akin to a page
number.
The APS link manager provides an interface that is built upon the metadata
subset used to cite the articles we publish. In particular, the
construction of the URL can be accomplished by plugging in the metadata to
a simple template. No lookups are required. The simple generalization of
the page number and the simple link interface (which takes one to the
wrapper for the article) elevates the traditional citation from one for
the physical manifestation to one for the Work itself. This is no small
point - the whole point of the peer-review process is to attach a name to
a Work that conveys quality assurance as well as "brand" (the value added
to an article derived from being part of a larger collection of similarly
selected articles).
The proposal calls for recommending DOIs as "the declared identifier in a
citation" and for DOIs to remain "dumb". This I think will not work. I do
not believe that researchers will be willing to use long, dumb identifiers
directly in their citations, especially when there are simpler schemes
that will suffice. During the transition period when publishers will have
both digital and physical manifestations, I do not think it wise to
introduce multiple citations for a single work. Some publishers (at least
one in physics) adopted the DOI for articles that are published online
before they are printed. When the articles is printed, a second citation
is produced. Researchers will thus cite the article in multiple ways,
making it harder to track and locate the article in its different
manifestations.
The other ingredient that is missing from the DOI proposal is an explicit
interface for accessing the database via the metadata. By far the most
important way of interfacing with the database is going to be querying the
database to get a DOI (and ultimately the URL) associated with the
(traditionally cited subset of) metadata. This interface will have to be
standardized and robust. With such an interface in place though, one must
pause and wonder what the addition of the DOI brings to the table. If the
query can be based on the metadata, why not leave it at that?
Finally (this e-mail is longer than I intended), one should carefully
think about the error handling that is possible when a link is being
resolved. If someone makes a mistake in the metadata (transposes some
digits in a page number or a DOI for example), where does that leave the
person attempting to follow the link? A solution like the APS link manager
is tightly integrated with our definitive manuscript database and this can
be used to the fullest to assist the user in resolving the error. At
worst, the user has found herself at the appropriate site and can follow
links to suitable search engines or tables of contents to locate the
wrongly cited article. A centralized server will need to work harder
to get the user moving on the correct path again.
So, from my pragmatic point of view, publishers can straightfowardly
generalize traditional (journal, volume, page) cites into an identifier
for the work and not just a particular manifestation and supply stable and
robust linking based upon this extension. A project like Eric Hellman's
SLinkS would provide a lighter weight centralized way of finding a
particular publishers URL template that can be used in simple algorithmic
constructions of stable and robust URLs based upon traditional citation
metadata (it is my belief that all publishers will have to supply such an
interface anyway). DOIs will surely find use in other spheres, but in the
limited domain of scholarly journal citations, I think there is a
fundamental mismatch between the generality of the DOI and the problem
domain.
Well, thanks for reading this far. The form of the solutions publishers
choose will have a great impact on the ease of adding the hyperlinking for
which the scholarly literature is so naturally suited. So far, it has
been our experience that simple interfaces based upon the commonly used
metadata have been the easiest to accomodate and result in the rapid
spread of linking.
Regards,
Mark
ICCC/IFIP talk