Anderson, 'Computer Science Technical Report (CS-TR) Project: A Pioneering Digital Library Project Viewed from a Library Perspective', Public Access Computer Systems Review v7n02 URL = http://hegel.lib.ncsu.edu/stacks/serials/pacsr/pr-v7n02-anderson-computer + Page 6 + ----------------------------------------------------------------- Anderson, Greg, Rebecca Lasher, and Vicky Reich. "The Computer Science Technical Report (CS-TR) Project: A Pioneering Digital Library Project Viewed from a Library Perspective." The Public- Access Computer Systems Review 7, no. 2 (1996): 6-26. (Refereed Article) ----------------------------------------------------------------- 1.0 Overview Be favorable to bold beginnings. --Virgil In 1992, the Advanced Research Projects Agency (ARPA) awarded a three-year grant to the Corporation for National Research Initiatives (CNRI) and five research universities to build a large-scale, distributed digital library of computer science technical reports produced by project participants. The participating universities were Carnegie Mellon University, Cornell University, the Massachusetts Institute of Technology, Stanford University, and the University of California at Berkeley. CNRI served as a collaborator and agent for the project. The Computer Science Technical Reports (CS-TR) project was one of the earliest sustained investigations into the system engineering of digital libraries, and it pioneered multi-institutional collaborative research in this increasingly important area. The CS-TR project investigated a broad spectrum of technical, social, and legal issues related to the development and implementation a very large, heterogeneous, distributed digital library. The project's main accomplishments can be summarized as follows: 1. The most enduring accomplishments were the mutual respect and the research partnership that developed between the computer scientists and the librarians who worked together to investigate digital library issues. 2. The project created a prototype digital library service that included a large collection of technical reports; an exchange format for bibliographic data (RFC 1357, which was superseded by RFC 1807); a distributed delivery protocol (Dienst) for information on the World-Wide Web; an information awareness service (Sift); an approach to interoperability (the Kahn and Wilensky paper); and a Web catalog tool (Lycos). + Page 7 + 3. The critical issues associated with the evolving concept of digital libraries were articulated through practice and deeper research into key issues. 2.0 Project Planning CS-TR project planning began in 1990 with discussions among staff from the participating institutions. Computer science technical reports are an important body of knowledge; however, they are often difficult to locate because they are normally published by academic/research departments. The original question posed for the project was straightforward: how can we make computer science technical reports more accessible to researchers? Project participants initially believed that the intellectual property issues associated with distributing the technical reports were not terribly complex. As a result of these early discussions, a variety of broader issues were identified, such as: o How do we build technologies that make scholarship more effective? o What do we really mean by a digital library? This more comprehensive view was presented to potential funding agencies. ARPA funding was secured in 1992, and CNRI assumed the role of contract administrator. At this stage, it was apparent that the project had the potential to set the pace for several important aspects of the digital library: distributed, virtual collections spread across the network; sophisticated linking mechanisms that would enable the location and retrieval of information no matter where it was located; tools to handle intellectual property issues; and identification and resolution of service and scholarly productivity issues. The consortial arrangement of the project enabled each participating institution to pursue separate, but linked, approaches to these issues. Each of the five participants placed its own technical reports online at its site. Through network- based searching and retrieval mechanisms, the project explored the issues involved in sharing, rather than duplicating, online information. + Page 8 + The research goals of the project varied with each participant. In "A Proposal for MIT Participation in an Electronic Library Plan" most of the key points involving technical, organizational, service, and data questions were enumerated: 1. To obtain early experience with a core function of the distributed electronic library of the future. 2. To work with a database that is readily available, has a critical time-sensitive value, and is already well-known and valued by its target audience. 3. To explore the architecture, design, and workflow issues associated with making information available in digital form. 4. To work within the research/prototype domain with a volume of information large enough to be useful, interesting, and scalable to an operational system. 5. To provide an important service to an audience of researchers, faculty, and students who are motivated and likely to have access to appropriately powerful workstations to use the library from their offices. [1] Each campus pursued its own research questions within the framework of these common goals. CNRI led the coordination, discussion, and facilitation of the individual efforts and contributed its own research on linking mechanisms and electronic copyright management. 3.0 Design and Development Issues The project's core design was based upon the construction of a bibliographic records database that described the technical reports and provided links to the page-image representations of the reports. In addition to images, the project obtained the full text of the technical reports from either the reports' source files or OCR conversions. Using this full-text information, the project evaluated different retrieval mechanisms; explored data integrity issues for huge stores of data; and developed citation linking strategies for references across documents (e.g., a link from a footnote or citation in one document to the cited document itself). [2] + Page 9 + 3.1 Bibliographic Record Format Many computer science R&D organizations routinely announce new technical reports by mailing (via the postal service) the bibliographic records for these reports. These bibliographic records are usually produced by secretaries or publications coordinators. This paper alert service has some obvious drawbacks: mailing costs, postal delays, and an inflexible format that is not amenable to convenient filing for later retrieval. The CS-TR project participants wanted to shift to electronic bibliographic records distribution; however, in order to do so, they needed to use the same bibliographic record exchange format. The project participants wanted a format that was simple (for people and for machines), easy to read, and easy to create. It was recognized that this was likely to be an interim format, because automatic and full-text indexing methods could supersede bibliographic records. Early in the project, use of the USMARC format was considered and discarded. USMARC is very complex, not easily taught, and not accepted by non-catalogers. Project staff were concerned that the complexity and the high level of training necessary to catalog in USMARC could cause significant time delays between report publication and bibliographic record creation. For the CS-TR project, the possibility of a delay was unacceptable. The BibTeX and Refer formats were also considered and rejected. Neither had the required computer science technical report fields (e.g., Computing Reviews category, monitoring, funding, contract organizations, and grant number). The project participants created their own bibliographic format: "RFC 1357, A Format for Mailing Bibliographic Records" (this format was subsequently superseded by RFC 1807, "A Format for Bibliographic Records"). The basic design principles of the RFC 1357/RFC 1807 formats were the: 1. Identification and creation of data elements needed for citation creation, management, and retrieval. 2. Creation of bibliographic records that coincided as closely as possible with the publication of the technical report. 3. Creation of RFC 1357/RFC 1807 records via machine parsing of the report's title page data and/or by staff in the participating computer science departments (library catalogers would not be needed). + Page 10 + 4. Provision of core information for more formal library bibliographic records. Berkeley, Stanford, and Cornell built translators that map the Project's record formats into other formats, including USMARC. Once the project participants decided to create a new bibliographic format, the development and implementation of the format proceeded quickly. The project was not constrained by older formats and could add fields as desired. Project participants came to agreement on name authority conventions for institutions; however, use of AACR2 was never discussed as a tool for bibliographic description. 3.2 Centralized Versus Distributed Indexes Once the bibliographic record format was created, the project considered the issue of centralized versus distributed indexes. Project participants had long discussions where they argued the virtues, value, and scalability of centralized and/or decentralized indexes for very large distributed collections. One of the early goals of the project was to develop an interoperable, distributed collection that would allow each site to develop its own testbed architecture, create consistent content based on the TIFF-B standard, experiment with interoperable systems, and share digitized technical reports across different systems. In the end, no conclusions were reached, and the above goal was not met. The project participants recognized that neither centralized nor decentralized servers would scale-up well. Eventually a more complicated, yet to be determined, architecture could emerge that would involve replication of an institution's indexes on several servers around the country. In order to get started, Cornell developed Dienst--a protocol and an operational system that provided Internet access to the project's distributed collections. Indexes were produced and kept at each institution. Each institution was required to run the Dienst server protocol. Dienst did permit a "single distributed collection model," but it was not an interoperable model running on different software and server platforms. [3] Some institutions implemented a full-text searching capability limited to that site's reports. + Page 11 + There were four classes of Dienst services: o A Repository Service stored digital documents, each of which had a unique name and could exist in several different formats. o An Index Service server searched a collection and returned a list of documents that matched the search. o A single, centralized Meta Service (also called a Contact Service) provided a directory of locations of all other services. o A User Interface Service mediated human access to the digital library. A group of sites sharing the Dienst protocol formed a single distributed collection. Each site typically ran Repository, Index, and User Interface Services for documents issued by that site. One site ran a Meta Service, which defined the set of sites that make up the collection. Davis et al. describe Dienst as follows: From the standpoint of a Dienst user, a document collection consists of a unified space of uniquely identified documents, each of which may be available in a variety of formats. Using publicly available World Wide Web clients, users may search the collection, browse and read individual documents in any of their available formats, and download or print a document. [4] With the Dienst system, users could query all or selected institutions using combinations of keywords in fields (e.g., author and title). The search was performed in parallel at user- selected sites. If a server was unavailable, the search would time-out and display a message to the user that the server was down. Davis et al. indicate that "further work needs to be done in two areas: begin replicating index servers to increase availability and response time; add persistent search which continues to attempt to contact non-responsive sites." [5] + Page 12 + 3.3 Technical Report File Format The pros and cons of a standardized technical report file format (e.g., images, SGML, PostScript, and ASCII) was vigorously debated. The TIFF-B image format (also called Group IV fax compression in TIFF format) was selected as the project standard. This decision was supported by the following factors: (1) in 1992, image formats were standard and many commercial image software packages were available on multiple platforms; (2) retrospective paper reports could be easily converted to the image format; (3) project participants were eager to populate servers with both retrospective and prospective reports; and (4) researchers did not want to engage in document markup, convert documents, or develop new standards. Some project members believed (and continue to believe) that image files were the ultimate version of record, because they provided the simplest exact representation of the document and could be exported to new software and platforms over time. Many of the participating institutions made multiple file formats available on their servers. All formats were available through the Dienst protocol. Use of the TIFF-B format was a requirement for the project, but most institutions also offered PostScript and ASCII files (particularly for the newer reports). 3.4 Scanning and OCR Project participants conducted an in-depth investigation of scanning and OCR hardware and software. Although there was no dpi requirement, the project participants agreed to scan pages at 300 dpi or greater because use of a lower resolution might require rescanning as more sophisticated systems were developed. Each institution purchased different equipment and software. As long as TIFF-B image files were produced, project participants did not need to use the same equipment. In fact, the project encouraged different scanning and OCR implementations. MIT conducted the most in-depth research on the high-volume production, archival, and record keeping aspects of the scanning process. The MIT Library 2000 testbed effort focused significant attention on production scanning. + Page 13 + This emphasis was based upon the hypotheses that scanned images of documents will be an important component of any future electronic environment. At its core, the digital library must contain high-quality content, and, for the foreseeable future, much of that content will come from the conversion of paper- format information to scanned images. The creation of a large corpus of quality information provides the testbed content for investigations into system architecture, electronic information management, retrieval, and long-term storage issues. Basic principles of the MIT scanning effort included: 1. Materials should only be handled once. The design of the scanning environment should strive to achieve the greatest advantage in terms of price, performance, and quality. Libraries and publishers cannot afford to rescan materials as technological capability increases. To limit potential damage to the original paper artifact, scanning once is preferable. To adhere to these principles, good paper workflow, management, and content selection are important. 2. As much information as possible should be captured in a single scan. Although current technology cannot exploit all of the bits captured, future technologies will be able to do so. The MIT scanners were capable of a resolution of 400 pixels per inch, with eight bits of gray-scale per pixel. This created very large files (about 16 MB per scanned page), which were rendered down to the agreed-upon interchange format for the project: 300 dpi, one bit per pixel, in TIFF-B format. 3. Quality control is critical. In order to achieve the first two principles, quality control methods must assure a high degree of integrity and confidence in the production environment. The MIT Libraries' Document Services department adapted procedures from its microreproduction heritage for this new production scanning effort. Document Services was using test targets from the Association for Information and Image Management (AIIM) and the Institute of Electrical and Electronic Engineers (IEEE) to test calibrations on the scanner. Quality control was checked via file checksums and visual review of selected images. + Page 14 + 4. Context of the images is important now and in the future. Because the underlying technologies will change and improve in the future, the scanned images must provide enough context for humans and machines to understand both their content and structure in order to use them effectively. The MIT scanning effort created a metadata record to provide information about the scanned document and the environment in which it was created. This record specified both the form and content of the information that must be captured when a document is scanned, and it became a component of the scanned form of the document. The record assisted in viewing, displaying, or printing the image correctly; in understanding how to interpret the image; and in meeting contractual or legal requirements. 3.5 Distributed Digital Object Services The most important design issue for the CS-TR project was to determine an appropriate infrastructure and architecture for a large distributed digital library. The outcome of the lengthy discussions of this issue is captured in a paper by Kahn and Wilensy: This document describes fundamental aspects of an infrastructure that is open in its architecture and which supports a large and extensible class of distributed digital information services. Digital libraries are one example of such services; numerous other examples of such services may be found in emerging electronic commerce applications. Here we define basic entities to be found in such a system, in which information in the form of "digital objects" is stored, accessed, disseminated and managed. We provide naming conventions for identifying and locating digital objects, describe a service for using object names to locate and disseminate objects, and provide elements of an access protocol. [6] The most important concept in the Kahn and Wilensky paper is the creation of the "handle" concept, which seeks to separate document naming issues from network address issues. Handles are not URLs; handles are an approach to a large-scale problem of naming objects that may change location over time. A handle is a unique, permanent identifier for a document, and it is used to name the document on a server. A mechanism called a "handle server" maps the handle to the document's real network address. A working prototype of the handle server is available at CNRI, and handle functionality is being integrated into Word-Wide Web browsers, such as Netscape. + Page 15 + In the future, a Web browser will send a message to a handle server that gives the handle for the desired document. The handle server will send the Web browser the actual network address of the target document, which the browser will then retrieve. Handles and handle servers will be very powerful tools for digital libraries. No longer will Web servers contain false links, because handle servers can update documents' network addresses on a nightly basis. For libraries to move beyond their physical walls (and campus boundaries) and to leverage the power of the distributed information base of the network to enrich services for their local community of users, a basic architecture for naming, locating, and accessing network information must be well-understood and adopted. The handle concept accomplishes this important goal. 3.6 Copyright Copyright is a key issue in building digital libraries. At the beginning of this project, participants assumed that there would be few (or no) copyright issues associated with distributing computer science technical reports. They assumed that the reports published at their schools were either in the public domain or that the rights were held by the publishing university. Later, as copyright questions arose, the project participants assumed that a single strategy would work for every institution. These assumptions proved to be naive. Upon investigation with legal counsel, researchers discovered that each school had different intellectual property policies, and, consequently, five different approaches to the copyright issue evolved. At Stanford, librarians took on the role of ensuring that these copyright issues did not pose a risk to the university or to the faculty. Librarians identified scenarios that needed attention, and they began to meet with legal counsel to determine appropriate responses. These efforts helped them to articulate a set of copyright guidelines now used by the CS-TR projects at Stanford and Cornell. + Page 16 + The major findings and recommendations of the Stanford guidelines are presented below. Other institutions may find this information helpful; however, they should not view it as legal advice. The worldwide legal environment is undergoing rapid change, and the project's approach may become obsolete in the face of new laws and treaties. o At most U.S. academic institutions, the author owns the copyright to any books, articles, or technical reports. Works published prior to March 1989 without a copyright notice are in the public domain (unless steps were taken within 5 years to establish copyright ownership). Works, with or without a copyright notice, published after March 1989 are copyrighted. o In most cases, reports that are produced (or report on work sponsored by) the government are not in the public domain. The government can make copies, but, at most institutions, the author owns the rights. o Most CS-TR project institutions ask authors to sign a form granting the institution nonexclusive, revocable, royalty-free license to publish, perform, display, and distribute the works. One author's signature is binding on multiple-author works. o If an author has signed or plans to sign an exclusive agreement with a publisher for a particular work (or for substantially the same work) in a particular format, that author cannot then sign a nonexclusive agreement with the institution for the same work in the same format. o If an author signs a nonexclusive agreement with an institution for a technical report and then decides to publish the same work elsewhere, the author should inform the publisher of this previous agreement. The author should then grant the institution written permission for the nonexclusive rights to publish, perform, and display the works before any works are loaded on that institution's servers. If the author indicates he or she has already signed an exclusive agreement with a publisher, the technical report should probably not be mounted on the server without permission of that publisher. + Page 17 + o At some institutions, the authors do not own the rights to their works. Each institution should be clear about copyright ownership before mounting technical reports on servers. The CS-TR project did not address the issue of third-party rights in technical reports. When authors sign agreements it is assumed that the entire work is original or that the author has the rights to include non-original tables, charts, and figures. This is one area that could be pursued by asking authors specifically about the originality of their works. There are several ways to manage technical reports that are submitted to publishers as articles. o Ask the authors not to sign exclusive agreements with publishers. Ask them to modify the publisher's standard agreement to allow the institution to keep the work on a server. o Make special arrangements with the publishers so the technical reports can stay on the servers even if an article is published. o If the author requests, remove the technical report from the server and point to the printed article. o Include a notice with the technical reports to inform viewers of their rights (e.g., transmitting documents over the network and viewing them, which may be legally considered a "performance" of the work; making printed copies; distributing copies to others; and selling copies to others). This relieves the users of guessing what restrictions might apply. Most likely, the user will properly assume fair-use restrictions apply, view the work, and perhaps make a personal copy. But can the user legally send a copy to a colleague? Cornell has chosen to clarify these issues by explicitly sublicensing rights to the user. [7] This clarifies the user's rights to quote, redistribute, or make copies of the technical report. However, the sublicense must preserve the author's right to withdraw the technical report from further distribution. + Page 18 + 3.7 Tension Between Research and Operational Objectives After a certain point in the CS-TR project's development, the project's prototype systems were used as both experimental and production services. The prototype systems that were available for public use changed constantly. This created a tension between providing reliable operational services while developing new experimental capabilities. In the CS-TR project, librarians continuously examined the long- term viability of the effort. At each stage of the project, it was important to remember that the project was primarily conducting research and that digital libraries are in a nascent state. Whatever we built would be superseded by more powerful knowledge and services in the future. Several public systems were implemented with support from the CS-TR project: o Dienst, a distributed search system for technical reports (Cornell). [8] o Mercury, a centralized search system for technical reports (Carnegie Mellon). o GLOSS, a system to help find relevant data sources (Stanford). [9] o SIFT, a system for performing wide-area information dissemination on USENET newsgroups and computer science technical reports (Stanford). [10] o Lycos, a searchable catalog of Internet resources (Carnegie Mellon). [11] o A handle server to maintain unique identifiers to objects in the digital library (CNRI). [12] + Page 19 + During the CS-TR project, these prototype systems were quite successful. The Lycos system had thousands of accesses every day. The Sift system had over 10,000 subscribers. Fourteen institutions used Dienst as a production system to disseminate their technical reports. However, using prototype systems as production systems was challenging. Enhancements and changes to the Dienst system were problematic because the institutions using the system all had to implement the upgrades. In a similar fashion, changes to Lycos or Shift system affected the Internet users of these systems. Today, many of the project's prototype systems have evolved into true production systems; however, they will continue to be used as testbeds for digital library experimentation and research. They offer an opportunity to examine a variety of new issues, such as the linkage of large-scale, distributed digital object collections; the cognitive efforts needed to identify and present coherent collections to users; and the effective integration and evaluation of services for all media, examining both content and user issues. 4.0 Collaboration in the CS-TR Project The CS-TR project involved significant collaboration between the participating institutions. It also required extensive collaboration between librarians and computer scientists. 4.1 Collaboration Between Participating Institutions As a result of many long discussions and compromises, the CS-TR project created systems that are more logical than they would have been without this collaborative effort. However, collaborations of this kind create tensions. Each institution was primarily funded to study specific areas of the overall digital library research domain. All of the participating institutions wanted to make their technical reports available on their servers as soon as possible so that their research could commence, and they wanted their prototype systems to reach the broadest possible audiences. While project participants had a common overall objective, the above considerations sometimes made multi-institutional collaboration a challenging endeavor. + Page 20 + 4.2 Collaboration Between Librarians and Computer Scientists If we accept that we are living in an information age and that a central challenge for this age is to give people tools with which they can successfully use networked information, then librarians and computer scientists are natural collaborators to address this challenge. Computer scientists and librarians each bring to the discussion complementary technical skills and perspectives. Computer scientists have a broad view of the network, new approaches to information retrieval, and an openness to change. Librarians have content expertise, responsibility for significant collections of scholarly material, a strong service orientation, and a historical commitment to the preservation of our intellectual heritage. Both communities share the academic values of the open sharing of information and the desire to foster the creation of new knowledge. From the inception of the CS-TR project, librarians worked closely with computer scientists. Both groups brought strengths to the project, and the cooperative results were superior to those that would have occurred if either group had conducted the project alone. Through ongoing discussions and consideration of common problems, such as the proposed handle mechanism, an atmosphere of trust and respect was created. The librarians benefitted from the computer scientists' cultural values of exploration and learning by doing. The computer scientists benefitted from the librarians' broad perspective and integrative skills. The mutual respect of these two groups for each other's professional knowledge and abilities created a productive, dynamic atmosphere. For example, early in the design stage of the project, the development of bibliographic records for the technical reports was a key discussion topic. The computer scientists wanted a variety of departmental staff to be able to quickly and easily create bibliographic records. The librarians wanted consistent record content and the ability to make multiple uses of the record. The resultant record structure (RFC 1807) accommodated both sets of requirements in a sustainable, scalable manner. The records can be immediately created upon acceptance of the technical report by publishing assistants. The records have a consistent definition, and the use of record fields is well- understood. There are conversion routines to facilitate MARC record creation (or use of the record in other formats). + Page 21 + Another example is the collaboration of staff in the MIT Libraries' Document Services department with researchers in the MIT Laboratory for Computer Science's Library 2000 project to create an operational scanning service. This collaboration resulted in other opportunities for joint work on scanning issues. The collaborative efforts of librarians and computer scientists created mutual respect that will continue to bear fruit long after the CS-TR project's termination. 5.0 Expanding the CS-TR Project At the June 1995 CS-TR meeting, the project participants agreed to ask the Computing Research Association (CRA) to endorse and to encourage the dissemination of this technology. A new consortium effort called Networked Computer Science Technical Report Library (NCSTRL) was created to merge the CS-TR project (sponsored by ARPA) and the WATERS (Wide Area Technical Report Service) project (sponsored by the National Science Foundation). [13] Institutions interested in participating in NCSTRL should consider the following qualifying criteria: 1. Participating sites are required to adopt, implement, and use RFC 1807 and Dienst. Adoption of these tools allows the site to automate the collection, management, and network availability of its own repository of computer science technical reports. The institution's report collection will become part of an expanding distributed library of technical reports through interoperation with other cooperating sites. 2. Doctoral granting U.S. institutions in computer science are invited to participate. Other institutions of higher education (or commercial or government research laboratories) who wish to participate should contact Rebecca Lasher (rlasher@forsythe.stanford.edu) to inquire about their possible involvement. 3. Institutions should only join if they feel they have a long-term commitment to disseminating computer science technical reports using NCSTRL tools. + Page 22 + 6.0 Lessons Learned Over the three years of the project, every participant gained a better understanding of the intellectual, organizational, social, and legal complexities embodied in library services. Building sophisticated digital library services while preserving the enduring values of a traditional library is a difficult endeavor. Among the lessons learned are: o Providing digital library services raises very difficult issues related to intellectual property and system scale, content, and use. o The underlying foundation of the digital library is content, structure, and organization. This foundation must be durable, but flexible enough to be useful in future environments. o From the beginning, create good content. Libraries cannot afford to redo the digital library with each new iteration of system design or access method. o A focus on system openness and interoperability is critical. The digital library is integrally involved with the nature of public and scholarly communication, information formats, and the economic and political environments within which information is created and sought. [14] 7.0 Conclusion Libraries are operational, production-oriented service organizations. A librarian's evaluation of a research project tends to focus on how successfully the products of this project are integrated with (or replace) existing services and how well they can be supported and renewed in a production environment. The CS-TR project built several new prototypes, which became true production systems. During the course of the project, it addressed many key aspects of designing a digital library: 1. Discovery: matching the technology with the service vision. 2. Delivery: nurturing and developing this match in a prototype atmosphere to examine its feasibility and readiness for implementation. + Page 23 + 3. Service: the ongoing operations of the service and the continuous improvement of the service. 4. Support: provision of assistance, documentation, and training. 5. Integration: fit of the new service with the organization's overall architecture and services. The CS-TR project made the most progress in the areas of discovery and delivery. More precise questions for each of the above processes were articulated. The project's discussions about integration issues related large-scale, distributed digital libraries will have a lasting impact on the field. The CS-TR project provides a model of a working distributed digital library that will be useful to participants in the NSF Joint Initiative Digital Library Projects and as the conceptual framework for further research by other digital library developers. The NCSTRL system that evolved from the CS-TR and WATERS projects will contribute significantly to the broader digital library community. [15] From a librarian's perspective, the CS-TR project offered the opportunity to work with and contribute to a world-class effort to transform scholarly communication. The learning experience was intense and gratifying. More questions have been formulated than were answered, but the new questions are better articulated and understood. One key question is whether a "digital library" is a real library as we understand it today or just a metaphor for something entirely different. Notes 1. Jerome H. Saltzer, "A Proposal for M.I.T. Participation in an Electronic Library Plan" (Cambridge: Massachusetts Institute for Technology, 1992). 2. A great deal of research was done by the participating institutions that is not mentioned in this article. Detailed descriptions of these activities can be found on each project participant's Web page. See . + Page 24 + 3. See . 4. James R. Davis,Carl Lagoze, and Dean B. Kraft, "Dienst: Building a Production Technical Report Server" (Paper delivered at ADL '95: A Forum for Research and Technology Advances in Digital Libraries, Tysons Center, VA, 17 May 1995). 5. Ibid. 6. Robert Kahn and Robert Wilensky, A Framework for Distributed Digital Object Services (Reston, VA: Corporation for National Research Initiatives, 13 May 1995). See . 7. See . 8. See . 9. Gloss is a research system, and the server may be unavailable at times. See . 10. Sift is a research system, and the server may be unavailable at times. See . 11. See . 12. See . 13. See . 14. Adapted from: Sarah M. Pritchard, "Librarians: Real Expertise for a Virtual World," Library Issues: Briefings for Faculty and Administrators 15, no. 5 (1995). 15. Clifford Lynch and Hector Garcia-Molina, Interoperability, Scaling, and the Digital Libraries Research Agenda: A Report on the May 18-19, 1995 IITA Digital Libraries Workshop. See . + Page 25 + Acknowledgments The research report upon which this article is based was sponsored in part by the Corporation for National Research Initiatives, using funds from the Advanced Research Projects Agency of the United States Department of Defense under CNRI's grant no. MDA-972-92-J-1029. The views and conclusions contained in this document are those of the authors and should not be interpreted as necessarily representing the official policies or endorsement, whether expressed or implied, of ARPA, the U.S. Government, or CNRI. Appendix A. RFC 1807 Fields Bibliographic record fields should follow the format described below. "" means the field is mandatory; records must include all mandatory fields. "" means the field is optional. The tags (a.k.a. the Field IDs) are shown in upper case. BIB-VERSION of this bibliographic records format ID ENTRY date ORGANIZATION TITLE TYPE REVISION WITHDRAW AUTHOR CORP-AUTHOR CONTACT for the author(s) DATE of publication PAGES count COPYRIGHT, permissions and disclaimers HANDLE OTHER_ACCESS RETRIEVAL KEYWORD CR-CATEGORY PERIOD SERIES MONITORING organization(s) FUNDING organization(s) CONTRACT number(s) GRANT number(s) LANGUAGE name NOTES ABSTRACT END + Page 26 + For the text of the entire RFC 1807 standard, see . About the Authors Greg Anderson, Director, IT Discovery Process, MIT Information Systems, 77 Massachusetts Ave., Room E19-324, Cambridge, MA 02139. Internet: ganderso@mit.edu. (During the CS-TR project, Mr. Anderson was the Associate Director for Systems and Planning at the MIT Libraries.) Rebecca Lasher, Head Librarian, Mathematical and Computer Sciences Library, Stanford University, Stanford, CA 94305-2125. Internet: rlasher@forsythe.stanford.edu. Vicky Reich, Assistant Director Highwire Press and Information Access Analyst, Green Library, Stanford University, Stanford, CA 94305-6004. Internet: vicky.reich@forsythe.stanford.edu. About the Journal The World-Wide Web home page for The Public-Access Computer Systems Review provides detailed information about the journal and access to all article files: Copyright This article is Copyright (C) 1996 by Greg Anderson, Rebecca Lasher, and Vicky Reich. All Rights Reserved. The Public-Access Computer Systems Review is Copyright (C) 1996 by the University Libraries, University of Houston. All Rights Reserved. Copying is permitted for noncommercial, educational use by academic computer centers, individual scholars, and libraries. This message must appear on all copied material. All commercial use requires permission.