Information Retrieval List Digest 069 (June 19, 1991) URL = http://hegel.lib.ncsu.edu/stacks/serials/irld/irld-069 ========================================================================= Date: Wed, 19 Jun 91 22:43:13 PST Reply-To: Information Retrieval List Sender: Information Retrieval List From: IRLIST Subject: IRLIST Digest, Vol. VIII, No. 26, Issue 69 IRLIST Digest June 19, 1991 Volume VIII, Number 26 Issue 69 ********************************************************** I. NOTICES C. Miscellaneous 1. Wide Area Information Servers Unix Release B1 II. QUERIES B. Requests for Information 1. Multilingual Information Retrieval (MIR) IV. PROJECT WORK B. Bibliographies 1. Selected IR-Related Dissertation Abstracts ********************************************************** I. NOTICES I.C.1. Fr: Brewster Kahle Re: Wide Area Information Servers Unix Release B1 New Unix Internet Release (Beta Release) Available Thank you for the interest in WAIS. The servers on Quake (including the directory of servers) has gotten 4500 requests from 193 different hosts all over the world in a couple of months. There are now 35 servers including one in Norway (UiO_Publications.src), poetry (poetry.src), as well as a Connection Machine serving all sorts of things. Through the Alpha release process, many people have helped find and fix bugs, than you. With Beta, I think we are ready for widespread announcement. Please repost this on any other list you would like to. There are are a few mailing lists on this subject that you might want to be on: wais-interest: only announcements like this (1 a month or so) wais-discussion: moderated mailings every 1 every 2 weeks. Good stuff including all on wais-interest. wais-talk: unmoderated for implementors and interactive discussions. Requests to wais--request@think.com. Archives available from wais server: wais-discussion or anonymous ftp from quake.think.com. The Mac release is unchanged and stable. Jonathan Goldman pulled the most recent release together. Highlights of modifications (see the release for the full report) Works on more architectures (BSD and closer on SysV and Xenix), waisindex: parses mail dates + fixes waisserver: security feature + first kludge toward relevance feedback + better logging + fixes waissearch, waisq, xwais, and xwaisq: fixes gmacs wais: can display pictures if on an Xmachine, more commands, fixes Thank you to all that have contributed bug reports and suggestions. Overview of components: In this release is source code for: * Server code: There is code to index text and picture files. * Protocol code: based on Z39.50-1988 using the internet. * Clients code: User interfaces for contacting servers * GNU emacs interface * simple shell interface * Mac interface (in separate WAIStation file) * tool kit for making your own interfaces * X interface * Directory of servers: This is be a network service that lists existing servers and how to contact them. * A Connection Machine server with some patent information, the CIA factbook, and some Biomedical abstracts, info-mac, risks, etc to serve as example servers. The public servers that are currently advertized are: CM-applications.src CM-fortran-manual.src CM-paris-manual.src CM-star-lisp-docs.src CM-tech-summary.src CMFS-documentation.src MIT-algorithms-bug.src MIT-algorithms-exercise.src MIT-algorithms-suggest.src Molecular-biology.src NIH-Guide.src US-Gov-Programs.src UiO_Publications.src bible.src cosmic-abstracts.src cosmic-programs.src directory-of-servers.src homebrew.src info-mac.src internet-resource-guide.src internet-rfcs.src jargon.src patent-sampler.src poetry.src risks-digest.src sample-books.src sample-pictures.src sun-spots.src tmc-library.src usenet-cookbook.src wais-discussion-archives.src wais-docs.src wall-street-journal-sample.src weather.src world-factbook.src The release is available from Think.com via anonymous FTP in The release is available from Think.com via anonymous FTP in /public/wais/wais-8-b1-dist.tar.Z and WAIStation-0-62.sit.hqx. Bugs to bug-wais@think.com or to me. -Brewster and the WAIS crew "Paper and flesh are fleeting media for the treasures that are ideas." Brewster Kahle Thinking Machines Corporation Brewster@Think.com 1010 El Camino Real Project Leader Menlo Park, CA 94025 Wide Area Information Servers 415-329-9300x228 ********************************************************** II. QUERIES II.B.2. Fr: Xia Lin Re: Request for Information We are working on a survey of research on multilingual information retrieval (MIR). We would like to invite researchers involved in MIR research to send us descriptions or publications of your research. We would appreciate any information about persons involved in multilingual computing, projects related to MIR, current existing MIR systems and projects, etc. Please respond by E-mail or send your printed materials to the following address. Thank you in advance. Xia Lin (xialin@apple.com) 20525 Mariani Ave. MS 76-2C Cupertino, CA 95014 ********************************************************** IV. PROJECT WORK IV.B.1. Fr: Susanne Humphrey Re: Selected IR-Related Dissertation Abstracts The following are citations selected by title and abstract as being related to Information Retrieval (IR), resulting from a computer search, using BRS Information Technologies, of the Dissertation Abstracts Online database produced by University Microfilms International (UMI). Included are UMI order number, title, author, degree, year, institution; number of pages, one or more Dissertation Abstracts International (DAI) subject descriptors chosen by the author, and abstract. Unless otherwise specified, paper or microform copies of dissertations may be ordered from University Microfilms International, Dissertation Copies, Post Office Box 1764, Ann Arbor, MI 48106; telephone for U.S. (except Michigan, Hawaii, Alaska): 1-800-521-3042, for Canada: 1-800-268-6090. Price lists and other ordering and shipping information are in the introduction to the published DAI. An alternate source for copies is sometimes provided. Dissertation titles and abstracts contained here are published with permission of University Microfilms International, publishers of Dissertation Abstracts International (copyright by University Microfilms International), and may not be reproduced without their prior permission. AN University Microfilms Order Number ADG90-34709. AU HIGH, WALTER MARTIN, III. TI EDITING CHANGES TO MONOGRAPHIC CATALOGING RECORDS IN THE OCLC DATABASE: AN ANALYSIS OF THE PRACTICE IN FIVE UNIVERSITY LIBRARIES. IN The University of North Carolina at Chapel Hill Ph.D. 1990, 286 pages. SO DAI V51(07), SecA, pp2186. DE Library Science. AB The Online Computer Library Center, Inc. (OCLC) maintains and provides access to a 21,000,000 title electronic bibliographic database which grows rapidly and is used by thousands of librarians daily. It has served as a source of shared cataloging data since its inception and was established for the purpose of reducing the time and cost of cataloging library materials. The purpose of this study is to measure how often librarians at selected research libraries accepted the cataloging data available in the OCLC database and how often they edited it to suit purposes specific to their institution. The method of ascertaining the desired measures was to select five libraries with membership in the Association of Research Libraries and examine a sample of one thousand monographic records from each, drawn from a four-week period in April 1985. The edited version of the bibliographic records was obtained from each institution's OCLC archive tape (a machine-readable magnetic tape that records all cataloging transactions). The original, unedited version of the records was obtained on magnetic tape from the OCLC database. The two versions were matched by computer algorithm and printed side-by-side for comparative purposes. The algorithm was also programmed to identify cataloging fields where changes had occurred. The changes were manually analyzed and classified to determine the extent of editing in seven areas of the catalog record. The results of the analysis show that the ratio of editing changes on records contributed to the OCLC database from a participating library compared to records contributed from the Library of Congress (LC) is 4.5 to 1.0. A little more than half of the editing changes to records could be classified as "cosmetic" as opposed to "substantive." Catalogers showed fairly high acceptance rates of each other's intellectual work, about 85% for records from participating libraries and 98% for records from LC with the series area being the most problematic. Records contributed to the database by participating libraries did not generally lack substantive data that would make them unacceptable. There is potential benefit from having OCLC staff design and run special computer programs to fix specific, identifiable problems that recur throughout database records. AN University Microfilms Order Number ADG90-35143. AU WESTERMAN, JUDY GERTUDE LUDLAM. TI AN EVALUATION OF DATABASE QUALITY: A CASE STUDY OF SUBJECT ACCESS TO TI AN EVALUATION OF DATABASE QUALITY: A CASE STUDY OF SUBJECT ACCESS TO YOUNG ADULT FICTION IN THE ACCESS PENNSYLVANIA CD-ROM DATABASE. IN University of Pittsburgh Ph.D. 1990, 129 pages. SO DAI V51(07), SecA, pp2187. DE Library Science. Computer Science. AB The purpose of this study was to investigate database quality through user accessibility to the multiple bibliographic records of young adult fiction titles in the ACCESS PENNSYLVANIA CD-ROM Database. The major hypotheses of the study were: the multiple bibliographic records for young adult fiction titles with the most topical subject headings have the most holdings; the bibliographic record for each young adult fiction title with the most holdings is annotated; the bibliographic record for each young adult fiction title with the most topical subject headings will have an annotation; and that the retrospective conversion of young adult fiction titles enhanced subject access to these records held by one school district. The investigation was accomplished by using 30 young adult fiction titles copyrighted between 1975 and 1985 as the purposeful sample. The database contained 162 records for the sample titles which were analyzed in terms of topical subject headings, annotations, and the number of libraries owning each edition. The data was entered onto a computer file using the statistical package SPSS Information Analysis System. Major findings in the study were: (1) The multiple bibliographic records for young adult fiction titles with the most topical subject headings are held by the most libraries on only 8 percent of the records; (2) The bibliographic record for each young adult fiction title with the most holdings is annotated in only 8.6 percent of the records; (3) The bibliographic record for each young adult fiction title with the most topical subject headings will have an annotation in only 30.8 percent of the records; (4) and The retrospective conversion of young adult fiction titles did enhance subject access to these records held by one school district for 6 percent but also showed a significant decrease in record quality for 42 percent of the records. The findings were related to the need for improved record standardization and database maintenance. AN University Microfilms Order Number ADG90-30439. AU YOO, JAE-OK. TI FIELD DEPENDENCE/INDEPENDENCE AND THE PERFORMANCE OF THE ONLINE SEARCHER. IN Indiana University Ph.D. 1990, 225 pages. SO DAI V51(07), SecA, pp2187. DE Library Science. AB This study attempted to identify whether online searching performance was affected by FD (Field Dependence) and FI (Field Independence) cognitive differences between searchers and the extent to which searching performance was affected by FD/FI dimension of cognitive style. This study used a quasi-experimental design with 41 student subjects using the Lockheed DIALOG system and ERIC ONTAP database. Cognitive styles of student subjects were measured by using GEFT (Group Embedded Figure Test) and they were divided into two cognitive groups--FD and FI based on the GEFT scores. Each subject was assigned two predetermined searches which had different search goals--a high precision search and a high recall search. Search performance of two cognitive groups on two different search problems was compared in order to see how these two groups responded to achieving different search goals in terms of search strategy, search inputs, and resulting search outputs. The major findings of this study were: (1) The pattern of approaching a search problem was not significantly different between the two cognitive groups. (2) For the high recall search, the FI group utilized significantly more terms than the FD group but slightly less time than the FD group. (3) For both searches the FI group achieved a significantly higher success rate than the FD group. The FI group were significantly more successful searchers than the FD group. (4) Mean differences of the search performance variables between the FD/FI groups were consistent across the two types of search questions. The FI group seemed to be equally effective for both types of search questions. In conclusion, the differences found in number of terms used and success rate between the two cognitive groups apparently resulted from different cognitive styles. AN University Microfilms Order Number ADG90-33645. AU MONTY, MELISSA LEE. TI ISSUES FOR SUPPORTING NOTETAKING AND NOTE USING IN THE COMPUTER ENVIRONMENT. IN University of California, San Diego Ph.D. 1990, 187 pages. SO DAI V51(07), SecB, pp3597. DE Psychology, Experimental. Information Science. Computer Science. AB A distributed view of cognition is discussed in terms of its applicability to external information storage and the problem of designing environmental support for notetaking and managing large personal databases. The cognitive issues of representing ideas in notes and using those notes in various tasks are explored in terms of the user's goals and the environmental constraints. Several methods of study were undertaken: (1) a longitudinal study of an individual taking notes and writing a paper using a computer hypertext interface designed for these tasks; (2) experimental tests for context effects when using notes to cue recall; (3) informal observations of an individual filing index cards and several individuals searching through class notes; and (4) informal analyses of cues encoded in class notes, office files, index cards, and NoteCards. An individual was observed over several months as he collected notes and wrote a paper using the Xerox NoteCards hypertext system. The stages of his task--notetaking, filing, organizing, and writing--are characterized by distinct activities and tools. The four main, overlapping organizations for his notes--source, topic, outline, and paper--are compared. Controlled experiments tested for context effects during note review but showed no effect of context. Short notes were taken while reading passages of text and presented one week later to cue recall for the passages. In one condition, notes from a passage were separated at test and interspersed between notes from other passages. Greater segmentation of separated notes appeared to reduce their ability to cue recall. Browsing and data retrieval strategies for loosely structured personal databases and multiple organizations are discussed. Visual discriminability is important during browsing and review and must be enhanced in computing environments. Different classes of encoding features are described with observations about their use across domains. Cues may be discriminating, attentional, directional, revisional, and reliability determining. Cues from these different classes were combined, sometimes redundantly, to encode such things as relationships, levels of importance, and the usefulness of the material. Directional features (boundaries and connectors) are particularly important during encoding and interpretation of spatial organizations. ********************************************************** IRLIST Digest is distributed from the University of California, Division of Library Automation, 300 Lakeside Drive, Oakland, CA. 94612-3550. Send subscription requests to: LISTSERV@UCCVMA.BITNET Send submissions to IRLIST to: IR-L@UCCVMA.BITNET Editorial Staff: Clifford Lynch lynch@postgres.berkeley.edu or calur@uccmvsa.bitnet Nancy Gusack ncgur@uccmvsa.bitnet Mary Engle engle@cmsa.berkeley.edu or meeur@uccmvsa.bitnet The IRLIST Archives will be set up for anonymous FTP, and the address will be announced in future issues. To access back issues presently, send the message INDEX IR-L to LISTSERV@UCCVMA.BITNET. To get a specific issue listed in the Index, send the message GET IR-L LOG ***, where *** is the month and day on which the issue was mailed, to LISTSERV@UCCVMA.BITNET. These files are not to be sold or used for commercial purposes. Contact Nancy Gusack or Mary Engle for more information on IRLIST. The opinions expressed in IRLIST do not represent those of the editors or the University of California. Authors assume full responsibility for the contents of their submissions to IRLIST.