Information Retrieval List Digest 153 (March 2, 1993) URL = http://hegel.lib.ncsu.edu/stacks/serials/irld/irld-153 IRLIST Digest ISSN 1064-6965 March 2, 1993 Volume X, Number 9 Issue 153 ********************************************************** I. NOTICES A. Meeting Announcements/Calls for Papers 1. ASIS SIG/Classification Workshop B. Publications 1. IR-L Digest via FTP II. QUERIES B. Requests for Information 1. Response to II.B.1., Issue 152 IV. PROJECT WORK C. Abstracts 1. IR-Related Dissertation Abstracts ********************************************************** I. NOTICES I.A.1. Fr: Ray Schwartz Re: ASIS SIG/Classification Workshop Call for Participation The American Society for Information Science Special Interest Group on Classification Research (ASIS SIG/CR) invites submissions for the 4th ASIS Classification Research Workshop, to be held at the 56th Annual Meeting of ASIS in Columbus, Ohio. The workshop will take place Sunday, October 24th, 1993, 8:30 a.m. -5:00 p.m. ASIS '93 continues through Thursday, October 28th. The CR Workshop is designed to be an exchange of ideas among active researchers with interests in the creation, development, management, representation, display, comparison, compatibility, theory, and application of classification schemes. Emphasis will be on semantic classification, in contrast to statistically based schemes. Topics include, but are not limited to: * Warrant for concepts in classification schemes * Concept acquisition * Basis for semantic classes * Automated techniques to assist in creating classification schemes * Statistical techniques used for developing explicit semantic classes * Relations and their properties * Inheritance and subsumption * Knowledge representation schemes * Classification algorithms * Procedural knowledge in classification schemes * Reasoning with classification schemes * Software for management of classification schemes * Interfaces for displaying classification schemes * Data structures and programming languages for classification schemes * Image classification * Comparison and compatibility between classification schemes * Applications such as subject analysis, natural language understanding, information retrieval, expert systems * The CR Workshop welcomes submissions from various disciplines. Those interested in participating are invited to submit a short (1-2 page single-spaced) position paper summarizing substantive work that has been conducted in the above areas or other areas related to semantic classification schemes, and a statement briefly outlining the reason for wanting to participate in the workshop. Submissions may include background papers as attachments. Participation will be of two kinds: presenter and regular participant. Those selected as presenters will be invited to submit expanded versions of their position papers and to speak to those papers in brief presentations during the workshop. All position papers (both expanded and short papers) will be published in proceedings to be distributed prior to the workshop. The workshop registration fee is $35.00. Submissions should be made by email, or diskette accompanied by paper copy, or paper copy only (fax or postal), to arrive by May 15, 1993, to: Phil Smith, 210 Baker Systems, 1971 Neil Avenue, Cognitive Systems Engineering Laboratory, The Ohio State University, Columbus, Ohio 43210; Phone: 614-292-4120; Fax: 614-292-7852, Internet: Phil+@osu.edu ********** I.B.1. Fr: Nancy Gusack, IR-L Moderator Re: IR-L Digest Available Via Anonymous FTP IR-L Digest issues are available via anonymous FTP, via the host dla.ucop.edu. The files are in the directory pub/irl, stored in subdirectories by year (e.g., /pub/irl/1993). Note that lower case is used. All files from 1989 through the present have been loaded onto the dla.ucop.edu host. Please let me know if you have any problems with the system or notice any peculiarities in the transferrred texts I need to fix. ********************************************************** II. QUERIES II.B.1. Fr: David Lewis Re: Full Text Documents Hans Paijmans asks: > experience? Could somebody direct me to literature that > explicitly tackles the differences between (weights of keywords > in) abstracts and full-texts? Good question. I do not know of a published comparison of retrieval effectiveness for full text documents vs. abstracts using a modern statistical text retrieval system. It's clear that there's a lot we don't know about weighting methods for full text documents. Later this year the TREC proceedings will be published, and TREC/TIPSTER papers will start showing up in conferences. These will be first solid data on weighting methods for full text, but probably won't contain a comparison of the type you want. --Dave David D. Lewis AT&T Bell Laboratories email: lewis@research.att.com 600 Mountain Ave.; Room 2C409 ph. 908-582-3976 Murray Hill, NJ 07974; USA dept. fax. 908-582-7550 ********************************************************** IV. PROJECT WORK IV.C.1. Fr: Susanne M. Humphrey Re: Selected IR-Related Dissertation Abstracts The following are citations selected by title and abstract as being related to Information Retrieval (IR), resulting from a computer search, using BRS Information Technologies, of the Dissertation Abstracts Online database produced by University Microfilms International (UMI). Included are UMI order number, title, author, degree, year, institution; number of pages, one or more Dissertation Abstracts International (DAI) subject descriptors chosen by the author, and abstract. Unless otherwise specified, paper or microform copies of dissertations may be ordered from University Microfilms International, Dissertation Copies, Post Office Box 1764, Ann Arbor, MI 48106; telephone for U.S. (except Michigan, Hawaii, Alaska): 1-800-521-3042, for Canada: 1-800-268-6090. Price lists and other ordering and shipping information are in the introduction to the published DAI. An alternate source for copies is sometimes provided. Dissertation titles and abstracts contained here are published with permission of University Microfilms International, publishers of Dissertation Abstracts International (copyright by University Microfilms International), and may not be reproduced without their prior permission. AN University Microfilms Order Number ADG92-18727. AU ZHANG, CHENG. TI A COMPUTER SYSTEM WHICH UNDERSTANDS APARTMENT-FOR-RENT ADVERTISEMENTS IN CHINESE NEWSPAPERS. IN Columbia University Teachers College Ed.D. 1992, 183 pages. SO DAI V53(02), SecA, pp485. DE Language, Linguistics. Computer Science. Language, General. Artificial Intelligence. AB People have tremendous frustration when attempting to communicate with computers. What they need is to be able to communicate with the computer in a language that is natural to them. This dissertation has focused on creating a pilot computer system which demonstrates a specific ability for understanding lists of apartment-for-rent advertisements in Chinese newspapers. This understanding is limited to recycling the information taken from the text and categorically mapping it into the database system for detailed inquiry and retrieval. This dissertation explores four aspects of computational linguistic research: (1) a functionally efficient way of entering Chinese characters into the computer, (2) parsing algorithms for written Chinese language, (3) a mapping relationship between the noun phrases of written Chinese and the computer's relational database, (4) employing an artificial intelligence system for problem solving in the area of word sense ambiguities. Such a working computer system is created through the use of the RPG III programming language on an AS/400 IBM mid-range computer. All the programs are tested, debugged and listed in the appendices. The conclusion of the dissertation is that a full-fledged commercial computer system, capable of recycling the meaning of noun phrases in written Chinese can be created, as long as the artificial intelligence system is adequately comprehensive and the parser is sufficiently flexible. AN University Microfilms Order Number ADGNN-65808. AU BEGHTOL, CLARE LAWTON. TI THE CLASSIFICATION OF FICTION: THE DEVELOPMENT OF A SYSTEM BASED ON THEORETICAL PRINCIPLES. IN University of Toronto (Canada) Ph.D. 1991, 475 pages. SO DAI V53(02), SecA, pp337. DE Library Science. IS ISBN: 0-315-65808-8. AB This thesis assumes that established and experimental semantic and syntactic subject analysis techniques can be brought to bear on analyzing fiction documents for the purposes of information retrieval. The thesis: (1) analyzes standard methods of classifying fiction; (2) reviews previous fiction classification theories and systems and uses the systems to classify a novel; (3) analyzes "fictional warrant", attempts to analyze "critical warrant", and identifies special problems that arise from identification of anomalous, fuzzy and/or ambiguous data in fiction; (4) develops a theoretical framework for fiction analysis containing four fiction-specific major data elements ("Characters", "Events", "Spaces" and "Times") and one general data element ("Other"); (5) develops a model for an experimental online fiction analysis system (EFAS) based on the theoretical framework and containing techniques for handling anomalus, fuzzy, and/or ambiguous fictional data; and (6) applies EFAS to 9 novels that were used as examples of novels containing anomalous, uncertain and/or ambiguous data and to the 10 novels that most recently won the Canadian Governor General's Literary Award for Fiction. EFAS assumes that data in fiction can be assumed to be "true" for the purposes of information retrieval; that each work of fiction can be treated as a self-contained information system; and that readers share world knowledge to an extent that would enable classifiers to produce reasonably consistent analyses of fiction. It uses retroactive notation and interpolations of notational and verbal expressions from standard subject analysis and cataloguing tools (Dewey Decimal Classification, Universal Decimal Classification, Library of Congress Subject Headings and Anglo-American Cataloguing Rules). It contains three tables of auxiliary notation that (1) signal ambiguous or non-classifiable data; (2) state relationships between different data elements in the same major category; and (3) state relationships between data elements in different major categories. Preliminary evaluation revealed that EFAS warrants further investigation, and some areas of research are suggested. AN University Microfilms Order Number ADG92-19635. AU BLISS, NONIE JANET. TI INTERNATIONAL LIBRARIANSHIP: A BIBLIOMETRIC ANALYSIS OF THE FIELD. IN Texas Woman's University Ph.D. 1991, 152 pages. SO DAI V53(02), SecA, pp337. DE Library Science. AB This study is a bibliometric analysis of the literature in the field of international librarianship. The analysis is based on the reference patterns in the materials indexed by Library Literature for the years 1958 to 1990. The study is designed to answer four research questions: (1) Based on the existing literature, what disciplines have contributed to international librarianship. (2) How have the contributions of publications in international librarianship fluctuated over the years. (3) What countries have contributed publications in international librarianship. (4) Who are the key/principal individuals who have authored contributions to the international librarianship literature. Citation analysis was used to collect the data for this study. Descriptive statistics was used to analyze the data and present the results and findings. The major findings of the study are: (1) Examination of the interdisciplinarity of the field of international librarianship revealed that the contribution by other disciplines was only 13.02%, suggesting the field is self-sufficient, (2) Examination of the fluctuations in the number of publications revealed the contributions fluctuate somewhat erratically, (3) Investigation of the geographic distribution of the contributions to the field revealed a dominance by the more industrialized countries, who published the majority of the documents, and (4) identification of key contributors to the literature determined that the field is extremely insular. AN University Microfilms Order Number ADG92-18829. AU HUBER, JEFFREY TODD. TI ASSISTING PERSONS WITH AIDS: A CONTENT ANALYSIS OF INFORMATION SOURCES ON DYING, DEATH, AND BEREAVEMENT FOR GAY MALES WITH AIDS. IN University of Pittsburgh Ph.D. 1991, 111 pages. SO DAI V53(02), SecA, pp337. DE Library Science. Social Work. Psychology, General. AB Although much research has been done to combat the onslaught of the AIDS epidemic, neither a known cure for the acquired immunodeficiency syndrome nor a vaccine to prevent infection with the human immunodeficiency virus currently exists. While medical and technological advances have proven successful in prolonging the course of the illness, for most individuals an AIDS diagnosis continues to result in death. Yet this subject area within the AIDS arena has repeatedly been noted as lacking adequate documentation. This pilot study analyzed the content of the information available to health care professionals counseling gay men with AIDS as they prepare for death. The criteria used in this content analysis to evaluate the strengths and weaknesses of the materials were timeliness and perceived degree of usefulness. Timeliness was judged by imprint date, and perceived degree of usefulness consisted of specific issues discussed in the general literature concerning the dying process and aspects involving the homosexual community as seen in the AIDS literature. It was found that it is necessary to consult multiple sources of information in order to gain comprehensive coverage of the subject matter, as no one is all-inclusive. Additionally, the information seeker cannot limit the search to traditional sources of information found in a health sciences setting since data has historically, and continues to be, produced by those individuals directly affected by the epidemic--the gay community. Moreover, it is essential to augment and complement available information with data from related areas as significant lacunae exist in AIDS-specific resources. Until a cure is found for the acquired immunodeficiency syndrome, health care professionals must be as knowledgeable as possible concerning all aspects of the illness--including the dying process. The gay community possesses unique characteristics that differentiate it from other groups perceived to be at risk for infection, and it is this community which continues to comprise the greatest number of AIDS-related deaths in the United States. With increased understanding, this final stage of life is one in which the assistance offered by health care professionals can make a significant difference. AN University Microfilms Order Number ADG92-19620. AU PEREZ, ERNEST RAOUL. TI A STUDY OF TRADITIONAL INFORMATION ACCESS MODELS APPLIED IN A HYPERTEXT INFORMATION SYSTEM. IN Texas Woman's University Ph.D. 1991, 259 pages. SO DAI V53(02), SecA, pp338. DE Library Science. Information Science. Mass Communications. AB Hypertext is an effective computer interface technology showing promise for information retrieval system applications. The ease of use and totally self-directed nature of navigating a hypertext information system using embedded associative links has drawn favorable reactions from system developers and implementors. However, many information retrieval specialists and librarians are concerned about the absence of planning for effective, consistent, controlled means of information organization and access. This exploratory study investigated the potential for transferring traditional, print-based methods of information retrieval to the hypertext medium. An extensive literature review provided background for construction of a conceptual model of traditional information access methods. The study used the established case study procedure of comparing a subject system against a proposed model, to establish the degree of agreement or disagreement with the hypothesis. The investigator selected an MS-DOS hypertext authoring system which apparently used some traditional information organization and structuring methods. Data collection methods included interview of the system developer and associates, examination of system documentation, informal interview of hypertext system developers using the system, and use of the subject system authoring tools for construction of a small hypertext system. Multiple means of data collection were used, allowing for cross-checking and verification of information. The data was interpreted by tabulation and examination of results against the traditional information access model. The study conclusively supported the hypothesis of the transferability of nearly all traditional information access and retrieval methods to the hypertext medium. It identified the hierarchical taxonomy matrix tool used in the subject authoring system as an accepted method of organizing information structure, and found it to be an effective hypertext authoring approach. It recommended the use of methods for insuring structured information organization, and tools to provide for consistency and control of access means. The study concluded that it would be productive for hypertext system authors to adopt a highly integrative approach, using a mixture of traditional information access methodologies to complement hypertext associative linking access and sophisticated information retrieval software approaches. AN University Microfilms Order Number ADG92-17762. AU SULLIVAN, KATHRYN ANN. TI USING DIALOG CIP AT WINONA STATE UNIVERSITY TO EDUCATE END-USERS. IN Nova University D.Sc. 1991, 300 pages. SO DAI V53(02), SecA, pp338. DE Library Science. Information Science. Education, Technology. AB Graduate students need to know the resources of their university library in order to do research and cannot be expected to remember any library training they may have received as undergraduates. A class offered by the library on how to search databases available through DIALOG's Classroom Instruction Program (CIP) was proposed, in cooperation with existing research classes in the student's field. A study was conducted at Winona State University, Winona, MN, with research classes offered by two professors in the field of special education. The study was to determine whether the information presented in an instruction session based on six learning objectives--choosing a database, choosing search terms and connectors, using search commands, modifying the search online, printing search results, and logging out--would enable graduate students to conduct an online search. Eighteen graduate students were provided with an hour's free searching on DIALOG in order to locate citations on their own choice of topic as part of an assignment from their instructor. Questionnaires were used to gather student assessments of their skills before and after their DIALOG search, while observation of the student during the search and examination of the actual printout of the search were used in conjunction with performance indicators to rate the actual skill in using DIALOG. All of the students were able to search DIALOG and print out citations but, based on performance indicators, none had the skill to be fully independent searchers. AN University Microfilms Order Number ADG92-18824. AU CHENG, LEE-JOY. TI THE NATURE OF POLICY STUDIES: AN EMPIRICAL STUDY OF EDUCATIONAL POLICY RESEARCH IN THE UNITED STATES, 1970-1989. IN University of Pittsburgh Ph.D. 1991, 179 pages. SO DAI V53(02), SecA, pp616. DE Political Science, Public Administration. Education, Administration. AB The purpose of this study is to investigate the process of knowledge production in educational policy studies, and to study the research quality and orientation of these studies. Specific research questions posed by this study include: What is the growth rate of educational policy-related research. Do differences exist between published and unpublished studies in terms of their growth of production. What is the research quality of policy-related studies in terms of the methodology employed by analysts. What is the orientation of policy-related studies. What are the determinants of research quality and orientation. In order to prevent publication bias, both published and unpublished studies were included in this study. Published articles were taken from major journals in educational policy studies which were listed in the Current Index to Journals in Education (CIJE). A computerized search of the Educational Resources Information Center (ERIC) system was employed to search for related unpublished reports. One hundred and fifty-five studies out of the total number of population (1445) were determined for this study. Based on the categories obtained from a pilot study conducted by the writer, 155 studies were coded and analyzed by: (1) computing the frequencies and percentages of the studies to answer questions about the growth of educational policy-related studies; (2) grouping all the studies into different quality groups based on their quality scores (factor scores); (3) conducting log-linear analyses to study both research quality and research orientation by examining their relationships to publication types, sources of support, topics of study, number of authors, number of author's disciplines, and institutional affiliations. The major findings are as follows: Published studies tripled in size in about ten years while unpublished studies doubled in size in about a ten year period. The growth rate of published studies was more rapid than that of unpublished studies. Results of factor analysis and cluster analysis indicated that about eighty percent (80.5%) of the educational policy-related studies produced during 1970 to 1989 were scored as low quality. About forty percent (39.36%) of the studies were found to be advocacy-oriented. Results of the logit analyses indicated that both publication types and topics of study affected the research quality significantly, and that both number of authors and institutional affiliations affected the research orientation of the studies significantly. ********************************************************** IRLIST Digest is distributed from the University of California, Division of Library Automation, 300 Lakeside Drive, Oakland, CA. 94612-3550. Send subscription requests to: LISTSERV@UCCVMA.BITNET Send submissions to IRLIST to: IR-L@UCCVMA.BITNET Editorial Staff: Clifford Lynch calur@uccmvsa.ucop.edu or calur@uccmvsa.bitnet Nancy Gusack ncgur@uccmvsa.bitnet or ncgur@uccmvsa.ucop.edu Mary Engle meeur@uccmvsa.bitnet The IRLIST Archives is being set up for anonymous FTP, and access information will be provided soon! Using LISTSERV, send the message INDEX IR-L to LISTSERV@UCCVMA.BITNET. To get a specific issue listed in the Index, send the message GET IR-L LOGYYMM, where YY is the year and MM is the numeric month in which the issue was mailed, to LISTSERV@UCCVMA (Bitnet) or LISTSERV@UCCVMA.UCOP.EDU. You will receive the issues for the entire month you have requested. These files are not to be sold or used for commercial purposes. Contact Nancy Gusack or Mary Engle for more information on IRLIST. THE OPINIONS EXPRESSED IN IRLIST DO NOT REPRESENT THOSE OF THE EDITORS OR THE UNIVERSITY OF CALIFORNIA. AUTHORS ASSUME FULL RESPONSIBILITY FOR THE CONTENTS OF THEIR SUBMISSIONS TO IRLIST.