Information Retrieval List Digest 111 (May 11, 1992) URL = http://hegel.lib.ncsu.edu/stacks/serials/irld/irld-111 ========================================================================= Date: Mon, 11 May 1992 10:50:06 PST Reply-To: "Information Retrieval List" Sender: "Information Retrieval List" From: IRLIST Subject: IR-L Digest, Vol.IX, No.15, Issue 111 IRLIST Digest May 11, 1992 Volume IX, Number 15 Issue 111 ********************************************************** I. NOTICES A. Meeting Announcements/Calls for Papers 1. ACL, June 28-July 2, 1992, U. Delaware, Newark II. QUERIES B. Requests for Information 1. E-Mail Conferencing User Survey IV. PROJECT WORK C. Abstracts 1. IR-Related Dissertation Abstracts ********************************************************** I. NOTICES I.A.1. Fr: Don Walker Re: Association for Computational Linguistics Annual Meeting ASSOCIATION FOR COMPUTATIONAL LINGUISTICS 30th Annual Meeting 28 June -- 2 July 1992 Clayton Hall, University of Delaware, Newark, Delaware, USA The program for the Annual Meeting itself, which will take place on 29 June to 2 July, features papers on all aspects of computational linguistics. Two invited lectures will be given during the meeting: "Natural Language Processing, Information Retrieval, and All That'' by Karen Sparck Jones, Cambridge University, and Martin Kay, Xerox PARC and Stanford University; and "Reflections and Projections'' by Don Walker, Bellcore and ACL. In addition, there are a special set of Student Sessions featuring papers that describe `work in progress' so that students can receive feedback from other members of the computational linguistics community. The Annual Meeting is preceded on 28 June by a set of tutorials: "Statistics for Computational Linguists'' by William A. Gale and Joseph B. Kruskal; "Leading Issues in Tree Adjunction'' by Yves Schabes and Stuart Shieber; "Very Large Text Corpora: What You Can Do with Them, and How to Do It'' by Mark Liberman and Mitch Marcus; and "Situation Semantics'' by Keith Devlin. The ACL Business Meeting will feature reports on the ACL Special Interest Groups, the ACL Data Collection Initiative, the Consortium for Lexical Research, the Text Encoding Initiative, the new Graduate Directory, the new Computational Linguistics Course Survey, the NLP Software Registry, the Linguistic Data Consortium, and other topics of current interest. There will also be an informal gathering to discuss multimedia language processing. CONFERENCE INFORMATION: The Program Committee was chaired by Henry Thompson, Edinburgh University. The Tutorials were organized by Bonnie Webber, University of Pennsylvania. For information about exhibits and demonstrations, contact Dan Chester, CIS Department, University of Delaware, Newark, DE 19716, USA, 1-302-831-1955; chester@udel.edu. Local arrangements are handled by Sandra Carberry, Dan Chester, or Kathleen McCoy, Computer and Information Sciences, University of Delaware, Newark, DE 19716, USA; 1-302-831-2712; acl@udel.edu. Program and registration brochures are being mailed to all ACL members. To get a brochure and other information on the conference and on the ACL more generally, contact Don Walker (ACL), Bellcore, MRE 2A379, 445 South Street, Box 1910, Morristown, NJ 07960-1910, USA; (+1 201)829-4312; walker@flash.bellcore.com. ********************************************************** II. QUERIES II.B.1. Fr: Diane Kovacs, Jeannie Dixon, Kara Robinson" Re: E-Mail Conference Participation Study E-Mail Conferencing User Survey Dear E-mail Conference Participant: This message describes a survey of E-mail conference participants. E-mail conferences are being established to fulfil perceived needs for professional information sharing between librarians. Our study will explore participation on e-mail conferences (who is participating) and question how e-mail conferences are fulfilling information needs and whether they are in fact replacing or enhancing traditional information sources. As a further descriptive activity, this study will discover whether some e-mail conferences with apparently duplicate topics or overlaps in subject matter actually overlap in subscribers. (E-mail conference refers to Listserv-based discussion lists as well as Internet interest groups of various kinds) We will look at the number of participants, their institutional affiliations and their geographic distribution. Most importantly, we need to ask our subcribers if they are using e-mail conferences as sources of professional information. Participation in this project is completely voluntary. Should you decide not to respond to the survey there will be no penalty of any kind. You may cease participation in the research at any time without penalty. Your participation - if you elect to do so - is very important to the study. Total anonymity is not possible because computers identify the senders of all e-mail. However, we will keep your responses completely confidential. Only summary data will be released. If we would like to quote part of your response, we will contact you for permission. If you want to know more about this research project, AND WOULD LIKE A COPY OF THE SURVEY TO COMPLETE, please feel free to contact Diane Kovacs at (dkovacs@kentvm or dkovacs@kentvm.kent.edu). The project has been approved by Kent State University. If you have questions about Kent State University's rules for research, please call Dr. Adriaan de Vries, telephone (216)672-2070. Sincerely, Diane Kovacs Jeannie Dixon Kara L. Robinson Libres Editor Educom-W Moderator Libref-L Moderator Kent State University University of Texas- Kent State University dkovacs@kentvm.kent.edu Pan-American dkovacs@kentvm (216)672-3045 ********************************************************** IV. PROJECT WORK IV.C.1. Fr: Susanne M. Humphrey Re: Selected IR-Related Dissertation Abstracts The following are citations selected by title and abstract as being related to Information Retrieval (IR), resulting from a computer search, using BRS Information Technologies, of the Dissertation Abstracts Online database produced by University Microfilms International (UMI). Included are UMI order number, title, author, degree, year, institution; number of pages, one or more Dissertation Abstracts International (DAI) subject descriptors chosen by the author, and abstract. Unless otherwise specified, paper or microform copies of dissertations may be ordered from University Microfilms International, Dissertation Copies, Post Office Box 1764, Ann Arbor, MI 48106; telephone for U.S. (except Michigan, Hawaii, Alaska): 1-800-521-3042, for Canada: 1-800-268-6090. Price lists and other ordering and shipping information are in the introduction to the published DAI. An alternate source for copies is sometimes provided. Dissertation titles and abstracts contained here are published with permission of University Microfilms International, publishers of Dissertation Abstracts International (copyright by University Microfilms International), and may not be reproduced without their prior permission. AN University Microfilms Order Number ADG91-20950. AU TURTLE, HOWARD ROBERT. TI INFERENCE NETWORKS FOR DOCUMENT RETRIEVAL. IN University of Massachusetts Ph.D. 1991, 211 pages. SO DAI V52(02), SecB, pp941. DE Computer Science. Information Science. AB Information retrieval is concerned with selecting documents from a collection that will be of interest to a user with a stated information need or query. Research aimed at improving the performance of retrieval systems, that is, selecting those documents most likely to match the user's information need, remains an area of considerable theoretical and practical importance. This dissertation describes a new formal retrieval model that uses probabilistic inference networks to represent documents and information needs. Retrieval is viewed as an evidential reasoning process in which multiple sources of evidence about document and query content are combined to estimate the probability that a given document matches a query. This model generalizes several current retrieval models and provides a framework within which disparate information retrieval research results can be integrated. To test the effectiveness of the inference network model, a retrieval system based on the model was implemented. Two test collections were built and used to compare retrieval performance with that of conventional retrieval models. The inference network model gives substantial improvements in retrieval performance with computational costs that are comparable to those associated with conventional retrieval models and which are feasible for large collections. AN University Microfilms Order Number ADG91-20481. AU HILL, LINDA LADD. TI ACCESS TO GEOGRAPHIC CONCEPTS IN ONLINE BIBLIOGRAPHIC FILES: EFFECTIVENESS OF CURRENT PRACTICES AND THE POTENTIAL OF A GRAPHIC INTERFACE. IN University of Pittsburgh Ph.D. 1990, 212 pages. SO DAI V52(02), SecA, pp327. DE Information Science. Library Science. Geography. AB The focus of this research was to determine the accuracy and predictability, and hence the effectiveness, of current practices of indexing geographic concepts for retrieval from online bibliographic files within the domain of earth sciences. The methodology was based on the measurement of geographic similarity between pairs of documents. The geographic study area for each document in a test file of earth science documents was represented in at least three ways: by a map and by the text of bibliographic records from two online bibliographic files. The geographic similarity of the documents to one another was measured spatially using the maps and linguistically using the text, both indexing terminology and free text, under both Boolean and vector retrieval models and with frequency weighting of terms. Correlation analysis of the map-based geographic similarities to the text-based similarities was used to evaluate the effectiveness of geographic representation. Some records also included representation of the geographic concepts with latitude and longitude coordinates which were compared spatially to the map-based representations of the study areas and to the text-based representations. Optimal recall and precision values for three case study areas, using text and coordinates, were also derived, using the overlap of map areas to define the relevant sets. Results indicate only weak correlations between the text-based and the spatially-based geographic representations (with a range of 0.19 to 0.38), related to the imprecise nature of words in representing geographic areas and to the lack of predictability of the terminology used to describe a particular area. Recall and precision values for optimal search strategies for three case studies exhibited a great range of values (both ranged from 15% to 80%), with average values of 50% recall and 41% precision. Free text performed better than index terms in both correlation values to map-based geographic similarities and in search strategies; the advantage was based primarily on individual words in the index term phrases. AN University Microfilms Order Number ADG91-19232. AU KIM, MEE JEAN. TI PLANNING FOR DISSEMINATION OF SCIENTIFIC AND TECHNICAL INFORMATION IN INFORMATION CENTERS IN THE REPUBLIC OF KOREA: A SUGGESTED MODEL. IN Texas Woman's University Ph.D. 1990, 188 pages. SO DAI V52(02), SecA, pp327. DE Information Science. Library Science. AB An efficient information system for the utilization of national and international information is a prerequisite for the growth of a national economy. In the case of Korea, problems and difficulties exist in scientific and technical information systems. Information centers in Korea do not have effective information systems, and cooperation among special libraries does not exist, even though the libraries do not have enough resources. The purposes of the study were to identify and describe the present status of scientific and technical information networks in Korea and to make recommendations for a model information network. The population surveyed consisted of 34 information centers (10 government-supported and 24 industry-supported) belonging to research and development institutes in the areas of science and technology. From the 24 responses received, six experts were selected to further participate in interviews which focused on gathering information, suggestions, and advice that might be useful in planning a model information network for Korea. Recommendations for change within the information center community and for a model for a scientific and technical information network for Korea were made based on questionnaire responses and interview results. All of the information centers in this study participated in resource sharing networks in a very limited way. Despite this lack of resource sharing, several of them are involved in interlibrary loan service, photocopy services, and construction of union catalogs/lists. Twenty information centers currently participate in one or more types of information networks on the national or international level. The most serious difficulties or barriers in operating ongoing or new cooperative/system activities were "a lack of standardization," followed by "a lack of information technology." The most highly necessary functions or requirements for a model information network for Korea expressed by respondents were "developing a national policy for science and technology information," "constructing national material databases," and "developing and updating information retrieval languages." Therefore, the functions or requirements associated with information storage and retrieval systems should have high priority in a model information network. The findings also indicate that information centers in Korea have sufficient capabilities to improve the present status of information networks in terms of funding, personnel, information technology, information infrastructure, and support from the parent institutions. The major recommendation for the model information network was the development of science and technology information policy that supports and is consistent with the science and technology policy. A second recommendation was the standardization among information systems, in order to facilitate intersystem communication. A third recommendation was greater cooperation, such as coordinated acquisition and cataloging activities. AN University Microfilms Order Number ADG91-19754. AU PINELLI, THOMAS EDWARD. TI THE RELATIONSHIP BETWEEN THE USE OF U.S. GOVERNMENT TECHNICAL REPORTS BY U.S. AEROSPACE ENGINEERS AND SCIENTISTS AND SELECTED INSTITUTIONAL AND SOCIOMETRIC VARIABLES. IN Indiana University Ph.D. 1990, 354 pages. SO DAI V52(02), SecA, pp328. DE Information Science. Library Science. History of Science. AB A study was undertaken that investigated the relationship between the use of U.S. government technical reports by U.S. aerospace engineers and scientists and selected institutional and sociometric variables. Two sets of variables were investigated. The first set, identified as institutional or structural variables, included the following six variables: level of education, academic preparation, years of professional aerospace work experience, type of organization, professional duty, and technical discipline. The second set, identified as sociometric or source selection variables, included the following seven variables: accessibility, ease of use, expense, familiarity or experience, technical quality or reliability, comprehensiveness, and relevance. The goal of the study was to determine which variables explain the use of U.S. government technical reports by U.S. aerospace engineers and scientists. Survey research is the methodology used for the study. Data were collected by means of a self-administered mail questionnaire. The approximately 34,000 members of the American Institute of Aeronautics and Astronautics (AIAA) served as the study population. The response rate for the survey was 70 percent. Eight hypotheses were established for the study. A dependent relationship was found to exist between the use of U.S. government technical reports and three of the institutional variables (academic preparation, years of professional aerospace work experience, and technical discipline). The use of U.S. government technical reports was found to be independent of all of the sociometric variables. The institutional variables best explain the use of U.S. government technical reports by U.S. aerospace engineers and scientists. AN University Microfilms Order Number ADG91-21448. AU WANG, XIANHUA. TI KNOWLEDGE-BASED SELECTION OF DATABASES: AN ALGORITHM AND ITS EVALUATION. IN University of Maryland College Park Ph.D. 1990, 422 pages. SO DAI V52(02), SecA, pp328. DE Information Science. Library Science. Computer Science. Artificial Intelligence. AB This dissertation addresses database selection, an issue of increasing importance as computerized files proliferate. Database guides and selection programs are limited in considering full range of capabilities of databases. This project designed and tested the effectiveness of a prototype system to select databases for queries in business. The system provides extensive descriptions of 26 business-related databases, using the entity-relationship approach as the conceptual schema. Data were derived from vendor descriptions and business thesauri and dictionaries. The knowledge basis consists of a conceptual schema, a thesaurus, and the database descriptions. The system is implemented in Prolog. No interface was established for end-user searching. An emphasis in evaluation is the effectiveness of the conceptual schema for representing database information. To test the system effectiveness, the system's response was compared with databases chosen by professional searchers. Four searchers and one judge participated in the experiment. All were experienced business librarians using a range of online databases. Each searcher chose databases for twelve actual reference queries and stated their reasoning. Using search algorithms optimizing recall, the system generated responses for all 24 questions. For each set of questions, databases selected by the system and each searcher were considered relevant. The judge decided the relevance of databases not unanimously chosen. The average number of databases chosen by the system and by the searchers was similar, but the median number was less for the system. Overall recall and precision was similar for both. While the agreement between the searchers was about 44%, the agreement between the system and searchers ranged from 34 to 49%. In the prototype, the system performs almost as well as a searcher. Formative evaluation of the search results, including the searcher's reasoning, provided the basis for determining factors contributing to the success or failure of the prototype. Reasons for failure included, for example, inability to represent specific data, insufficient database descriptions, problems with question interpretation, breadth of searching (algorithm problems). Suggestions for improvement of all components of system are included in the results. Appendices include the formative analysis for each query, the Prolog program, and individual database descriptions, which reflect the conceptual schema. ********************************************************** IRLIST Digest is distributed from the University of California, Division of Library Automation, 300 Lakeside Drive, Oakland, CA. 94612-3550. Send subscription requests to: LISTSERV@UCCVMA.BITNET Send submissions to IRLIST to: IR-L@UCCVMA.BITNET Editorial Staff: Clifford Lynch lynch@uccmvsa.ucop.edu or calur@uccmvsa.bitnet Nancy Gusack ncgur@uccmvsa.bitnet Mary Engle engle@cmsa.berkeley.edu or meeur@uccmvsa.bitnet The IRLIST Archives will be set up for anonymous FTP, and the address will be announced in future issues. To access back issues presently, send the message INDEX IR-L to LISTSERV@UCCVMA.BITNET. To get a specific issue listed in the Index, send the message GET IR-L LOG ***, where *** is the month and day on which the issue was mailed, to LISTSERV@UCCVMA.BITNET. These files are not to be sold or used for commercial purposes. Contact Nancy Gusack or Mary Engle for more information on IRLIST. The opinions expressed in IRLIST do not represent those of the editors or the University of California. Authors assume full responsibility for the contents of their submissions to IRLIST.