Information Retrieval List Digest 074 (July 18, 1991) URL = http://hegel.lib.ncsu.edu/stacks/serials/irld/irld-074 IRLIST Digest July 18, 1991 Volume VIII, Number 31 Issue 74 ********************************************************** I. NOTICES A. Meeting Announcements/Calls for Papers 1. Hypertext '91 II. QUERIES A. Questions and Answers 1. Knowledge Dictionary Project B. Requests for Information 1. A Spanish Stemmer IV. PROJECT WORK B. Bibliographies 1. Selected IR-Related Dissertation Abstracts ********************************************************** I. NOTICES I.A.1. Fr: Gary Perlman Re: Call for Posters for ACM Hypertext'91 A C M H y p e r t e x t ' 9 1 San Antonio, Texas December 15-18, 1991 POSTER SUBMISSION INFORMATION (please circulate and post) Submission Deadline: Postmarked by August 25, 1991 Contents of this File: [1] GOALS [2] SELECTION CRITERIA [3] SUBMISSION FORMAT [4] SUBMIT TO [5] LIMITS [6] ANSWERS TO COMMONLY ASKED QUESTIONS Hypertext'91 is an international research conference on hypertext. The ACM Hypertext Conference occurs in the United States every second year in alternation with ECHT, the European Conference on Hypertext. Hypertext systems provide computer support for locating, gathering, annotating, and organizing information. Hypertext systems are being designed for information collections of diverse material in heterogeneous media, hence the alternate name, hypermedia. Hypertext is by nature multi-disciplinary, involving researchers in many fields, including computer science, cognitive science, rhetoric, and education, as well as many application domains. This conference will interest a broad spectrum of professionals in these fields ranging from theoreticians through behavioral researchers to systems researchers and application developers. The conference will offer technical events in a variety of formats as well as guest speakers and opportunities for information special interest groups. Poster presentations will allow researchers to present late-breaking results, significant work in progress, or work that is best presented in conversation. Poster sessions let conference attendees exchange ideas one-on-one with authors and let authors discuss their work in more detail than in a paper presentation. [1] GOALS: Posters will be accepted much later than papers and will provide an opportunity to present and get feedback on new or developing ideas. System developers might want to contact Amy Pearl (pearl@eng.sun.com, 415-960-1300) about submitting a demonstration proposal instead or in addition to a poster. [2] SELECTION CRITERIA: Posters will be reviewed by a panel of subject-matter experts and will be selected on the basis of their contribution to research or practice. [3] SUBMISSION FORMAT: Submit a cover page with: * title of the proposed poster * name and affiliation of the author(s) * complete contact information (including phone, fax, and email) for one contact person to whom correspondence will be addressed Also submit an extended abstract of at most two typewritten pages describing: * the problem, * what was done, and * why the work is important. Graphic displays can be appended to the two-page limit. Electronic submission is preferred. Include a "Subject:" line in the form: Subject: HT91 POSTER: title of your proposed poster [4] SUBMIT TO: Gary Perlman; Department of Computer and Information Science; Room 228, Bolz Hall; The Ohio State University; 2036 Neil Avenue Mall; Columbus, OH 43210-1277 USA Email: perlman@cis.ohio-state.edu; Phone: 614-292-2566; Fax: 614-292-9021 [5] LIMITS: * Because of the interactive nature of poster presentations, only one submission will be accepted per author. * All submissions must be postmarked by August 25, 1991. Overseas submissions should consider express mail if submitting late; sometimes, overseas mail takes more than two weeks. * There is a limit of two typewritten pages for submissions. Figures can be appended to these pages. [6] ANSWERS TO COMMONLY ASKED QUESTIONS: Q: Will the posters be published as part of the proceedings? A: No, but abstracts of the posters will be available at the conference. Posters will be technical "presentations" but not "publications". Some posters might make good papers for the SIGLINK newsletter, or other outlets. Q: How many posters will be accepted? A: The program and conference committees have allocated space for 25-40. The actual number accepted will not exceed the larger number, but we will not feel compelled to accept posters to fill the space. We want to have high quality posters as part of the technical program. Q: When, where, how, etc. will the posters be displayed? A: Several large meeting rooms have been reserved in the conference hotel. The conference committee and the posters chair have taken special care to provide ample room for people to walk through the posters. Each poster will be provided with a cork-board, table, and chair. cork board: 8' x 4' (2.44 m x 1.22 m); table: 8 x 15" (2.44 m x 0.38 m); chair: 4 legs (.004 kilolegs); Pushpins will be provided, but electrical outlets will not. Gary Perlman; Department of Computer and Information Science; Room 228, Bolz Hall; The Ohio State University; 2036 Neil Avenue Mall; Columbus, OH 43210-1277 USA Email: perlman@cis.ohio-state.edu; Phone: 614-292-2566; Fax: 614-292-9021 ********************************************************** II. NOTICES II.A.1. Fr: Cris Kobryn Re: "Knowledge Dictionary" Project We are investigating intelligent text information retrieval systems and have been verbally referred to NASA's "Knowledge Dictionary" project. Any pointers to literature or persons associated with this project would be greatly appreciated. Cris Kobryn; Harlequin Limited; Barrington Hall; Barrington; Cambridge CB2 5RG, England; Tele: +44-223-872522; Net: cris@uk.co.harlqn; Fax: +44-223-872519 ********** II.B.1. Fr: Ross WIlkinson Re: A Spanish Stemmer Does anyone know of an algorithm for stemming spanish words similar to one of the ones used in English to stem "running" to "run", "addition" to "add"? Thanks for your help. Ross Wilkinson; Department of Computer Science; R.M.I.T.; GOP BOX 2476 V; Melbourne, 3001; AUSTRALIA; ACSnet:ross@goanna.cs.rmit.OZ; INTERNET: ross%goanna.cs.rmit.OZ.AU; JANET: ross%au.oz.goanna@uk.ac.ukc; BITNET: ross%goanna.cs.rmit.OZ.AU@relay.cs.net; UUCP: ..!uunet!goanna.cs.rmit.OZ.AU!ross; Phone: +61 3 660 3310; Fax: +61 3 660 1617 ********************************************************** IV. PROJECT WORK IV.B.1. Fr: Susann Humphrey Re: Selected IR-Related Dissertation Abstracts The following are citations selected by title and abstract as being related to Information Retrieval (IR), resulting from a computer search, using BRS Information Technologies, of the Dissertation Abstracts Online database produced by University Microfilms International (UMI). Included are UMI order number, title, author, degree, year, institution; number of pages, one or more Dissertation Abstracts International (DAI) subject descriptors chosen by the author, and abstract. Unless otherwise specified, paper or microform copies of dissertations may be ordered from University Microfilms International, Dissertation Copies, Post Office Box 1764, Ann Arbor, MI 48106; telephone for U.S. (except Michigan, Hawaii, Alaska): 1-800-521-3042, for Canada: 1-800-268-6090. Price lists and other ordering and shipping information are in the introduction to the published DAI. An alternate source for copies is sometimes provided. Dissertation titles and abstracts contained here are published with permission of University Microfilms International, publishers of Dissertation Abstracts International (copyright by University Microfilms International), and may not be reproduced without their prior permission. AN This item is not available from University Microfilms International ADGC1-53768. AU PEREZ SALINAS, ISABEL. TT THE SPANISH CONTRIBUTION TO THE CIRCULATING LITERATURE IN THE INTERNATIONAL MEDICAL COMMUNITY DURING 1927-1932. INVENTORY, THESAURUS, BIBLIOMETRICS AND PROSOPOGRAPHY. TI LA APORTACION ESPANOLA A LA LITERATURA CIRCULANTE EN LA COMUNIDAD MEDICA INTERNACIONAL DURANTE EL PERIODO 1927-1932. INVENTARIO, THESAURUS, BIBLIOMETRIA Y PROSOPOGRAFIA. LG SPANISH. IN Universitat de Valencia (Spain) Ph.D. 1990, 1403 pages. DE History of Science. IS ISBN: 84-370-0605-8. PU SERVICIO DE PUBLICACIONES, UNIVERSITAT DE VALENCIA, C. DE LA NAVE, 2, 46003 - VALENCIA, SPAIN. AB The increase of the scientific activity and the appearance of a new dynamics in the communication between scientists are the main factors that have lead to the complex situation of information in the contemporary scientific world. The main purpose of the present work is the study of the Spanish contribution to the circulating publications in the international medical community from 1927 to 1932, through the bibliographic repertory Quarterly Cumulative Index Medicus. We have mainly made use of the following four methods: bibliographic description or external documentary analysis, the semantic-documentary analysis, bibliometric methods, and those of prosopography. In the bibliographic inventory we have collected all the works published in Spain, both with Spanish and foreign authors, as well as those published outside Spain by Spanish scientists that have been indexed in the twelve first volumes of the repertory studied. The first part of the inventory collects the papers (in all 5118); the second part collects the books and brochures (80 in all); the third part is an alphabetically classified author index. The thesaurus is, at the moment, the most effective tool in vocabulary control for information retrieval systems. In this work we show an alphabetical list of descriptors and keyworks and their hierarchical relationships that forms a subject catalogue. The bibliometric study contains a first part dedicated to descriptive statistics; a second one with the bibliometric analysis that shows the authors' productivity (LOTKA law), the authors' collaboration in the Spanish publications, and the scattering of the scientific publications (BRADFORD law); a third paragraph in which the subject distribution is compared with the mortality statistics in Spain. A prosopography of authors with twenty or more publications is made. AN University Microfilms Order Number ADG90-33775. AU HSIEH-YEE, INGRID P. Y. TI THE SEARCH TACTICS OF NOVICE AND EXPERIENCED SEARCHERS: A COMPARATIVE STUDY. IN The University of Wisconsin - Madison Ph.D. 1990, 243 pages. DE Library Science. Information Science. AB The study compared how experienced and novice searchers conduct online searches by focusing on their search tactics. Specifically the effects of subject knowledge and search experience on the use of the search tactics were examined. In an experimental setting, thirty-two experienced searchers and thirty novice searchers searched a practice question and two test questions in the ERIC database on the DIALOG system. To address the issue of order effects, half of each subject group received a question in their subject area first, while the other half received a question outside their subject area first. Nine tactics were identified from the literature and ten pretests, and operationalized for the study. Data were collected by protocols, transaction logs, and observation forms. Compared to novice searchers, experienced searchers included more similar concepts, tried out more logical combinations of search terms and used more system devices to expand or reduce retrieval results. When searching an unfamiliar topic, besides using more of the above tactics, they also relied more on the thesaurus for term selection and looked for more terms offline than novice searchers. When searching a familiar topic, experienced searchers relied more on their own terminology than when searching an unfamiliar topic. But in searching an unfamiliar topic, they included more similar concepts, monitored their search more, manipulated their terms more, checked the thesaurus more, and found more search terms before going online than in searching a familiar topic. When searching a familiar topic, novice searchers monitored their search more than experienced searchers. When searching an unfamiliar topic, they were found to rely more on their own terminology than experienced searchers; knowledge had no effect on novice searchers' use of search tactics. AN University Microfilms Order Number ADG91-06285. AU KANE-ESRIG, YANA. TI INFORMATION RETRIEVAL AND ESTIMATION WITH AUXILIARY INFORMATION. IN Cornell University Ph.D. 1990, 317 pages. DE Statistics. Information Science. AB In the first part of this work we construct a new method for selecting from a document collection documents relevant to a user's query. We assume that the documents and the query are indexed by terms, that the documents and the terms are represented by vectors in a document-term space, and that vectors of closely related documents and terms are close to each other in that space. We model the relevance of terms and documents as a probability density over the document-term space. The density is high over the areas of the space containing the vectors of the documents relevant to the query and low over the areas containing the vectors of the non-relevant documents. We use Bayes's rule to compute the density. The relevance density can incorporate a priori knowledge about the user's interests, and the user's query and feedback. We state the desired behaviors of the method, propose two candidate densities and show that they have the desired properties. Tests of the proposed method and of the existing method (the vector averaging method) on two collections show that the proposed method requires much less computing. It performs significantly better than vector averaging in some cases and equally well with vector averaging in other cases. In the second part, we discuss estimating the mean of a Gaussian density when the observations in the sample are evaluated by an expert who tells us whether each observation is "typical" or "unusual", i.e. whether the probability density is high or low over each observed value. We start by assuming that the expert's judgements are correct, then extend the model to include a positive probability that the expert makes mistakes. We present an optimal (lowest variance) unbiased weighted average estimator and a trimmed mean-like unbiased estimator. When the expert is reliable, the variance of both estimators are lower than that of the sample mean X. The variance of the optimal weighted average is always less than or equal to that of X. We develop procedures for computing the MLE and prove consistency results. We also discuss convergence of posterior densities and existence of reproducing priors in the case of Bayesian estimation. ********************************************************** IRLIST Digest is distributed from the University of California, Division of Library Automation, 300 Lakeside Drive, Oakland, CA. 94612-3550. Send subscription requests to: LISTSERV@UCCVMA.BITNET Send submissions to IRLIST to: IR-L@UCCVMA.BITNET Editorial Staff: Clifford Lynch lynch@postgres.berkeley.edu or calur@uccmvsa.bitnet Nancy Gusack ncgur@uccmvsa.bitnet Mary Engle engle@cmsa.berkeley.edu or meeur@uccmvsa.bitnet The IRLIST Archives will be set up for anonymous FTP, and the address will be announced in future issues. To access back issues presently, send the message INDEX IR-L to LISTSERV@UCCVMA.BITNET. To get a specific issue listed in the Index, send the message GET IR-L LOG ***, where *** is the month and day on which the issue was mailed, to LISTSERV@UCCVMA.BITNET. These files are not to be sold or used for commercial purposes. Contact Nancy Gusack or Mary Engle for more information on IRLIST. The opinions expressed in IRLIST do not represent those of the editors or the University of California. Authors assume full responsibility for the contents of their submissions to IRLIST.