Information Retrieval List Digest 090 (November 4, 1991) URL = http://hegel.lib.ncsu.edu/stacks/serials/irld/irld-090 IRLIST Digest November 4, 1991 Volume VIII, Number 47 Issue 90 ********************************************************** I. NOTICES A. Meeting Announcements/Calls for Papers 1. Combinatorial Pattern Matching, Tucson, Arizona, April 29-May 1, 1992 II. QUERIES B. Requests for Information 1. TOPIC IV. PROJECT WORK B. Bibliographies 1. IR-Related Dissertation Abstracts ********************************************************** I. NOTICES I.A.1. Fr: Udi Manber Re: Call for Papers - Combinatorial Pattern Matching CALL FOR PAPERS COMBINATORIAL PATTERN MATCHING Tucson, Arizona, USA, April 29 - May 1, 1992 The third symposium on Combinatorial Pattern Matching will be held in Tucson, Arizona, USA, on April 29 - May 1, 1992. (This is the end of the week preceding the Theory of Computing conference to be held in Victoria, BC, May 3-5.) The first two meetings were held at the University of Paris in 1990 and at the University of London in 1991. Papers in all areas related to combinatorial pattern matching and its applications will be considered, including, but not limited to, string algorithms, pattern recognition, applications in molecular biology, text searching, information retrieval, symbolic computing, and data compression. Two types of submissions will be considered. 1. Original research unpublished elsewhere. 2. Survey of important results, especially recent ones. Proceedings, probably in the Springer-Verlag series on Computer Science, will be published. The goal is to keep the attendance at about 50-70 participants to allow ample time for informal interaction. Thus, attendance will be by invitation only. If you are interested in attending the meeting without submitting a paper, contact us via the address below. Some financial assistance for students will probably be available. To submit a paper, please send 10 copies of an extended abstract (5-10 pages) to the address below by December 16, 1991. This is a strict deadline. Papers arriving after that date will not be considered. Please include e-mail address if possible. Authors will be notified by February 10, 1992. Udi Manber, CPM 92 Dept. of Computer Science University of Arizona Tucson, AZ 85721 USA e-mail address: udi@cs.arizona.edu. Program Committee: A. Apostolico, M. Crochemore, Z. Galil, G. Gonnet, D. Gusfield, D. Hirschberg, U. Manber, E. Myers, F. Tompa, and E. Ukkonen. ********************************************************** II. QUERIES II.A.1. Fr: Lou Rosenfeld Re: IR-Related Dissertation Abstracts The following are citations selected by title and abstract as being related to Information Retrieval (IR), resulting from a computer search, using BRS Information Technologies, of the Dissertation Abstracts Online database produced by University Microfilms International (UMI). Included are UMI order number, title, author, degree, year, institution; number of pages, one or more Dissertation Abstracts International (DAI) subject descriptors chosen by the author, and abstract. Unless otherwise specified, paper or microform copies of dissertations may be ordered from University Microfilms International, Dissertation Copies, Post Office Box 1764, Ann Arbor, MI 48106; telephone for U.S. (except Michigan, Hawaii, Alaska): 1-800-521-3042, for Canada: 1-800-268-6090. Price lists and other ordering and shipping information are in the introduction to the published DAI. An alternate source for copies is sometimes provided. Dissertation titles and abstracts contained here are published with permission of University Microfilms International, publishers of Dissertation Abstracts International (copyright by University Microfilms International), and may not be reproduced without their prior permission. AN University Microfilms Order Number ADG91-07195. AU HINDES, MARY ANN. TI THE SEARCH PROCESSES AND ATTITUDES OF STUDENTS ACCESSING CD-ROM RESOURCES: A CASE STUDY OF TWO HIGH SCHOOL MEDIA CENTERS. IN University of Georgia Ed.D. 1990, 175 pages. DE Education, Technology. Information Science. AB The purpose of this study was to examine the search processes of students accessing CD-ROM resources and to describe their attitudes concerning this approach to computer-based information retrieval. This study employed a case study research design to examine student use of CD-ROM technology in two high school media centers. Four media specialists and twenty-two students were observed and interviewed during March 1990. The researcher observed fifty-five instances of students accessing the available CD-ROM resources. Three categories emerged from the analysis of the data concerning the students' search processes and two categories emerged from the data concerning students' attitudes towards computer-based searching of CD-ROM resources. The three categories of the Search Process actually reflect a process, viz., initiation of the search, modifications to the search, and selection of citations. The Search Initiation phase of the search process is described by the following properties which reflect the students' approach: the Shotgun Approach, the Educated Guess, and the Resourceful Approach. The second phase of the search process, called Search Modification, addressed that part of the search involving students' manner of Broadening the Search or Narrowing the Search. The third phase of the search process was the students' Selection of Citations. Students selected citations based on the Availability of the full-text articles in their local media centers or solely on the seeming Aptness of the information. From the data concerning students' attitudes towards the use of CD-ROM resources, two major categories developed. Students felt Comfortable using CD-ROM technology and were Enthusiastic about this form of computer-based information retrieval. Students considered the systems easy to access and developed a personalized relationship with the computer. Enthusiastic responses were generated by the sense of independence students derived from using CD-ROM resources. Two general conclusions of the study were (a) Students experienced difficulty performing searches, a difficulty exacerbated by their failure to use the tutorials or help screens included in the search systems; and (b) despite the difficulties encountered and their failure to take advantage of the full capabilities of the systems, students liked this form of information retrieval, preferring it to print-based searching. AN University Microfilms Order Number ADG91-06437. AU KIM, CHONGTAI. TI DEVELOPMENT OF A NATURAL LANGUAGE INTERFACE TO A SLEEP EEG DATABASE. IN The University of Florida Ph.D. 1990, 175 pages. DE Engineering, Electronics and Electrical. Artificial Intelligence. AB A natural language (NL) system called SEEGER (Sleep $\underline{\rm EEGer}$) has been developed to retrieve information from a sleep database without requiring any database expertise or computer programming knowledge of the operator. A new knowledge-based structure for building a NL system has been developed and designed into SEEGER. It provides a more convenient method for incorporating domain knowledge than the previous approaches to building NL systems. In addition, it provides a more efficient database mapping method for those NL requests that rely on bottom-up parsing more often than top-down parsing. SEEGER handles ungrammatical input, makes inferences, and disambiguates some complicated word senses. It engages in an interactive dialogue to correct spelling errors, to clarify ambiguous or incomplete queries, and to enter a synonym when a word is not in the lexicon. The database is then retrieved by retrieval query created from SEEGER. The sleep database has been designed to contain information about subject record, channel recording, epoch summaries of waveform occurrences in EEG/EOG/EMG, sleep stage parameters, and night parameters. It has been implemented with the commercially available dBASE IV database management system. SEEGER is now in the experimental stage, and it was evaluated using four users familiar with sleep data analysis. It gave correct answers at the performance level of 63%, achieving the preliminary design goal. The entire program has been written in about 10,000 lines of LISP. The system performance was analyzed in terms of the elements of the syntactic variations, semantic complexities, and interactive dialogues based on the test results. These tasks show that future improvements can lead to a sleep NL system performing at the level of a data processing technician, but it would take at least two man-years of effort, assuming the developers already possess some background in artificial intelligence programming. SEEGER has been implemented in a personal computer; it requires 6 megabytes of system memory and 15 megabytes of hard disk storage for the LISP environment and dBASE IV. Finally, the guidelines for future development are suggested. AN University Microfilms Order Number ADG91-09215. AU VASSILAS, NIKOLAOS. TI PERFORMANCE OF NEURAL ASSOCIATIVE MEMORIES FOR CHARACTER RECOGNITION AND ASSOCIATIVE DATABASE RETRIEVAL. IN University of Minnesota Ph.D. 1990, 105 pages. DE Engineering, Electronics and Electrical. Computer Science. Artificial Intelligence. AB Neural associative memories have been extensively used in applications that perform classification or some kind of input-output mapping. The main advantage of neural associative memories over conventional computer memories is their robustness in noisy environments, generalization and fast retrieval capabilities. In this work we use a robust associative memory model to investigate the performance, capacity and saturation effects of three of the most popular associative memory paradigms, namely: back-propagation networks, correlation matrix memories and generalized inverse memories. The comparison studies use binary character recognition and spelling correction as application domains. Our analytic and experimental modeling shows that correlation matrix memories used as classifiers exhibit overall superiority over other training rules. The correlation matrix memories are then used as the building block of a new neural model for semantic categorization and concept formation. The proposed model represents a hierarchical two-level associative memory, where the first level memory stores categories (concepts) of semantically related items, and each of the second level memories stores items (prototypes) forming a given concept. Memory construction (or learning) is based on supervised learning. Besides its ability to form and store semantic categories of input patterns, the proposed hierarchical model offers substantial scaling advantages over traditional single-level associative memory systems in terms of software implementations. Experimental results (assuming fixed prototypes) for binary character recognition and spelling correction applications demonstrate excellent quality of associative retrieval and superior scaling properties of the proposed model. Finally, we propose an iterative scheme for adaptive prototype formation from repeated presentations of noisy inputs, and present theoretical analysis and proof of convergence. We also find optimal adaptation parameters that provide the best convergence rate. ********************************************************** IRLIST Digest is distributed from the University of California, Division of Library Automation, 300 Lakeside Drive, Oakland, CA. 94612-3550. Send subscription requests to: LISTSERV@UCCVMA.BITNET Send submissions to IRLIST to: IR-L@UCCVMA.BITNET Editorial Staff: Clifford Lynch lynch@postgres.berkeley.edu or calur@uccmvsa.bitnet Nancy Gusack ncgur@uccmvsa.bitnet Mary Engle engle@cmsa.berkeley.edu or meeur@uccmvsa.bitnet The IRLIST Archives will be set up for anonymous FTP, and the address will be announced in future issues. To access back issues presently, send the message INDEX IR-L to LISTSERV@UCCVMA.BITNET. To get a specific issue listed in the Index, send the message GET IR-L LOG ***, where *** is the month and day on which the issue was mailed, to LISTSERV@UCCVMA.BITNET. These files are not to be sold or used for commercial purposes. Contact Nancy Gusack or Mary Engle for more information on IRLIST. The opinions expressed in IRLIST do not represent those of the editors or the University of California. Authors assume full responsibility for the contents of their submissions to IRLIST.