Information Retrieval List Digest 178 (September 7, 1993) URL = http://hegel.lib.ncsu.edu/stacks/serials/irld/irld-178 IRLIST Digest ISSN 1064-6965 September 7, 1993 Volume X, Number 34 Issue 178 ********************************************************** I. NOTICES A. Meeting Announcements/Calls for Papers 1. Hypertext '93 Conference Advanced Program IV. PROJECT WORK C. Abstracts 1. IR-Related Dissertation Abstracts ********************************************************** I. NOTICES I.A.1. Fr: Muru Palaniappan Re: Selected IR-Related Dissertation Abstracts The following are citations selected by title and abstract as being related to Information Retrieval (IR), resulting from a computer search, using BRS Information Technologies, of the Dissertation Abstracts Online database produced by University Microfilms International (UMI). Included are UMI order number, title, author, degree, year, institution; number of pages, one or more Dissertation Abstracts International (DAI) subject descriptors chosen by the author, and abstract. Unless otherwise specified, paper or microform copies of dissertations may be ordered from University Microfilms International, Dissertation Copies, Post Office Box 1764, Ann Arbor, MI 48106; telephone for U.S. (except Michigan, Hawaii, Alaska): 1-800-521-3042, for Canada: 1-800-268-6090. Price lists and other ordering and shipping information are in the introduction to the published DAI. An alternate source for copies is sometimes provided. Dissertation titles and abstracts contained here are published with permission of University Microfilms International, publishers of Dissertation Abstracts International (copyright by University Microfilms International), and may not be reproduced without their prior permission. AN University Microfilms Order Number ADGDX-96897. AU ROLFE, RICHARD. TI INFORMATION RESOURCE UTILISATION: ACCESSIBILITY BASED ON CONCERNS. IN University of Kent at Canterbury (United Kingdom) Ph.D. 1991 307 pages. SO DAI v53(05), SecA, pp1306. DE Information Science. Computer Science. AB Available from UMI in association with The British Library. The utilisation of the information resource is a key requirement of an effective information system within the organisation. With computing technology generating increasingly complex applications but with an increasing spread of computing contactability through the organisation, an inherent conflict is presented which can only worsen as hypermedia and multimedia applications are brought to the organisation. This work identified user participation as the key ingredient in improving information resource utilisation. A two pronged approach was taken: initially a field based investigation of the utilisation of a complex information system within a large organisation and secondly the development of a user interface environment to tackle the issues raised. The fieldwork provided evidence to suggest the level of accessibility to Health Service information left wide scope for development. Several variables were identified as having a key relationship with information resource utilisation. Of these complexity predominated, both in terms of the database structures in the application and the functionality provided through the interface itself. The prototype focused on this complexity issue directly, providing a user driven interpretative environment in which both user and organisation based information can be combined to form a dynamic, user based semantic model. Such a graphical representation evolves in parallel with user requirements, enabling the user to manipulate his/her own view of the application for the expression of database queries. The basis for this interpretation is the generalised conceptual model which is captured from the data modeller during the initial application data modelling process. This view forms the basis for user accessibility, acting as an initial interpretation which the user tailors towards a personal interpretation using 'concerns'. The provision of such an environment promotes user accessibility and hence utilisation, being personalised by the individual to the task at hand. AN University Microfilms Order Number ADG92-27393. AU UNNAVA, VASUNDHARA. TI QUERY PROCESSING IN DISTRIBUTED DATABASE SYSTEMS. IN The Ohio State University Ph.D. 1992, 128 pages. SO DAI v53(05), SecA, pp1306. DE Information Science. Business Administration, Management. Computer Science. AB During the last decade distributed database management systems (DDBMS) have become important information processing systems supporting business activities of geographically decentralized organizations. Since data files are distributed at several locations in a DDBMS, user queries that reference remote files introduce substantial data communication delays. The efficiency of a DDBMS is determined by the speed with which these queries are processed. This dissertation deals with the optimization of query processing in a relational DDBMS. Our objective is to develop a methodological approach to the design of query processing optimizers. The algorithms developed in this dissertation will be valuable tools in the design of a DDBMS. The first chapter of the dissertation describes a distributed database environment and the importance of query processing in such an environment. The second chapter presents a detailed literature survey. In the third chapter, a special case of queries, star queries, is defined. The requirement for new algorithms to improve system efficiency is demonstrated. Heuristic procedures using greedy approach and a branch and bound solution procedure are proposed. An efficient lower bounding technique is implemented in the branch and bound procedure. The results of extensive computational experiments indicate that the proposed procedures process star queries effectively. Also, the greedy algorithm proves to be insensitive to errors in the selectivity estimation procedures. The fourth chapter concentrates on the problem of a generalized star query. The problem, an extension of star query, is significantly harder than the star query problem because its optimization model includes an additional operation of joining files. Heuristic and branch and bound solution methods are developed. Extensive computational testing supports the practical feasibility of the solution methods. Also, rigorous analysis of the generalized star query algorithm in a dynamic mode exhibits that the static version is robust to changes in the procedures used to estimate selectivity. The fifth chapter investigates the use of heuristics in the general query processing problem. Algorithms for the problem of general query processing, which consider various methods of selecting a semijoin in producing a query processing strategy are proposed. Computational experiments are designed to assess the performance of the proposed algorithms relative to existing algorithms. The analyses show that the proposed algorithms outperform the existing algorithms. Chapter six summarizes our work and also discusses future research directions in the field of query processing in DDBMS. AN University Microfilms Order Number ADGMM-67170. 9212. AU BALLARTE, SANDRA. TI EXTENDING A TEXTUAL QUERY LANGUAGE THROUGH PROCEDURES. IN Queen's University at Kingston (Canada) M.Sc. 1991, 122 pages. SO MAI V30(04) pp1373. DE Computer Science. IS ISBN: 0-315-67170-X. AB Many retrieval systems have recognized the importance of the hierarchical structure of documents. In fact, considerable interest exists in developing textual databases that capture this structure. One example is the MAESTRO project which provides a textual query language to retrieve and link information contained in structured documents. Although most of these models provide retrieval capabilities, they require more specialized operations and user-defined functions. Moreover, new trends have emerged from the area of hypertext and these should also be incorporated into these text models. It seems necessary to enhance the database concepts for structured documents. This thesis investigates an extension to a text model based on Stonebraker's extension to a relational database management system for supporting database procedures as full-fledged database objects. To implement this enhancement, the Maestro query language is extended to include procedures attached to objects. This work gives the specifications necessary to incorporate these procedures into the query language and explores the different applications of this extension in the area of hypertext and text retrieval. AN University Microfilms Order Number ADG92-35535. AU DOMESHEK, ERIC ANDREW. TI DO THE RIGHT THING: A COMPONENT THEORY FOR INDEXING STORIES AS SOCIAL ADVICE. IN Yale University Ph.D. 1992, 520 pages. SO DAI V53(07), SecB, pp3592. DE Computer Science. Artificial Intelligence. AB Within the Artificial Intelligence paradigm and Case-Based Reasoning (CBR), an important problem is specifying how appropriate old cases are to be retrieved so as to assist in coping with new situations. This has been called the indexing problem. This dissertation describes an indexing system supporting retrieval of past cases as advice about everyday social problems; the system has been implemented in the Abby lovelorn advising program. Abby currently contains over 500 indices, each giving access to a story of some social problem. One major result of indexing such a large case-base, was the design and validation of a comprehensive vocabulary for describing social situations. Much of this dissertation can be read as a reference work detailing representational components that have proved useful in this context, and that are likely to serve well in others as well. In addition, this work helps to clarify and systematize a methodology for constructing component theories in support of indexing. Two other points are emphasized throughout the discussion of Abby's indexing system: (1) indices are descriptions of problems and their causes, couched in a vocabulary centered on intentional causality, and (2) indices fit a fixed format that allows reification of identity and thematic relationships as features, and thus, parallel associative retrieval sensitive to important aspects of input situations. Abby answers several of the central questions that any indexing system must address, and offers some advantages over less restrictive systems. AN University Microfilms Order Number ADGMM-67125. 9212. AU FAUSTINO, ANGELIQUE F. TI TOWARDS A FRAMEWORK FOR ORGANIZING TEXT DOMINATED DATABASES. IN Queen's University at Kingston (Canada) M.Sc. 1991, 162 pages. SO MAI V30(04) pp1376. DE Computer Science. Information Science. IS ISBN: 0-315-67125-4. AB The view of text as a complex object has been widely recognized for decades, particularly in recent years since the evolution of the hypertext paradigm. This work describes a unified conceptual framework for organizing structured text, which serves as a general-purpose data model for describing a wide range of text-oriented applications. The framework is unified in the sense that the same language constructs can be used to describe both the structure of text and management related issues of how documents are to be collected and organised. The proposed framework offers several new features, many of which are particularly useful for organizing hypertext space. The data model is based on an extension of the network database model, thereby allowing non-hierarchical links between objects to exist. A single data structure (i.e. fan) models all objects in the database, thus providing a uniform approach to the treatment of all database objects. An inheritance mechanism for object classes serves both as a tool for organizing information and for ease of database maintenance. Constraint mechanisms for controlling semantic and structural integrity of objects are included. The framework provides an associated set of operators precisely defined in the form of an extended relational algebra. In contrast to other models, the proposed framework preserves the simplicity of the relational model, while providing powerful, semantically rich modelling tools for representing structured text applications. AN University Microfilms Order Number ADGMM-64619. 9212. AU MATHESON, STEVEN ALBERT. TI PERFORMANCE EVALUATION OF A DISTRIBUTED INDEX INFORMATION RETRIEVAL MODEL. IN Dalhousie University (Canada) M.Sc. 1990, 106 pages. SO MAI V30(04) pp1383. DE Computer Science. IS ISBN: 0-315-64619-5. AB Information retrieval systems have traditionally used the inverted index model and have been implemented on serial machines. Modern databases are becoming increasingly large and so is the demand to search through them quickly and efficiently. The use of massively parallel computers has been suggested to more efficiently handle these larger databases; however, the effectiveness of such systems has been questioned. A distributed index information retrieval system is proposed as an alternative to parallel machines. Based on the popular inverted index model supporting Boolean queries, the proposed model utilizes distributed processing to improve the query processing time. The database index used by the system is partitioned and distributed over several components of a specialized multicomputer system. This model is intented to provide a faster query processing time, a higher rate of throughput, and the ability to be extended for term weighting and term proximity, allowing the user to form more effective queries. Simulations are used to compare the performance of this proposed model to the serially implemented model for variable and fixed sized queries with different numbers of partitions. ********************************************************** IRLIST Digest is distributed from the University of California, Division of Library Automation, 300 Lakeside Drive, Oakland, CA. 94612-3550. Send subscription requests to: LISTSERV@UCCVMA.BITNET Send submissions to IRLIST to: IR-L@UCCVMA.BITNET Editorial Staff: Clifford Lynch calur@uccmvsa.ucop.edu or calur@uccmvsa.bitnet Nancy Gusack ncgur@uccmvsa.bitnet or ncgur@uccmvsa.ucop.edu Mary Engle meeur@uccmvsa.bitnet The IRLIST Archives is now set up for anonymous FTP, as well as via the LISTSERV. Using anonymous FTP via the host dla.ucop.edu, the files will be found in the directory pub/irl, stored in subdirectories by year (e.g., /pub/irl/1993). Using LISTSERV, send the message INDEX IR-L to LISTSERV@UCCVMA.BITNET. To get a specific issue listed in the Index, send the message GET IR-L LOGYYMM, where YY is the year and MM is the numeric month in which the issue was mailed, to LISTSERV@UCCVMA (Bitnet) or LISTSERV@UCCVMA.UCOP.EDU. You will receive the issues for the entire month you have requested. These files are not to be sold or used for commercial purposes. Contact Nancy Gusack or Mary Engle for more information on IRLIST. THE OPINIONS EXPRESSED IN IRLIST DO NOT REPRESENT THOSE OF THE EDITORS OR THE UNIVERSITY OF CALIFORNIA. AUTHORS ASSUME FULL RESPONSIBILITY FOR THE CONTENTS OF THEIR SUBMISSIONS TO IRLIST.