Information Retrieval List Digest 017 (April 9, 1990) URL = http://hegel.lib.ncsu.edu/stacks/serials/irld/irld-017 IRLIST Digest April 9, 1990 Volume VII Number 11 Issue 17 ********************************************************** I. NOTICES: A. Meetings announcements/Calls for papers 1. 7th Israeli Conference on Artificial Intelligence and Computer Vision, Tel-Aviv, December 26-27, 1990 B. Publications announcements 1. New Journal Announcement - Call for Papers II. QUERIES: B. Requests for information 1. Wanted: pointers to algorithms and complexity results IV. PROJECTS: B. Bibliographies 1. Selected IR-related dissertation abstracts (This will be an ongoing submission as there are hundreds of abstracts.) ********************************************************** I. NOTICES I.A.1. Fr: Yishai Feldman Re: 7th Israeli Conference on Artificial Intelligence and Computer Vision, Tel-Aviv, December 26-27, 1990 Call For Papers 7th Israeli Conference on Artificial Intelligence and Computer Vision Tel-Aviv, December 26-27, 1990 The conference is the joint annual meeting of the Israeli Association for Artificial Intelligence, and the Israeli Association for Computer Vision and Pattern Recognition, which are affiliates of the Israeli Information Processing Association. Papers addressing all aspects of AI and Computer Vision, including, but not limited to, the following topics, are solicited: * Image Processing and Pattern Recognition * Image Analysis and Computer Vision * Visual Perception * Applications * Robotics * Inductive inference * Knowledge acquisition * AI and education * AI languages * Automated reasoning * Cognitive modeling * Expert systems * Natural language processing * Planning and search * Knowledge theory * Logics of knowledge Submitted papers will be refereed by the program committee, listed below. Authors should submit 4 copies of a full paper or an extended abstract. Accepted papers will appear in the conference proceedings. Papers should be received by the conference co-chairmen at the following addresses by June 1st, 1990. Authors will be notified of accepted papers by September 1st, 1990. Vision: AI: Prof. Alfred Bruckstein Dr. Yishai Feldman 7th IAICV 7th IAICV Dept. of Computer Science Dept. of Applied Mathematics Technion The Weizmann Institute of Science 32000 Haifa, Israel 76100 Rehovot, Israel Freddy@cs.technion.ac.il Yishai@wisdom.weizmann.ac.il Program Committee: AI: Mira Balaban Ben Gurion University Moshe Ben Bassat Tel Aviv University Rina Dechter Techion Ehud Gudes Ben Gurion University Tamar Flash The Weizmann Institute Daniel Lehmann Hebrew University Marc Luria Technion Yoram Moses The Weizmann Institute Uzzi Ornan Technion Jeff Rosenschein Hebrew University Uri Schild Bar Ilan University Ehud Shapiro The Weizmann Institute Vision: Zvi Meiri IBM Amnon Meizles Ben Gurion University Shmuel Peleg Hebrew University Moshe Porat Technion Micha Sharir Tel Aviv University Shimon Ullman The Weizmann Institute Michael Werman Hebrew University Haim Wolfson Tel Aviv University Yehezkel Yeshurun Tel Aviv University ********************************************************** I.B.1. Fr: Elan Moritz <71620.3203@compuserve.com> Re: New Journal Announcement - Call for Papers please post and distribute to investigators of: ========================== * human and machine intelligence * knowledge systems * computational linguistics * natural languages * theoretical biology * population genetics * ethology / cultural ecology * information storage and transfer * learning and teaching systems * philosophy of knowledge * philosophy and history of science ++++++++++++++++++++++++++++++++++++++++++++++++++ NEW JOURNAL ANNOUNCEMENT & CALL FOR PAPERS ................... . JOURNAL of IDEAS . ................... IMR, BOX 16327, PANAMA CITY, FLORIDA 32406, USA ++++++++++++++++++++++++++++++++++++++++++++++++++ The Institute for Memetic Research [IMR] is publishing a new journal called 'Journal of Ideas'. The main purpose of the journal is to provide an archival forum for discussion of the genesis, evolution, competition and death of 'ideas' and 'memes'. The term 'idea' is one that requires careful discussion. The original term 'meme' [pronounced: meem] is a conceptual construct introduced by Richard Dawkins to describe units of cultural transmission and imitation. IMR uses the term 'meme' as a point of departure for an area we call 'Memetic Science'. Ultimately, 'meme' requires further definition and clarification. The primary thesis of Memetic Science is that 'ideas' and 'memes' are entities that are functionally similar to biological genes in their ability to replicate, mutate, and undergo natural selection. What are sought in Memetic Science are: rigorous quantitative foundations, theory, and experimental methodology and measurements. The history of the study of 'ideas'-as-entities-by-themselves is ancient. >From Plato & Aristotle, through Locke, Hume, Descartes, Kant and modern philosophers, we have a variety of qualitative theories and speculations. Logic theory, philology, modern linguistics, and computer oriented technologies, have provided a start in the area of understanding structures, grammars, and truth conditions of sentences and small collections of sentences. Population geneticists and biologists have provided initial models for spread of 'cultural' constructs. These models incorporate the techniques of dominant/recessive allele spreading in genetic pools and epidemiological approaches. Some models use compound constructs of 'gene + culture' elements as the particulate elements that replicate and propagate. While the contributions from these diverse disciplines are useful, there are needs for systematic, robust and, most importantly, quantitative approaches. Present day applications of Memetic Science include both human aspects of replication, mutation, competition, spread and death of ideas and memes, as well as their electronic analogs. The 'electronic memes' are beneficial messages, reusable subroutines, programs that are freely [or surreptitiously] copied and modified, computer viruses, worms, trojan horses, etc. To address the needs stated above the Institute for Memetic Research is launching the Journal of Ideas (first issue printing, September 1990). The detailed statement of scope, pivotal references, subscription information, and instruction for authors is available upon written request from: Elan Moritz, Editor Journal of Ideas The Institute for Memetic Research Box 16327 Panama City, Florida 32406, USA email address: INTERNET: 71620.3203@compuserve.com or INET: 71620.3203@compuserve.com The Journal of Ideas will appear [initially] quarterly, and will contain the following reqular sections: 1) Invited papers, 2) Research Contributions, 3) Rapid Publications and 4) Discussion of persistence and spread of existing 'Major Ideas'. Only previously unpublished papers will be accepted. Page charges for invited papers will be waived. Brevity, and jargon accessible to interdisciplinary researchers are encouraged. Standard transfer of copyrights is required prior to printing. To encourage participation and discussion of this new area, IMR/JoI will experiment with two categories of papers. One category will be strictly reviewed and refereed, while another will be reviewed by the editor but not refereed. Non-refereed papers will be so marked; they will have the advantages of rapid publication and possible disadvantages of archival of errors. To expedite processing, authors can immediately submit papers prepared according to a standard professional society [e.g. IEEE, AIP, APS] journal manuscript format. Three copies are required. On an experimental basis, authors who would like to submit papers for rapid publication using email may submit papers using the internet address [INTERNET: 71620.3203@compuserve.com]. These papers should consist of ASCII text only, with equations built up carefully using ASCII text. Papers submitted through email should be followed up by submitting a written version via regular postal channels. Readers of this message are encouraged to suggest topics and individuals [including themselves] to be considered for invited papers. ********************************************************** II. QUERIES II.B.1. Fr: Gerard Ellis Re: Pointers to algorithms and complexity results I am interested in any pointers to the literature on the following problems. Problem: Given the set S, a p.o. R on S defined by an oracle, and an element top, give efficient implementations of the following set of operations: (i) member(u, S) (ii) specializations(u, S); returns the ordered set T = {v in S| v R u} An obvious method would be to construct a Hasse diagram representing R on S, and search from top, selecting a node of which u is a specialization from adjacent nodes, at each step, resulting in a deterministic search for u. This solves (i) and also (ii) given u in S. Associated Problem: Given a set S, a p.o. R on S defined by an oracle, and an element top, give efficient implementations of : (i) construct(S) - construct a Hasse diagram representing R on S. (ii) insert(u) - insert u into the Hasse diagram. The second problem is of particular interest since elements of S will arrive interspersed with queries (member, specializations). Of particular interest are examples where elements of S are complex objects (such as graphs) which minimize queries to the oracle. Applications include: pattern-recognition, content-addressable memory, machine learning, AI/DB systems. Thank you, Gerard. email: ged@batserver.cs.uq.oz.au (overseas) ged@batserver.cs.uq.oz (within Australia) Gerard Ellis Key Centre for Information Technology University of Queensland, Qld, 4072 Australia ********************************************************** IV. PROJECTS IV.B.1. Fr: Susanne Humphrey Re: Selected IR-Related Dissertation Abstracts Compiled by: Susanne M. Humphrey National Library of Medicine Bethesda, MD 20894 The following are citations selected by title and abstract as being related to Information Retrieval (IR), resulting from a computer search, using the BRS Information Technologies retrieval service, of the Dissertation Abstracts International (DAI) database produced by University Microfilms International. Included are the UMI order number; author; university, degree, and, if available, number of pages; title; DAI subject category chosen by the author of the dissertation; and abstract. Unless otherwise specified, paper or microform copies of dissertations may be ordered from University Microfilms International, Dissertation Copies, Post Office Box 1764, Ann Arbor, MI 48106; telephone for U.S. (except Michigan, Hawaii, Alaska): 1-800-521-3042, for Canada: 1-800-268-6090. Price lists and other ordering and shipping information are in the introduction to the published DAI. An alternate source for copies is sometimes provided at the end of the abstract. The dissertation titles and abstracts contained here are published with permission of University Microfilms International, publishers of Dissertation Abstracts International (copyright by University Microfilms International), and may not be reproduced without their prior permission. AN University Microfilms Order Number ADG88-24752. AU KAMEL, MAGDI NESSIM. IN University of Pennsylvania Ph.D. 1988, 168 pages. TI REDUNDANCY: AN APPROACH TO THE EFFICIENT IMPLEMENTATION OF SEMANTIC INTEGRITY ASSERTIONS AND VIEWS. DE Information Science. AB Frequently executed queries are an important concept of database management systems. They include alerters, triggers, semantic integrity assertions and views. Unfortunately, executing a frequently executed query usually requires evaluating a complex expression that requires accessing several database relations, thus causing severe efficiency problems. In this thesis, we propose a procedure to efficiently implement frequently executed queries in the context of evaluating semantic integrity assertions and views. The method is based on storing redundant data that represent an intermediate stage of evaluating the query. These sets are used to efficiently construct the results of the query, yet they are easy to maintain. The method is a two-step technique. The first step identifies the candidate redundant data using heuristics from query processing research. The second step selects the redundant data based on the users query and update pattern, the profiles of the database relations, the query optimizer of the database management system, and the characteristics of the physical data storage devices used. We also develop an analytical model to compare our approach, called semi-materialization, with other approaches in the context of view processing and to identify conditions when our approach is most attractive. Other approaches include query modification, under which the view is not materialized at all, and full materialization, under which the view is kept materialized at all times. Our model indicates that the results are most sensitive to the frequency of updates (P), the selectivity of the view predicate ($f\sb{v}$), the selectivity of the query predicate ($f\sb{q}$), the number of tuples per update (l), and the size of the relations (N). For select-project-join expressions, and except for high values for P, both full and semi-materialization perform better than query modification. Higher values of P, $f\sb{v}$ or l or lower values of $f\sb{q}$ or N favor semi-materialization over full materialization. For general view definitions, as would be found with complex triggers, alerters and integrity assertions, semi-materialization is superior to both full materialization and query modification for a wide range of parameter settings. Finally, the results of the analytical model are verified through an implementation using the database management system INGRES. AN University Microfilms Order Number ADG88-25746. AU LATHROP, ANN. IN University of Oregon Ph.D. 1988, 249 pages. TI ONLINE INFORMATION RETRIEVAL AS A RESEARCH TOOL IN SECONDARY SCHOOL LIBRARIES. DE Information Science. Library Science. Education Technology. AB The rapid expansion of electronic online information databases makes it appropriate to investigate the use of this new research tool in secondary school libraries. Interviews with leaders in the field and a literature review provided a rationale for the 100-item survey completed in May, 1987, by 73 librarians in 19 states. Eight research questions addressed instructional objectives, student and staff training and applications, database selection, funding, and equipment. Instructional goals for the online information retrieval program were categorized as being at an awareness of level, a skills level, or a research tool level. Critical thinking skills, location and use of materials cited, and curriculum integration received more emphasis at the research tool level. Librarians provided most student instruction, generally four hours or less for 20% or fewer of all students. Teachers received some training at 27 of the 73 schools, but only 16 librarians reported more than 10% faculty involvement. Most librarians conducted online searches while students observed or actively participated. Students, librarians, and teachers perceived the program as being somewhat more effective and successful at the 29 schools where some students conducted independent searches. Content appropriate to the curriculum was the most important database selection criterion. Magazine Index was the most used of 91 databases listed and DIALOG was the primary database vendor. Lack of teacher interest and cooperation was the major difficulty reported in achieving instructional goals. Other difficulties were inadequate library staff time, equipment, and funding. Few schools had written policies and record keeping was minimal. A recommendation was made that librarians, teachers, and administrators cooperatively develop a formal written policy with specific instructional goals and evaluation criteria. Fifty-eight libraries considered the online information retrieval program to be "very useful" or "useful to certain groups of students." Most librarians were enthusiastic and 46 planned to expand the program. Recommendations for further research focused on investigating equal access to online information databases, the impact of CD-ROM technology, and the establishment of criteria to evaluate online information retrieval programs in school libraries. AN University Microfilms Order Number ADG13-34329. AU FORD, VICTORIA. IN University of Nevada, Reno M.A. 1988, 135 pages. TI THE INFORMATION NETWORK: LINKING LIBRARIES, JOURNALISTS AND JOURNALISM SCHOOLS. DE Journalism. Information Science. AB Freedom of the press in America carries the responsibility for providing the information free citizens need for self-government. Yet, the American media has responded to criticism of its news quality by conducting readership surveys--studying what the public wants rather than how to improve reporters' skills. The purpose of this study was to explore one research method used by journalists. A questionnaire was designed to determine whether Nevada newspaper journalists use libraries for research. The hypothesis was that reporters with the shortest deadlines would use the library the least. The results were the opposite. Nevada print media journalists with the shortest deadlines use the library the most. This study concludes that a combination of time and distance--or convenience--determines whether or not journalists will use libraries for research. The results were used to design a public relations program which would create a first-of-its-kind information network among a university library, a journalism school and a newspaper. AN This item is not available from University Microfilms International ADG05-64115. AU ALLEN, BRYCE LAVERNE. IN The University of Western Ontario (Canada) Ph.D. 1988. TI BIBLIOGRAPHIC AND TEXT-LINGUISTIC SCHEMATA IN THE USER-INTERMEDIARY INTERACTION. DE Library Science. AB Information systems require input from their users in order to perform information retrieval. In many systems, that input is provided by interaction between users and intermediaries. The way users understand and express their information needs may be affected by the cognitive structures (schemata) by which they have organized their knowledge of the search topic, or by the schemata introduced in the questions which intermediaries ask. The bibliographic and text-linguistic schemata studied in this research are related to two ways of thinking about textual materials. The bibliographic schema leads to an emphasis on elements of bibliographic description: authors, titles, and subject keywords. The text-linguistic schema leads to an emphasis on the structural components of texts: in this case the Purpose, Methodology, Findings and Discussion found in scientific report articles. The first experiment introduced these two schemata at the point of knowledge acquisition, and in the user-intermediary interaction. When presented through intermediary questions, the bibliographic schema led to short responses with small numbers of subject keywords, while the text-linguistic schema led to long responses with large numbers of subject keywords. Open questions, which presented no specific schema, produced responses which were longer than responses to bibliographic questions but shorter than responses to text-linguistic questions. There was no evidence that the schema introduced at the time of knowledge acquisition had an effect on statements of information need. The second experiment introduced the text-linguistic schema through questions posed on supplementary online search forms in a working information retrieval environment. Responses replicated the findings from the first experiment in terms of overall length of responses. In the case of one searcher, searches based on the information supplied in response to text-linguistic questions used significantly more words in the search expression, and achieved lower precision. Questions posed by intermediaries introduce cognitive stuctures which affect the details contained in statements of information need presented by users of information systems. A schema based on text-linguistic categories can be useful in eliciting more details from users, but these additional details do not necessarily result in better information retrieval. AN University Microfilms Order Number ADG13-33478. AU HILL, HELEN KATHERINE. IN Texas Woman's University M.A. 1987, 103 pages. TI METHODS OF ANALYSIS OF INFORMATION NEEDS. DE Library Science. Information Science. AB Librarians hold diverse opinions about the appropriate methods to use in the analysis of information needs. This thesis defines nine types of methods used: "ought-to-need" statements, potential need statements, demand studies, goal-oriented analyses, demographic studies, user studies, lifestyle investigations, required output studies, and effectiveness studies. "Ought-to-need" statements, demand studies, and goal-oriented analyses are analyzed in detail to answer the following questions: What is the justification for its use? What applications does it have? What are its strengths and weaknesses? When is it appropriately used? Finally, this paper provides a comparison of the appropriate application of these methods. AN University Microfilms Order Number ADGDX-83250. AU IBRAHIM, FARID MOHAMMED SELIM. IN Loughborough University of Technology (United Kingdom) Ph.D. 1988, 439 pages. TI A SYNTACTICALLY-BASED PREPROCESSOR FOR A LIMITED EXPERIMENTAL ARABIC DOCUMENT RETRIEVAL SYSTEM. DE Library Science. Language, Linguistics. Information Science. AB Available from UMI in association with The British Library. The research reported in this thesis is about the description and discussion of an experimental document retrieval system for Arabic texts, using linguistic methods of analysis. Specifically, Arabic presents difficulties for the efficient retrieval of information because it is an agglutinative language, thus rendering the stop list method (as commonly used for English texts) near to useless. The system has two stages: the creation of the retrieval lexicon and the search program. The latter is done using a limited on-line searching which allows for partial matching. The former has four stages. Texts in the form of abstracts are processed by morphological analysis, syntactic analysis, term extraction and term manipulation modules. Each stage produces a new representation of the source text. The morphological analyser attempts to recognise any prefixes and/or suffixes attached to the words in the corpus being processed. It also assigns grammatical labels specifying the part of speech using a contextual analysis of individual words (assuming that the inflectional features of a word are indicative of its syntactic role). An augmented transition network grammar and parser have been built for this purpose. The same parser has been developed and used in the second stage which is syntactic analysis. It takes as its input the representation of the text created by the morphological analysis, and uses a separate grammar file defined as a recursive transition network. The aim of syntactic analysis is the definition of the relations of the different constituents in the individual sentences being processed. The formation added by the morphological and syntactic analysers is used in the term extraction module. This module uses a traversal algorithm to negotiate the structure built by syntax, utilising a set of rules, kept on a file, specifying the type of constructs needing to be selected. The manipulative module generates new entries for each term selected by rotating its elements. The system has been implemented using the Hull V-mode Pascal compiler available on the L.U.T. Prime System. It has been tested using 40 abstracts selected from a conference proceedings in the field of computer applications. (Abstract shortened by UMI.). EOB AN University Microfilms Order Number ADGDX-83967. AU WAKELIN, A. W. IN Council for National Academic Awards (United Kingdom) Ph.D. 1988, 222 pages. TI A DATABASE QUERY LANGUAGE FOR OPERATIONS ON GRAPHICAL OBJECTS. DE Computer Science. AB Available from UMI in association with The British Library. The motivation for this work arose from the recognized inability of relational databases to store and manipulate data that is outside normal commercial applications (e.g. graphical data). The published work in this area is described with respect to the major problems of representation and manipulation of complex data. A general purpose data model, called GDB, that successfully tackles these major problems is developed from a formal specification in ML and is implemented using the PRECI/C database system. This model uses three basic graphical primitives (line segments, plane surfaces--facets, and volume elements--tetrons) to construct graphical objects and it is shown how user designed primitives can be included. It is argued that graphical database query languages should be designed to be application specific and the user should be protected from the relational algebra which is the basis of the database operations. Such a base language (an extended version of DEAL) is presented which is capable of performing the necessary graphical manipulation by the use of recursive functions and views. The need for object hierarchies is established and the power of the DEAL language is shown to be necessary to handle such complex structures. The importance of integrity constraints is discussed and some ideas for the provision of user defined constraints are put forward. AN University Microfilms Order Number ADGD--83706. AU WOOD, MURRAY IAN. IN University of Strathclyde (United Kingdom) Ph.D. 1988, 222 pages. TI COMPONENT DESCRIPTOR FRAMES: A REPRESENTATION TO SUPPORT THE STORAGE AND RETRIEVAL OF REUSABLE SOFTWARE COMPONENTS. DE Computer Science. AB Available from UMI in association with The British Library. Requires signed TDF. There is a growing interest in the reuse of previously designed, coded, tested and documented software, primarily for reasons of economy and reliability. Although the potential of software reuse has been acknowledged for decades its practice has been inhibited by cultural, managerial and technical problems. It would seem that now, in the 1980's, improvements in software technology combined with the high costs of software production have made the widespread reuse of software a real possibility. This thesis concentrates on one of the technical problems associated with widespread software reuse, the effective storage and retrieval of reusable software components. In particular it presents a representation to support software component storage and retrieval termed Component Descriptor Frames. These are stereotypical structures that represent software components in terms of the relationships between the conceptual function that the software component performs and the conceptual objects that occur in the context of that function. The thesis reviews software reuse in general, placing software component reuse clearly in its context. Information Retrieval techniques are then considered in terms of their appropriateness to support software component storage and retrieval. Following this review of related work the original contribution of the thesis, the Component Descriptor Frame representation, is discussed. The thesis describes how it derived from Conceptual Dependency, a theroy used to represent the 'meaning' of natural language. The representation is justified in terms of its capability to satisfy the fundamental requirements of such a software component representation. A component retrieval system that has been developed using Component Descriptor Frames as a basis is then described. The important role of a supportive user-interface in component retrieval, and information retrieval in general, is emphasised. The thesis concludes with an argued evaluation of the representation and the capabilities it offers. It is suggested that Component Descriptor Frames strike a more appropriate balance than feasible alternatives between the identified needs for meaningful representation, partial matching, ease of use and general applicability. ********************************************************** IRLIST Digest is distributed from the University of California, Division of Library Automation, 300 Lakeside Drive, Oakland, CA. 94612-3550. Send subscription requests to: LISTSERV@UCCVMA.BITNET Send submissions to IRLIST to: IR-L@UCCVMA.BITNET Editorial Staff: Clifford Lynch lynch@postgres.berkeley.edu calur@uccmvsa.bitnet Mary Engle engle@cmsa.berkeley.edu meeur@uccmvsa.bitnet Nancy Gusack ncgur@uccmvsa.bitnet The IRLIST Archives will be set up for anonymous FTP, and the address will be announced in future issues. These files are not to be sold or used for commercial purposes. Contact Mary Engle or Nancy Gusack for more information on IRLIST. The opinions expressed in IRLIST do not represent those of the editors or the University of California. Authors assume full responsibility for the contents of their submissions to IRLIST.