Information Retrieval List Digest 207 (April 4) URL = http://hegel.lib.ncsu.edu/stacks/serials/irld/irld-207 IRLIST Digest ISSN 1064-6965 April 4, 1994 Volume XI, Number 14 Issue 207 ********************************************************** I. QUERIES 1. Comparison of Two Noun Phrases III. NOTICES A. Publications 1. Philosophy HyperTextBook B. Meetings 1. COMPMED '94 2. ASIS SIG/CR 3. SS on Advanced Broadband Communications '94 4. IFIP: High Performance Networking '94 IV. PROJECTS A. Abstracts 1. IR-Related Dissertation Abstracts ********************************************************** I. QUERIES I.1. Fr: Y. Shum Re: Comparison of 2 Noun Phrases Hi there, I would truly appreciate it if anyone could tell me of any methods that could be used to determine if 2 noun phrases are equivalent, given that the words in the phrases have been tagged with their part-of-speech. The reason why I'm interested in this is because I've written a crude software that extracts noun phrases from a text. However, there are too many phrases being extracted and I'm trying to get a 'frequency of noun phrases'. To do this I will need a comparison algorithm. The naive way to do it is to consider 2 noun phrases are equal if any of the nouns in the phrases are equal. This easily leads to the situation where phrase A is equal to phrase B,phrase A is equal to phrase C but phrase B is not equal to phrase C. Thanks a lot! ********************************************************** III. NOTICES III.A.1. Fr: Jeff Iverson Re: Philosophy HyperTextBook MINNEAPOLIS, MN -- March 21, 1994-- Jeff Iverson, a developer of educational software and tools for developers is now shipping Philosophy HyperTextBook 3.0. The HyperTextBook represents a new way of presenting information for educational purposes. By clicking on keywords (either in a special keyword list or in the body copy of the text) the user can instantly jump to a related piece of information. The HyperTextBook allows a user to type in a word or string and search for any other linked articles. Iverson's proprietary linking technology has been in use since early 1988, but these new products represent the first major use of this technology in information publishing. HyperTextBooks are intended to be used as a compliment to an existing curriculum, rather than function as a replacement for other resources. "A typical HyperTextBook is made up of information 'nuggets' on a particular topic," says Jeff Iverson. He continues, "the user can explore many different paths between these information nuggets and, in the process, learn about related items that they might not have searched for in the first place." The first HyperTextBook available is "Philosophy," which Iverson calls, "a collection of information nuggets on different philosophers and schools of thought, dynamically linked so the user can navigate through the information in any way they choose via hypertext links." The current version of Philosophy incorporates text and pictures, although Iverson says, "One of the capabilities inherent in the HyperTextBook is the ability to link to other media in the future, such as video discs, animation files and other forms of presentation technology." Initially offered in HyperCard 2.2 Stack or Stand-Alone format, Iverson is exploring the possibilities of developing Windows compatible versions. The software will run on any Macintosh running System 6.0 or above. FOR COMPLETE INFORMATION AND ORDERS, CONTACT Orders may be sent to Jeff Iverson, 2800 Selkirk Dr., C-104, Burnsville MN 55337-5662, or purchase orders may be faxed to (612) 890-8166. For more information, Jeff can be reached via e-mail at j5rson@aol.com or at (612) 890-8292. ********** III.B.1. Fr: mwitten@chpc.utexas.edu Re: COMPMED '94 FINAL PROGRAM ANNOUNCEMENT FIRST WORLD CONGRESS ON COMPUTATIONAL MEDICINE AND PUBLIC HEALTH 24-28 April 1994 Hyatt on the Lake, Austin, Texas Over 200 speakers will be presenting work in a variety of applications areas related to medicine and public health. Registration is still open for attendees. Registration details and/or a copy of the schedule at a glance, schedule-in-detail may be requested by sending an email request to compmed94@chpc.utexas.edu or by calling 512-471-2472 or by faxing 512-471-2445 We will be happy to fax/send a copy to anyone who requests it. The conference proceedings will appear as a series of volumes published by World Scientific. If you are interested in submitting a paper for the proceedings, please contact mwitten@chpc.utexas.edu or call 512-471-2457 The overwhelming response to this congress has already justified having a second world congress in the future. The tentative schedule is to have it in 3 years. If you are interested in participating at the 2nd World Congress On Computational Medicine and Public Health, please contact Dr. Matthew Witten Congress Chair mwitten@chpc.utexas.edu ********** III.B.2. Fr: Ray Schwartz Re: ASIS SIG/CR's 5th Classification Research Workshop 5th ASIS SIG/CR Classification Research Workshop QUESTIONS, CONTROVERSIES AND CONCLUSIONS IN CLASSIFICATION RESEARCH October 15, 1994 The American Society for Information Science Special Interest Group on Classification Research (ASIS SIG/CR) invites submissions for the 5th ASIS Classification Research Workshop, to be held at the 57th Annual Meeting of ASIS in Alexandria, VA. The workshop will take place Sunday, October 16th, 1994, 8:30 a.m. - 5:00 p.m. ASIS '94 continues through Thursday, October 20th. The CR Workshop is designed to be an exchange of ideas among active researchers with interests in the creation, development, management,representation, display, comparison, compatibility, theory, and application of classification schemes. Emphasis will be on semantic classification, in contrast to statistically based schemes. Topics include, but are not limited to: - Warrant for concepts in classification schemes - Concept acquisition - Basis for semantic classes - Automated techniques to assist in creating classification schemes - Statistical techniques used for developing explicit semantic classes - Relations and their properties - Inheritance and subsumption - Knowledge representation schemes - Classification algorithms - Procedural knowledge in classification schemes - Reasoning with classification schemes - Software for management of classification schemes - Interfaces for displaying classification schemes - Data structures and programming languages for classification schemes - Image classification - Comparison and compatibility between classification schemes - Applications such as subject analysis, natural language understanding, information retrieval, expert systems. The CR Workshop welcomes submissions from various disciplines. Those interested in participating are invited to submit a short (1-2 page single-spaced) position paper summarizing substantive work that has been conducted in the above areas or other areas related to semantic classification schemes, and a statement briefly outlining the reason for wanting to participate in the workshop. Submissions may include background papers as attachments. Participation will be of two kinds: presenter and regular participant. Those selected as presenters will be invited to submit expanded versions of their position papers and to speak to those papers in brief presentations during the workshop. All position papers (both expanded and short papers) will be published in proceedings to be distributed prior to the workshop. Submissions should be made by email, or diskette accompanied by paper copy, or paper copy only (fax or postal), to arrive by May 15, 1994. COMPLETE INFORMATION AVAILABLE FROM (or send submissions to): *Raya Fidel, Graduate School of Library and Information Science, University of Washington, FM-30, Seattle, WA 98195; Internet: fidelr@u.washington.edu; Phone: 206-543-1888; Fax: 206-685-8049* ********** III.B.3. Fr: Juan Quemada Vives Re: 2nd Int'l. Summer School on Advanced Broadband Communications SS'94 Second International Summer School on Advanced Broadband Communications July 11-15 1994 This year the School will be distributed to at least four different and geographically distant sites and will constitute a unique event joining a thorough presentation of ABC (Advanced Broadband Communications) with its real use and demonstration. It will include tutorials, in depth lectures, panels and active syndicate sessions covering the most relevant topics of ABC, including - Cell based technologies (ATM, Frame Relay, SMDS) - Access Networks: Mobility, LANs, ATM, - Network interconnection - Corporate and Virtual Private Networks - Cost Modeling - Systems Engineering and ABC - Management of ABC Systems - Multimedia Interfaces and Applications - Entertainment and ABC. Most important, the 1994 Summer School itself will be a demonstration of ATM-based broadband communications, applications and services were the results from different RACE projects will be shown in real operation. The 1994 Summer School will be a distributed event where a multimedia CSCW (Computer Supported Cooperative Work) tele-education application will join the lecture rooms of the different physical sites into a unique virtual lecture room such that lecturers and participants lose the sense of physical separation and work together with full interaction. At least, the following sites will join SS'94: - ETSI Telecomunicacion, Madrid-Spain (Central Site) - University of Aveiro, Aveiro-Portugal - CET, Aveiro-Portugal - TIDSA, Madrid-Spain. FOR COMPLETE INFORMATION, CONTACT: Dept Ing. Telematica (SS'94) ETSI Telecomunicacion E-28040 Madrid, Spain tf: +34 1 3367332, fax: +34 1 3367333, email: SS94@dit.upm.es ********** III.B.4. Fr: Christophe Diot Re: High Performance Networking (HPN) '94 program 5th International IFIP Conference on HIGH PERFORMANCE NETWORKING ADVANCED PROGRAM June 27 - July 1, 1994 Museum of Art, Grenoble (France) FOREWORD: This workshop belongs to the series started in 1987 in Aachen, followed by Liege in 1988, Berlin in 1991 and Liege in 1992. HPN '94 is the fifth event of this series sponsored by IFIP WG 6.4. It aims at being an international forum where researchers coming from industry and universities present and discuss evolution in the framework of high-speed networking and computing in private and public environments. The conference targeted new mechanisms, protocols, services and architectures derived from the need of emerging distributed multimedia applications, as well as from the requirements of the new communication environment. TUTORIAL A: Host Interface Design for High Speed Networks, by Bruce Davie (Bellcore-USA). TUTORIAL B: Multimedia in Operating and Communication Systems, by Ralf Steinmetz and Ralf Guido Herrtwich (IBM Heidelberg-D). TUTORIAL C: LOTOS-Based Protocol Engineering, by Guy Leduc (University of Lihge-B). TUTORIAL D: High Speed Networks and Multimedia Communications, by Fouad Tobagi (Starlight Inc., Stanford University -USA). CONFERENCE PRELIMINARY PROGRAM: OPENING SESSION: Chairpersons: Guy Pujolle, PRiSM, France, Jean-Pierre Verjus, IMAG, France, Serge Fdida, MASI, France, Christophe Diot, INRIA Sophia Antipolis, France. KEYNOTE ADDRESS: Radu Popescu-Zeletin, GMD-FOKUS, Germany Session A: High-Performance LANs and MANs Session B: MAC Performance in High-Speed Networks Session C: Routing Issues in High-Performance Networks Session D: Enhanced Transport and Synchronization Session E: Quality of Service and Architecture Session F: Resource Management Session G: Traffic Analysis and Performance Session H: Internetworking Session I: Multimedia Communication Systems Session J: Performance Tools TECHNICAL PROGRAM INQUIRIES CAN BE ADDRESSED TO: Professor Serge FDIDA Laboratoire MASI - CNRS 4 Place Jussieu - 75252 PARIS cedex 05 - FRANCE Ph : +33 1 44 27 30 58 - Fx : +33 1 44 27 62 86 - e.mail : fdida@masi.ibp.fr FOR COMPLETE INFORMATION, CONTACT: HPN '94 Conference Administrative Office Catherine HICTER-PLOTTIER and Martine RETTER DESTINATION CONGRES BP 56 - 38242 MEYLAN cedex - FRANCE INRIA 2004 route des Lucioles BP 93, 06902 Sophia Antipolis FRANCE Ph: (33) 93 65 77 56 Fx: (33) 93 65 77 65 E-mail: christophe.diot@sophia.inria.fr ********************************************************** IV.A.1. Fr: Susanne M. Humphrey Re: Selected IR-Related Dissertation Abstracts The following are citations selected by title and abstract as being related to Information Retrieval (IR), resulting from a computer search, using BRS Information Technologies, of the Dissertation Abstracts Online database produced by University Microfilms International (UMI). Included are UMI order number, title, author, degree, year, institution; number of pages, one or more Dissertation Abstracts International (DAI) subject descriptors chosen by the author, and abstract. Unless otherwise specified, paper or microform copies of dissertations may be ordered from University Microfilms International, Dissertation Copies, Post Office Box 1764, Ann Arbor, MI 48106; telephone for U.S. (except Michigan, Hawaii, Alaska): 1-800-521-3042, for Canada: 1-800-268-6090. Price lists and other ordering and shipping information are in the introduction to the published DAI. An alternate source for copies is sometimes provided. Dissertation titles and abstracts contained here are published with permission of University Microfilms International, publishers of Dissertation Abstracts International (copyright by University Microfilms International), and may not be reproduced without their prior permission. AN University Microfilms Order Number ADG93-10319. AU CLOYD, C. BRYAN. TI THE EFFECTS OF KNOWLEDGE AND INCENTIVES ON INFORMATION SEARCH IN TAX RESEARCH TASKS. IN Indiana University Ph.D. 1992, 148 pages. SO DAI v54(01), SecA, pp232. DE Business Administration, Accounting. Education, Psychology. AB A large portion of the behavioral accounting literature attempts to identify the determinants of performance in accounting-related tasks. For the most part, these studies investigate simple relationships between performance and various predictor variables (e.g., experience, knowledge, abilities, or incentives). However, behavioral accounting researchers have increasingly noted that task difficulty, knowledge, ability, and incentives all interact to affect task performance. The goal of the present study is to enhance our understanding of these interactive relationships. This goal is addressed by presenting a general model of the determinants of task performance and then testing the implications of that model in an experiment involving the information search phase of a tax research task. In this study, 64 tax professionals engaged in a computerized tax research task involving a complex partnership tax issue. The computer created a comprehensive record of each subject's information search activity, from which measures of subjects' effort, search effectiveness, and search efficiency were obtained. The results of this experiment generally support the predictions of the model. The data show that experience is positively related to knowledge, and that knowledge is positively related to search effectiveness and search efficiency. Support for the prediction that knowledge is positively related to effort depends upon which of two measures of effort is considered. The data also indicate that the incentive manipulation had a significant, positive effect on subjects' effort levels and search effectiveness scores. Finally, the data support the interaction hypotheses that the effects of incentives on effort and search effectiveness are positively related to knowledge. AN University Microfilms Order Number ADG93-14192. AU ANWAR, TAREK M. TI ON THE APPLICATION OF CONCEPTUAL CLUSTERING FOR KNOWLEDGE DISCOVERY IN DATABASE SYSTEMS. IN The University of Florida Ph.D. 1992, 130 pages. SO DAI v54(01), SecB, pp321. DE Computer Science. Artificial Intelligence. AB Knowledge discovery is the nontrivial and efficient extraction of high-level patterns from databases. In this thesis we propose an approach to knowledge discovery in terms of discovery of class formations from a set of existing instances of data. The emphasis in this approach is on reasoning at the instance level with which we are able to generate classes, and a schema that more accurately and precisely reflects the actual data stored rather than ad hoc class formations. Specifically, three attribute-based purpose-directed conceptual clustering techniques are presented. These techniques are not confined to any particular semantic data model but can easily be adapted to any data model. A conceptual clustering technique for database integration is introduced by which schema generation occurs by conceptually clustering the underlying data instances of several (possibly heterogeneous) databases. This process is guided by specifying a context in the form of a clustering seed. By modifying the clustering seed we are able to vary the schema generated to accommodate different user groups needs. A second conceptual clustering technique is utilized for schema evolution and exception handling. This technique determines instance similarity through a graph-matching procedure. A class description is subsequently generated and is incorporated into the schema. Moreover, this technique provides a class taxonomy and thereby an intensional answer to a query that is conceptually more informative than a set of instances. A key issue in schema evolution is that of exception detection in which this algorithm aids. Finally, a conceptual clustering technique for imprecise querying is detailed. In contrast to numeric or fuzzy sets approaches which ultimately rely on some distance metric and threshold to processing such queries, conceptual clustering retrieves instances which are structurally, semantically, and pragmatically similar to the query even though they may not match the requirements exactly. The query processor has both a deductive and inductive component. The deductive component finds precise matches in the traditional sense, and the inductive component identifies ways in which imprecise matches may be considered similar. Ranking on similarity is done using the database taxonomy, by which similar instances become members of the same class. Relative similarity is determined by depth in the taxonomy. Overall, this thesis has applied machine learning techniques (learning from observation) that are based on psychological principles of category formation to the difficult problems of schema design, evolution and integration, and imprecise querying in database systems. ********************************************************** IRLIST Digest is distributed from the University of California, Division of Library Automation, 300 Lakeside Drive, Oakland, CA. 94612-3550. Send subscription requests to: LISTSERV@UCCVMA.BITNET Send submissions to IRLIST to: IR-L@UCCVMA.BITNET Editorial Staff: Clifford Lynch calur@uccmvsa.ucop.edu or calur@uccmvsa.bitnet Nancy Gusack ncgur@uccmvsa.ucop.edu or nancy.gusack@ucop.edu Mary Engle meeur@uccmvsa.ucop.edu or mary.engle@ucop.edu The IRLIST Archives is now set up for anonymous FTP, as well as via the LISTSERV. Using anonymous FTP via the host dla.ucop.edu, the files will be found in the directory pub/irl, stored in subdirectories by year (e.g., /pub/irl/1993). Using LISTSERV, send the message INDEX IR-L to LISTSERV@UCCVMA.BITNET. To get a specific issue listed in the Index, send the message GET IR-L LOGYYMM, where YY is the year and MM is the numeric month in which the issue was mailed, to LISTSERV@UCCVMA (Bitnet) or LISTSERV@UCCVMA.UCOP.EDU. You will receive the issues for the entire month you have requested. These files are not to be sold or used for commercial purposes. Contact Nancy Gusack or Mary Engle for more information on IRLIST. THE OPINIONS EXPRESSED IN IRLIST DO NOT REPRESENT THOSE OF THE EDITORS OR THE UNIVERSITY OF CALIFORNIA. AUTHORS ASSUME FULL RESPONSIBILITY FOR THE CONTENTS OF THEIR SUBMISSIONS TO IRLIST.