Information Retrieval List Digest 265 (July 24, 1995) URL = http://hegel.lib.ncsu.edu/stacks/serials/irld/irld-265 IRLIST Digest ISSN 1064-6965 July 24, 1995 Volume XII, Number 28 Issue 265 ********************************************************** I. QUERIES 1. Q- Text Retrieval XCMDs III. NOTICES A. Publications 1. AI: Special Issue on Relevance B. Meetings 1. How We Do User-Centered Design & Evaluation of Digital Libraries: A Methodological Forum IV. PROJECTS A. Abstracts 1. IR-Related Dissertation Abstracts ********************************************************** I. QUERIES I.1. Fr: Michael Hutchins Re: Q- Text Retrieval XCMDs I'm part of a team that has been working for several years developing a prototype in SuperCard to demonstrate the capabilities of hypermedia techniques for supporting access to technical reference manuals. One of the access paradigms that our system offers is Text Retrieval (just simple Boolean search, with wildcards, proximity, etc.). We get that functionality by incorporating a set of XCMDs from Knowledge Set Corp. (called SuperIndexer + SuperKRS). We are looking either to buy several licenses to the K'Set product (even though it is now no longer on the market) or to replace its functionality with another product. We would very much appreciate any information either about the availability of licenses to the K'Set product or about other products (on the market, hopefully!) that provide similar functionality with an XCMD interface. Michael Hutchins pmh@draper.com ********************************************************** III. NOTICES III.A.1. Fr: Russell Greiner Re: AI: Special Issue on Relevance Artificial Intelligence: An International Journal Call for Papers Special Issue on Relevance Guest Editors: Russell Greiner, Devika Subramanian, Judea Pearl With too little information, reasoning and learning systems cannot work effectively. Surprisingly, too much information can also cause the performance of these systems to degrade, in terms of both accuracy and efficiency. It is therefore important to determine what information must be preserved, i.e., what information is "relevant." There has been a recent flurry of interest in explicitly reasoning about relevance in a number of different communities, including the AI fields of knowledge representation, probabilistic reasoning, machine learning and neural computation, as well as communities that range from statistics and operations research to cognitive science. Members of these diverse communities met at the 1994 AAAI Fall Symposium on Relevance, to seek a better understanding of the various senses of the term "relevance," with a focus on finding techniques for improving the performance of embedded agents by ignoring or de-emphasizing irrelevant and superfluous information. Such techniques will clearly be of increasing importance as knowledge bases, and learning systems, become more comprehensive to accommodate real-world applications. To help consolidate leading research on relevance, the "Artificial Intelligence" journal is devoting a special issue to this topic. We are now seeking papers on (but not restricted to) the following topics: [Representing and reasoning with relevance:] reasoning about the relevance of distinctions to speed up computation, relevance reasoning in real-world KR tasks including design, diagnosis and common-sense reasoning, use of relevant causal information for planning, theories of discrete approximations. [Learning in the presence of irrelevant information:] removing irrelevant attributes and/or irrelevant training examples, to make feasible induction from very large datasets; methods for learning action policies for embedded agents in large state spaces by explicit construction of approximations and abstractions. [Relevance and probabilistic reasoning:] simplifying/approximating Bayesian nets (both topology and values) to permit real-time reasoning; axiomatic bases for constructing abstractions and approximations of Bayesian nets and other probabilistic reasoning models. [Relevance in neural computational models:] methods for evolving computations that ignore aspects of the environment to make certain classes of decisions, automated design of topologies of neural models guided by relevance reasoning based on task class. [Applications of relevance reasoning:] Applications that require explicit reasoning about relevance in the context of IVHS, exploring and understanding large information repositories, etc. We are especially interested in papers that have strong theoretical analyses complemented by experimental evidence from non-trivial applications. Authors are invited to submit manuscripts conforming to the AIJ submission requirements by 11 Sept 1995 to Russell Greiner or Devika Subramanian Siemens Corporate Research Department of Computer Science 755 College Road East 5141 Upson Hall, Cornell University Princeton, NJ 08540-6632 Ithaca, New York 14853 (609) 734-3627 (607) 255-9189 Papers will be a subject to a standard peer review. The first round of reviews will be completed and decisions mailed by 11 December 1995. The authors of accepted and conditionally accepted manuscripts will be required to send revised versions by 1 March 1996. The special issue is tentatively scheduled to appear around the end of 1996. SIGNIFICANT DATES: 11/Sep/95: Manuscripts dues 11/Dec/95: First round decisions 1/Mar/96: revised manuscripts due end of 96: special issue appears (tentative) ********** III.B.1. Fr: Ed Fox Re: How We Do User-Centered Design & Evaluation of Digital Libraries: A Methodological Forum HOW WE DO USER-CENTERED DESIGN AND EVALUATION OF DIGITAL LIBRARIES: A METHODOLOGICAL FORUM Thirty-Seventh Allerton Institute Allerton Park and Conference Center Monticello, IL October 29-31, 1995 CHAIRPERSON: Ann Bishop, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign COORDINATOR: Emily Ignacio, University of Illinois at Urbana- Champaign THE GROWTH OF DIGITAL LIBRARIES: Improvements in information technologies and increased support directed towards our national information infrastructure have led to the development of a wide range of digital libraries. Academic, special, and public libraries are implementing online systems that provide their patrons with electronic access to catalogs and a variety of information resources. NASA is developing online collections of images and data for scientists and engineers. Museums are digitizing their collections and making them available on the Internet. Individual scientific communities are building collaboratories to support their work and communication. Publishers are experimenting with the creation of digital archives of their journals and books. And individuals and groups from all walks of life are providing network access to information they create. USER-CENTERED DESIGN AND EVALUATION: THE CHALLENGE: Digital libraries pose fascinating socio-technical challenges for understanding and supporting use. Design and evaluation are complicated by the newness of the systems, their ability to integrate a range of functions that were previously designed and evaluated separately, the heterogeneity of their user population, the physically distributed nature of usage, the ability to fragment and rearrange previously integrated documents and images, and the rapid versioning available digitally. Appropriate user-centered research objectives, measures, and methods for the digital library are just beginning to emerge. PURPOSE AND NATURE OF THE FORUM: "How We Do User-Centered Design and Evaluation of Digital Libraries: A Methodological Forum" will bring together an interdisciplinary group of experts in the design and study of information systems, experts in user-centered research in traditional libraries, and researchers currently involved in a wide range of digital library projects. The goal of the forum is to present both the range of user-centered methods available for studying digital libraries and rationales for choosing amongst them; we also want to look ahead to new methods and developments and map out what the challenges are. This methodological forum will give all attendees an opportunity to share their expertise, experiences, and ideas with their peers in a relaxed environment. In order to facilitate informal discussion and extensive interaction, attendance will be limited to about 70 invited participants. All attendees must submit brief discussion documents in advance of the forum and will be expected to participate actively in panels and small group working sessions. Forum activities will be devoted to issues such as: * What are appropriate measures for gauging digital library outcomes at the individual, group, institutional, and global levels? * How can we best incorporate knowledge of user needs and behavior in designing digital library interactions and interfaces? * What do we need to know about how people use electronic texts and how can we gain this knowledge and apply it to the development of digital libraries? * What can we learn from studies of traditional library use? * How can we develop an understanding of the computerization of library work that will help as digital systems are incorporated into current institutional practices? * How can we deal with the ethical, practical, and conceptual issues that arise in the remote observation of online (and offline) behavior on a very large scale? * How do we foster effective communication among digital library designers, users, and social science researchers? WHO WILL BE THERE?: Several individuals have already signed on to participate in the forum. They will present and discuss their work in sessions devoted to some of the issues noted above: *William L. Anderson and Susan Anderson (Xerox Corporation): Socially Grounded Engineering for Digital Libraries *John M. Carroll (Virginia Tech - VPI&SU): Exploring Scenario-Based Design in the Context of Digital Libraries *Andrew Dillon (Indiana University): Studying the Use of Electronic Texts F. W. Lancaster (University of Illinois): A Consideration of Traditional Library and Information Science Approaches to Evaluation *Gary Marchionini and Ben Shneiderman (University of Maryland): User-Centered Methods for Library Interface Design *Annelise Mark Pejtersen (RISO National Laboratory, Denmark): Designing for Retrieval in Library Collections: Lessons from Book House *S. Leigh Star (University of Illinois) and John R. Garrett (Corporation for National Research Initiatives): The Narrative as Information Tool *Michael Twidale (Lancaster University, England): How to Study and Design for Collaborative Browsing in the Digital Library IF YOU'D LIKE TO PARTICIPATE: To apply for the 1995 Allerton Institute, submit your discussion document by July 25, 1995. Email submissions of the discussion document, in ASCII format, are preferred; please include your name, job title, address, telephone and fax numbers, and email address. The Institute steering committee will review the discussion documents and respond to all applicants by August 15. The discussion document should be from one to three pages in length and should describe your current research and interests: What has struck you the most in your work? What are the most difficult problems you've run into? Comment also on your particular interests in attending this methodological forum: What do you hope to find out? What can you contribute? The discussion documents will be circulated among participants prior to the forum and will be used to help focus and structure forum activities. Some participants may be invited to make presentations or facilitate working sessions. Participants who wish to contribute to a publication based on the forum will have the opportunity to submit revised versions of their discussion documents, copies of their forum presentations, individual or group commentaries on topics and issues that arose during the forum, or other relevant material (papers, research instruments, resource lists, etc.). Accepted participants must cover their own travel costs and pay a registration fee of $120 to cover their room, meals, and materials. A registration form and further information will be supplied upon acceptance. A limited amount of funding is available for students or others who might require financial assistance. Include with your discussion document a brief explanation of your funding requirements if you would like to apply for support; we will try to help in the most urgent cases. Discussion documents and requests for additional information should be addressed to: Allerton Institute Graduate School of Library and Information Science University of Illinois at Urbana-Champaign LIS Building 501 E. Daniel St. Champaign, IL 61820-6211 allerton@alexia.lis.uiuc.edu 1-217-333-3281 (V) 1-217-244-3302 (F) IMPORTANT DATES: JULY 25: Applications (in the form of discussion documents) due AUGUST 15: Notification of acceptance sent to applicants; registration forms distributed to participants SEPTEMBER 15: Registration forms and conference payment due from all participants OCTOBER 29-31: Allerton Institute FOR COMPLETE INFORMATION, CONTACT: Allerton Conference Administrator, Graduate School of Library and Information Science, 501 E. Daniel Street, Champaign, IL 61820. Telephone: (217) 333-3281. E-mail: allerton@alexia.lis.uiuc.edu ********************************************************** IV. PROJECTS IV.A.1. Fr: Susanne M. Humphrey Re: Selected IR-Related Dissertation Abstracts The following are citations selected by title and abstract as being of potential interest to the Information Retrieval (IR) community, resulting from a computer search, using the CDP/Online system, of the Dissertation Abstracts International (DAI) database produced by University Microfilms International (UMI). Included are accession number (AN); author (AU); title (TI); degree, institution, year, number of pages (IN); UMI order number (DD); reference to the published DAI (SO); abstract (AB); one or more DAI subject descriptors chosen by the author (DE); thesis adviser (AR); and dates associated with the monthly update file (UP). Unless otherwise specified, paper or microform copies of dissertations may be ordered from University Microfilms International, Dissertation Copies, Post Office Box 1764, Ann Arbor, MI 48106; telephone for U.S. (except Michigan, Hawaii, Alaska): 1-800-521-3042, for Canada: 1-800-343-5299; fax: 313-973-1540. Price lists and other ordering and shipping information are in the introduction to the published DAI. An alternate source for copies is sometimes provided. Dissertation titles and abstracts contained here are published with permission of University Microfilms International, publishers of Dissertation Abstracts International (copyright by University Microfilms International), and may not be reproduced without their prior permission. AN AAIMM88823 AU Jiang, Helen Hua. TI A SAMPLE SELECTION METHOD FOR AN ADAPTIVE RETRIEVAL MODEL. IN Masters Thesis (M.SC.)--THE UNIVERSITY OF REGINA (CANADA), 1993, 63p. DD Order Number: AAIMM88823. SO Masters Abstracts International. Volume: 33-01, page: 0213. AB The main objective of an information retrieval system is to find the information items which are relevant to the user's request. An important problem in the design of an adaptive retrieval system is sample selection. In this study, based on the notion of partially supervised learning, a sample selection algorithm is used to choose appropriate samples for the construction of the query vector. Experiments were carried out to demonstrate the effectiveness of the proposed method. DE Computer Science. AR Wong, S K M. IB 0-315-88823-7 UP 9501. Revised: 950127. AN AAI9434202 AU Frank, Steven Merrill. TI CATALOGING PARADIGMS FOR SPATIAL METADATA (DATA CATALOGING). IN Thesis (PH.D.)--UNIVERSITY OF MAINE, 1994, 142p. DD Order Number: AAI9434202. SO Dissertation Abstracts International. Volume: 55-08, Section: B, page: 3427. AB The transfer of information from paper based to electronic forms raises questions about how this information might be most easily and reliably found and accessed in a vast network of interconnected computers. This thesis examines specifically how spatial (geographic) information might be handled in such a network. It first examines the current state of spatial data cataloging. It then looks at how spatial resources (in the forms of archived data sets, active databases, analytical software, and automated services) might be arranged in the National Information Infrastructure by developing limited scenarios of spatial resource interactions. Five possible paradigms (based on trends in current electronic access mechanisms) to catalog these scenarios are compared. Service protocols, conceptual models for developing intelligent computer to computer exchanges, were found to be the most pragmatic choice of possible cataloging paradigms. Two examples of spatial resource service protocols are developed and presented. DE Engineering, General. Computer Science. AR Onsrud, Harlan. UP 9501. Revised: 950127. AN AAI9433901 AU Osgood, Richard Earle. TI THE CONCEPTUAL INDEXING OF CONVERSATIONAL HYPERTEXT (HYPERTEXT). IN Thesis (PH.D.)--NORTHWESTERN UNIVERSITY, 1994, 267p. DD Order Number: AAI9433901. SO Dissertation Abstracts International. Volume: 55-08, Section: B, page: 3415. AB Linear text limits an author's ability to satisfy the variety of knowledge needs and diverse interests of readers. To solve this problem, hypertext presents text in a non-linear arrangement linked by key phrases in the text so that readers can more easily find passages suited to their needs and interests. However, non-linear reading via hypertext creates two additional problems well-known to researchers in information science. Chief among them is the loss of coherence in reading hypertext linked passages. Also typical hypertext indexing methods are overly syntactic and atheoretical. To address the coherence problem, this dissertation presents conversational reading and its ASK Michael implementation as a new way to structure hypertext. It describes how text questions can replace imbedded text phrases to better label links between passages and how the categories of a model of conversational coherence can better group these questions for easy reader selection. To address the unprincipled indexing problem of hypertext, this dissertation describes a step-by-step method for conceptual indexing of hypertext. The question-based method employs a working representation of anticipated reader questions raised by a passage and questions for which the passage supplies answers. Links are generated by a computer assisted, manual matching process of questions raised with questions answered. The 2,000 indices of ASK Michael were generated by the question-based method. In situations where question matching might be impractical, a second conceptual indexing approach is proposed. Based on the AI techniques of frame representation, classification models, and simple inference procedures, a semi-automated indexing tool was developed and tested in several application environments. This research demonstrates the utility of combining the non-linear design of hypertext with a conversational model and principled conceptual indexing methods to create workable solutions to the problems of structuring and accessing large bodies of information. DE Computer Science. UP 9501. Revised: 950127. AN AAI9502021 AU Carlyle, Allyson. TI THE SECOND OBJECTIVE OF THE CATALOG: AN EVALUATION OF COLLOCATION IN ONLINE CATALOG DISPLAYS (INFORMATION RETRIEVAL). IN Thesis (PH.D.)--UNIVERSITY OF CALIFORNIA, LOS ANGELES, 1994, 254p. DD Order Number: AAI9502021. SO Dissertation Abstracts International. Volume: 55-08, Section: A, page: 2192. AB Research is presented that tests the extent to which the second of Charles Cutter's objects, or functions, of the library catalog are fulfilled in online catalogs. The second objective stipulates that a catalog should collocate the works of an author and the editions of a work. It is, in essence, a standard for display and arrangement of bibliographically related catalog records. A survey of online catalogs is conducted in which "worst-case" searches are used to measure the extent to which records for the works of an author and the editions of a work are collocated. The main research question is: What is the effect of online catalog computer system variables such as match type, match extent, and filing order on collocation of bibliographically related items? Computer system variables that have a potential impact on arrangement of multiple records are defined. The effect of system variables is measured by various dependent variables that show how closely multiple-record displays collocate the works of authors and the editions of works. Worst-case searches are selected that represent real-life searches presenting various obstacles to an intelligently structured display. Results of the survey show that a cluster of independent variable values associated with character-string match type, including single-field match location, record grouping, consistent arrangement, and word-by-word filing order collocate related records better than a cluster associated with keyword match type, including multiple-field match location, no record grouping, inconsistent arrangement, and numerical filing order. Catalog size has an effect on collocation in only about half the searches, in which case small catalogs collocate related work records better than medium catalogs, and medium catalogs collocate better than large catalogs. Author record sets are collocated in online catalogs more successfully than either work or superwork record sets. DE Library Science. Information Science. AR Svenonius, Elaine. UP 9501. Revised: 950127. AN AAI9500346 AU Welsh, Thomas M. TI HYPERMEDIA INTERFACE DESIGN: THE EFFECTS OF LINK FILTERING AND LINK INDICATOR DIFFERENTIATION ON LEARNER EXPLORATION, PERFORMANCE AND PERCEPTIONS OF USABILITY. IN Thesis (PH.D.)--INDIANA UNIVERSITY, 1994, 148p. DD Order Number: AAI9500346. SO Dissertation Abstracts International. Volume: 55-08, Section: A, page: 2355. AB This study investigated the effects of two hypermedia interface design strategies on learner exploration, perceptions of usability, and performance. These effects were compared in relation to two tasks involved in reading to prepare for writing an essay--comprehending information and locating information. The interface design strategies investigated were; (1) ability to manually filter link indicators (by hiding indicators leading to specific link types), and (2) ability to visually filter link indicators (by giving indicators leading to specific link types common visual appearances). Dependent measures were: learner exploration (including number of annotations accessed, amount of time spent reading annotations, and use of manual filtering capability); usability (including perceptions of how the interface design affected performance and using the computer program for the task), and; performance (including amount of information from the hypermedia system integrated into an essay and essay cohesiveness and authoritativeness). Learners who could manually filter link indicators accessed fewer annotations and spent a greater amount of time reading annotations than those who could not during the information locating task only. Learners who could visually filter link indicators accessed fewer annotations and spent a greater proportional amount of time reading annotations than those who could not during both the comprehension and information locating tasks. Of those who could manually filter link indicators, those who could also visually filter link indicators used the manual filtering capability less than those who had no visual filtering capability during the information locating task. Those who could manually filter link indicators had more positive assessments of how the interface design affected their performance than those who could not. Likewise, those who could visually filter link indicators had more positive assessments than those who could not. No significant differences were found in student perceptions of using the computer program for the task. No significant differences were found in the amount of information from the hypermedia system integrated into essays or in essay cohesiveness and authoritativeness. Findings indicated that hypermedia systems for academic computing should include interface configuration tools that allow the learner to specify attributes reflecting task-specific needs. DE Education, Technology. Information Science. AR Duffy, Thomas M. UP 9501. Revised: 950127. ********************************************************** IRLIST Digest is distributed from the University of California, Division of Library Automation, 300 Lakeside Drive, Oakland, CA. 94612-3550. Send subscription requests and submissions to: NCGUR@UCCMVSA.UCOP.EDU Editorial Staff: Clifford Lynch calur@uccmvsa.ucop.edu Nancy Gusack ncgur@uccmvsa.ucop.edu The IRLIST Archives is now set up for anonymous FTP, as well as via the LISTSERV. Using anonymous FTP via the host dla.ucop.edu, the files will be found in the directory pub/irl, stored in subdirectories by year (e.g., /pub/irl/1993). Using LISTSERV, send the message INDEX IR-L to LISTSERV@UCOP.EDU. To get a specific issue listed in the Index, send the message GET IR-L LOGYYMM, where YY is the year and MM is the numeric month in which the issue was mailed, to LISTSERV@UCOP.EDU. You will receive the issues for the entire month you have requested. These files are not to be sold or used for commercial purposes. Contact Nancy Gusack for more information on IRLIST. THE OPINIONS EXPRESSED IN IRLIST DO NOT REPRESENT THOSE OF THE EDITORS OR THE UNIVERSITY OF CALIFORNIA. AUTHORS ASSUME FULL RESPONSIBILITY FOR THE CONTENTS OF THEIR SUBMISSIONS TO IRLIST.