Information Retrieval List Digest 255 (May 15, 1995) URL = http://hegel.lib.ncsu.edu/stacks/serials/irld/irld-255 IRLIST Digest ISSN 1064-6965 May 15, 1995 Volume XII, Number 18 Issue 255 ********************************************************** I. QUERIES 1. Seeking Information III. NOTICES B. Meetings 1. SIGIR '95 Pre-Conference Course 2. SIGIR '95 Workshops IV. PROJECTS A. Abstracts 1. IR-Related Dissertation Abstracts ********************************************************** I. QUERIES I.1. Fr: Sisco Meredith A Re: Seeking information Pardon. I'm de-lurking to ask what may be a very stupid question, but I figure you'll know the answer if anyone does. I'm a retired investigative journalist now developing a murder mystery series with two sleuths, one of which is a retired librarian who knows the net. I have some skills in retrieval of information from my journalism days but am in over my head when it comes to ferreting out some kinds of info. Anyone feel inclined to help flesh out my picture with stories about how something arcane and/or obscure was found while prowling the net? Unless you all (yes, I am from the south) are into cluttering up the list with war stories, you may reply to my address instead of IR-L. Your help, and patience, much appreciated. Backing into lurk mode, Yarnspinner ********************************************************** III. NOTICES III.B.1. Fr: ERASMUS@ntuvax.ntu.ac.sg Re: Latest addition to SIGIR'95 program REUSABILITY, INTERCHANGEABILITY, AND COMPATIBILITY: ANSWERING THE QUESTIONS OF TEXT ENCODING STANDARDS Lou Burnard, Oxford University Judith Klavans, Columbia University C. M. Sperberg-McQueen, University of Illinois at Chicago A PRE-CONFERENCE COURSE to be held in association with SIGIR '95: 18th International Conference on Research and Development in Information Retrieval Seattle, WA, USA Saturday, July 8, 1995 8:30 a.m. - 3:30 p.m. SIGIR '95, an international research conference on information retrieval theory, systems, practice and applications, will be held in Seattle, WA, from July 9-13. On the Saturday prior to the conference, a one-day course will be offered covering the theory and practice of markup languages for the representation of textual and other data, such as SGML and the Text Encoding Initiative. Taught by Lou Burnard, Judith Klavans, and C. M. Sperberg-McQueen. COURSE DESCRIPTION: The representation of textual data has raised serious problems since the early days of digital technology. Incompatibility between representations range from simple formatting issues, such as word delimitation, to data encoding schemes, such as 7-bit encoding for English, 8-bit for accented languages, up to 32-bit for Asian languages. Furthermore, the complications seem to be growing as the amount of digital data increases. Recognizing the predicament these complications cause in the information age, a group of researchers and practitioners, sponsored by the Association for Computational Linguistics, the Association for Computers and the Humanities, and the Association for Literary and Linguistic Computing, joined in 1988 to explore ways to resolve the serious emerging incompatibilities in the representation of text. The Text Encoding Initiative has addressed these problems by developing detailed SGML Document Type Definitions (DTDs) to achieve comprehensive and generalizable encoding standards for a range of data types, from verse to syntactic analyses, from spoken language to hypertext, from terminological data to multilingual corpora. This one-day course will consist of three parts: the first will describe the challenges raised by the three ``abilities'' which concern effective text representation: reusability, interchangeability, and compatibility. The next section of the course will present the types of data handled so far by the TEI encoding scheme, some of the problems already solved, some ongoing projects, and some unsettled questions. If hands-on is possible, we will provide a session to experience the strengths of using the TEI for building intelligent text data bases from existing on-line texts. Otherwise, we will demonstrate widely available software and discuss practical issues in using the TEI for building intelligent text data bases from existing on-line texts. COMPLETE INFORMATION CAN BE FOUND AT: http://www.columbia.edu/~klavans/home.html http://www-tei.uic.edu/pub/tei/sigir.html Questions re workshop content should be directed to C.M. Sperberg-McQueen, u35395@uicvm.cc.uic.edu; addresses for queries re registration and accommodation are given below. MATERIALS AND PRESENTERS: All participants will be provided with a printed introductory summary guide to the TEI scheme and supporting materials on PC disks, including full versions of the TEI DTDs, public domain SGML software and sample TEI texts. The electronic version of the Guidelines will also be provided. Lou Burnard, of Oxford University Computing Services, is the European editor of the TEI project. Judith Klavans is the Director of the Center for Research on Information Access (CRIA) at Columbia University. C. M. Sperberg-McQueen is a senior research programmer at the academic computer center at the University of Illinois at Chicago. REGISTRATION: Cost of the course is $50 before May 29 and $65 after May 29 which includes a box lunch and course documentation. The attached registration form covers this course only. The course venue will depend on enrollment but at present it is expected that it will be at the SIGIR conference hotel, the Seattle Sheraton Hotel & Towers, 1400 Sixth Avenue, Seattle, WA 98101. Details of conference accomodation are available from the ftp and www addreses above. ********** III.B.2 Fr: ERASMUS@NTUVAX.NTU.AC.SG Re: Research Workshops in IR at SIGIR '95 RESEARCH WORKSHOPS IN INFORMATION RETRIEVAL: Information Retrieval and Databases VIRI: Visual Information Retrieval Interfaces IR and Automatic Construction of Hypermedia Curriculum Development in Computer Information Science Z39.50 and the IR Research Community to be held in association with SIGIR '95: 18th International Conference on Research and Development in Information Retrieval Seattle, WA, USA July 13, 1995 8:30 a.m. - 3:30 p.m. SIGIR '95, an international research conference on information retrieval theory, systems, practice and applications, will be held in Seattle, WA, from July 9-13. The final day of the conference will be devoted to five post-conference research workshops on topics of great current and general interest: IR and databases; visual information retrieval interfaces; curriculum development for IR; automatic construction of hypermedia; and Z39.50. Full descriptions and instructions for attendees are given below. All SIGIR research workshops will run concurrently from 8:30 a.m. to 3:30 p.m. on Thursday, July 13. Intending participants should follow the instructions given below to contact the program committee for the workshop they wish to attend. Attendance at SIGIR '95 is not required, though it is necessary to register for the workshops using the conference registration form. Cost of each workshop is $45 (before May 29) and $55 (after May 29) which includes a box lunch and workshop documentation. A copy of the registration form plus full information on SIGIR '95, including descriptions of tutorials, all technical sessions, and accommodation, etc. is available via anonymous ftp from: ftp.u.washington.edu (^public^sigir95^program) or via WWW at URL: http://info.sigir.acm.org/sigir/conferences/ SIGIR_95_adv.pgm.html; or contact sigir95@u.washington.edu to request a copy of the program by mail. WORKSHOP DESCRIPTIONS: INFORMATION RETRIEVAL AND DATABASES A Research Workshop Program Committee: David Harper, The Robert Gordon University, UK Peter Schauble, Swiss Federal Institute of Technology (ETH), Zurich. Submissions or requests for further information should be addressed to: . Deadline for submission: May 31, 1995 Notification: June 15, 1995. -*-*-*-*-*-*- VIRI: VISUAL INFORMATION RETRIEVAL INTERFACES A Research Workshop Program Committee: Robert R. Korfhage, University of Pittsburgh Xia Lin, University of Kentucky David S. Dubin, University of Pittsburgh. Requests for further information or a copy of the questionnaire should be sent to korfhage@lis.pitt.edu. -*-*-*-*-*-*- IR AND AUTOMATIC CONSTRUCTION OF HYPERMEDIA A Research Workshop Program Committee: Maristella Agosti, Padua University James Allan, University of Massachusetts at Amherst. For complete information, please contact Maristella Agosti at agosti@ipdunivx.unipd.it. -*-*-*-*-*-*- CURRICULUM DEVELOPMENT IN COMPUTER INFORMATION SCIENCE: A FRAMEWORK FOR DEVELOPING A NEW CURRICULUM IN IR The Workshop Leaders: Doris Lidtke and Michael Mulder. Program committee: Edward A. Fox, Virginia Tech; Doris K. Lidtke, Towson State University, Maryland; Michael C. Mulder, University of Southwestern Louisiana; Edie M. Rasmussen, University of Pittsburgh; Kazem Taghva, University of Nevada Las Vegas. -*-*-*-*-*-*- Z39.50 AND THE IR RESEARCH COMMUNITY A Research Workshop Program Committee: Clifford Lynch, University of California (clifford.lynch@ucop.edu); Ray Larson, University of California at Berkeley (ray@sherlock.berkeley.edu). ********************************************************** IV. PROJECTS IV.A.1. Fr: Susanne M. Humphrey Re: IR-Related Dissertation Abstracts - November 1994 Selected IR-Related Dissertation Abstracts Compiled by: Susanne M. Humphrey, National Library of Medicine, Bethesda, MD 20894 The following are citations selected by title and abstract as being of potential interest to the Information Retrieval (IR), resulting from a computer search, using the CDP/Online system, of the Dissertation Abstracts International (DAI) database produced by University Microfilms International (UMI). Included are accession number (AN); author (AU); title (TI); degree, institution, year, number of pages (IN); UMI order number (DD); reference to the published DAI (SO); abstract (AB); one or more DAI subject descriptors chosen by the author (DE); thesis adviser (AR); and dates associated with the monthly update file (UP). Unless otherwise specified, paper or microform copies of dissertations may be ordered from University Microfilms International, Dissertation Copies, Post Office Box 1764, Ann Arbor, MI 48106; telephone for U.S. (except Michigan, Hawaii, Alaska): 1-800-521-3042, for Canada: 1-800-343-5299; fax: 313-973-1540. Price lists and other ordering and shipping information are in the introduction to the published DAI. An alternate source for copies is sometimes provided. Dissertation titles and abstracts contained here are published with permission of University Microfilms International, publishers of Dissertation Abstracts International (copyright by University Microfilms International), and may not be reproduced without their prior permission. AN AAI9427810 AU Strittholt, James Richard. TI A REGIONAL NATURE RESERVE DESIGN USING GEOGRAPHIC INFORMATION SYSTEMS FOR THE EDGE OF APPALACHIA, ADAMS COUNTY, OHIO. IN Thesis (PH.D.)--THE OHIO STATE UNIVERSITY, 1994, 425p. DD Order Number: AAI9427810. SO Dissertation Abstracts International. Volume: 55-06, Section: B, page: 2065. AB As traditional approaches directed at stopping what has been termed the "extinction crisis" have proven to be inadequate, alternative holistic approaches are being explored. This study examined the use of modern computer mapping technologies, specifically remote sensing and geographic information systems (GIS), to address key nature reserve design issues for a 146 square mile area of south-central Ohio known as the Edge of Appalachia (EOA). After an extensive GIS database was constructed (1:24,000 scale), a series of conservation biology questions pertaining to the long-term preservation of the region were addressed. Landscape structure and change analyses were conducted based on low altitude aerial photograph interpretation. Between 1938 and 1988, the EOA underwent considerable landscape change, including a dramatic increase in forest cover and a reduction in forest fragmentation. By combining Landsat TM imagery with other data layers (i.e., bedrock geology), a rule-based model was developed to differentiate and map 8 plant community types within the study area, for both the current and pre-settlement vegetation. These maps were then compared to the existing preserve boundaries. This modified gap analysis identified some important preservation deficiencies using the historical vegetation model, particularly wet and mesic hardwood forest communities. The conservation biology issues on which the GIS modeling focused included plant community representation, habitat fragmentation, watershed protection, prairie management and restoration, rare plant species protection, and endangered animal habitat protection. Desirability maps were created pertaining to each criterion using the land parcel as the fundamental mapping unit. Finally, land acquisition scenarios were generated for the EOA study area emphasizing different ecological factors (i.e., plant community representation). Each of the 4 plans were evaluated in terms of their ecological value and economic cost. While expensive to implement, the plan which considered all of the ecological criteria proved to offer the best hope for the long-term maintenance of the ecosystem integrity for the EOA. The database and modeling results are to be used by The Nature Conservancy and other regional land conservators in helping them adopt long range conservation plans for this biologically diverse and rich landscape. DE Biology, Ecology. Geography. Environmental Sciences. Remote Sensing. AR Boerner, Ralph e J. UP 9411. Revised: 941202. AN AAI9429930 AU Goyal, Nita. TI A FRAMEWORK FOR REASONING PRECISELY WITH VAGUE CONCEPTS (KNOWLEDGE REPRESENTATION). IN Thesis (PH.D.)--STANFORD UNIVERSITY, 1994, 146p. DD Order Number: AAI9429930. SO Dissertation Abstracts International. Volume: 55-06, Section: B, page: 2273. AB Many knowledge-based systems need to represent vague concepts such as "old" and "tall". The practical approach of representing vague concepts as precise intervals over numbers (e.g., "old" as the interval (70,110)) is well-accepted in Artificial Intelligence. However, there have been no systematic procedures, but only ad hoc methods to delimit the boundaries of intervals representing the vague predicates. A key observation is that the vague concepts and their interval boundaries are constrained by the underlying domain knowledge. Therefore, any systematic approach to assigning interval boundaries must take the domain knowledge into account. Hence, in the dissertation, we present a framework to represent the domain knowledge and exploit it to reason about the interval boundaries via a query language. This framework is comprised of a constraint language to represent logical constraints on vague concepts, as well as numerical constraints on the interval boundaries; a query language to request information about the interval boundaries; and an algorithm to answer the queries. The algorithm preprocesses the constraints by extracting the numerical information from the logical constraints and combines them with the given numerical constraints. We have implemented the framework and applied it to medical domain to illustrate its usefulness. DE Computer Science. Artifical Intelligence. AR Shoham, Yoav. UP 9411. Revised: 941202. AN AAIC376130 AU Li, Zhuoxun. TI INFORMATION RETRIEVAL FOR AUTOMATIC LINK CREATION IN HYPERTEXT SYSTEMS. IN Thesis (PH.D.)--UNIVERSITY OF SOUTHAMPTON (UNITED KINGDOM), 1993 DD Not available from UMI. SO Dissertation Abstracts International. Volume: 55-04, Section: C, page: 1294. AB Hypertext systems have become popular in recent years although the ideas behind them were proposed nearly fifty years ago. Hypertext links, which are connections between information items, provide the possibility for non-sequential reading and writing, and enable related pieces of information to be connected together no matter where they are stored in a system. Traditionally, a hypertext link is created by specifying its two ends manually. Links created in this way can be excellent but suffer from limitations in that they require manual work. In a system where information is frequently changed or a huge amount of information is stored, these links may become inadequate. Methods that can create links automatically are needed. In this thesis, two kinds of computer created link, retrieval-links and doc-links, are proposed and investigated. Discussions are focused on how to control link creation and how to design proper full text retrieval procedures for link creation. Dynamically dividing documents into sections is an example of enabling the user to control link creation, and the use of break words and source stems are products of exploring effective retrieval methods. With efforts to ensure effectiveness and efficiency in link creation, the proposed methods can provide extra support for users and require virtually no extra human effort or user interface skills. In this thesis, key issues in link creation are identified, problems and solutions are discussed, and systematic experiments and their results are presented and analysed. Based on these results, the proposed methods are implemented and integrated with a hypermedia system, Microcosm. Practice has proved that these methods are effective and useful. Possible improvements and issues for developing better information systems are discussed as well. DE Engineering, Electronics and Electrical. UP 9411. Revised: 941202. AN AAIMM87485 AU Myers, Troy Gordon. TI USER FEES FOR INFORMATION SERVICES: AN EXPLORATION IN NORTH AMERICAN PUBLICLY FUNDED LIBRARIES. IN Masters Thesis (M.L.S.)--DALHOUSIE UNIVERSITY (CANADA), 1993, 115p. DD Order Number: AAIMM87485. SO Masters Abstracts International. Volume: 32-06, page: 1492. AB This thesis deals with the topical issue of user fees for publicly funded library services. The central objection by librarians to the imposition of user fees has related to their alleged conflict with basic principles of librarianship. The development of these principles is traced with the historical myths surrounding usage and social consequences exposed thereby casting doubt on the absolute validity of some of these principles. Also, user fees are shown not to be the recent phenomena that they are sometimes portrayed as being, but a regular feature of the library since the last century. A survey of selected Canadian and American fee-based information services in libraries is included to illustrate the contemporary nature of fees for library service. Another issue explored is the dilemma posed by the realities of public funding and the costs, e.g. online searching, for purists who oppose user fees in all circumstances. Rather than promoting the dissemination of information this deprives libraries of a source of revenue which will result in less access by less people to less information. Finally, two case studies--Arizona State University's FIRST service and Cleveland Public Library's CRC service--are included to illustrate the nature, as well as the problems, of fee-based information services in publicly funded libraries. DE Information Science. Library Science. AR Dykstra, Mars. IB 0-315-87485-6 UP 9411. Revised: 941202. ********************************************************** IRLIST Digest is distributed from the University of California, Division of Library Automation, 300 Lakeside Drive, Oakland, CA. 94612-3550. Send subscription requests and submissions to: NCGUR@UCCMVSA.UCOP.EDU Editorial Staff: Clifford Lynch calur@uccmvsa.ucop.edu Nancy Gusack ncgur@uccmvsa.ucop.edu The IRLIST Archives is now set up for anonymous FTP, as well as via the LISTSERV. Using anonymous FTP via the host dla.ucop.edu, the files will be found in the directory pub/irl, stored in subdirectories by year (e.g., /pub/irl/1993). Using LISTSERV, send the message INDEX IR-L to LISTSERV@UCOP.EDU. To get a specific issue listed in the Index, send the message GET IR-L LOGYYMM, where YY is the year and MM is the numeric month in which the issue was mailed, to LISTSERV@UCOP.EDU. You will receive the issues for the entire month you have requested. These files are not to be sold or used for commercial purposes. Contact Nancy Gusack for more information on IRLIST. THE OPINIONS EXPRESSED IN IRLIST DO NOT REPRESENT THOSE OF THE EDITORS OR THE UNIVERSITY OF CALIFORNIA. AUTHORS ASSUME FULL RESPONSIBILITY FOR THE CONTENTS OF THEIR SUBMISSIONS TO IRLIST.