Information Retrieval List Digest 117 (June 24, 1992) URL = http://hegel.lib.ncsu.edu/stacks/serials/irld/irld-117 ========================================================================= Date: Tue, 23 Jun 1992 16:12:05 PST Reply-To: "Information Retrieval List" Sender: "Information Retrieval List" From: IRLIST Subject: IR-L Digest, Vol.IX, No.21, Issue 117 IRLIST Digest June 24, 1992 Volume IX, Number 21 Issue 117 ********************************************************** I. NOTICES A. Meeting Announcements/Calls for Papers 1. 6th Conference of the European Chapter of the ACL University of Utrecht, The Netherlands, April 21-23,1993 C. Miscellaneous 1. IR-L Submission Length II. QUERIES B. Requests for Information 1. Basic Interface for Telneting III. JOB ANNOUNCEMENTS 1. PhD Fellowships at Drexel University IV. PROJECT WORK C. Abstracts 1. IR-Related Dissertation Abstracts ********************************************************** I. NOTICES I.A.1. Fr: Louis des Tombe Re: 6th Conference of the European Chapter of the ACL FIRST NOTIFICATION AND CALL FOR PAPERS Sixth Conference of the European Chapter of the Association for Computational Linguistics 21-23 April 1993 Onderzoeksinstituut voor Taal en Spraak (OTS) Research Institute for Language and Speech University of Utrecht, The Netherlands PURPOSE: This conference is the sixth in a series of biennial conferences on computational linguistics sponsored by the European Chapter of the Association for Computational Linguistics. Previous conferences were held in Pisa (September 1983), Geneva (March 1985), Copenhagen (April 1987), Manchester (April 1989) and Berlin (April 1991). Although hosted by a regional chapter, these conferences are global in scope and participation. The European Chapter represents a major subset of the ACL. The conference is open to both members and nonmembers of the Association. SCOPE: Papers are invited on all aspects of computational linguistics, including, but not limited to: morphology, syntax, semantics, discourse analysis, pragmatics, grammar formalisms, formal languages, software tools, knowledge representation, AI-methods in computational linguistics, analysis and generation of language, computational lexicography and lexicology, lexical databases, machine translation, computational aids to translation, speech analysis and synthesis, natural language interfaces, dialogue, computer-assisted language learning, corpus analysis and corpus-based language modelling, and information retrieval and message understanding. Special Sessions/Tutorials: The Programme Committee plans special sessions around the following themes: - logic and computational linguistics - data-oriented methods in computational linguistics This thematic orientation will be further developed in a tutorial programme to be held the day preceding the conference (20 April 1993). Details will be provided in the circular of October 1992. Submission: Authors should submit an extended abstract of their papers, or in case of hardcopy 6 copies, to the Programme Committee at the following address: EACL-93 Programme Committee OTS Trans 10 NL-3512 JK Utrecht The Netherlands Phone: (+31) 30-392531 Fax: (+31) 30-333380 Email: eacl93@let.ruu.nl The first page should include the title, the name(s) of the author(s), complete addresses (including e-mail), a specification of the topic area (one or two keywords, preferably from the list above), and an indication of whether the paper addresses one of the themes of the Special Sessions. The extended abstract should not exceed 5 pages A4. It should contain sufficient information to allow the referees and the Programme Committee to determine the scope of the work and its relation to relevant literature. Contributions should report on original research that has not been presented elsewhere. Electronic submission is preferred, using standard LaTeX or plain ASCII. In case of problems with this, contact the organizers at the above address. For future final versions, hardcopy or LaTeX files will be accepted. SCHEDULE: The deadline for submission is 1 December 1992. Authors will be notified of acceptance by 1 February 1993. Camera-ready copies of the final papers must be postmarked before 5 March 1993, and received by 12 March 1993, along with a signed copyright release statement. Papers not received by the due date will not be included in the conference proceedings, which will be published in time for distribution to everyone attending the conference. PROGRAMME COMMITTEE: The Programme Committee will be co-chaired by Louis des Tombe, Steven Krauwer and Michael Moortgat (OTS, Utrecht). LOCAL ARRANGEMENTS: Contact Nadine Buenen or Joke Dorrepaal at the above address. More information on local arrangements will be provided in the next circular. OTHER ACTIVITIES: A programme of demonstrations and exhibits is planned. For information, contact the EACL address above. ********** I.C.1. Fr: Nancy Gusack, IR-L Moderator Re: IR-L Submission Length Dear IR-L Subscribers: Let me remind you that I'd like to receive submissions of less than 150 lines if possible. I'm trying to keep each issue at or below 400 lines, and a long call for papers takes up too much space. When I receive long submissions, I will try to edit them, emphasizing the "for further information" address/number. For huge submissions (300 - 600 lines) I'll most likely discard them. The obvious exception to this rule will be papers submitted for review by subscribers. Thanks for your cooperation. Nancy Gusack, IR-L Moderator ncgur@uccmvsa ********************************************************** II. QUERIES II.B.1. Fr: Nicholas Rosselli (219)980-6929 Re: Basic interface for Telnet-ing Well, I was fairly flabbergasted (what a strange word) today. We've finally got our campus network up - and there, sitting on my desk was a workstation offering me telnet access to the world of the internet. Vision of remote computing danced in my head - more specifically visions of being able to do on-line searching without relying on a 1200 baud modem and a noisy line danced in my head. To my dismay, I discover that there is no capability for doing what I thought were simple things: echoing to a printer and/or echoing to a log file. I mean the simplest shareware can do that much. And so, girding up my loins, I ventured forth to the computer center - only to be told that *they* have no idea of how to do it. So I'm asking you all ... is it possible, is there a way (even one that costs money) to achieve this goal? And, sorry for the duplication but I'm posting this to Libref-l, Pacs-l, and IR-l. Nicholas Rosselli Electronic Reference Services Indiana Univ. Northwest Rosselli@IUBACS ********** II.B.2. Fr: Bonnie Dorr Re: Request for Information: Directory of Computational Linguistics SURVEY OF COMPUTATIONAL LINGUISTICS COURSES URGENT NEED FOR INFORMATION As a follow-on to the Directory of Computational Linguistics Courses recently compiled by Martha Evens, the Association for Computational Linguistics will publish a new edition of the Survey of Computational Linguistics Courses. (See Computational Linguistics, volume 12 (1986) for the previous version compiled by Robin Cohen.) We are eager to include two types of courses: those that teach computational linguistics as the sole topic and those that teach computational linguistics as one of many topics. The survey will allow us to share with colleagues ideas on how to teach computational linguistics. It will also provide an idea of how the field of computational linguistics is being portrayed to potential new researchers. Our listing will include the name and address of the University and Department(s) offering the course, the name and number of the course, the type of course, and information about the syllabus (e.g., topics, texts used, format, workload, enrollment, duration, and assistance). In addition we will include some statistics on the responses (i.e., total number of courses having particular characteristics) and a bibliography of the most of frequently cited references. Please request guidelines as to content and format and send information to: Ms. Sandy Tsue UMIACS A.V. Williams Building University of Maryland E-mail: cl-survey@umiacs.umd.edu College Park, MD 20742 Tel: (+1-301)405-6722 Re: CL-SURVEY Fax: (+1-301)314-9658 Note: e-mail is preferred. If your institution was listed in the 1986 compilation, you may request a copy of your previous entry. Thank you for your participation in this endeavor. Professor Bonnie Dorr Department of Computer Science and UMIACS University of Maryland ********************************************************** III. JOB ANNOUNCEMENTS III.1. Fr: Kate McCain Re: PhD fellowships available As coordinator of the PhD program in the College of Information Studies at Drexel University, I am pleased to distribute the following announcement of the availability of fellowships to support doctoral research. Katherine W. McCain Associate Professor College of Information Studies Drexel University Philadelphia, PA 19104 _ANNOUNCEMENT_ Drexel University's College of Information Studies has received a $118,400 grant from the U.S. Department of Education to fund doctoral fellowships. The fellowships will provide complete tuition remission and stipends for up to eight full-time doctoral students during the period September 1992 to September 1993. The college will select the recipients from among newly applying students and those currently enrolled. Individuals may apply for a fellowship by enclosing a letter of request with their application to the college's doctoral program. A fellowship will cover tuition and fees, as well as provide a $6,400 stipend for the period September to June, and a $1,000 stipend for the following summer quarter (summer enrollment is optional). Fellowships may be renewable, depending on a student's academic performance and the availability of further federal funding. Through its fellowship grants, the Department of Education seeks to increase excellence in library education by encouraging study of the principles and practices of library and information science at the doctoral level. Such study may include investigations of the collection, organization, storage, retrieval and dissemination of information; the design, management and evaluation of libraries and information centers; and the use and users of information centers and their resources. For information on the CIS doctoral program and these fellowship opportunities, contact: Associate Dean, College of Information Studies, Drexel University, Philadelphia, Pa. 19104 (215) 895-2474. ********************************************************** IV. PROJECT WORK IV.C.1. Fr: Susanne M. Humphrey Selected IR-Related Dissertation Abstracts The following are citations selected by title and abstract as being related to Information Retrieval (IR), resulting from a computer search, using BRS Information Technologies, of the Dissertation Abstracts Online database produced by University Microfilms International (UMI). Included are UMI order number, title, author, degree, year, institution; number of pages, one or more Dissertation Abstracts International (DAI) subject descriptors chosen by the author, and abstract. Unless otherwise specified, paper or microform copies of dissertations may be ordered from University Microfilms International, Dissertation Copies, Post Office Box 1764, Ann Arbor, MI 48106; telephone for U.S. (except Michigan, Hawaii, Alaska): 1-800-521-3042, for Canada: 1-800-268-6090. Price lists and other ordering and shipping information are in the introduction to the published DAI. An alternate source for copies is sometimes provided. Dissertation titles and abstracts contained here are published with permission of University Microfilms International, publishers of Dissertation Abstracts International (copyright by University Microfilms International), and may not be reproduced without their prior permission. AN University Microfilms Order Number ADG91-23827. AU JONES, JULIAN LLOYD. TI ITERATIVE DEVELOPMENT, SYSTEM DESIGN, AND PSYCHOLOGICAL INVESTIGATION. IN University of York (United Kingdom) Ph.D. 1991, 168 pages. SO DAI V52(03), SecB, pp1546. DE Computer Science. Psychology, Experimental. Psychology, Industrial. AB This thesis examines the relationship between system design and psychology. The perspective taken is from the point of view of a psychologist, working in the field of Human-Computer Interaction (HCI), who is intent on contributing most effectively to the design of better computer systems. A conclusion that is reached very early on in the thesis is that the psychologist can contribute most effectively by integrating the activities of system design and psychology and pursuing both activities equally intently. This conclusion is formulated on the basis of the debate between Newell and Card (1985; 1986) and Carroll and Campbell (1986) on the role of technical theories in HCI, and in the light of Long's (1986; 1987; 1989) framework for HCI research. Hitherto, arguments in favour of a parity between system design and psychology were general enough that if they were true, they were true for any applied science. In order to identify those arguments most pertinent to the field of HCI, three experiments were conducted. These experiments manipulate the interplays made possible between system design and psychology when brought together in the redesign of the user-interface to an Online Public Access Catalog. What were two disparate domains, found commonality through the shared use of behavioural data. It is concluded that the primary benefits afforded to the psychologist working in the field of HCI, and obtained by integrating system design and psychology, arise as a result of an increase to: the power of an experiment; the speed with which one is able to identify the major and minor obstacles to perfect performance; the precision of the mapping possible between the worlds of system design and psychology; the opportunities available for paradigm shifts. In addition, it is noted that iterative development provides the capability to accentuate each one of these primary benefits. AN University Microfilms Order Number ADG91-20228. AU KANAI, JUNICHI. TI KNOWLEDGE-BASED DOCUMENT IMAGE ANALYSIS SYSTEM. IN Rensselaer Polytechnic Institute Ph.D. 1990, 235 pages. SO DAI V52(03), SecB, pp1546. DE Computer Science. AB A knowledge-based document layout analysis system has been designed and implemented to convert paper documents into an electronic form effectively. The system extracts characters, word, and text lines from a given text column. The domain knowledge includes generic typographical conventions and features of printed symbols in a given document, but excludes publication-specific layout features. A domain-specific knowledge-representation scheme called character prototype has been introduced to represent printed symbols. The character prototypes of each font were generated from font definitions, interactively, and automatically from training data. The system was coded using C and Common Lisp. Experimental results showed that it correctly extracted an average of 96 percent of text-lines from digitized text-columns written in English. Almost all word blocks were properly generated from the character boxes in the extracted text-lines. The character prototype scheme accurately represented the English, French and Japanese alphabets, Chinese characters, and Bengali words, and text lines were correctly extracted from documents written in these languages. The basic representation scheme for printed document is the X-Y tree. The term X-Y tree is used to describe a family of hierarchical data structures. Their common property is that they represent a recursive decomposition of space (isothetic rectangles). Additional properties depend on the selected local segmentation method, and these properties, which lead to a classification of X-Y trees, have been identified. Various utility algorithms have been developed: queries, insertions, deletion, and compression. A significant advantage of this structure is that it can represent hierarchically both logical and layout structure of a document without using additional pointers. AN University Microfilms Order Number ADG91-22617. AU KEENE, CAROL A. TI DOCUMENT RETRIEVAL USING STATISTICAL WORD DECOMPOSITION. IN University of Colorado at Boulder Ph.D. 1990, 266 pages. SO DAI V52(03), SecB, pp1547. DE Computer Science. AB Information may be defined as a resource used to solve problems or make decisions. Document retrieval (DR) systems locate documents containing needed information. Retrieving potentially relevant documents depends upon the user's ability to accurately express information needs in the input language of the DR system and the system's ability to process the query. Determining the relevance of a document depends upon the current task and the user's previous knowledge. Preliminary work studied linguistic behavior of knowledgeable users retrieving documents related to CSIM (the Colorado Simulator), a VLSI testing and verification tool. A typical query was 3 words long and was not a complete sentence. The Chaucer document retrieval system retrieves relevant documents for queries expressed in unrestricted English. Document content is limited to a technical domain, and individual documents contain information about a single concept. Full-text vocabulary is extracted and statistically decomposed into high-frequency variable-length word fragments which are then used to build an inverted index. Vocabulary recognition is achieved in time proportional to query length. Quorum-level implicit Boolean processing provides dynamic term-phrase formation. A prototype implementation using on-line documentation for CSIM was built and tested. Users, who exhibited varying levels of domain knowledge and English fluency, were able to locate relevant documents for more than 70% of the problems. Linguistic characteristics of queries were confirmed: average length of 3 words and telegraphic constructs. One significance of the Chaucer system is its ability to process both well-formed and ill-formed input and to retrieve relevant documents for all users. Another is the compact index provided by word decomposition (25% of the size of the document collection). This approach is applicable to systems involving search of collections of technical documents. ********************************************************** IRLIST Digest is distributed from the University of California, Division of Library Automation, 300 Lakeside Drive, Oakland, CA. 94612-3550. Send subscription requests to: LISTSERV@UCCVMA.BITNET Send submissions to IRLIST to: IR-L@UCCVMA.BITNET Editorial Staff: Clifford Lynch lynch@uccmvsa.ucop.edu or calur@uccmvsa.bitnet Nancy Gusack ncgur@uccmvsa.bitnet Mary Engle engle@cmsa.berkeley.edu or meeur@uccmvsa.bitnet The IRLIST Archives will be set up for anonymous FTP, and the address will be announced in future issues. To access back issues presently, send the message INDEX IR-L to LISTSERV@UCCVMA.BITNET. To get a specific issue listed in the Index, send the message GET IR-L LOG ***, where *** is the month and day on which the issue was mailed, to LISTSERV@UCCVMA.BITNET. These files are not to be sold or used for commercial purposes. Contact Nancy Gusack or Mary Engle for more information on IRLIST. The opinions expressed in IRLIST do not represent those of the editors or the University of California. Authors assume full responsibility for the contents of their submissions to IRLIST.