Information Retrieval List Digest 202 (February 28, 1994) URL = http://hegel.lib.ncsu.edu/stacks/serials/irld/irld-202 IRLIST Digest ISSN 1064-6965 February 28, 1994 Volume XI, Number 9 Issue 202 ********************************************************** I. NOTICES A. Meeting Announcements/Calls for Papers 1. Workshop on Combining Statistical & Symbolic Approaches to Languages 2. Gigabit Networkng Workshop '94 B. Publications Announcements 1. RLG Proceedings II. QUERIES A. Questions/Answers 1. One-Way String-Matching Algorithms 2. Publication Frequency B. Requests for Information 1. Managing Gigabytes IV. PROJECT WORK D. Miscellaneous 1. Error Reduction in Pattern Recognition ********************************************************** I. NOTICES I.A.1. Fr: Judith Klavans Re: Workshop on Combining Statistical and Symbolic Approaches to Language THE BALANCING ACT: Combining Symbolic and Statistical Approaches to Language 1 July 1994 New Mexico State University Las Cruces, New Mexico, USA A workshop in conjunction with the 32nd Annual Meeting of the Association for Computational Linguistics (27-30 June 1994) A renaissance of interest in corpus-based statistical methods has rekindled old controversies -- rationalist vs. empiricist philosophies, theory-driven vs. data-driven methodologies, symbolic vs. statistical techniques. The aim of this workshop is to set aside a priori biases and explore the balancing act that must take place when symbolic and statistical approaches are brought together. We plan to accept papers from authors having a wide range of perspectives, and to initiate a discussion that includes philosophical, theoretical, and practical issues. Submissions to the workshop must describe research in which both symbolic and statistical methods play a part. All research of this kind requires that the researcher make choices: What knowledge will be represented symbolically and how will it be obtained? What assumptions underlie the statistical model? What is the researcher gaining by combining approaches? Questions like these, and the metaphor of the balancing act, will provide a unifying theme to draw contributions from a wide spectrum of language researchers. ORGANIZERS: Judith Klavans, Columbia Univerisity; Philip Resnik, Sun Microsystems Laboratories, Inc. REQUIREMENTS: Papers should describe original work; they should clearly emphasize the type of paper to be presented (e.g. implementation, philosophical, etc.) and the state of completion of the research. A paper accepted for presentation cannot be presented or have been presented at any other meeting. In addition to the workshop proceedings, plans for publication as a book require that papers not have been published in any other publicly available proceedings. Papers submitted to other conferences will be considered, as long as this fact is clearly indicated in the submission. FORMAT FOR SUBMISSION: Following guidelines for the ACL meeting, authors should submit preliminary versions of their papers, not to exceed 3200 words (exclusive of references). Papers outside the specified length and formatting requirements are subject to rejection without review. Papers should be headed by a title page containing the paper title, a short (5 line) summary and a specification of the subject area(s). If the author wishes reviewing to be blind, a separate page with author identification information must be submitted. SUBMISSION MEDIA: Papers may be submitted electronically or in hard copy to either organizer at the addresses given below. Electronic submissions should be either self-contained LaTeX source or plain text. LaTeX submissions must use the ACL submission style (aclsub.sty) retrievable from the ACL LISTSERV server (access to which is described below) and should not refer to any external files or styles except for the standard styles for TeX 3.14 and LaTeX 2.09. A model submission modelsub.tex is also provided in the archive, as well as a bibliography style acl.bst. Note that the bibliography for a submission cannot be submitted as separate .bib file; the actual bibliography entries must be inserted in the submitted LaTeX source file. Be sure that e-mail submissions have no lines longer than 80 characters to avoid mailer problems. Hard copy submissions should consist of four (4) copies of the paper. A plain text version of the identification page should be sent separately by electronic mail if possible, giving the following information: title, author(s), address(es), abstract, content areas, word count. Schedule: Papers must be received by 15 March 1994. Late papers will not be considered. Notification of receipt will be mailed to the first author (or designated author) soon after receipt. Authors will be notified of acceptance by 10 April 1994. Camera-ready copies of final papers prepared in a double-column format, preferably using a laser printer, must be received by 10 May 1994, along with a signed copyright release statement. The ACL LaTeX proceedings format is available through the ACL LISTSERV. FOR COMPLETE INFORMATION CONTACT: Judith L. Klavens Department of Computer Science 500 W 120th Street New York, NY 10027, USA (212) 939-7120 klavans@cs.columbia.edu ********** I.A.2. Fr: Bryan Lyles Re: Gigabit Networking Workshop '94 GIGABIT NETWORKING WORKSHOP GBN`94 - CALL FOR PARTICIPATION 12 June 1994 - Toronto, Ontario, Canada Sponsored by the IEEE ComSoc Technical Committee on Gigabit Networking in conjunction with INFOCOM'94 FORMAT: The workshop will take place from 9:00 AM until 4:00 PM with lunch provided. The morning will consist of short presentations and discussion; the afternoon by the presentation of full papers describing applications which will drive the deployment of Gigabit networks. There will be an open business meeting of the Technical Committee on Gigabit Networking following the workshop from 4:00 to 5:00 PM. SHORT PRESENTATIONS AND DISCUSSION: The first part of the workshop will consist of a number of short informal presentations and discussion on current research and implementation, hot topics, position statements, and controversial issues relating to high bandwidth networking. End-to-end issues including transport and higher layer protocols, host and network interface architecture, operating systems, applications, economic and regulatory issues, and other societal impacts will be of particular interest. A one page abstract of the presentation is due on 1 April 1994; all reasonable proposals will be accepted (and possibly some outrageous ones). The length of the presentations will be 5 - 15 min. each depending on the number of accepted submissions, and it is expected that this part of the workshop will be highly interactive and informative. FOCUS PAPERS: APPLICATIONS ENABLING THE LARGE SCALE DEPLOYMENT OF GIGABIT NETWORKS: The second part will consist of full paper presentations. We seek papers on a broad range of applications driving the wide scale deployment of Gigabit networks covering the consumer, business, and industry markets from application designers, system vendors, and network infrastructure providers. Current Gigabit applications will likely not be numerous enough to force the provision of a new infrastructure, and we are looking for new applications which must meet a set of criteria for using and enabling Gigabit network technology: o Realistic consumer or business application (current or future) o Minimum bandwidth per user of many Mbps o Minimum potential base of 1000s of simultaneous users o Number of users x application bandwidth product in excess of 1 Tbps o Consumer video applications must be more sophisticated than broadcast or simple video-on-demand multicast Papers must include justification in terms of specific network requirements and/or system architecture that allows the delivery of the required bandwidth to the application. A two page extended abstract of the proposed paper is due 1 April 1994, with the full text (20 double spaced pages, excluding figures) of accepted papers due 12 June 1994 at the workshop. After open email discussion by the general membership of the Technical Committee on Gigabit Networking as well as further review of the full paper text, selected papers along with the summary of the discussion generated will be published in a forthcoming issue of IEEE JSAC. SUBMISSION: The submission deadline for both the 1 page presentation abstracts and the 2 page paper extended abstract is 1 April 1994. Submission by email to the program chair at jpgs@acm.org is encouraged; please include the text "GBN'94 Submission" in the Subject: field. Submission of eight copies by surface mail is also acceptable, to the program chair at the address below. All submissions will be quickly acknowledged; the lack of an acknowledgment indicates that the author should contact the program chair to confirm the receipt of the proposal. Notification of accepted presentations and papers will be made by 1 May 1994, and all accepted presenters are expected to register in advance for the workshop. REGISTRATION: Registration for the workshop will be handled as part of INFOCOM'94 registration; to receive the INFOCOM'94 advance program, request by email to infocom@ee.upenn.edu, or by fax to Mark J. Karol at +1 908 949 9118. PROGRAM CHAIR James P. G. Sterbenz IBM Research H3-D27 30 Saw Mill River Rd. Hawthorne, NY 10532 USA +1 914 784 6489 jpgs@acm.org jpgs@watson.ibm.com ********** I.B.1. Fr: RLG Sales Associates Re: News Release from RLG The following is a news announcement from the Research Libraries Group. It is being posted to other library-related LISTSERVs. Available now from the Research Libraries Group (RLG) is "Electronic Access to Information: A New Service Paradigm", the proceedings from an RLG symposium held in July 1993 that explored the service issues facing academic and research institutions as they make the transition from a print-based information environment to an increasingly electronic one. The publication presents the six formal papers given at the symposium; summaries of the small group discussions that followed; three cases in point, which exemplified strategies in improving access to electronic information; and a profile of the over 60 information professionals who attended the event. Paper presenters included: Keynote speaker Douglas E. Van Houweling, vice provost for information technology at the University of Michigan, who challenged attendees to consider fundamental questions about the nature of their "business," who their customers are, and what they must do in the future. Nancy M. Cline, dean of university libraries at Pennsylvania State University, who, in focusing on the needs of the user, considered the choices and issues involved in selecting different modes of access to information and suggested strategies for making choices. Jerry D. Campbell, vice provost for library affairs and university librarian at Duke University, who discussed a recent survey of faculty and students at Duke University and suggested ways to revitalize library service and create the "new library paradise." Kathryn M. Downing, president and CEO of Lawyers Cooperative Publishing, who discussed copyright and the fair use doctrine, and cited the highly successful ASCAP copyright/licensing program as a possible model for electronic publishing. Robert C. Berring, professor and law librarian at the University of California at Berkeley's School of Law, who declared that librarianship was in peril and that librarians must act swiftly and strategically to recapture the leadership role in information management. Kathleen Price, law librarian at the Library of Congress, who summarized the two-day event, highlighting the key issues raised and solutions proposed, and discussed recent efforts by the law library of the Library of Congress to preserve and expand access to information. "Electronic Access to Information: A New Service Paradigm" is 84 pages, paperbound, 8 1/2 x 11 inches. FOR COMPLETE INFORMATION, CONTACT the Distribution Services Center, The Research Libraries Group, Inc., 1200 Villa Street, Mountain View, CA 94041-1100, or send e-mail to bl.dsc@rlg.bitnet or bl.dsc@rlg.stanford.edu, or fax to 415-964-0943. ********************************************************** II. QUERIES II.A.1. To: IR-L Readers Fr: Nancy Gusack, IR-L Moderator Re: Publication Frequency IR-L frequently receives too much material to fit into one issue. Some of the submissions have deadline information, and we might consider publishing more often than weekly so deadlines aren't missed. Do you all have opinions? Shall IR-L stick to a weekly publication schedule? Shall it be published more often (and perhaps in shorter issues), depending on the amount of material? Let me know! ********** II.A.2. Fr: Jim S. Mochel Re: Managing Gigabytes Hi there, I am trying to track down a text that is supposedly due out soon: March 94. The text is called "Managing Gigabytes" and is a comparative discussion of inversion and compression (among other topics). I would like to find the name of the author and the publisher, if possible. Jim Mochel jmochel@world.std.com ********** II.B.1. Fr: Peter van der Post Re: WANTED: One-Way String-Matching Algorithms I'm looking for a fast algorithm to test how well string A matches (part of) string B, possible permutations included. The amount of unmatched material in B is unimportant. For example, the following comparisons should all return a high match score: A-string B-string ADAR ADAR INTERNATIONAL BRACADAB ABRACADABRA SOFTWARE INC. ZYLOG FUZZY LOGIC SYSTEMS VIEW DATA DATAVIEW CORP. VERASCO VARESCO SOFT HOUSE SOFT WAREHOUSE, INC. Any pointers? Peter van der Post o o P.O.Box 1375 professional computer scientist /\ 1400 BJ Bussum e-mail: pepo@knoware.nl \'--'/ The Netherlands `~~~ Europe specialist in: OO, GUI and KBS system development current area of interest: CSCW ********************************************************** IV. PROJECT WORK IV.D.1. Fr: Elliot Davis Re: Error Reduction in Pattern Recognition I would greatly appreciate your thoughts on the: ERROR TEMPLATE TECHNIQUE The "Error Template" technique (patent 4,802,231) provides an alternative method for reducing false alarms in pattern recognition systems. In this approach, a pattern representing a mismatched pattern is stored in the reference lexicon. It is a reference pattern to an error rather then to what is desired. THIS IS DONE WITH THE EXPECTATION THAT IF THE ERROR PATTERN OR A VARIATION OF IT IS REPEATED IT WILL TEND TO BE CLOSER TO ITSELF THEN TO THE PATTERN THAT IT FALSED OUT TO. Preferential matching to an Error Template can result in the system deciding that: 1. the test pattern was outside its reference vocabulary or 2. the test pattern falsely matched a desired reference pattern but should be compared for possible matches to other desired reference patterns and Error Templates. In this case the Error Template is linked to specific desired reference patterns and is called a Linked Error Template. This technique should be tried on any pattern recognition system that needs improvement in error reduction or speed. Pattern recognition systems may be streamlined by reducing the amount of desired reference templates while keeping the false alarm rate down by the use of Error Templates. It is, relative to most other techniques, extremely easy to implement. The Error Template can be created and positioned in the comparison process in the same way as a normal reference pattern. As all pattern recognition systems must have a means of characterizing parameters to be stored and compared, the applicability of the Error Template technique should be widely and easily testable. Elliot Davis, Ph.D. E-Mail: edavis@ubvms.cc.buffalo.edu Phone: (716) 691-7235 ********************************************************** IRLIST Digest is distributed from the University of California, Division of Library Automation, 300 Lakeside Drive, Oakland, CA. 94612-3550. Send subscription requests to: LISTSERV@UCCVMA.BITNET Send submissions to IRLIST to: IR-L@UCCVMA.BITNET Editorial Staff: Clifford Lynch calur@uccmvsa.ucop.edu or calur@uccmvsa.bitnet Nancy Gusack ncgur@uccmvsa.ucop.edu or nancy.gusack@ucop.edu Mary Engle meeur@uccmvsa.ucop.edu or mary.engle@ucop.edu The IRLIST Archives is now set up for anonymous FTP, as well as via the LISTSERV. Using anonymous FTP via the host dla.ucop.edu, the files will be found in the directory pub/irl, stored in subdirectories by year (e.g., /pub/irl/1993). Using LISTSERV, send the message INDEX IR-L to LISTSERV@UCCVMA.BITNET. To get a specific issue listed in the Index, send the message GET IR-L LOGYYMM, where YY is the year and MM is the numeric month in which the issue was mailed, to LISTSERV@UCCVMA (Bitnet) or LISTSERV@UCCVMA.UCOP.EDU. You will receive the issues for the entire month you have requested. These files are not to be sold or used for commercial purposes. Contact Nancy Gusack or Mary Engle for more information on IRLIST. THE OPINIONS EXPRESSED IN IRLIST DO NOT REPRESENT THOSE OF THE EDITORS OR THE UNIVERSITY OF CALIFORNIA. AUTHORS ASSUME FULL RESPONSIBILITY FOR THE CONTENTS OF THEIR SUBMISSIONS TO IRLIST.