Information Retrieval List Digest 094 (December 2, 1991) URL = http://hegel.lib.ncsu.edu/stacks/serials/irld/irld-094 IRLIST Digest December 2, 1991 Volume VIII, Number 51 Issue 94 ********************************************************** I. NOTICES A. Meeting Announcements/Calls for Papers 1. Text Retrieval Conference, February 1992 - November 1992 2. 4th Message Understanding System Evaulation & Message Understanding Conference, December 1991 - June 1992 ********************************************************** I. NOTICES I.A.1 Fr: Donna Harman X3569 Re: Text Retrieval Conference, February 1992 - November 1992 CALL FOR PARTICIPATION TEXT RETRIEVAL CONFERENCE February 1992 - November 1992 Conducted by: National Institute of Standards and Technology (NIST) Sponsored by: Defense Advanced Research Projects Agency Software and Intelligent Systems Technology Office (DARPA/SISTO) A new conference for examination of text retrieval methodologies (TREC) will be held in early November 1992 at Gaithersburg, Md. (near Washington, D.C.). The goal of this conference is to encourage research in text retrieval from large document collections by providing a large test collection, uniform scoring procedures and a forum for organizations interested in comparing their results. Both ad-hoc queries against archival data collections and routing (filtering or dissemination) queries against incoming data streams will be tested. There are plans to publish a conference proceedings. This announcement serves as a call for participation from groups interested in this forum. Participants will be expected to work with approximately half a million documents (2 gigabytes of data), retrieving lists of documents that could be considered relevant to each of 100 topics (50 training and 50 test topics). NIST will distribute the data and will collect and analyze the results. There will be some minimal support distributed to selected participants in an effort to maximize the number of participants and to attract the widest possible variety of technical approaches and system architectures. This funding is intended only as a supplement to other support. Non-U.S. as well as U.S. participants are eligible for this funding. Schedule: -Jan. 1, 1991 -- deadline for participation applications, including funding requests - Feb. 1, 1992 -- acceptances announced -March 1, 1992 -- first data (~ 1 gigabyte) to be distributed via CD-ROM, with the first group of topics (50), including relevance judgments for those topics, to be distributed shortly thereafter -June 1, 1992 -- second gigabyte of data distributed via CD-ROM, after trial results and routing queries (see below) received at NIST -July 15, 1992 -- 50 test topics distributed -August 1, 1992 -- results from 50 routing queries and 50 test topics due at NIST -Sept 1, 1992 -- relevance judgements released to participants -Oct 1, 1992 -- individual evaluation scores due back to participants -Nov 4-6, 1992 -- TREC conference in Gaithersburg, Md. Task Description:: Participants will receive the initial half of the test collection (or a subset for Category B participants, see below), to use for training of their systems, including development of appropriate algorithms or knowledge bases. The topics will be in the form of a highly-formatted user need statement (see attachment 1). Queries can either be constructed automatically from this topic description, or can be manually constructed. Participants are strongly encouraged to submit at least one run where queries are automatically constructed. Two types of retrieval operations will be tested: a routing or filtering operation against new data, and an ad-hoc query operation against archival data. The 50 topics initially distributed as training topics will be used by each participating group to create formalized routing or filtering queries to be used for retrieval against the second gigabyte of data. The 50 new test topics will be used against the total 2 gigabytes of data as ad-hoc queries. Results from both types of queries (routing and ad-hoc) will be submitted to NIST as the top X documents (X to be determined at a later date) retrieved for each query, although systems not providing ranking are not specifically excluded. Participants creating queries both automatically and manually may submit both sets for evaluation. A scoring technique using traditional recall/precision measures will be run for all systems and individual results will be returned to each participant. Conference Format: The conference itself will be used as a forum both for presentation of results (including failure analyses and system comparisons), and for more lengthy system presentations describing retrieval techniques used, experiments run using the data, and other issues of interest to researchers in information retrieval. As there is a limited amount of time for these presentations, some presentations may be in the form of expanded poster sessions. Additionally some organizations may not wish to describe their proprietary algorithms, and these groups may chose to participate in a different manner (see Category C). To allow a maximum number of participants, the following three categories have been established. Category A: Full participation: Participants will be expected to work with the full data set, and to present full details of system algorithms and various experiments run using the data. In addition to algorithms and experiments, some information on time and effort statistics should be provided. This includes time for data preparation (such as indexing, building a manual thesaurus, building a knowledge base), time for construction of manual queries, query execution time, etc. More details on the desired content of the presentation will be provided later. Category B: Exploratory groups: Because small groups with novel retrieval techniques might like to participate but may have limited research resources, a category has been set up to work with only a subset of the data. This subset (see data description below), will consist of about 1/4 gigabyte of training data (and 25 training topics), and 1/4 gigabyte of test data (and 25 test topics). Participants in this category will be expected to follow the same schedule as category A, except with less data, and will be expected to present full details of system algorithms, experiments, and time and effort statistics either in an expanded poster session or as a regular presentation. Category C: Evaluation only: Participants in this category will be expected to work on the full data set, submit results for common scoring and tabulation, and present their results in a poster session, including the time and effort statistics described in Category A. They will not be expected to describe their systems in detail. It is not anticipated that any supplemental funding will be available for this category. Data (Test Collection): The test collection (documents, topics, and relevance judgments) will be the same collection (English only) being used for the DARPA TIPSTER project. The collection is being assembled from the ACL/DCI text initiative, and a ACL/DC I User Agreement will be required from all participants. The documents and topics will cover two domains, international finance and science and technology, with half of the topics in each area. The documents will be an assorted collection of newspapers (including the Wall Street Journal), newswires, journals, technical abstracts and email newsgroups. The test set will be of approximately the same composition as the training set, and all documents will be typical of those seen in a real-world situation (i.e. there will not be arcane vocabulary, but there may be missing pieces of text or typographical errors). The subset for category B will use the topics from the domain of international finance and will run on a document collection comprised mostly of newspapers. The format of the documents should be relatively clean and easy-to-use as is (see attachment 2). Most of the documents will consist of a text section only, with no titles or other categories. The relevance judgments against which each system's output will be scored will be made by experienced relevance assessors based on the output of five different retrieval methods (from the TIPSTER project) using a pooled relevance methodology. Response format and submission details: By Jan. 1, 1992 organizations wishing to participate at any of the category levels should respond to the call for participation by submitting a summary of their text retrieval approach and a system architecture description, not to exceed five pages in total. The summary should include the strengths and significance of their approach to text retrieval, and highlight differences between their approach and other retrieval approaches. These summaries will serve as the basis for any published proceedings. Opportunity to revise the summaries and add explanations of the results will be provided before publication. Each organization should indicate in which category they wish to participate and which other categories are acceptable if the desired category is filled. Please indicate clearly the persons responsible for the summary statement and to whom correspondence should be directed. A full regular address, telephone number, and an email address should be given. EMAIL IS THE PREFERRED METHOD OF COMMUNICATION, although it is realized that diagrams and figures will need to be sent by regular mail or FAX. Organizations with no email connection must provide a viable alternative for communicating data and results between them and NIST. Those organizations wishing to apply for funding to supplement their own resources must provide a second statement (not to exceed two pages). This statement should include an estimate of the amount of funding available from other sources to support participation in this work, and a specification of the amount of funding desired. Please clearly indicate whether the organization is interested in participating in TREC even if no funding is available. All responses should be submitted by Jan. 1, 1992 to the Program Chair, Donna Harman: harman@magi.ncsl.nist.gov or Donna Harman, NIST, Building 225/A216, Gaithersburg, Md. 20899 FAX: 301-975-2128 AS NOTED ABOVE, EMAIL IS THE DESIRED FORM OF COMMUNICATION. Any questions about conference participation, response format, etc. should also be sent to the same address, as well as requests for the text of attachments 1 and 2 (deleted for space conservation). Selection of participants: As the goal of TREC is to further research in large-scale text retrieval, the program committee will be looking for as wide a range of text retrieval approaches as possible, and will select the best representatives of these approaches as participants for categories A and B. Category C participants must be able to demonstrate their ability to work with the full data collection. The program committee has been chosen from a broad area of information retrieval researchers and government users, and will both select the participants and provide guidance in the planning of the conference. The funding requests will be handled by a group outside the program committee which will not contain any participants in the conference. Program Committee Donna Harman, NIST, chair Ed Addison, Synchronetics, Inc. Chris Buckley, Cornell University Darryl Howard, U.S. Department of Defense David Lewis, University of Chicago Jan Pedersen, Xerox PARC John Prange, U.S. Department of Defense Alan Smeaton, Dublin City University, Ireland Richard Tong, Advanced Decision Systems ********** I.A.2. Fr: Donna Harman X3569 Re: 4th Message Understanding System Evaluation & Message Understanding Conference (MCU-4), December 1991 - June 1992 CALL FOR PARTICIPATION FOURTH MESSAGE UNDERSTANDING SYSTEM EVALUATION AND MESSAGE UNDERSTANDING CONFERENCE (MUC-4) December 1991 - June 1992 Sponsored by: Defense Advanced Research Projects Agency Software and Intelligent Systems Technology Office (DARPA/SISTO) The fourth in a series of text analysis system evaluations will be conducted over the coming months, concluding with the Fourth Message Understanding Conference in June, 1992. These evaluations are intended to provide a sound basis for understanding the merits of current text analysis techniques, as applied to the performance of a realistic information extraction task. The conference agenda will include reports and analyses of the systems and test results by the system developers and descriptions and critiques of the evaluation design by the test designers. The conference will be held on 16-18 June, 1992, preceded by a two-week testing period. The conference will be held in McLean, VA and hosted by PRC, Inc., a longtime MUC participant. Attendance at the conference is limited to evaluation participants, test designers and advisers, and representatives of government agencies. THE MUC-3 CHALLENGE: The MUC-3 experience is documented in the conference proceedings, which is available through Morgan Kaufmann Publishers at a cost of $40 plus shipping. If you are located in North America, the best way to place your order is to phone them toll free at (800)745-7323; if you are located outside North America, you may request information by email to morgan@unix.sri.com. When ordering, please refer to "Proceedings Third Message Understanding Conference (MUC-3)" or to ISBN 1-55860-236-4. THE MUC-4 OPPORTUNITY: The results of MUC-3 support the belief that the systems could do significantly better on the information extraction task if given more time for development. Only then would we begin to see what the upper limits are for a given system and for current systems as a whole. However, the MUC-3 results also showed that there is not only plenty of room for improvement, but also plenty of room for innovation. Thus, the performance of systems not tested for MUC-3 is also of interest, even if performance does not match that of the MUC-3 systems that are retested for MUC-4. Also of high interest are the results of experiments conducted by the participants to explore ways of obtaining various tradeoffs among the measures of performance. Organizations pursuing pure and hybrid approaches using pattern-matching, information retrieval, machine-learning, syntax-driven natural language processing, semantics-driven natural language processing, or other techniques are invited to participate. A corpus that can be used for system development is available in electronic form. It consists of 1400 texts and corresponding templates. Systems must be able to cope with the fact that substantial portions of some texts and even entire texts are irrelevant to the template generation task. Controlled changes to the MUC-3 evaluation design will be made to produce an improved evaluation for MUC-4 and to maintain continuity with MUC-3 so that progress can be measured. The changes are intended to more accurately and more thoroughly reveal the text analysis capabilities of the various systems. Participants will be required to conduct a dry-run test a couple months prior to the conference to ensure that the systems are functional and that the test procedures are well understood. The dry-run test results will be submitted to the MUC-4 program chair at the Naval Ocean Systems Center, who will compile the summary score reports, label each site's score report anonymously, and distribute the compilation to the other participants. Two representative texts and corresponding templates are attached. (The fill requirements are covered in reference documents, which are available along with the corpus in electronic form.) The sample templates follow the guidelines used for MUC-3. Assistance from participants is expected in updating the answer key templates for the training corpus to reflect the changes that are being planned for MUC-4. Participants are expected to follow the prescribed test procedures rigorously, and the name, description, and performance of each system will be published. All systems will be evaluated based on performance on the template fill task in a blind test. The primary metrics will be recall and precision. Individual slot scores and composite scores will be calculated for the test set as a whole, for individual texts in the test set, and for some subsets of the texts (such as just the "relevant" texts). A semi-automated scoring program will be distributed to participants to facilitate and standardize scoring. The templates produced by the system may also be evaluated in other ways, based on cross-system tests designed and conducted by persons responding to this call for participation. Submission of candidate adjunct test plans is invited, and a very limited amount of funding is available. These tests are expected to reveal something about language analysis performance that cannot be obtained from the usual score reports or to reveal issues in measuring the accuracy of information extraction technology; they are also expected to be designed in such a way that no special outputs have to be generated by the systems. When the sites submit their output templates (and other test materials) for the test set, the templates will be forwarded to the adjunct test designers, who will carry out the adjunct tests and report the results. Sites participating in the MUC-4 evaluation will be informed of the nature of the tests in advance. A limited amount of funding will be available to encourage participation by particularly worthy, needy organizations. The amount of the award will depend on the number of good applications received and the need of each applicant. Since the MUC-4 evaluation will take place over an extended period of time, it is expected that even those organizations that receive this support will need to identify other resources to supplement it. Organizations that can demonstrate their commitment to the evaluation process by identifying other funds to supplement an award will be considered more favorably than organizations which would have no other source of funding to apply to the effort. DEADLINES AND CHECKLIST 27 December 1991 Application for Participation (required) 27 December 1991 Request for Funding Participation (optional) 17 January 1992 "Adjunct Test" Plan (optional) 17 January 1992 Request for Funding Adjunct Test (optional) NOTE: Notification of acceptance/rejection of the submissions listed above will be made within 4 weeks of the deadines. 27 December 1991: DEADLINE FOR APPLICATIONS FOR PARTICIPATION Participation in the MUC-4 evaluation will be limited to those who have already invested a substantial amount of time and energy in software research and development. The software should be capable of accepting texts without manual preprocessing and should be readily applicable to the template generation task. Organizations wishing to participate must respond by submitting a summary of their text processing/interpretation approach and a system architecture description, not to exceed five pages in total. Acceptance or rejection will be determined on the basis of a technical assessment of each application. The summary and description will be published in the conference proceedings. Participants will have the opportunity to make revisions prior to publication. NOTE: Those who are interested in participating in MUC-4 need not wait until their application is submitted before obtaining the training data, which is available for electronic transfer. Please contact the program chair (sundheim@nosc.mil) for further information. MUC-4 participants are expected to comply with all prescribed test procedures, to make presentations at the conference, and to prepare papers for inclusion in a conference proceedings. Please note that participation implies agreement to permit the organization name and system description and performance to be published. 27 December 1991: DEADLINE FOR PARTICIPATION FUNDING REQUESTS Those wishing to apply for partial funding for their MUC-4 participation must provide a second statement, not to exceed two pages. This statement should identify * the particular strengths and significance of your approach to extracting and deriving useful information from text; * an estimate of the amount of funding available from other sources to support your participation in the evaluation; * a description of any software to be used for MUC-4 that you are willing to deliver to the Government and MUC participants for possible redistribution; * specification of the amount of funding desired and the minimum acceptable amount. Evaluators of funding requests will not include MUC system developers. Please note that less than two-thirds of the MUC-3 participants received similar financial support; the amounts ranged from $10,000 to $45,000, with most awards in the middle of the range. The total amount available for MUC-4 is more limited than it was for MUC-3. 17 January 1992: DEADLINE FOR ADJUNCT TEST PLANS Organizations wishing to carry out a cross-system experimental test using the MUC-4 templates must respond by submitting a preliminary test plan, not to exceed five pages in total, including a clear statement of the hypothesis, objective, and planned approach and at least a preliminary plan for implementing and conducting the test. Acceptance or rejection will be determined on the basis of technical merit and feasibility. 17 January 1992: DEADLINE FOR ADJUNCT TEST FUNDING REQUESTS Those wishing to apply for funding to partially cover the cost of conducting an adjunct test should provide a second statement, not to exceed two pages, to be evaluated. This statement should identify * which aspects of the work are essential; * any optional portions of the effort; * an estimate of the amount of funding available from other sources to support the effort; * a specification of the amount of funding desired and the minimum acceptable amount. Evaluators of funding requests will not include MUC system developers. Carrying out an adjunct test for MUC-4 will entail a commitment to design and conduct the adjunct test, to analyze the results and present them at the conference, and to write up the experiment for the conference proceedings. The total amount of funding available for the adjunct tests is extremely limited, and it may not be possible to fund more than one or two. POINT OF CONTACT FOR MUC-4: All responses to this call for participation should be directed to the program chair, Beth Sundheim, via email to sundheim@nosc.mil or via US mail to: Beth Sundheim, Naval Ocean Systems Center, Code 444, San Diego, CA 92152-5000. EMAIL IS HIGHLY PREFERRED FOR ALL COMMUNICATIONS. PROGRAM COMMITTEE: Laura Blumer Balcom, Advanced Decision Systems Nancy Chinchor, Science Applications International Ralph Grishman, New York University Jerry Hobbs, SRI International David D. Lewis, University of Chicago Lisa Rau, General Electric R&D Center Beth Sundheim, Naval Ocean Systems Center (program chair) Carl Weir, Unisys Center for Advanced Information Technology For further information and sample texts and templates, contact Beth Sundheim, sundheim@nosc.mil. ********************************************************** IRLIST Digest is distributed from the University of California, Division of Library Automation, 300 Lakeside Drive, Oakland, CA. 94612-3550. Send subscription requests to: LISTSERV@UCCVMA.BITNET Send submissions to IRLIST to: IR-L@UCCVMA.BITNET Editorial Staff: Clifford Lynch lynch@postgres.berkeley.edu or calur@uccmvsa.bitnet Nancy Gusack ncgur@uccmvsa.bitnet Mary Engle engle@cmsa.berkeley.edu or meeur@uccmvsa.bitnet The IRLIST Archives will be set up for anonymous FTP, and the address will be announced in future issues. To access back issues presently, send the message INDEX IR-L to LISTSERV@UCCVMA.BITNET. To get a specific issue listed in the Index, send the message GET IR-L LOG ***, where *** is the month and day on which the issue was mailed, to LISTSERV@UCCVMA.BITNET. These files are not to be sold or used for commercial purposes. Contact Nancy Gusack or Mary Engle for more information on IRLIST. The opinions expressed in IRLIST do not represent those of the editors or the University of California. Authors assume full responsibility for the contents of their submissions to IRLIST.