Information Retrieval List Digest 258 (June 5, 1995) URL = http://hegel.lib.ncsu.edu/stacks/serials/irld/irld-258 IRLIST Digest ISSN 1064-6965 June 5, 1995 Volume XII, Number 21 Issue 258 ********************************************************** I. QUERIES 1. Stop Words II. JOBS 1. CompuServe: Lead Analyst; Programmer/Analyst 2. Harvard Medical School: Countway Library of Medicine III. NOTICES A. Publications 1. Comp.theory.info-retrieval Reborn B. Meetings 1. ASIS European Chapter/Call for Papers 2. ASIS Annual Meeting 1996 3. ACL-95 Corpus-based NLP Workshop ********************************************************** I. QUERIES I.1. Fr: Venu Dasigi Re: List of Function Words (Stop Words) I would like to obtain an on line list of function words (of English, for now). I believe they are also generally referred to as "stop words". I was almost sure a list is available somewhere, and even that I came across one some time in the cyberspace, but can't locate one now. I tried both LDC and CLR, but didn't find it there. I admit my search was perhaps not very thorough. Any help will be appreciated as to an ftp site or a web link or whatever. Thanks very much. Dr. Venu Dasigi (203) 371-7792 dasigi@shu.sacredheart.edu Dept. of Computer Science and Information Technology Sacred Heart University, 5151 Park Avenue, Fairfield, CT 06432-1000 Summer, 1995 at: Oak Ridge National Laboratory P.O. Box 2008 MS-6364 Bldg. 6025 Oak Ridge, TN 37831 vdasigi@plato.epm.ornl.gov ********************************************************** II. JOBS II.1. Fr: Debbie Roark <70003.5163@compuserve.com> Re: CompuServe: Lead Analyst; Programmer/Analyst CompuServe, a world leader in the communications and information services industries, offers technical professionals the kind of environment that will provide long-term professional and personal satisfaction and the latest in proven technology. We're looking for technical professionals to join our team and provide aggressive analysis and development. These positions are immediately available in Columbus, Ohio. The Database Engineering team in CompuServe's Information Delivery Platforms department has two positions available, Lead Analyst and Programmer/Analyst. The Database Engineering team evaluates information retrieval products and technologies, implements "first-of-a-kind" applications, and provides tools and support for effective use of full-text, object-oriented, and relational databases in online applications. Qualified candidates will possess: - Familiarity with common text indexing methods - Experience in analysis of software systems - Software development skills in C or C++, and Windows NT or UNIX We offer competitive salaries and attractive benefits including relocation assistance and an on-site fitness center. For immediate and confidential consideration, please forward your resume by mail to our World Headquarters at 5000 Arlington Centre Blvd., Columbus, Ohio 43220. Attention: Debbie Roark. Resumes may also be forwarded via the Internet to d.roark@csi.compuserve.com. CompuServe is an equal opportunity employer. ********** II.2. Fr: blike@WARREN.MED.HARVARD.EDU Re: Harvard Medical School: Countway Library of Medicine POSITION DESCRIPTION: Title: Knowledge & Consultation Services Librarian Dept: Countway Library of Medicine Harvard Medical School, Boston, Massachusetts Unit: Knowledge & Consultation Services RESPONSIBILITIES: The staff of the Knowledge & Consultation Services division of the Countway Library are responsible for facilitating the access to electronic resources by research and clinical faculty, students and staff of the Harvard schools of medicine, dentistry and public health, and their associated institutions. As an outgrowth of a traditional library reference department, KCS provides users with in-depth research consultation and instruction in the use of electronic tools and resources. Emphasis is on developing access to resources via the Library's World Wide Web and designing integrated searching interfaces which permit easy integration and manipulation of hypermedia resources, including the development of electronic publications. The Library "collects" and maintains access to textual, data, structural and graphical resources and works collaboratively with faculty in the evaluation, design and use of new and innovative electronic resources. Candidates should desire the challenge of working in a fast paced, team managed organization, the creativity of designing new services and products based on sophisticated information needs, especially in the basic sciences, and the excitement and personal growth of employing highly technical and expert interpersonal skills in collaborative roles with faculty, students and researchers. QUALIFICATIONS: ALA-accredited MLS degree with relevant biological sciences experience or second degree in science, or advanced degree in biological science and/or computer science with significant biological sciences experience; database design and development experience; technical expertise and competency with computers; excellent oral and written communications skills; team player. Grade: 57; $36.200 - 57,800 Hours: Monday-Friday. Occasional weekend hours as needed. Reports to: Susan Whitehead, Assistant Driector for Knowledge & Consultation Services SEND RESUME/COVER LETTER TO: Elaine Pridham Harvard Medical School Employment Office 25 Shattuck Street Boston, MA 02115. Salary: depending upon experience, minimum $36,000 EOE ********************************************************** III. NOTICES III.A.1. Fr: Art Pollard Re: Comp.theory.info-retrieval Reborn Comp.theory.info-retrieval (C.T.I.R.) is alive once again. C.T.I.R. is a moderated newsgroup devoted to the discussion of the theoretical aspects of information retrieval. C.T.I.R. has been in existence for some time however, the former moderator disappeared making it virtually impossible for articles to be posted. Now C.T.I.R. has a new moderator and a reasonable amount of traffic making it a very informative newsgroup for those that are involved in the information retrieval community. Typical topics of conversation within the newsgroup include: * Index Construction & Data Structures * Query Methods/Performance Figures * Relevance Ranking * Document Storage * Automated Thesaurus Generation * Document Routing Discussions are typically highly theoretical and have a very high signal to noise ratio. I would like to extend an invitation to those on IR-LIST to participate in this newsgroup and help make it a success. Thank you, Art Pollard Pollarda@Uhunix.uhcc.hawaii.edu Comp.Theory.Info-Retrieval Moderator ********** III.B.1. Fr: V. Cano Re: ASIS European Chapter/Call for Papers INFORMATION, PRODUCTS MARKET, AND SERVICES IN A NETWORKED ENVIRONMENT ASIS-EC sponsors a session at the Nord-IoD Conference in Oslo, Norway. Nord-IoD will take place from the 6th-9th of September 1995. The ASIS-EC session will take place on the pre-conference meetings and seminar day taking place on September 6th. The language of the session will be English. The title of the ASIS-EC session is: information Products, Markets, and Services in a Networked Environment. The session proposes to examine issues concerning the commodification of information, and the development of information products and services based on a networked environment. Papers addressing the accountability of networked information products are encouraged. Please submit an extended abstracts in English by June 15th, 1995 to: Dr. V. Cano Queen Margaret College, Clerwood Terrace Edinburgh EH12 8TS Scotland, UK Fax. +44-1-31-3173256 Notification of acceptance: July 15th, 1995. ASIS-EC in Nord-IoD Organizing Committee: Dr. Sinikka Koskiala, Helsinki University of Technology, Finland. Dr. V. Cano. Queen Margaret College, Edinburgh, Scotland, UK Dr. Julian Warner. Queen's University, Belfast, UK Nord-IoD's central theme is Information Power. -Information Power -from Economical Power to Information and Knowledge Power. Chair: Karl Kalseth, Norsk Hydro a.s. Chairman, Organizing Committee. -Information as Basis for Value Addition. Chair: helge Clausen. Statsbibliotecket Arhus. -Information Forcing Change. Chair: Cecilie Butenschen, President NFF. -IoD in a Nordic and International Perspective. Chair: Cunnhildur Manfredsdottir, University of Iceland -The New Information Specialist- How the Information professsion itself is Changing. Chair: Prijo Kalnu, PL-Consulting Ltd. Finland. -The Global Information Market. Chair: Hans I. Holm, Astra Hassle. ********** III.B.2. From: Richard Hill Re: ASIS Annual Meeting 1996 Call for Participation CALL FOR PARTICIPATION GLOBAL COMPLEXITY: INFORMATION, CHAOS AND CONTROL ASIS 1996 Annual Meeting October 21-26 1996 Baltimore, Maryland Research in chaotic systems has uncovered order in the midst of disorder -- information hidden in noise -- and spawned complexity as a field of study. Complexity theory explores interconnectedness, coevolution, structure and order that produce spontaneous self-organizing and adaptive systems that balance precariously on the edge of chaos. >From Mandelbrot sets and fractals to economics, there is a tantalizing similarity to evolutionary patterns and emergent phenomena. As an emergent and interdisciplinary field, information science should profit by exploring complexity. >From the bits transmitted via an information channel to the less well understood transfer of knowledge and wisdom, there are patterns. Are they global? The ASIS 1996 Annual Meeting will consider the complexity of the working world of information professionals as well as theoretical perspectives involving the nature and use of information. Topics to be addressed will include: * Generation and dissemination of information How do individuals and organizations produce and recognize informative materials using multiple technologies and myriad, networked resources? What can be learned from parallels with the incunabula period of printing, when proliferation of documents led to higher literacy? * Information organization and access. It has been said that traditional publishing guarantees some quality precisely because of its time lag. With information being provided instantaneously, can we assure quality without tacitly endorsing censorship? How can multiple organizations be created, maintained, and made useful? If interfaces evolve to cope with complexity, what will be the roles of intermediaries? * Social implications of complex information systems. When anyone with a file server on the Internet can look like a multinational conglomerate, will Davids slay Goliaths? What will promote innovation, and how will it be recognized? Who will own what, and how can information producers protect themselves? Will traditionally underserved groups find access to complex information resources? CONTRIBUTED PAPERS: Contributed papers report results of completed research or research in progress. Papers should be scholarly in nature and will be refereed. Those accepted will be published in full in the conference Proceedings. Authors of accepted papers will be expected to attend the conference and will be given 15-20 minutes to present their work. To submit a contributed paper, send an intent consisting of the title and a 250 word abstract with complete addresses of author(s) to the Contributed Papers Coordinator, Linda C. Smith, at the address below by December 15, 1995. Preliminary approval will be made by January 15, 1996. Three copies of the complete paper will be due on February 15, 1996. Notification of acceptance will be made no later than April 1, 1996, and camera-ready copy for the Proceedings will be due June 1, 1996. PANEL SESSIONS: Panel sessions and other technical programs are developed by ASIS Special Interest Groups (SIGs) either individually or in collaboration with other SIGs or with organizations and individuals outside ASIS. Initial proposals for panel sessions should include: session title, sponsoring SIG(s), name and address of session organizer (contact person), brief description (500 words), and names and affiliations of presenters and moderators. Proposals should be sent to the SIG Sessions Coordinator, Merri Beth Lavagnino, at the address below by December 15, 1995. Notification of acceptance will be sent by February 1, 1996. Final program copy, including speakers, titles, and abstracts, will be due March 15, 1996, and camera-ready copy of abstracts for the Proceedings will be due June 1, 1996. Panel session papers that are submitted to the Contributed Papers Coordinator by February 15 and follow the schedule described for contributed papers may be published in full in the Proceedings. SUBMISSION INFORMATION: Contributed Papers Proposals/abstracts (mail, fax, e-mail) due December 15, 1995 Complete papers (1500 - 3500 words) for review due February 15, 1996 Camera-ready copy of accepted papers due June 1, 1996 Panel Sessions Proposals/abstracts due December 15, 1995 Final program descriptions due March 15, 1996 Camera-ready copy due June 1, 1996 FOR COMPLETE INFORMATION, CONTACT: Richard Hill Executive Director, American Society for Information Science 8720 Georgia Avenue, Suite 501 Silver Spring, MD 20910 (301) 495-0900 FAX: (301) 495-0810 rhill@cni.org ********** III.B.3. Fr: David Yarowsky Re: ACL-95 Corpus-based NLP Workshop THE THIRD WORKSHOP ON VERY LARGE CORPORA Friday, 30 June 1995 8:45 AM - 5:25 PM MIT, Cambridge, Massachusetts, USA at ACL-95 (Sponsored by ACL's SIGDAT and SIGNLL) The workshop will present original research in corpus-based and statistical natural language processing. Topics will include sense disambiguation, grammar induction, part-of-speech tagging, information retrieval, language modeling, and machine translation. This year's theme is: SUPERVISED TRAINING VS. SELF-ORGANIZING METHODS Historically, annotated corpora have made a significant contribution to tasks such as part-of-speech tagging and sense disambiguation. But annotated corpora are expensive and generally unavailable for languages other than English. Self-organizing methods offer the hope that annotated corpora might not be necessary. Can we achieve comparable performance using little or no tagged training data? What are the tradeoffs? ORGANIZERS: Ken Church and David Yarowsky INDUSTRIAL SPONSOR: LEXIS-NEXIS, Division of Reed and Elsevier, Plc. REGISTRATION: Registration fees are $35 for participants who register by 19 May 1995, $40 for payment received by 15 June 1995, and $45 at the door. Registration includes a copy of the proceedings, catered lunch and refreshments during the day. Acceptable forms of payment are US$ cheques payable to "ACL" or credit card (VISA/Mastercard) payment. E-mail registrations are encouraged. FOR COMPLETE INFORMATION, CONTACT: David Yarowsky Dept. of Computer and Information Science University of Pennsylvania 200 S. 33rd St. Philadelphia, PA 19104-6389 USA email: yarowsky@unagi.cis.upenn.edu More Information: http://www.cis.upenn.edu/~yarowsky/wvlc3.html ACL-95 Homepage: http://www.ai.mit.edu/people/cgdemarc/acl/acl-info.html PRELIMINARY PROGRAM INVITED TALK: Mark Liberman Eric Brill: Unsupervised Learning of Disambiguation Rules for Part of Speech Tagging Carl de Marcken: Lexical Heads, Phrase Structure and the Induction of Grammar Michael Collins and James Brooks: Prepositional Phrase Attachment through a Backed-off Model Andrew Golding: A Bayesian Hybrid Method for Context-sensitive Spelling Correction Philip Resnik: Disambiguating Noun Groupings with Respect to Wordnet Senses Dekai Wu: Trainable Coarse Bilingual Grammars for Parallel Text Bracketing Lance Ramshaw and Mitch Marcus: Text Chunking Using Transformation-Based Learning INVITED TALK: Henry Kucera and Nelson Francis Fernando Pereira, Yoram Singer, and Naftali Tishby: Beyond Word N-Grams Jing-Shin Chang, Yi-Chung Lin, and Keh-Yih Su: Automatic Construction of a Chinese Electronic Dictionary Kenneth Church and William Gale: Inverse Document Frequency (IDF): A Measure of Deviations from Poisson Joe Zhou and Pete Dapkus: Automatic Suggestion of Significant Terms for a Predefined Topic Ellen Riloff and Jay Shoen: Automatically Acquiring Conceptual Patterns without an Annotated Corpus ********************************************************** IRLIST Digest is distributed from the University of California, Division of Library Automation, 300 Lakeside Drive, Oakland, CA. 94612-3550. Send subscription requests and submissions to: NCGUR@UCCMVSA.UCOP.EDU Editorial Staff: Clifford Lynch calur@uccmvsa.ucop.edu Nancy Gusack ncgur@uccmvsa.ucop.edu The IRLIST Archives is now set up for anonymous FTP, as well as via the LISTSERV. Using anonymous FTP via the host dla.ucop.edu, the files will be found in the directory pub/irl, stored in subdirectories by year (e.g., /pub/irl/1993). Using LISTSERV, send the message INDEX IR-L to LISTSERV@UCOP.EDU. To get a specific issue listed in the Index, send the message GET IR-L LOGYYMM, where YY is the year and MM is the numeric month in which the issue was mailed, to LISTSERV@UCOP.EDU. You will receive the issues for the entire month you have requested. These files are not to be sold or used for commercial purposes. Contact Nancy Gusack for more information on IRLIST. THE OPINIONS EXPRESSED IN IRLIST DO NOT REPRESENT THOSE OF THE EDITORS OR THE UNIVERSITY OF CALIFORNIA. AUTHORS ASSUME FULL RESPONSIBILITY FOR THE CONTENTS OF THEIR SUBMISSIONS TO IRLIST.