Information Retrieval List Digest 034 (October 4, 1990) URL = http://hegel.lib.ncsu.edu/stacks/serials/irld/irld-034 IRLIST Digest October 4, 1990 Volume VII Number 28 Issue 34 ********************************************************** I. NOTICES A. Meetings announcements/Calls for papers 1. JIPDEC - CID International Symposium on Trends of Intelligent Hypermedia October 25-26, 1990 Tokyo, Japan 2. ACM SIGIR '91 14th International Conference on R&D in IR October 13-16, 1991 Chicago, Illinois II. QUERIES A. Questions and answers 1. Detecting identical or near-identical texts 2. Mac FTP ********************************************************** I. NOTICES I.A.1. Fr: Gregory Grefenstette Re: JIPDEC - CID International Symposium on Trends of Intelligent Hypermedia October 25-26, 1990 Tokyo, Japan Organizers . Japan Information Processing Development Center (JIPDEC) . Centre de Hautes Etudes Internationales d'Informatique Documentaire (CID). AGENDA October 25 Opening Remarks 9:00 am - 9:00 am . Mr. Eiji Kageyama, President of JIPDEC. . Prof. A. Licherowciz, scientific director of C.I.D. . Minister of International Trade and Industry. (1) Keynote Speech 9:30 am - 11:00 am Mr. Theodor Nelson (U.S.A.) Founding Designer, Project Xanadu, Autodesk, Inc. (2) Hypermedia as an Aid to Learning 11:00 am - 12:10 am Dr. Peter Zorkczy (U.K.) Director. Center for Electronics Education LUNCH (3) Dynamic Media Toolkit System 1:10 pm - 2:20 pm Prof. Dr. Yuzura Tanaka (Japan) Electrical Engineering Dept. Hokkaido University. (4) Dynamic Hyertext Links and computer Aided Linkage 2:20 pm - 3:30 pm Prof. Christian Fluhr (France) I.N.S.T.N. (C.E.A.) Paris XI University COFFEE BREAK (5) Associative Retrieval Method for Image Database 3:30 pm - 5:00 pm Mr. Masahiro Shibata (Japan) NHK (Japan Broadcasting Corp.) 5:00 pm - 6:30 pm Exhibition 6:30 pm - 8:30 pm Party October 26 (6) Comparative Analysis of Hypertext and Neural Networks Models: Mind Extension vs. Mind Simulation 9:30 am - 10:40 am Mr. Daniel Gross (U.S.A.) Chairman of Magnetic Press. (7) Very Large Knowledge Base and Hypertext 10:40 am - 11:50 am Mr. Toshio Yokoi (Japan) General Manager Japan Electronic Dictionary Research Institut, Ltd. LUNCH (8) Application of Hypermedia and Information Retrieval for Technical Documentation 12:50 am - 2:00 pm Mr. Philippe Flichy (France) Chairman of Better Way. (9) Concept Browser for a Personal Information Base 2:00 pm - 3:10 pm Dr. Hironichi Fujisawa (Japan) Hitachi, Ltd. COFFEE BREAK Panel Session 3:30 pm - 5:30 pm Coordinator : Prof. Dr. Tanaka Panelists : Prof. C. Fluhr Mr. Ph. Flichy Dr. P. Zorkoczy Mr. T. Yokoi Dr. H. Fujisawa Registration fees: $420. (account) 'CASIS' nx 02 51 46 99 / 00 Socit Gnerale Bank 50 Rockfeller plaza, New York, U.S.A. Special travel fair are available. ********** I.A.2. Fr: Abraham Bookstein Re: ACM SIGIR '91 14th International Conference on R&D in IR October 13-16, 1991 Chicago, Illinois Sponsored by: ACM SIGIR In co-operation with: AICA - GLIR (Italy) BCS - IRSG (UK) GI (Federal Republic of Germany) INRIA (France) The events of SIGIR '91 are being coordinated with the Centennial celebrations of the University of Chicago. The Center for Information and Language Studies (CILS) is responsible for the coordination. INFORMATION RETRIEVAL Problems relating to the effective storage, access and manipulation of textual information are among the most challenging to current computer science. Information is continuing to grow exponentially and is increasingly becoming available in machine readable form; computer networks are making communicating information easier; new computer architectures and inexpensive, powerful hardware are making feasible the introduction of sophisticated, computer intensive algorithms for efficiently storing and retrieving information. Research in information retrieval touches on fields as diverse as the design and analysis of algorithms, natural language processing, artificial intelligence, hypertext, multimedia data management, and software engineering. The Annual ACM SIGIR Conference is the premier forum for presentation and discussion of current research in Information Retrieval. The 14th Annual Conference will continue this multidisciplinary tradition, but will focus especially on the problems of full text databases. The program will consist of contributed research papers and panel presentations. There will also be a program of tutorials on Sunday, October 13. TOPICS FOR SIGIR '91 Original research papers and panel proposals are solicited on topics including, but not limited to, the following: Information retrieval theory: Retrieval models and algorithms, Evaluation, Document and query presentation, extension to full text databases. Artificial Intelligence Applications: Knowledge representation, Connectionism, Expert Systems. Natural Language Processing: Application of lexicons, parsing algorithms to IR. Interface Issues: Human-computer interaction, design considerations. Hypertext and Multimedia Systems: Software reuse, Office information systems, Case-based retrieval. Implementation issues: New computer architectures, Retrieval hardware, Storage devices, Data structures, Compression methods. INSTRUCTIONS FOR CONTRIBUTORS CONTRIBUTED PAPERS Persons wishing to contribute original research papers should send four copies of a full paper to the appropriate program chair, as indicated below. Papers or (if the author chooses) extended, 10-12 page, abstracts will be published in the conference proceedings; authors will be required to sign an ACM copyright release form. The program committee may select papers for journal publication, in which case an abstract will be published in the proceedings. Submissions are due March 25, 1991. PANEL PRESENTATIONS Suggestions for panels should consist of descriptions of the topics to be covered, the names of proposed speakers and moderator, brief abstracts of the proposed presentations, and the desired length of time for the panel. Four copies of proposals, of no more than three pages, should be sent to the appropriate program chair. Proposals are due March 25, 1991. TUTORIALS Proposals for tutorials should consist of the topic to be discussed, the name(s) and brief biographies of the presenter(s), and an outline of the tutorial. Four copies of proposals, of no more than three pages, are due April 25, 1991. Email may be used for tutorial proposals, but backed up by hard copy. Proposals should be sent to the tutorial chair: Dr. Donna K. Harman Building 225/A216 National Institute of Standards and Technology Gaithersburg, MD 20899 harman@dsys.ncsl.nist.gov IMPORTANT DATES March 25, 1991: Papers and panel proposals due to program chairs April 25, 1991: Tutorial proposals due to tutorial chair June 3, 1991: Authors informed of acceptance of papers and proposals July 15, 1991: Final versions of papers due to program chairs CONFERENCE CHAIR Prof. Abraham Bookstein 1100 E. 57th, CILS University of Chicago Chicago, IL 60637, USA bkst@tira.uchicago.edu Telephone: (312) 702-8268 FAX: (312) 702-0775 PROGRAM CHAIRS Americas and Asia Europe, Africa, Australia Prof. Gerard Salton Prof. Yves Chiaramella Department of Computer LGI-IMAG Science B.P. 53 X Cornell University 38041 Grenoble CEDEX Upson Hall France Ithaca, NY 14853, USA chiara@imag.imag.fr gs@gvax.cs.cornell.edu PROGRAM COMMITTEE Maristella Agosti Universita di Padova, Italy Nick Belkin Rutgers University, USA Abraham Bookstein University of Chicago, USA Christine Borgman University of California, Los Angeles, USA Giorgio Brajnik Universita degli Studi di Udine, Italy Yves Chiaramella (Chair, European Committee) University of Grenoble, France S. Christodoulakis University of Waterloo, Canada M. Crehange CRIN, France Bruce Croft University of Massachusetts, USA Christian Fluhr CEN-SACLAY, France Ed Fox Virginia Polytechnic Institute, USA Norbert Fuhr Technische Hochshule Darmstadt, Germany Paul Jacobs General Electric Research, USA Gary Marchionini University of Maryland, USA V. Quint INRIA, France Fausto Rabitti IEI-CNRS, Italy Vijay Raghavan (Co-Chair, USA) University of Southwestern Louisiana, USA Edie Rasmussen University of Pittsburgh, USA Gerard Salton (Co-Chair, USA) Cornell University, USA Craig Stanfill Thinking Machines Corporation, USA Jean-Luc Vidick Universite Libre de Bruxelles, Belgium Peter Willett University of Sheffield, UK Michael Wong University of Regina, Canada Clement Yu University of Illinois, Chicago, USA Keith van Rijsbergen Glasgow University, UK CONFERENCE COMMITTEE Conference Chair: Abraham Bookstein, Center for Information and Language Studies, University of Chicago Program Chairs: Gerard Salton, Cornell University Vijay Raghavan, University of South West Louisisana Yves Chiaramella, Institut de Mathematiques Appliques de Grenoble Tutorials Chair: Donna Harman, National Bureau of Standards, USA Local Arrangements Chair: Michael Koenig, Rosary College Publicity Chair: Edward A. Fox, Virginia Polytech Local Publicity: Scott Deerwester, Center for Information and Language Studies, University of Chicago Treasurer: Clement Yu, University of Illinois, Chicago Campus ********************************************************** II. QUERIES II.A.1. Fr: Dave Lewis Re: Detecting identical or near-identical texts Hi, Does anyone know of a simple and efficient technique for detecting that two pieces of text are identical or nearly identical? I have a moderate size (approx 30,000 documents) corpus of texts (ranging from one sentence to a couple of pages in length) where some of the texts are known to be duplicates or near-duplicates of each other, and I'd like to detect with high probability all the pairs of duplicates and near-duplicates. I am willing to look at a moderate number of false pairs. This could be done by document clustering techniques, but I'd like something that could be done without all the overhead of building indexed versions of the corpus and such that that would require to be done efficiently. One approach that occurred to me is forming a bit vector for each text (using signature file techniques) and doing all pairwise comparisons. But I'd be interested in hearing from anyone who's attacked this problem before or knows of literature on it. Best, Dave Lewis lewis@cs.umass.edu ********** II.A.2. Fr: Luis Morin Re: Mac FTP AT the campus we have an Appletalk network with Kennetics fastpaths connected to Internet. We are having several problems trying to use Mac FTP, in order to communicate with other nodes. Do you know if a special version is required on Mac FTP or on the Kennetics? We have all the range of Mac's from 512 to CI's. There are 8 Kinetics and the amount of Mac's in the networks is 120. As a comment, we can emulate 3270 from those Kinnetics. Any suggestions? Thanks in advance, Luis Morin. ********************************************************** IRLIST Digest is distributed from the University of California, Division of Library Automation, 300 Lakeside Drive, Oakland, CA. 94612-3550. Send subscription requests to: LISTSERV@UCCVMA.BITNET Send submissions to IRLIST to: IR-L@UCCVMA.BITNET Editorial Staff: Clifford Lynch lynch@postgres.berkeley.edu calur@uccmvsa.bitnet Mary Engle engle@cmsa.berkeley.edu meeur@uccmvsa.bitnet Nancy Gusack ncgur@uccmvsa.bitnet The IRLIST Archives will be set up for anonymous FTP, and the address will be announced in future issues. These files are not to be sold or used for commercial purposes. Contact Mary Engle or Nancy Gusack for more information on IRLIST. The opinions expressed in IRLIST do not represent those of the editors or the University of California. Authors assume full responsibility for the contents of their submissions to IRLIST.