Information Retrieval List Digest 260 (June 19, 1995)
URL = http://hegel.lib.ncsu.edu/stacks/serials/irld/irld-260

IRLIST Digest                               ISSN 1064-6965
June 19, 1995
Volume XII, Number 23
Issue 260
**********************************************************
 II. JOBS
        1. Int'l. U. of Japan: Information Access Librarian
III. NOTICES
     B. Meetings
        1. Information Retrieval & Automatic Construction
           of Hypermedia
        2. Text Encoding for Information Interchange
     C. Miscellaneous
        1. Rutgers U.: Experimental Interface to the Library
           Catalog
IV. PROJECTS
     D. Initiatives & Proposals
        1. Human Language Resources (NSF/ARPA)
**********************************************************
II. JOBS
II.1.
Fr: Kazuto Shibuya <SHIBUYA%JPNIUJ00@VMD.CSO.UIUC.EDU>
Re: International University of Japan: Information Access Librarian

International University of Japan, a two-year graduate program
offering MAs and MBAs, is seeking an individual who combines
computer expertise, public service skills and the desire to have
an impact as part of the Reference/Internet service team working
to build a library for the 21st century. The available position
is the post of Information Access Librarian at the Matsushita
Library and Information Center.

The Information Access Librarian is responsible for helping
library patrons to use the library effectively and find the
information they require for their study and research needs.

A significant portion of this Librarian's time is spent providing
a high level of service to library users - students and faculty.
The main working language used is ENGLISH, however candidates
with Japanese language skills are encouraged to apply. Good
English communication and consultation skills are required. This
position demands expertise in the use of electronic (e.g., online
database services, CD-ROM systems, and networks) as well as
traditional information resources.

The successful candidate will be able to conduct library
orientations, including classroom sessions, tours, individual
training and preparation of user guides designed to increase user
success with electronic and traditional information resources.

The position requires: information librarianship; a minimum of 3
years experience in public oriented reference or information
access services; demonstrated commitment to public service;
awareness of SGML and HTML; broad interest in the organization
and retrieval of electronic information via the Internet; Some
online searching experience with Dialog, LEXIS/NEXIS and others;
network information retrieval and resource discovery; English
oral and written communications skills; and a working knowledge
of DOS, Windows and Macintosh hardware and software (e.g,
communications, CD-ROM and LANs).

The successful candidate must be flexible, a self-starter, an
effective communicator, and be an enthusiastic participant in a
team- oriented environment. A background in humanities or social
sciences is desired.

The International University of Japan is a certified graduate
institution with 200 students offering both an MA in
international relations and an MBA. Courses are taught in English
to students from some 35 countries all around the world averaging
28 years of age. Forty percent of the student body, and the
majority of the staff are Japanese nationals. Faculty come from
all over the world. The Matsushita Library and Information Center
houses 100,000 volumes and over 1,200 periodical titles in
several languages, but predominantly English and Japanese. The
campus is preparing to install a campus-wide LAN, and the latest
library automation technology. Located in rural Niigata
Prefecture, the IUJ campus is just 1.5 hours from downtown Tokyo
and is close to skiing, tennis, hot springs and hiking.

To apply for this unique position, send CV with cover letter and
references, to Mr.Shinichiro Oda, Deputy Manager of the
Matsushita Library and Information Center, International
University of Japan, Yamato- machi, Niigata, 949-72 JAPAN. Or
send via e-mail to MLICJOB@JPNIUJ00.BITNET Interviews will be
scheduled as suitably as possible with the location of the
applicant.

The deadline of this application is June 30th, 1995. The position
begins September 1, 1995. Contract length is for an initial one-
year period with the ability to renew and for the right
candidate, holds the potential of becoming a permanent position.
Compensation is attractive and commensurate with experience and
skills.  Benefits include health care coverage.
**********************************************************
III. NOTICES
III.B.1
Fr: James Allan <allan@cs.umass.edu>
Re: SIGIR '95 Workshops (Current Information)

                          Research Workshop
    INFORMATION RETRIEVAL AND AUTOMATIC CONSTRUCTION OF HYPERMEDIA
                    to be held in association with
                              SIGIR '95:
      18th International Conference on Research and Development
                       in Information Retrieval
                           Seattle, WA, USA
                            July 13, 1995
                        8:30 a.m. - 3:30 p.m.

The workshop will address IR methods and tools that can be used
in the automatic construction of a hypermedia collection, to
produce an informative set of documents (nodes) and links that
can be searched and browsed by content. For example, typical IR
measures of document similarity can provide a motivation for
linking documents. Also, recent work with passage retrieval shows
that it can be used to structure a collection of "flat" documents
for use in a hypermedia.

These and other methods for the automatic authoring of hypermedia
collections will be presented and discussed in the workshop. Both
techniques that construct a hypertext from an unlinked set of
data and those that can be applied to an existing hypertext/media
(augmenting its set of links) are relevant to the workshop. The
workshop will also discuss issues such as static links, dynamic
links, automatically assigning types to the generated links, and
evaluation of link quality.

The following researchers will talk about their work toward
automatically constructing or evaluating hypermedia. (This list
is accurate as of June 19, 1995, but is subject to change.)
    * INVITED SPEAKER: Gerard Salton, Cornell University
      "Text Structure Analysis and its Use for Text Retrieval, Text
      Traversal, and Text Summarization"
    * Maristella Agosti, Universita di Padova
      "Automatic authoring and construction of hypermedia for IR"
    * James Allan, University of Massachusetts
      "Automatic Hypertext Construction"
    * Niels K. Bauer, Texas A&M University
      "AutoLink: An Automated Link Generator for Building Hypertext"
    * James Blustein, University of Western Ontario
      "Using LSI to evaluate the quality of hypertext links" (with
      Robert E. Webber)
    * Paul Thistlewaite, Australian National University
      "The PASTIME project: Hypermedia in the Australian Parliament"

The workshop will also include time for general discussion and
some small group discussion about specific subtopics of interest.

Attendance at SIGIR '95 is not required, though it is necessary
to register for the workshop using the conference registration
form.  Cost of the workshop is $55 which includes a box lunch and
workshop documentation.

A copy of the registration form plus full information on SIGIR
'95, including descriptions of other workshops, several
tutorials, all technical sessions, and accommodation, etc. is
available via anonymous ftp from: ftp.u.washington.edu
(/public/sigir95/program) or via WWW at URL:
http://info.sigir.acm.org/sigir/conferences/SIGIR_95_adv.pgm.html;
or contact sigir95@u.washington.edu to request a copy of the
program by mail.
**********
III.B.2.
Fr: Eric Dahlin <hcf1dahl@UCSBUXA.UCSB.EDU>
Re: TEI Workshop

TEXT ENCODING FOR INFORMATION INTERCHANGE
A Tutorial Introduction to the Text Encoding Initiative
A workshop to be held at ACH/ALLC '95 in Santa Barbara

The organizers of ACH/ALLC '95 are pleased to announce a pre-
conference workshop on the Text Encoding Initiative Guidelines.

TITLE: Text Encoding for Information Interchange: A Tutorial
       Introduction to the Text Encoding Initiative
DATE:  10 July 1995, 9 a.m. to 4 p.m.
PLACE: UCSB Microcomputer Laboratory
INSTRUCTORS: C.M. Sperberg-McQueen, Lou Burnard David, Chesnutt
REGISTRATION FEE: $50

This workshop will introduce the encoding scheme recommended by
the Text Encoding Initiative (TEI) in its Guidelines for Text
Encoding and Interchange.  The main focus will be on introducing
the tag set defined in the Guidelines, but the context within
which the TEI Guidelines were developed and general problems of
text markup will also be addressed.

TOPICS:
1. General Principles of Text Markup:  What is markup for?
Varieties of markup; effect of markup.  What are electronic texts
for?  Markup and interpretation.  Markup as a means of enabling
intelligent retrieval.
2.  Basics of SGML:   What it is and isn't; the case for using
it.  Basic SGML syntax for the document instance (tags, entity
references, comment declarations).  Examination and explication
of simple examples.
3.  Document Analysis:  What document analysis is, and why it is
an essential part of any e-text project.  Phases of document
analysis.  Group document analysis of a sample text.
4.  Basics of the TEI:  origins and goals of the TEI, overall
organization of the TEI encoding scheme, basic structural notions
of the TEI DTD and the pizza model:  the base, additional, and
core tag sets, and how they may be extended, modified, and
documented; group tagging of the sample document.
5.  Hands-on Session:  introduction to standard commercial or
public-domain SGML-aware editor.
6. Putting the TEI into Practice:  types of software available
for SGML, how the adoption of TEI encoding affects the practical
work of an e-text project, and a review of where to go for
further information.

THE TEXT ENCODING INITIATIVE:  The Text Encoding Initiative (TEI)
is an international cooperative research effort, the goal of
which is to define a set of generic Guidelines for the
representation of all kinds of textual materials in electronic
form, in such a way as to enable researchers in any discipline to
interchange texts and datasets in machine readable form,
independently of the software or hardware in use, and also
independently of the particular application for which such
electronic resources are used.  The first full version of the TEI
Guidelines was published in May, 1994, after six years of
development in Europe and the US.  It takes the form of a
substantial reference manual, documenting a modular and
extensible SGML document type definition (DTD), which can be used
to describe electronic encodings of all kinds of texts, of all
times and in all languages.  It is sometimes said that the
Standard Generalized Markup Language (SGML:  ISO 8879) provides
only the syntax for text markup; the TEI aims to provide a
semantics.

Computer-aided research now crosses many political, linguistics,
temporal, and disciplinary boundaries;  the TEI Guidelines have
been designed to be applied to texts in any language, from any
period, in any genre, encoded for research of any kind.  As far
as possible, the Guidelines eschew controversy; where consensus
has not been established, only very general recommendations are
made.  The object is to help the researcher make his or her
position explicit, not to dictate what that position should be.

Viewed as a standard, the TEI scheme attempts to occupy the
middle ground.  It offers neither a single all-embracing encoding
scheme, solving all problems once for all, nor an unstructured
collection of tag sets.  Rather it offers an extensible framework
containing a common core of features, a choice of frameworks or
bases, and a wide variety of optional additions for specific
application areas.  Somewhat light-heartedly, we refer to this as
the Chicago Pizza model (in which the customer chooses a
particular base -- say deep dish or whole crust -- and adds the
toppings of his or her choice), by contrast with both the Chinese
menu or laissez-faire approach (which allows for any combinations
of dishes, even the ridiculous) and the set meal approach, in
which you must have the entire menu.

MATERIALS AND PRESENTERS:  All participants will be provided with
a printed introductory summary guide to the TEI scheme, and
supporting materials on PC disks, including full versions of the
TEI DTDs, public domain SGML software and sample TEI texts.
Subject to availability, participants may be able to acquire the
CD-ROM of the TEI Guidelines at a discounted price.

The tutorial will be taught by three instructors:  C. M.
Sperberg-McQueen (Computer Center, University of Illinois at
Chicago), Lou Burnard (Oxford University Computing Services), and
David Chesnutt (Dept. of History, University of South Carolina).

Please register before July 1, 1995
FOR COMPLETE INFORMATION, CONTACT:
     Sally Vito
     Phone: (805) 893-3072
     E-mail: hr03vito@ucsbvm.ucsb.edu
**********
III.C.1.
Fr: kantorp@bimacs.cs.biu.ac.il
Re: Rutgers U.: Experimental Interface to the Library Catalog

An experimental interface to the library catalog at Rutgers
University is available over the Internet. This system, called
the Adaptive Network Library Interface (ANLI) permits users of
the catalog to record, and to browse, anonymously contributed
links between items in that catalog. Thus it puts a kind of
hypertext layer on top of the existing catalog. In use it appears
as a transparent interface to the Rutgers IRIS (a GEAC system).
When it recognizes that you are considering a unique
bibliographic item it invites you to browse the related items,
and to offer suggestions.

All interested persons are invited to experiment with it. To
access the anli over the Internet follow these steps:

telnet mozart.rutgers.edu
login:  anli
password: anli
anli ID: (option, you may use your own initials)

To complete your session, type

end <cr>

There is a brief 4 question exit interview.  To skip a question,
enter <CR> until the cursor leaves the reply box.

The interface was developed by S. Zhao, T. Badics, R. Settergen,
L. Nordmann and R. Schwartz working under the direction of Prof.
Paul Kantor at the Rutgers, SCILS, Alexandria Project Laboratory.
The development was supported in part by a grant from the US
Department of Education.  For further information about the ANLI
project, contact Lorene Reba at lreba@scils.rutgers.edu.
**********************************************************
IV. PROJECTS
IV.D.1.
Fr: Maria Zemankova <mzemanko@nsf.gov>
Re: Human Language Resources Initiative (USA)

THIS NOTICE IS HEAVILY CUT FOR SPACE PURPOSES.
FOR COMPLETE INFORMATION, CONTACT:
    Gary W. Strong, Program Director
    Interactive Systems
    (703) 306-1928
    gstrong@nsf.gov

                       HUMAN LANGUAGE RESOURCES
                         Program Solicitation
                        A JOINT INITIATIVE OF:
                     NATIONAL SCIENCE FOUNDATION
     COMPUTER AND INFORMATION SCIENCE AND ENGINEERING DIRECTORATE
                                 and
                  ADVANCED RESEARCH PROJECTS AGENCY
          SOFTWARE AND INTELLIGENT SYSTEMS TECHNOLOGY OFFICE

DEADLINE:   JULY 14, 1995
INTRODUCTION:  The Information, Robotics and Intelligent Systems
Division (IRIS) and the Cross-Disciplinary Activities Office
(CDA) of the Computer, Information Science and Engineering
Directorate (CISE) of the National Science Foundation (NSF) and
the Software and Intelligent Systems Technology Office (SISTO) of
the Advanced Research Projects Agency (ARPA) plan to jointly
support research and development devoted to developing linguistic
resources for use in human language technology.

The aim of this joint initiative between NSF and ARPA is to
accelerate the progress in human language technology by
supporting the research and development of widely-accessible and
affordable language resources and closely related data resources.
It is also of interest to encourage access to these resources by
exploring alternative delivery mechanisms that the research
community may incorporate as requested resources in their
proposals.

TOPICS OF INTEREST:  This initiative has three main foci: (1) the
continued improvement and extension of speech, text, and closely
related language resources to support research and development in
human language technology and associated areas, such as
interlanguage communications; (2) focused experimental research
and data collection involving multimodal types of human language
data resources; and (3) innovative ways to make these resources
widely available to potential users for both research and
education. The last two foci are described in Type II awards
below.

TYPE I AWARD. Improvement in Basic Speech and Text Data
Resources.  Resources of interest are those created, maintained,
and distributed to provide broad training and evaluation data for
basic research and technological advances in the following areas:
- Speech recognition, including the transcription of high-quality
continuous speech and other contextual information from talkers
unknown to the system.
- Speech understanding, in which the focus is primarily on
domain-specific database query and update by voice.
- Information retrieval, in which the retrieval request is made
in terms of speech, text, or other closely associated modalities.
- Machine translation, including computer-aided human translation
and interlanguage dialog.

TYPE II AWARDS. New Approaches and Means of Data Collection and
Distribution.  While the primary interest of this initiative is
resource support for research in speech and text recognition and
understanding, related support on a smaller scale is also
available for the following areas of innovation:
- Development of innovative resources. Examples include: The
collection and annotation of video, involving facial gestures and
hand movements while speaking to advance research on multi-modal
communication using kinesics.  Dialogue data collection and
annotation to serve as a foundation for the advancement of
research on natural language understanding in realistic
situations of human-to-human communication.
- Novel methods of delivery for multimedia resources to support,
for example, such areas as the study of prosody, facial
expression understanding, multi-agent dialogues, or others.
- Transportable software tools for speech and written language
data access and analysis.
- Novel mechanisms for language data capture. Means to capture
and make available samples such as contrived on-line speech
understanding experiments or scenarios for public access and data
collection.  Experiments using such data to advance language
research on speech recognition in noisy environments over
telephones by ordinary users.

SCOPE OF SUPPORT:  This initiative is expected to provide overall
a total of approximately $3.5 million, depending on funding
availability, to one or more awardees in the following two
categories:
- One large, standard award in the broad area of data collection,
archival and distribution of speech, text, and closely related
modalities or supportive annotations (Type I Award above). This
award may be in the form of an NSF grant or cooperative
agreement, depending on the structure of the project. Funding for
this award will begin in late FY95. The total budget should not
exceed $2 million over a 30-month period. It's duration may
depend on the proposer's method for achieving self-sufficiency.
- Several smaller grants in the range of $150K to $250K per year
for up to three years toward one or more innovative approaches to
language data or its delivery (Type II Awards above). Funding for
these awards will be made when FY96 funds are available.
**********************************************************
IRLIST Digest is distributed from the University of California,
Division of Library Automation, 300 Lakeside Drive, Oakland, CA.
94612-3550.

Send subscription requests and submissions to:
NCGUR@UCCMVSA.UCOP.EDU

Editorial Staff:
 Clifford Lynch calur@uccmvsa.ucop.edu
 Nancy Gusack ncgur@uccmvsa.ucop.edu

The IRLIST Archives is now set up for anonymous FTP, as well as
via the LISTSERV.

Using anonymous FTP via the host dla.ucop.edu, the files will be
found in the directory pub/irl, stored in subdirectories by year
(e.g., /pub/irl/1993).

Using LISTSERV, send the message INDEX IR-L to LISTSERV@UCOP.EDU.
To get a specific issue listed in the Index, send the message GET
IR-L LOGYYMM, where YY is the year and MM is the numeric month in
which the issue was mailed, to LISTSERV@UCOP.EDU. You will
receive the issues for the entire month you have requested.

These files are not to be sold or used for commercial purposes.
Contact Nancy Gusack for more information on IRLIST. THE OPINIONS
EXPRESSED IN IRLIST DO NOT REPRESENT THOSE OF THE EDITORS OR THE
UNIVERSITY OF CALIFORNIA. AUTHORS ASSUME FULL RESPONSIBILITY FOR
THE CONTENTS OF THEIR SUBMISSIONS TO IRLIST.