Information Retrieval List Digest 019 (May 3, 1990) URL = http://hegel.lib.ncsu.edu/stacks/serials/irld/irld-019 IRLIST Digest May 3, 1990 Volume VII Number 13 Issue 19 ********************************************************** I. NOTICES: A. Meetings announcements/Calls for papers 1. 2nd International Conference on Chemical Structures and Chemical Information, June 3-7, 1990, The Netherlands ********************************************************** I. NOTICES I.A.1. Fr: David Johnson Re: 2nd International Conference on Chemical Structures and Chemical Information, June 3-7, 1990, The Netherlands SECOND INTERNATIONAL MEETING ON CHEMICAL STRUCTURES Sunday 3rd June to Thursday 7th June 1990 LEEUWENHORST CONGRESS CENTER NOORDWIJKERHOUT THE NETHERLANDS Technical Program and Abstracts PREFACE Planning for the second international conference on chemical structure handling to be sponsored jointly by the American Chemical Society's Chemical Information Division, the Chemical Structure Association, the Royal Netherlands Chemical Society and the Chemical Information Groups of the Gesellschaft Deutscher Chemiker and the Royal Society of Chemistry has been a lengthy process. The response to our information leaflets and publicity for the planned meeting and the large numbers of you present at the Leeuwenhorst Congress Center demonstrate the ever increasing interest in the topics to be covered in the conference. We have tried to anticipate all your needs during your stay in Noordwijkerhout. If we have failed in any way please do not hesitate to contact any one of us and we will try to rectify the problem. We hope you will enjoy the conference. The Organizing Committee Mr Charles L Citroen, CID-TNO, The Netherlands Mr David K Johnson, Exxon Research and Engineering, USA Dr Reiner Luckenbach, Beilstein Institute, FRG Dr Peter Rhodes, Royal Society of Chemistry, UK Mr Maarten de Hoog, Royal Netherlands Chemical Society, The Netherlands Dr Wendy A Warr, ICI Pharmaceuticals, UK TECHNICAL PROGRAM Sunday June 3 14.00 Meeting of the Standard Molecular Data File Group 16.00 Tea 16.15 Business Meeting ACS-CINF 17.15 1 Keynote address: 'Chemistry in Three Dimensions', Ernest Eliel, University of North Carolina, Chapel Hill 18.00 Rijsttafel Dinner courtesy of Springer-Verlag Monday June 4 Wendy Warr presiding 9.00 2 'Optical Recognition of Chemical Graphics', Stephen Boyer, IBM Almaden Research Center 9.25 3 'A Personal Computer Program System for NMR Database Construction', Hidetsugu Abe, Toyohashi University of Technology 9.50 4 'Integrating Chemical Nomenclature Interfaces to Structure-Based Information Systems', Mark Lord, University of Hull 10.15 5 'AUTONOM - A Chemist's Dream: System for (Micro)Computer Generation of IUPAC-Compatible Names from Structural Input', Janutz Wisniewski, Beilstein Institute 10.40 Break Martyn Wilkins presiding 11.10 6 'Searching a Full Generics Database', Valerie Gillet, University of Sheffield 11.35 7 'Automatic Translation of GENSAL Representations of Markush Structures into GREMAS Fragment Codes at IDC', G Stiegler, IDC Internationale Dokumentationsgesellschaft fur Chemie 12.00 8 'Chiral Structure Database for Saccharides', Nancy L Porter, Maxwell Online 12.25 9 'Addition of Node/Bond Stereochemistry to the CAS Registry File', Paul Blower, Chemical Abstracts Service 13.00 Lunch 16.00 Poster Sessions and Exhibition Posters chaired by David Johnson Exhibition will be Monday afternoon and evening only 18.00 Buffet Meal in Exhibition Hall 22.30 Close of Exhibition Tuesday June 5 David Johnson presiding 8.35 10 'Stereochemical Substructure Searching: Handling of Relative Configurations', Paul Blower, Chemical Abstracts Service 9.00 11 'The Du Pont Global Technical Information System', Jean Marcali, E.I. du Pont de Nemours & Co 9.25 12 'Structure Registration for Beilstein Online', Steve Welford, Beilstein Institute 9.50 13 'A New Structure Search System', Peter Rusch, Dialog Information Service 10.15 14 'The DARC Inhouse Packages as a Library of Standalone Functions for Building Applications in Handling Chemical Information', Pascal Huguet, Questel 10.40 Break Gerry Vander Stouw presiding 11.10 15 'Rings - The Importance of Being Perceived', Geoff Downs, Barnard Chemical Information Ltd 11.35 16 'Computer Representation and Searching of Chemical Substances', Doug Hounshell, Molecular Design Ltd 12.00 17 'Information Integration: Distributed Chemical Information Management Systems', Dennis Smith, Molecular Design Ltd 12.25 18 'Integrating Chemical Structures into an Extended Relational Database System', Tom Hagadone, The Upjohn Company 13.00 Lunch 17.00 Annual General Meeting of the Chemical Structure Association 18.30 Dinner Wednesday June 6 Peter Nichols presiding 8.35 19 'Representation and Searching of 3-D Protein Structures', Peter Willett, University of Sheffield 9.00 20 'Conformational Freedom in 3-D Databases', Nick Murrall, Chemical Design Limited 9.25 21 'Using 3-D Similarity Searching to Develop Synthetic Targets', Charles Eyermann, E.I. Du Pont de Nemours Experimental Station 9.50 22 'Strategies for the Evaluation of Hits from 3-D Substructure Searching', Yvonne Martin, Abbott Laboratories 10.15 23 'Chemical Structure Handling Using the Distributed Array Processor', Peter Willett, University of Sheffield 10.40 Break Bill Town presiding 11.10 24 '3D Search and Numerical Analyses Applied to Files of Crystallographic Data: Methodologies, Examples, and Integration with 1D and 2D Techniques', Frank Allen, Cambridge Crystallographic Data Center 11.35 25 'An Integrated Approach to 2D and 3D Similarity Searching for the Cambridge Structural Database (CSD)', Eleanor Mitchell, Cambridge Crystallographic Data Center 12.00 26 'Molecular Dissimilarity in Chemical Information Systems', David Bawden, Pfizer Central Research 12.25 27 'Similarity and Analogy Based on Discrimination Net', Takashi Okada, Kwansei Gakuin University 13.00 Lunch 14.00 Outing 19.00 Conference Dinner Thursday June 7 Reiner Luckenbach presiding 9.00 28 'Similarity Criteria for Chemical Structures and Reactions', Johnny Gasteiger, Technical University Munich 9.25 29 'The Path Matrix, A Useful Tool for Coding Cyclization Reactions', Josef Brandt, Institut fur Organische Chemie 9.50 30 'The Computer Aided Design of Organic Reactions', Rainer Herges, University of Erlangen-Nurnberg ABSTRACTS 1. Chemistry in Three Dimensions E.L. Eliel, University of North Carolina Prior to van't Hoff and Le Bel, chemistry was two-dimensional. Since 1874, however, we have had to deal with the third dimension in molecular models, projection formulae, configurational descriptors and, most recently, computer algorithms used to describe and specify configuration. The problem is complicated because chirality - an important aspect of three-dimensional structure - is an attribute of the molecule as a whole whereas the commonly used Cahn-Ingold-Prelog configurational descriptors require factorization of chirality into individual chiral elements. The address will deal with the history of chirality and the present status of describing it. 2. Optical Recognition of Chemical Graphics S.K. Boyer, R. Casey, A. Miller and K. Zilles, IBM Almaden Research Center We have developed a program for the optical recognition of chemical graphics. The program allows for documents containing chemical structures to be optically scanned so that both the text and the chemical diagrams are recognized. The structures are converted directly into molfiles (a type of connectivity table) suitable for direct input into chemical databases, molecular modeling programs, image rendering programs and programs that perform real time manipulation of structures. 3. A Personal Computer Program System for NMR Database Construction H. Abe, E. Kouno, T. Okuyama, K. Yoshida and S. Sasaki, Toyohashi University of Technology During the construction of a proton NMR database, several utility programs have been developed for improving efficiency of the compilation works. The personal computer program system presented here is one of such utilities. This program serves three major functions. The first function is to input structural formulae interactively by drawing them on a CRT screen. The second is to convert and process the raw spectral data from a spectrometer into a proper set of information. The last function is to make assignment of signal groups in the given spectral data to hydrogen containing substructures in the corresponding structural formula. These functions are executed interactively with graphical representation of spectrum and structures drawn on a CRT screen. 4. Integrating Chemical Nomenclature Interfaces to Structure-Based Information Systems G.H. Kirby, M.R. Lord and J.D. Rayner, University of Hull This paper derives from experience gained in using the Hull Chemical Nomenclature Translator as a front-end to various structure-based software packages. The benefits of nomenclature input facilities for chemical structure software present a need to link stand-alone packages into multi- process systems running on PCs under MS DOS, using connection tables for data exchange. Techniques are described that have aided the production of multi- process systems capable of running within the memory and single process limitations of MS DOS. The advantages of more powerful desktop computers with multi-tasking operating systems are outlined. The attention of software writers and users is drawn to current problems in redirecting all data input to be read from files and in allowing alternative entry points which avoid repeated initialization. They are urged to make more use of the Standard Molecular Data (SMD) format for transfer of structure data. 5. AUTONOM - A Chemist's Dream: System for (Micro)Computer Generation of IUPAC-Compatible Names from Structural Input J.L. Wisniewski, Beilstein Institute The rules for assigning the systematic name to a structure are complex and frequently lead to ambiguous names. It is this difficulty in assigning names that can be overcome by a program which uniquely translates graphic structures into text names and is readily available as personal computer tool. The algorithm developed for AUTONOM analyzes the compound's structural diagram, input via a graphic interface, and generates the name purely on the basis of the resulting molecular connection table. This paper describes the general design of AUTONOM, presents a detailed analysis of software and chemical nomenclature solutions adopted during the work on the system, and discusses the system's current accuracy and reliability. The degree of compatibility between AUTONOM and IUPAC nomenclature is discussed and illustrated by numerous examples. 6. Searching a Full Generics Database V.J. Gillet, G. Downs, J. Holliday, M.F. Lynch, W. Dethlefsen, Department of Information Studies, University of Sheffield A hierarchy of screening methods applicable to full generic structures, including generic radical terms, is described. The most general level of description is fragment screening, which includes atom and bond centred fragments, 'bubbled-up' from full generic structures, where the logical relationships between fragments are retained in MUST and POSS screens. Ring systems are characterized and represented as a section of the bit string by a similar logical procedure. A second screening method which retains the topology of a generic structure as an AND/OR tree is the reduced graph method, where the ring and non-ring components of structures are distinguished. Non-ring components are further distinguished as aggregates of carbon atoms and non-ring nodes consisting of connected heteroatoms. The nodes of a reduced graph are further characterized by a hierarchy of descriptors, ranging from an indication of the status of structural features within nodes to the most detailed description, i.e., the constituent atoms for nodes derived from specific partial structures and parameter lists for nodes derived from generic partial structures. Finally the integration of these two screening methods is described, i.e., the inclusion of fragment and ring screens within the nodes of a reduced graph. 7. Automatic Translation of GENSAL Representations of Markush Structures into GREMAS Fragment Codes at IDC G. Stiegler, B. Maier, H. Lenz, IDC Internationale Dokumentationsgesellschaft fpr Chemie The IDC file of generic chemical structures and reactions in patents is based on the GREMAS search system. GREMAS allows very rapid search of structures, substructures and other information in this large database. Until now, the GREMAS fragment codes of generic chemical structures had to be derived manually. Using the formal language GENSAL for the description and graphical input of generic chemical structures, we are able to process them by a computer program. In addition to the research work generic chemical structures in Sheffield (Prof. M.F. Lynch), we have developed a method of translating the GENSAL expressions into GREMAS fragments automatically. So we can use the well established GREMAS search software and the GENSAL input is compatible with the large and valuable IDC file. In this paper, the way of processing Markush structures at IDC using GENSAL will be outlined. The generation of GREMAS fragments from an extended connection table representation using graph reduction methods will be described. 8. Chiral Structure Database for Saccharides N.L. Porter, Maxwell Online Locating a group of organic compounds sharing a common string of chiral centers with the same stereochemistry is currently a tedious and time- consuming exercise. After a substructure search has been performed, dictionary terms and three-dimensional structural diagrams must be used to eliminate compounds with the incorrect stereochemistry. The more chiral centers there are in the substructure, the more of the retrieved compounds must be eliminated from the final answer set. This low precision is due to the absence of absolute and relative orientation information available in connection tables, even where three-dimensional structures can be displayed. In order to achieve chiral substructure searching capability, atom pair information for chiral centers must be included in connection tables. A small chemical structure database using monosaccharides and disaccharides was created to develop and test this concept further. The results of this project may ultimately be applicable for searching any compound with one or more chiral centers. 9. Addition of Node/Bond Stereochemistry to the Chemical Abstracts Service Registry File P.E. Blower, Jr., D.H. Lillie, A.H. Lipkus and C. Qian, Chemical Abstracts Service CAS registers stereoisomers using text descriptors derived from the corresponding chemical names. This system works well for the unique registration of stereoisomers, but it is difficult to relate the text descriptor to the atoms and bonds of the connection table. This limits its usefulness for substructure search or display of stereochemistry in the structure diagram. CAS is currently preparing to augment the Registry connection table with atom/bond-specific stereodescriptors. This presentation will focus on two aspects of this work: the representation of stereochemistry and techniques for converting the Registry structure file to the stereo- augmented format. 10. Stereochemical Substructure Searching: Handling of Relative Configurations P.E. Blower, Jr. and A.H. Lipkus, Chemical Abstracts Service Incorporation of stereochemically augmented connection tables into the CAS Registry File will make stereochemical substructure searching possible. A stereochemical search capability could be implemented by extending the present search system to determine the stereochemical validity of a topological substructure match. However, much of the stereochemistry in the Registry is described only in terms of relative configurations. When relative stereochemistry is involved, it may not be obvious whether a topological substructure match is stereochemically valid, since each relative set of stereoatoms may be viewed as having either of two chiralities. This type of problem can be represented as a weighted graph. By analyzing the corresponding graph, one is able to determine whether a topological substructure match is stereochemically valid. Several alternative algorithms for performing this analysis are described. 11. The Du Pont Global Technical Information System J.G. Marcali, F.H. Kvalnes, J.A. Patterson, E.S. Wilks, E.I. du Pont de Nemours & Co. Du Pont and Chemical Abstracts Service (CAS) designed and implemented an integrated private database on STN International. This database consists of a chemical file and a document file. A unique feature of this system is that both the Du Pont proprietary files and the publically available files on STN can be searched using the same command language, Messenger. The chemical file contains structures, Du Pont accession numbers, molecular formulae, systematic names, synonyms, descriptors, and CAS Registry Numbers for equivalent organic, inorganic and polymeric substances. The document file comprises bibliographic data, abstracts, subject indexing via controlled vocabulary terms and an online hierarchical thesaurus of these controlled terms. Custom features, system capabilities, and preliminary reaction of Du Pont users to the online database are discussed. 12. Structure Registration for Beilstein Online S.M. Welford, Springer-Verlag London Ltd, and C.J. Jochum, Beilstein Institute Beilstein Online is a comprehensive online database of organic chemical compounds and their reported chemical and physical properties. The database, which corresponds to the Beilstein Handbook of Organic Chemistry and covers the chemical literature from 1830, is available online on STN International and Dialog. The paper describes the structure registration system, which has been developed at the Beilstein Institute and used in the construction of this database. The paper concentrates on the treatment of tautomerism and stereochemistry, the connection table format in which structures are processed and delivered to online hosts, and the range of alternative delivery formats which are available for inhouse use. Extension of the registration system and data structures for metallo-organic and inorganic substances in support of the Gmelin online database are also described. 13. A New Structure Search System P.F. Rusch, Dialog Information Service Development of the computer-readable version of the Beilstein database prompted an extended review of factual and substructure search techniques. In order to provide appropriate access to the three-dimensional structure information in combination with the extensive range of physical property information, new methods of online access had to be developed. The DIALOG Information Retrieval software is a long-established text searching system that had undergone some modifications to provide more extensive full-text and limited numeric range searching. Access to the Beilstein database required that the substructure and full-structure search techniques be fully integrated into the existing capabilities to provide a maximum of user convenience and accessibility to the combination of factual and structure data. What has evolved is a mix of components that provides an entirely new system for structure query formulation and transmissions, a new structure search process, and enhancements to numeric searching. This paper will describe the components, their integration, and the resulting service. Evaluation of the service and possible extensions to it will be reviewed. 14. The DARC In-house Packages as a Library of Standalone Functions for Building Applications in Handling Chemical Information P. Huguet and O. Sultan, Questel The variety of applications aimed at handling Chemical Information require the use of common tools corresponding to specific functions, for instance: structure editing, molecular formula or substructure searching, displaying, printing, communicating with other software. The presentation will refer to the building of such standalone tools on one hand, and to how these tools can be assembled to form specific applications on the other hand. The implementation of such capabilities is obtained through an optimal use of the operating system, especially VMS on VAX. Some combinations will be described such as the linking of the structure editing capability and the VAX Packetnet System Interface (P.S.I.) allowing X25 connection for query transfer to public hosts. Another example is the use of VMS interprocess communication and shared memory for the mixed DARC/DBMS display or print. A third example is the use of events flags for security when cross-registration of DARC structures and of DBMS data is undertaken. 15. Rings - The Importance of Being Perceived G.M. Downs, Barnard Chemical Information Ltd In areas such as retrieval, QSAR, synthesis design, reaction indexing, and structure display, ring analysis is required as a descriptive utility and to complement other structural analyses. Finding those rings necessary and sufficient for unambiguous representation, in an efficient manner and for the worst cases is not trivial. Issues include the 2-D representation of 3-D structures, the definition and perception of a ring set, and whether vertex and cut-vertex graphs simplify the analysis. Generic and partial/sub structures cause particular problems. Substructure queries can be the worst defined, introducing problems for structural conventions and ring/chain differentiation. After ring perception, it is necessary to select and represent the information relevant to a particular application. This can be used as a condensation, to enable more efficient matching, and an expansion, to give more detail to reduced graphs. 16. Computer Representation and Searching of Chemical Substances J.G. Nourse, W.D. Hounshell, B.A. Leland, A.J. Gushurst and D.G. Raich, Molecular Design Ltd Chemical structures are typically represented in computer programs as simple graphs, where atoms are represented by a list of nodes and the bonds by a list of non-directional edges. We will describe an extension to such a representation which allows properties to be identified with a defined subgraph in a structure; these properties are fully searchable at a sub- structure level. A description of the representation will be given and examples of searching capabilities will be illustrated. An implementation of these techniques has been applied to "superatoms", salts, non-stoichiometric mixtures, formulations, homopolymers, and copolymers. Other potential uses will also be discussed. These extensions to chemical representation allow us to represent and search a much broader class of chemical substances than the set of discrete chemical structures which has previously been handled. 17. Information Integration: Distributed Chemical Information Management Systems D.H. Smith, J. Barstow, R.E. Carhart and J. Laufer, Molecular Design Ltd The new generation of distributed computing systems offers exciting new ways to work with chemical information, including chemical structures and reactions, data, text and graphics. The connection of high performance workstations to networks of geographically distributed computers rather than to single hosts provides the connectivity to support information integration. The next challenge is to construct software systems that actually deliver the potential functionality. We will discuss software architectures designed to achieve integration at both the workstation and host. End user control over how information is accessed and presented, independent of its geographical location, will be shown to be an essential part of such systems. Emerging standards for transport of information among diverse applications will be discussed as the "glue" that makes distribution the enabling technology for integration. 18. Integrating Chemical Structures into an Extended Relational Database System T. Hagadone, The Upjohn Company This paper proposes extensions to the relational database model that allow chemical structure and other complex data types to be included in a relational database. It is argued that this approach provides benefits over the common practice of storing chemical structures in a chemical database system and associated research data in a relational or other general database system. The design, implementation, and usage patterns of an extended relational system are discussed in the context of the Upjohn Cousin compound information system. Emerging extensibility features that support the proposed approach within commercially available database systems are reviewed. 19. Representation and Searching of 3-D Protein Structures P.J. Artymiuk, H.M. Grindley, E.M. Mitchell, D.W. Rice, E.C. Ujah and P. Willett, University of Sheffield Work in Sheffield over the last four years has resulted in the development of techniques for the representation of and searching for patterns in the 3-D macromolecular structures in the Protein Data Bank. The work has focused on the use of subgraph isomorphism and maximal common subgraph isomorphism algorithms for processing information about the helix and strand secondary structure elements in proteins. The use of these graph-theoretic methods provides novel computational tools for investigating the 3-D structures of proteins. 20. Conformational Freedom in 3-D Databases N.W. Murrall and E.K. Davies, Chemical Design Ltd With the advent of databases capable of storing full 3-D information on chemical structures, it is now feasible to search such databases for particular arrangements of atoms or functional groups. This is of particular importance in drug design, where known 3-D dispositions of atoms or groups (pharmacophores) are believed to be responsible for the physiological action of the drug. When designing a 3-D database system, it is necessary to include information on the conformational flexibility of a molecule, since a given molecule can exist in a variety of different conformations. This paper presents a novel way of storing this information, in which disk storage requirements and search times are independent of the number of conformations stored. A system based on this approach thus enables a true search to be performed, finding all possible matches to a given pharmacophore, not just those corresponding to the particular conformation stored. This paper describes the architecture of such a system and the processes that are carried out during a search, with special emphasis on the 3-D keys used to store and retrieve data. 21. Using 3-D Similarity Searching to Develop Synthetic Targets W.C. Ripka and C.J. Eyermann, E. I. Du Pont de Nemours Experimental Station One of the key challenges facing the medicinal chemist is converting information obtained from enzyme/receptor mechanisms, known active compounds, and 3-D structural data, into ideas for new synthetic targets. To develop these new targets two problems must be solved. First, a pharmacophore model representing the 3-D arrangement of the functional groups required for biological activity must be developed. Then, a molecular framework must be found which can position the functional groups in the proper 3-D orientation. This talk will review our approaches to developing pharmacophore models and how these models can be used as input for searching 3-D databases to find desirable molecular "frameworks". Our 3-D searching is done using the GEOSTAT software from the Cambridge Crystallography Data Center and Molecular Design Limited's MACCS-3D software. Results for several pharmacophores will be used to illustrate how the technique can be used to develop novel synthetic targets. 22. Strategies for the Evaluation of Hits from 3-D Substructure Searching Y.C. Martin and J.H. Van Drie, Abbott Laboratories Frequently when one performs a 3-D substructure search (as with ALADDIN) one finds hundreds of hits that meet the search criteria. The problem is to organize these hits. We have written a program MODSMI that changes the 2-D structure of the hit (in SMILES notation) as directed by the user. For the problem at hand, MODSMI is used to remove substituents not involved in the recognized 3-D substructure. The result is that compounds with the same connectivity within the substructure can be recognized by a MERLIN substructure search of the list of hits using the pruned structure as a target. The use of these techniques in the design of dopamine agonists will be illustrated. 23. Chemical Structure Handling Using the Distributed Array Processor E.M. Rasmussen, P. Willett and T. Wilson, Department of Information Studies, University of Sheffield The Distributed Array Processor, or DAP is a massively parallel SIMD array processor containing either 1K or 4K bit-serial processing elements. We have used the DAP for the clustering of databases of 2-D chemical structures, for the ranking of output in an experimental system for substructure searching of the 3-D macromolecules in the Protein Data Bank and for conventional, 2-D substructure searching. Our studies show that the DAP can achieve processing rates that are substantially in excess of those obtainable using conventional mainframe computers; however the precise degree of speed-up is often crucially dependent upon the characteristics of the particular data that are being processed. 24. 3-D Search and Numerical Analyses Applied to Files of Crystallographic Data: Methodologies Examples, and Integration with 1D and 2D Techniques F.H. Allen, O. Kennard, J.J. Galloy, O. Johnson, J.E. Davies, C.F. Macrae, Cambridge Crystallographic Data Center Search queries in 3-D chemical information systems are usually formulated in terms of geometrical data items derived from an underlying 3-D coordinate set. A very wide range of distances, angles, puckering parameters, etc, are possible. The CSD program GSTAT will locate a fragment, calculate user specified geometry (and linear combinations thereof, if required) and select fragments on the basis of limiting values supplied for any derived parameter(s). Even then, statistical analyses of the multivariate data set G(Nf,Np) (Nf = no. of fragments retained, Np = no. of parameters specified) may be required to answer the 3-D query completely. Simple descriptive statistics, cluster analyses, principal component methods, correlation, regression, etc. are essential tools within GSTAT. Integration of 3-D searching with 1D and 2D capabilities of CSD program QUEST are being effected via a 1:1 graph matching of chemical and crystallographic connection tables, and the careful generation of 3-D screening mechanisms. An improved statistics package is also being developed. 25. An Integrated Approach to 2D and 3D Similarity Searching for the Cambridge Structural Database (CSD) E.M. Mitchell, F.H. Allen and G.F. Mitchell, Cambridge Crystallographic Data Center; R.S. Rowland, Department of Biochemistry, University of Alabama Similarity searching in chemical databases depends crucially upon the chosen molecular attribute sets. The current 2D implementation in CSD uses the substructural bit screens. These primarily contain chemical information and a restricted connectivity level around each node. They contain little pattern information apart from details of chemical rings. Gross pattern attributes can be assigned in terms of inter-nodal bond separation frequencies which can be used alone, (or in combination with the chemical data) to provide alternative (or enhanced) 2D capabilities. For 3D structures, the distance partition frequencies, which can also serve as 3D screens, are a logical basis for similarity calculations. The pattern recognition ability of this approach can be improved by providing an additional distance partition based on bond separations. Substructural similarity searching in 3D is also important in CSD, eg, to compare calculated fragment geometry with experimental data. This is accomplished by a symmetry modified Minkowski metric. 26. Molecular Dissimilarity in Chemical Information Systems D. Bawden, Pfizer Central Research The concept of molecular similarity is now well-established within chemical information systems, for similarity searching and browsing, and for file clustering. Molecular dissimilarity is the converse of similarity, and may be calculated in exactly the same ways. It gives direct insight into the concepts of structural variation and structural diversity. Dissimilarity ranking gives a file ordering so that the compounds at the top of the list will encompass the whole of the structural diversity within that file, and the first N structures will be the most diverse set of N that it is possible to choose from the file. This has advantages over a clustering procedure, since it avoids the necessity to specify the number of structures in advance, and to repeat the analysis when this changes. Applications include ranking of search output, to assess structural variation, selection of structural representatives, and file screening and validation. It can also be used to give a quantitative measure of structural diversity. Dissimilarity measures can be combined with similarity, to give a full picture of the inter-relationships within a structural dataset. 27. Similarity and Analogy Based on Discrimination Net T. Okada, Kwansei Gakuin University Software DNET/MS is developed to search similar structures from the molecules registered in a discrimination net. The retrieved results are put forth to the subsequent analogical estimation of molecular properties. The basic idea is to set an anchor position in the structure as the viewpoint of similarity judgement. Then the molecular structures are ordered in a discrimination net, and a query molecule can be located in the net uniquely. The molecules adjacent to the query location in the net are regarded to have similar structures. The system is applied to the analogical estimation of pKa's of organic oxyacids and to the qualitative structure activity recognition of hypoglycemic pyrimidine derivatives. The scope of DNET/MS system is also discussed. 28. Similarity Criteria for Chemical Structures and Reactions J. Gasteiger and W.D. Ihlenfeldt, Technical University Munich Synthesis design asks for the development of strategies to find efficient pathways from the target to available starting materials. This search can be guided by perceiving structural similarities between the target or a synthesis precursor and a starting material. Several similarity criteria have been defined and their usefulness in synthesis design has been explored. Hash coding algorithms allow one to make this search on a file of commercially available compounds rapid and efficient. 29. The Path Matrix, A Useful Tool for Coding Cyclization Reactions J. Brandt, Technical University Munich The path matrix is proposed as a new concept for coding, storing and retrieving ring changes in the CASTOR-system of reaction documentation. The path matrix contains entries of reaction-invariant paths between members of the reaction core that exist in the molecular graph outside the reaction core. It does not add or alter any information contained in the atom vector and the BE-matrices of the ensemble of molecules of educt(s) and product(s), but it presents relevant pieces of that information in a form more suitable for mathematical treatment, and for storage and retrieval operations in practical applications. Adaptations of the details of its definition under the aspect of practical implementation are being investigated. 30. The Computer-Aided Design of Organic Reactions R. Herges, University of Erlangen-Nprnberg The current paper presents a formal, systematic and deductive approach for searching and designing unprecedented reactions. With the aid of an expert system the whole set of conceivable reactions complying with defined preconditions is generated. Usually most of the generated reactions are known and some are unknown. The unknown reactions can be predicted. Quantum mechanical calculations are performed, in order to select the most suitable candidates for the solution of the given problem (e.g. those reactions with the lowest barrier of activation). Using this approach, we not only predicted new reactions, but we also have been able to confirm our predictions in the laboratory and "designed" five new reactions. Two recently developed reactions should serve as an example. 31. Chemical Reaction Retrieval Using Citation-Based Relationships D.E. Meyer, N.F. Abdul-Malik, G.E. Vladutz, ISI It has been reported that a novel way for retrieving reactions similar to a given query reaction is to use relationships created by literature citations. Later publications citing an article dedicated to a new reaction or synthetic method are likely to contain information about either non-trivial modifications of the original reaction(s) or about further new reactions similar to the orginal one, whereas the nature of such similarity may be difficult to express in purely structural formats. Publications bibliographically coupled (i.e., sharing common citations) with the publication describing a new specific reaction or synthetic method may also contain information relevant to the development of such methods. This paper will analyze the utility of searching for chemically-related reactions via citation coupling relationships using a sample of reactions from ISI's Current Chemical Reactions database. 32. Chemical Reaction Sequence Searching G.A. Hopkinson, T.P. Cook and D. Williams, ORAC Reaction database systems provide users with access to an ever increasing body of chemical knowledge. A major limitation has been the inability to return answers to reaction queries involving a sequence of reaction steps which perform the desired transformation. The ORAC system is currently being extended to allow reaction sequences to be searched and displayed to the user. This paper explores the technical issues relating to storing reaction sequence information and efficient searching in large reaction databases. 33. Multistep Reaction Schemes in the Reaction Access System B. Christie and T. Moock, Molecular Design Limited Although reaction databases with structural searching capabilities have been available for several years, they are primarily based on single-step reactions. Conventions have been adopted for many of these for the representation of multistep reactions; however, there does not exist at present a system which uses a general approach for searching multistep schemes both within a single document, and for schemes spanning multiple documents. A comparison of methods for how this can be achieved within the architecture of the Reaction Access System (REACCS) is described. 34. Computer Invention of Molecular Structures W.T. Wipke, M. Pittman, University of California at Santa Cruz Computers are frequently used in assisting chemists to design molecular structures, with the chemist proposing structures, and the computer simulating properties or modeling them. The best candidates are then selected by the chemist after viewing the computed results. The same procedure is found in other fields: computer chip design, automotive body design, etc. In these cases, the human being is clearly the inventor and the computer, the assistant. In fact it is generally thought that the computer could not invent anything new on its own. That viewpoint leads to a self-fulfilling prophecy! POSTER ABSTRACTS 1. A Hierarchy of the Structure of Matter, from the Viewpoint of Information Retrieval and Structure-Property Correlations S. Barcza, Sandoz Research Institute A (regrettably only) two dimensional chart will be presented, systematizing and summarizing the structure of matter. Within this, the areas relevant to chemical information retrieval, structure-property correlations, especially related to the design of bioactive molecules is zoomed at with higher magnification. The chart covers small to large scale and low to higher resolution. It points out some of the parallels between man-made (e.g., drug, transistor) and natural (hormone, receptor, ribosome) assemblies of matter. The need to create integrated systems of this type is specially indicated by newer developments in: Molecular Biology, Computer Systems and Systems Analysis. The scientist needs to find information and correlations over a sliding scale, with continuously variable window and magnification. 2. The Standard Molecular Data (SMD) Format J.M. Barnard, Barnard Chemical Information During the past few years interest has grown in the development of standard formats for the machine-readable representation of chemical structures, and a number of proposals have been published. One of these is the Standard Molecular Data (SMD) Format, developed by a group of European chemical companies. Under the auspices of the Chemical Structure Association, a series of technical working groups have examined the original version of SMD Format and proposed a number of revisions and extensions. This poster will describe the revised version of the format using annotated examples, and will discuss the areas where further extension is required. 3. Similarity Searching in the Development of New Bioactive Compounds. An Application G. Grethe, Molecular Design Ltd The potential of applying both molecule and reaction similarity searching for the development of a bioactive compound will be discussed. The theoretical aspects of similarity searching in MACCS and REACCS will be described briefly. The target (lead compound) is then developed by utilizing similarity searching with compounds of known activity over a database of bioactive compounds. This search will involve super- and sub-similarity searching and the use of intersections. Similarity planning to develop potential synthetic pathways to the desired target. Suitable derivatives of the lead compound for SAR-studies are proposed based on the similarity of available starting materials and by utilizing reaction similarity searching in reaction classification. 4. Macromolecules: Structure Representation and Nomenclature W.V. Metanomski, J.E. Merritt, Chemical Abstracts Service, K.L. Loening, Topterm Macromolecules, including biopolymers, represent a challenge in generating meaningful structure representations and corresponding names, because frequently complete structural information is not known. Current handling of such structures by Chemical Abstracts Service (CAS) as well as some planned enhancements are illustrated. Similarly, illustrations are shown from existing recommendations of the International Union of Pure and Applied Chemistry (IUPAC). In addition, examples from drafts under consideration by the IUPAC Commission on Macromolecular Nomenclature are presented. 5. Calculation of Three-Dimensional Structural Similarity C.A. Pepperrell and P. Willett, Department of Information Studies, University of Sheffield The last few years have seen a large amount of interest in the use of similarity searching methods for databases of 2-D chemical structures, using fragment occurrence data to determine the degree of similarity between a pair of molecules. This poster will describe work that is being carried out in collaboration with ICI Agrochemicals on similarity searching methods for databases of 3-D chemical structures. We are studying four different ways of quantifying the degree of 3-D similarity between a pair of molecules, where the molecules are defined by connection tables that contain the constituent non-hydrogen atoms and the corresponding inter-atomic distances. These are as follows: 1. a comparison of the frequency distributions of the inter-atomic distances in the two molecules; 2. a comparison of the actual inter-atomic distances to identify those in common (in a manner analogous to the fragment matching procedures used in 2-D similarity searching); 3. a mapping procedure that identifies atoms in the two molecules that appear to be in the same structural environments; 4. a maximal common subgraph isomorphism algorithm that carries out an exhaustive mapping of one molecule onto the other. The last of these approaches is well understood and forms the basis for reaction indexing systems; it is, however, far more demanding of computational resources than the other methods, which may also be more hospitable to structurally heterogeneous datasets such as are encountered in large-scale biological screening programs. The main aim of the present work is to determine whether the simpler procedures, numbers 1-3, can provide a more cost-effective means of calculating inter-molecular similarities. To date, we have considered only inter-atomic distances, but we intend to include valence and torsion angles in the near future and then to compare the utilities of these two types of structural information for 3-D similarity searching. Multiplatform Chemical Structure Management 6. M. Peeters and D. Verbinnen, Janssen Research Foundation In search of excellence ... .. In chemical information systems we lack the availability of multiplatform chemical structure representation. We felt that this representation should not be limited to one platform - the Macintosh. Therefore we defined the Chemical Structure Metaformat. The METAFORMAT is a universal, hardware/software independent description of chemical structures. Any structure available to us from either a DARC-SMS or a CHEMDRAW source should be usable in revisable form on Macintosh, Vax VT-terminal, Vaxstation, X-windows terminal, IBM 3270 terminal, etc. The METAFORMAT converters take care of this task. Every structure is either stored in its Metaformat form, or passes through the metaformat filter before display or printing. The Metaformat is a corner stone of our systems: it is being used as intermediate between ChemDraw, Disspla, Darc, Postscript, Regis, DDIF, etc. It makes it possible e.g. to obtain editable full function documents in ChemDraw on Macintosh with structures from a central Vax SMS DARC database, and this completely transparent to the user. Together with open ended standard chemical software packages it provides the complete toolbox for multiplatform chemical computing. 7. Selection of Screens for 3-D Substructure Searching A.R. Poirrette and P. Willett, Department of Information Studies, University of Sheffield This poster will describe a new algorithm that has been developed for the selection of sets of screens for substructure searching. The sets are chosen so that each of the screens will occur approximately equifrequently in the database that is being searched. The algorithm operates on a sorted dictionary of fragment occurrences and is derived from a previous algorithm that was developed for sorting word dictionaries (1,2). Each entry in the fragment dictionary contains a particular fragment substructure together with its frequency of occurrence in a sample set of compounds characteristic of the file that is to be screened. The dictionary is partitioned into a series of sub-dictionaries, each of which contains an approximately equal number of fragment occurrences. A screen will then correspond to the range of fragment types that is contained within one of these sub-dictionaries. The algorithm was devised specifically for the selection of valence angle and torsion angle screens as part of an ongoing project, funded by the British Library and the Cambridge Crystallographic Data Center, to investigate the use of angular information for 3-D substructure searching. However, the algorithm may be applied to any type of dictionary that contains associated frequency of occurrence data; to date, we have used the algorithm to select not only angle screens but also inter-atomic distance screens, topological fragment screens and sets of words for text retrieval. (1) Cooper, D., Dicker, M.E. and Lynch, M.F. (1980). Sorting of textual databases: a variety generation approach to distribution sorting. Information Processing and Management, 16, 49-56. (2) Cringean, J.K., Pepperrell, C.A., Poirrette, A.R. and Willett, P. (1990). Selection of screens for three-dimensional substructure searching. Submitted for publication. 8. "B-Base" - A Structure-Oriented Numerical Factual Database for 11B-NMR Spectroscopy and Related Information About Other Nuclei H. Nuth, E. Striedl, Institut fur Anorganische Chemie The structure oriented numerical factual database, B-Base, is described as an application of the database management system ChemBase. The constructive work within the last two years resulted in a database, B-Base, which at present incorporates approximately 9,200 document units. Today the database covers primary literature up to the middle of 1988 and will be updated continuously. In general, data concerning boron compounds with 1-4 B atoms of coordination number 2-4 are stored. Each unit includes chemical name, CA Registry Number (both only if handled within an article), authors, bibliographic reference, 11B chemical shifts, linewidth, solvent and coupling constants. Structures with atom-related 11B and heteroatom chemical shifts are interactively searchable via ChemBase software, enabling the in- and output of (sub)structures with atom-related shift values or ranges. A concept for Lewis adduct compounds has been developed. In addition, this concept has been expanded for metal complexes and cluster compounds. 9. Inter-Molecular Similarity and Clustering Methods for Prediction of Toxicity P.T. Walsh, Health and Safety Executive, and P. Willett, Department of Information Studies, University of Sheffield Only a very limited number of the wide range of substances used in industry can be adequately assessed in the laboratory for toxicity because of cost and time constraints and the limited number of test laboratories. Hence computerized toxicity/property prediction techniques based on structure- activity/property relationships have received much attention as effective prediction tools. This paper describes the application of structure-related techniques based on explicit measures of inter-molecular structural similarity for the prediction of toxicity. The degree of structural similarity between a pair of compounds is expressed in terms of the substructural fragments that are common to the two molecules being compared. The molecules are characterized by fragment vectors denoting: (a) the presence or absence of augmented atom, atom sequence and bond sequence fragment substructures; and (b) various ring screens. Only the presence or absence of a particular fragment descriptor was considered and the Tanimoto coefficient was used to quantify the degree of similarity between a pair of molecules. The predicted toxicity value for each molecule Qi was then set equal to the observed toxicity value of the compound that is most similar to Qi i.e. its nearest neighbor. Odd numbers (>1) of nearest neighbors were also employed; here the predicted toxicity was assigned on the basis of a majority vote principle for counted data (e.g. mutagen/non-mutagen) or the average value for continuous data (e.g. rat oral LD50). The structures were also clustered using the Jarvis-Patrick clustering method based on the Tanimoto similarity measure. The predicted toxicity value of each molecule Qi was calculated by identifying the cluster containing Qi and assigning a value based on the majority vote or the average of the toxicities of the other molecules in the cluster. These methods were reared on those compounds extracted from the RTECS database which had assigned WLNs and any of the following toxicity endpoints: carcinogenicity, mutagenicity, skin irritancy, threshold limit value and rat oral LD50. The composition of these five datasets was reasonably heterogeneous and ranged from 269 to 5964 compounds. Significant correlations, i.e. not attributable to chance association, between the observed toxicity of a compound and that predicted on the basis of the structural similarity of its nearest neighbor(s) were obtained for all the datasets studied. While the levels of agreement were probably not as high as other more sophisticated structure-toxicity prediction methods, the simplicity and efficiency of the similarity-based methods suggests potential application for screening large chemical databases. 10. GEMINI: A Generalized Connection Table Language and Interpreter David Weininger and Arthur Weininger, Daylight Chemical Information Systems The design of a generalized program for interpreting connection tables is presented, including a language for describing the external encoding of structural information. Such a program is useful for format conversion or as a front-end to chemical application software. This program has been implemented (GEMINI), and is discussed with examples. There are only three fundamental ways in which connection table formats differ: what information is stored, how that information is represented, and how those representations are encoded. GEMINI achieves a high degree of generality by dividing its task into these three parts. Internally, GEMINI uses a rich set of data types with well-defined relationships which can include redundancy or ambiguity. Such information includes atomic number, connectivity, bond order, chirality, charge, coordinates, etc. Fortunately, there are very few different representations that are used for each kind of information, and algorithms are readily available for their interconversion. A string-processing language is used to specify a connection table format. This language allows succinct specification of both the content and encoding of most external formats that we have encountered, but is currently limited to character-oriented files. LIST OF SPEAKERS 1. Prof E L Eliel 9. P E Blower, Jr. Department of Chemistry Chemical Abstracts Service University of North Carolina 2540 Olentangy River Road Chapel Hill P.O. Box 3012 NC 27599-3290 Columbus, Ohio 43210 U.S.A. U.S.A. 2. Dr S Boyer 10. P E Blower, Jr. IBM Almaden Research Chemical Abstracts Service 650 Harry Road 2540 Olentangy River Road San Jose P.O. Box 3012 CA 95120-6099 Columbus, Ohio 43210 U.S.A. U.S.A. 3. Mr H Abe 11. J G. Marcali Associate Professor of Chemistry E.I. du Pont de Nemours & Co. Research Center for Chemometrics Central Research & Development Toyohashi University of Technology Department 1-1, Hibarigaoka BMP14-1200 Tempaku, Toyohashi 440 Wilmington, DE 19880-0014 Japan U.S.A. 4. Mr M R Lord 12. Dr S M Welford University of Hull Springer-Verlag London Ltd Dept. of Computer Science 8 Alexandra Road Hull HU6 7RX Wimbledon SW19 7JZ 5. Dr J L Wisniewski 13. Dr P F Rusch Beilstein Institute Dialog Information Services, Inc Varrentrappstr. 40-42 3460 Hillview Avenue 6000 Frankfurt/Main 90 Palo Alto West Germany California 94304 U.S.A. 6. Dr V J Gillet Department of Information Studies 14. P Huguet University of Sheffield Questel International Western Bank 83-85, boulevard Vincent Auriol Sheffield S10 2TN 75013 Paris 7. Dr G Stiegler 15. Dr G M Downs IDC Internationale Barnard Chemical Information Ltd Dokumentationsgesellschaft fur 15 Manor Farm Close Chemie m.b.H. Aughton Otto-Volger-Strabe 19 Sheffield S31 0XY D-6231 Sulzbach West Germany 16. Dr W D Hounshell Molecular Design Limited 8. N L Porter 2132 Farallon Drive ORBIT Search Service San Leandro 8000 Westpark Drive CA 94577 McLean U.S.A. Virginia 22102 U.S.A. 17. Dr D H Smith 26. Dr D Bawden Molecular Design Limited Pfizer Central Research 2132 Farallon Drive Sandwich San Leandro Kent CT13 9NJ CA 94577 U.S.A. 27. T Okada Information Processing Research 18. Dr T Hagadone Center Information Scientist Kwansei Kakuin University The Upjohn Company Uegahara Kalamazoo Nishinomiya 662 Michigan 49001 Japan U.S.A. 28. Dr J Gasteiger 19. Dr P Willett Institute of Organic Chemistry The University of Sheffield Technical University Munich Department of Information Studies D-8046 Garching Sheffield S10 2TN West Germany 20. Dr N W Murrall 29. Dr J Brandt Chemical Design Ltd Institut fur Organische Chemie Unit 12 Technische Universitat Munchen 7 West Way Lichtenbergstrabe 4 Oxford OX2 0JB D-8046 Garching West Germany 21. Dr C J Eyermann E.I. Du Pont de Nemours 30. Dr R Herges Experimental Station E320/113 Universitat Erlangen-Nurnberg Central Research & Development Dept. Institut fur Organische Chemie Scientific Computing Division Henkestrasse 42 Wilmington, DE 19880-0320 D-8520 Erlangen U.S.A. West Germany 22. Dr Y C Martin 31. Dr D Meyer D-47E, AP9 ART, Inc. Abbott Laboratories P.O. Box 556 Abbott Park Wayne Illinois 60064 PA 19087 U.S.A. U.S.A. 23. Dr P Willett 32. Dr G A Hopkinson The University of Sheffield Software Development Manager Department of Information Studies ORAC Ltd Sheffield S10 2TN 175 Woodhouse Lane Leeds 24. Dr F H Allen West Yorkshire Cambridge Crystallographic Data Centre 33. Dr T Moock University Chemical Laboratory Molecular Design Ltd Lensfield Road 2132 Farallon Drive Cambridge CB2 1EW San Leandro CA 94577 25. Dr E M Mitchell U.S.A. Cambridge Crystallographic Data Centre 34. Prof W T Wipke University Chemical Laboratory Department of Chemistry Lensfield Road University of California Cambridge CB2 1EW Santa Cruz California 95064 U.S.A. LIST OF POSTER PRESENTERS S Barcza Sandoz Pharmaceuticals 403-3 East Hanover NJ 07936 U.S.A. J M Barnard BCI Ltd Knoll Cottage 46 Uppergate Road Stannington Sheffield S6 6BX G Grethe Molecular Design Ltd 2132 Farallon Drive San Leandro CA 94577 U.S.A. W V Metanomski Chemical Abstracts Service P.O. Box 3012 Columbus Ohio 43210 U.S.A. C A Pepperrell Department of Information Studies University of Sheffield Western Bank Sheffield S10 2TN M Peeters Janssen Research Foundation Turnhoutseweg 30 2340 Beerse Belgium A R Poirrette Department of Information Studies University of Sheffield Western Bank Sheffield S10 2TN H Nuth Institut fur Anorganische Chemie Universitat Munchen Meiserstrabe 1 D-8000 Munchen 2 West Germany P T Walsh Research and Laboratory Services Division Health and Safety Executive Broad Lane Sheffield S3 7HQ D Weininger Daylight Chemical Information Systems Inc. 111 Rue Iberville Suite 610 New Orleans Louisiana 70130 U.S.A. LIST OF EXHIBITORS Hampden Data Services Polygen Springer-Verlag Oxford Electronic Publishing Molecular Design Limited Questel Autoscribe/Hawk Scientific Dialog Information Services Institute for Scientific Information Evans and Sutherland Oxford Molecular FIZ Chemie Fraser Williams (Scientific Systems) THE CHEMICAL STRUCTURE ASSOCIATION CSA is active in the handling of information about chemical structures. The CSA organizes seminars and a training course and produces a lively, quarterly Newsletter. Membership details from: Mr. L. A. McArdle Information Services Section Intellectual Property Department ICI Pharmaceuticals Mereside, Alderley Park Macclesfield, Cheshire SK10 4TG England THE ROYAL SOCIETY OF CHEMISTRY CHEMICAL INFORMATION GROUP The RSC-CIG organizes regular meetings of interest to chemists, information scientists and their managers. There is a semiannual Newsletter. Details of membership of RSC or RSC-CIG from: Ms J. Deschamps Department of the Environment Library Room P3/008d 2 Marsham Street London SW1P 3EB England THE AMERICAN CHEMICAL SOCIETY DIVISION OF CHEMICAL INFORMATION The Division has about 1400 members and associates anxious to learn about the latest research and development in producing and using chemical information. Members receive the Chemical Information Bulletin and are entitled to reduced rate subscriptions to certain other publications. Details from: Dr. Rosemarie F. Parker Exxon Biomedical Services, Inc. Mettlers Road CN 2350 East Millstone New Jersey 08873-2350 U.S.A. THE "CHEMISTRY-INFORMATION-COMPUTER (CIC) GROUP" OF THE GERMAN CHEMICAL SOCIETY (GDCh) The group organizes meetings and publications, develops courses and encourages interest in chemical information and documentation, computer applications in chemical information, software development in chemistry, database systems for compounds, factual data, spectra, structures, etc., reaction databases, synthesis planning, reaction modeling, molecular modeling, structure/activity relations, expert systems; artificial intelligence, teaching of courses, in academia, covering all aspects of chemical information and computer applications. The Group maintains contact with other Organizations in the field and publishes a regular Newsletter ("Mitteilungsblatt"). Details from: Dr. Reiner Luckenbach Beilstein-Institut Varrentrappstr. 40-42 6000 Frankfurst am Main 90 Federal Republic of Germany THE ROYAL NETHERLANDS CHEMICAL SOCIETY DIVISION FOR COMPUTER APPLICATIONS The division started in 1985 and has now almost 1000 members. Regularly meetings are organized on current topics in the applications of computers in chemistry. The division started in 1989 a one-year sponsored project to inventarize the use of computers in chemistry in the Netherlands and to describe trends in 7 specialized areas for the next years. The division publishes the 'Chemistry Bytes' newsletter for its members. Information about the Royal Netherlands Chemical Society can be obtained from: Ir. E.J. de Ryck van der Gracht KNCV P.O. Box 90613 2509 LP Den Haag The Netherlands Registration information may be obtained from: David K. Johnson Exxon Research and Engineering Company US Route 22 East, Clinton Township Annandale, NJ 08801 USA Telephone (201) 730-3095 FAX (201) 730-3042 BITNET DKJOHNS@ERENJ ********************************************************** IRLIST Digest is distributed from the University of California, Division of Library Automation, 300 Lakeside Drive, Oakland, CA. 94612-3550. Send subscription requests to: LISTSERV@UCCVMA.BITNET Send submissions to IRLIST to: IR-L@UCCVMA.BITNET Editorial Staff: Clifford Lynch lynch@postgres.berkeley.edu calur@uccmvsa.bitnet Mary Engle engle@cmsa.berkeley.edu meeur@uccmvsa.bitnet Nancy Gusack ncgur@uccmvsa.bitnet The IRLIST Archives will be set up for anonymous FTP, and the address will be announced in future issues. These files are not to be sold or used for commercial purposes. Contact Mary Engle or Nancy Gusack for more information on IRLIST. The opinions expressed in IRLIST do not represent those of the editors or the University of California. Authors assume full responsibility for the contents of their submissions to IRLIST.