________________________________________________________________________ ________________________________________________________________________ ________________________________________________________________________ PROTEIN DATA BANK QUARTERLY NEWSLETTER Release #83 - January 1998 Published by Brookhaven National Laboratory Protein Data Bank ________________________________________________________________________ ________________________________________________________________________ ________________________________________________________________________ Internet Sites WWW http://www.pdb.bnl.gov FTP ftp.pdb.bnl.gov ------------------------------------------------------------------------- January 1998 CD-ROM Release 6947 Released Atomic Coordinate Entries Molecule Type 6151 proteins, peptides, and viruses 268 protein/nucleic acid complexes 516 nucleic acids 12 carbohydrates Experimental Technique 168 theoretical modeling 1089 NMR 5690 diffraction and other 1673 Structure Factor Files 400 NMR Restraint Files The total size of the atomic coordinate entry database is 3.0 GB uncompressed. -------------------------------------------------------------------------- Table of Contents What's New at the PDB Archive Management EBI Now Accepting AutoDep Submissions The `Intelligent' Search Engine Behind the 3DB Browser(TM) Request for a Revision of IUCr Policy on Publication and Deposition of Crstallographic Data PDB Computer Services Energy Department Announces New BNL Contractor Exceptional Science Fair Project Uses the PDB Writing Structure Factors in mmCIF using CCP4 Protein Topology WWW Site SARF2 - a Program for Comparison of Protein Structures Molecular Docking by Fourier Correlation with FTDOCK MolView and MolView Lite OLDERADO: Extracting Single Structures, Core Atoms and Domains from a NMR-derived Ensemble Notes of a Protein Crystallographer- FRODO, the Electronic Hobbit Web Sites Referenced in the January 1998 PDB Newsletter Affiliated Centers and Mirror Sites Related WWW Sites PDB(TM) Order Form PDB Access, FTP Directory Structure, Consultants, Staff, Support and Instructions to Authors ------------------------------------------------------------------------- What's New at the PDB Joel L. Sussman Structure factors are the observed experimental data from X-ray crystallographic experiments. They are the basis of the X-ray coordinate entries and as such need to be readily available to and usable by researchers. To facilitate the exchange of this data between scientists, as well as for their deposition and retrieval from the PDB, it was decided to set up a standard format, i.e., a Lingua Franca, for structure factors. The PDB and a number of macromolecular crystallographers, including the Chairperson of the IUCr Working Group on Macromolecular CIF, Dr. Paula Fitzgerald, and other members of this committee, developed a standard interchange format for structure factors. This standard is in mmCIF format, i.e., the IUCr-developed `macromolecular Crystallographic Information File'. It was chosen for simplicity of design and for being clearly self-defining. The format is also easy to extend, by simply adding additional tokens as new crystallographic experimental methods or concepts are developed (see the January, 1996 PDB Newsletter and ftp://ftp.pdb.bnl.gov/structure_factors/cifSF_dictionary). The entire mmCIF crystallographic dictionary has recently been ratified by the IUCr's COMCIFS committee (http://ndb.rutgers.edu/NDB/mmcif/). We have been strongly urging our depositors to submit structure factors with their entries (Baker et al., 1996). We are pleased to report that since the release of the PDB's Web-based deposition tool, AutoDep, in October 1996, that almost two-thirds of the depositions of X-ray structures to the PDB are now accompanied by their structure factors. Dr. Jiansheng Jiang, at the PDB, has converted these recently-deposited structure factors, as well as virtually all the previously-deposited ones, to the standard mmCIF format. The structure factors are available through the PDB's Web-based 3DB Browser(TM) (http://www.pdb.bnl.gov/pdb-bin/pdbmain), as can be seen on the Browser's `Atlas' page for each structure. Over the years, the PDB has observed that one of the most useful reasons for storing structure factors is for the crystallographer who did the experiment to be able to retrieve his/her own data which have been misplaced in their laboratory. In parallel, the fact that this data is now easily available, and in a standard format, has already begun to foster new community-wide efforts at improvements in validation techniques based on the experimental data, e.g., SFCHECK by Vagin, Richelle & Wodak (http://www.sdsc.edu/Xtal/IUCr/CC/School96/), the Uppsala Electron Density Server by Taylor (http://alpha2.bmc.uu.se/valid/density/form1.html), and others. References: Baker, E. N., Blundell, T. L., Vijayan, M., Dodson, E., Dodson, G., Gilliland, G. I. & Sussman, J. L. (1996). Crystallographic Data Deposition. Nature 379, 202. -------------------------------------------------------------------------- Archive Management Enrique E. Abola Layered Release In the October 1997 PDB Quarterly Newsletter, we discussed the layered-release that will allow for a virtually immediate release of entries (to be referred to as the 1st layer) without staff intervention. A PDB ID code will be issued only after the depositor gives approval to release his/her entry either immediately or as soon as it comes off hold. Following this, PDB staff will process the entry as done presently. This processing will include standardization of nomenclature, other annotation, and more importantly, data representation. Most of this work covers issues not now fully delegated to software. The resulting entry will be loaded on our servers as the 2nd layer. The set of mandatory items to be required before data are accepted was discussed in the October 1997 PDB Newsletter. The complete list may be found at our Web site (http://www.pdb.bnl.gov). Listed below are a series of checks that will be done on an entry as part of the submission process. The first checks will be used to ensure that entries with obvious deficiencies are not released (e.g., duplicate atom records). The other checks will be used to add annotations to the entry. When the coordinates are loaded on the PDB server, a file containing the results of the diagnostic runs will be loaded as well. The following tests will be done on the entry as part of the submission process. Results of the tests will be provided to the depositor who can then take the appropriate action given the options outlined below before a PDB ID is issued. 1. Diagnostics requiring corrections and re-submission of coordinate data. * More than one polypeptide or nucleotide chain assigned the same chain name * Heterogen group specified by HET and FORMUL records not present in the ATOM/HETATM records * More than 10% of the atoms involved in unusually close crystal packing interactions (this check will also cover the case for which a non-standard space group setting was used and the correct set of symmetry operators were not provided) * Violation of atom nomenclature for standard amino and nucleic acids * Duplicate ATOM or HETATM records in the same residue with the same atom name or the same coordinates * ATOM/HETATM records not correctly formatted * Heterogen ID provided in the coordinate file conflicts with the PDB Het Dictionary 2. Diagnostics requiring annotations and/or comments to be provided by the depositors if the data are not corrected. The PDB will insert a CAVEAT record before release. * For polypeptides, phi-psi angles for more than 20% of the residues outside the allowable region * Unexpected chirality at C-alpha center 3. The following diagnostics are normally used by PDB staff to take a closer look at the data because, by experience, we have found that they may be indicative of unusual structures or possible problems. We will present this information to the depositor and the list will also be included in a file containing the output of our checking runs to be made available to the users. The depositor may, of course, modify the coordinate file during the submission process to correct for possible errors before giving final approval to release the data. * RMSD of bond lengths greater than 0.08 Angstroms from ideal values * RMSD of bond angles greater than 5.0 degrees * Breaks in the chain (e.g., due to disorder) * Differences between amino acid sequences given by the ATOM records and those given in the appropriate sequence database entry * Amino acid sequences not reported in any sequence database * CIS-peptides and peptide bonds that deviate significantly from the expected trans conformation * Individual bond lengths differing by more than 0.1 Angstroms from standard values * Individual bond angles differing by more than 15 degrees from standard values * Atoms too close to symmetry axes * Atoms involved in unusually close crystal packing interaction * Atom occupancies less than or equal to 0.0 or occupancies greater than 1.0 * Atom occupancies less than 1.0 and for which no alternate location ATOM record is provided * Missing residues, missing atoms * Thermal factors greater than 100 A**2 * Unexpected deviations from planarity * Non-standard SCALE matrix * OXT atom record in the middle of a chain (flagged as extra atom), typically occurring before a gap in the coordinates * R value greater than 30% * Free-R value greater than 35% * Free-R and R value differ by more than 10% * RMSD between atoms related by NCS MTRIX records is greater than 3.0 Angstroms Tests which are valid only for diffraction experiments will not be applied to entries reporting NMR experiments or model building studies. Heterogen groups will be checked against the current PDB Het Dictionary. The only check that will be done at this stage of the processing is to see if the HET ID and the atom nomenclature used for a group is consistent with the dictionary (e.g., is the GLC group in the coordinate file a glucose molecule as given in the PDB Het Dictionary and are the atoms properly named?). Groups that are not in the dictionary and for which there is no conflict on the HET ID code will be accepted as is and will be checked and standardized as part of the regular processing to be done after the first layer is loaded. Complete descriptions of these tests along with a more precise definition of values such as those defining allowable Ramachandran plot regions are provided on our Web pages (http://www.pdb.bnl.gov). Please send your comments and suggestions regarding these tests and/or on the layered-release to abola1@bnl.gov. Summary of Data Processing Activities for 1997 In 1997 we received 1,844 coordinate sets and released 1,631. This averages out to 153 entries deposited per month which is about 27% more than the 1996 submission rate. On average it took us 119 days to release an entry, which is significantly improved from the 173 days that it took us to release an entry in 1996. A plot giving the growth in data deposition as well as the turn-around time is provided on the back cover of this Newsletter. The plot is accessible via our Web Home Page, and is updated after every load. There were several changes in our procedures that have allowed us to handle the increased rate of deposition while at the same time reducing the amount of time required for processing. Most significant was the release of our AutoDep program in October 1996 that has greatly simplified processing of entries. More than 70% of the entries submitted in 1997 were done through AutoDep. Starting in November 1997 we initiated a new procedure in which entries are released every Tuesday night. This was done at the request of several Mirror Sites, most of which have programs that automatically generate indices relating PDB entries to other databases. Users wishing to check data loads can visit our Home Page for a list of recently released ID codes. -------------------------------------------------------------------------- EBI Now Accepting AutoDep Submissions The following announcement was posted on several listservers and newsgroups on December 23, 1997, including the PDB Listserver, X-PLOR Listserver, and the O Listserver. Deposition of 3D Structural Studies of Biological Macromolecules We are pleased to announce the inauguration of a new deposition site for 3D structural studies of biological macromolecules. Starting on January 5, 1998, authors using the Web-based tool, AutoDep, can submit data either to the European Bioinformatics Institute (EBI), UK or to the Protein Data Bank (PDB) at Brookhaven National Laboratory (BNL), USA. The additional site is expected to significantly facilitate the submission procedure, especially for European researchers. AutoDep is a Web-based tool originally designed at PDB for automatic submission of macromolecular data into the PDB. Extensive collaboration between EBI and PDB has produced significant changes to the original system allowing for the seamless operation of multiple deposition sites. This includes EBI specifications for standards and protocols to be used in making the code portable and generally more robust. AutoDep is accessible from the following URLs: * BNL-PDB http://www.pdb.bnl.gov * EBI-MSD http://autodep.ebi.ac.uk Those wishing to submit data using the electronic version of the Deposition Form must continue to deposit directly to BNL using e-mail or FTP. The submission procedure will be identical, and equivalent, at both sites, but PDB ID codes will be issued by BNL. Data submitted at EBI will be forwarded automatically to PDB after depositors have reviewed the AutoDep-generated entry and diagnostics. Final preparation for archiving and release will be done by PDB staff. We encourage depositors to submit not only the structural results, but also their experimental data, i.e., for crystallographers, X-ray structure factors, and for NMR spectroscopists, constraints lists and statistical data describing the calculated NMR conformers and constraints. Important Notes: (i) Submissions can be completed only at the site at which they were started. (ii) The option "Based on a previous submission" may be used to simplify submissions by using an earlier AutoDep session as a template. However, depositors will only have access to their earlier submissions at the site where those submissions were originally made. (iii) Existing PDB entries may be used as templates by choosing the option "Based on an existing PDB entry". The full set of entries will be available at either site, irrespective of where the original deposition was made. (iv) The date of submission for data deposited at EBI will be the corresponding U.S. Eastern Time of the date when submission is completed at EBI. (v) EBI staff will offer assistance (via e-mail: pdbhelp@ebi.ac.uk) up to the point of submission. Once BNL has issued an ID code, correspondence should be directed to BNL (via e-mail: pdbhelp@pdb.bnl.gov). Please note that authors should continue to deposit crystal structures of nucleic acids to the Nucleic Acid Database (NDB) at Rutgers, the State University of New Jersey, USA at URL: http://ndbserver.rutgers.edu:80/NDB/deposition/index.html. Experimental data related to NMR studies will also be transferred electronically to the BioMagResBank (BMRB) at the University of Wisconsin-Madison, USA for further processing and inclusion into the database (http://www.bmrb.wisc.edu) as well. Joel L. Sussman Phil McNeil Head, Protein Data Bank Head, Macromolecular Biology Department Structure Group Brookhaven National Laboratory EMBL Outstation Upton, NY, USA European Bioinformatics Institute Wellcome Trust Genome Campus Hinxton, Cambridge, UK -------------------------------------------------------------------------- The `Intelligent' Search Engine Behind the 3DB Browser(TM) Jaime Prilusky Bioinformatics Unit, Weizmann Institute of Science, Rehovot, Israel (lsprilus@weizmann.weizmann.ac.il) The new 3DB Browser(TM) allows the user to rapidly search through the contents of the entire PDB Archive for entries matching certain constraints. A full text search can be made for any string appearing in the text of a PDB entry, excluding the coordinate records. Many specific records can be searched for regular expressions or numerical limits. 3DB Browser gives you the option of saving object sets resulting from queries. This saved set can be used as a starting point for further database operations or as a reference for your work. Every saved set includes the date of the search and the query from which it was generated. The Search Fields of the 3DB Browser The main source of information for the 3DB Browser is the data from the Protein Data Bank. This data is highly structured and most of the crystallographers are used to thinking of a piece of data from a PDB entry as belonging to a particular "record" or "field". It makes sense to use these fields to constrain the search. Searching for `rich' as a keyword has a different meaning than searching for the author Rich. Search Field PDB Entry PDB ID code Four-character accession code Keyword Molecule name, class or family, or related term (HEADER, TITLE, KEYWDS and COMPND fields) Author Family name of depositor or author of associated publication (AUTHOR and JRNL fields) Text query Any word in the complete PDB text, excluding the field names FASTA Search Fasta search of the sequence Experiment Method of structure determination Resolution A unique value or range of values, in Angstroms (REMARK 2 field) Space group Both extended and standard Hermann-Mauguin symbols (CRYST1 field) Organism Trivial name, systematic name or expression system (SOURCE field) Date (lower) Date entry was released or updated Date (upper) Date entry was released or updated Associated group Prosthetic group, metal ion, ligand or substrate, or its three letter PDB abbreviation (HET and HETNAM fields) Examples and Boolean-style Searches The simplest operation with the browser is to enter one or more words in the "Text query" field and press the "Search" button. The browser engine will come back with those entries from the database that contain or are related to the provided words. The symbol `*' can be used as a wild card, to denote a sequence of any number (including 0) of arbitrary characters. Just add a star `*' at the beginning or end of a word (or both) to `extend' the search. For example, enter *tox* in keywords to retrieve those entries with keywords like neurotoxic and toxin. Wild cards have no meaning in number-only fields, like Resolution and Date. The Boolean operator AND is the default for 3DB Browser, and mandatory (you cannot change it) between fields. If you enter `ATP' in the Associated group field and `kinase' in the Keyword field, only those entries matching both constraints are returned. Inside a given field, you may apply Boolean logical operators at will to the words you enter. The available Boolean logical operators are AND, OR and NOT. The case is unimportant. The operator AND can be represented by `+' and the operator NOT represented by `-'. For example, `zinc and (torpedo or snake)' in the Text query field will return those entries that contain either the word torpedo or the word snake, but only where the word zinc is also present. To Err is Human One of the main concerns for us, as database-interface developers, is the "false negatives", that is, to not return data after a query, even when the data are available in the database. Frequently this happens because the user was unable to express the query in a way compatible with the search engine, or used words or keywords unknown to the search engine. 3DB Browser deals with this problem by incorporating several automatic and semi- automatic mechanisms to help the user in retrieving the requested data. The request from the user gets filtered and transformed by one or more of the following engines. At the end, the resulting query is the one used for the search. Engine Example american-british `amoeba' and `ameba' are equivalent synonyms `protease' is equivalent to `proteinase' spelling search based on a dictionary built from the current PDB data, the spelling engine will produce words that are close to the entered one. As an example, entering `imune' will offer `immune' as a valid alternative. soundex search based on the soundex algorithm that approximates the sound of the word when spoken by an English speaker. Looking for author `weich' will offer as alternatives: Weiss, Wess, Wyss ... Inside this section on understanding what the user looks for, we can include the improved search on the CRYST1 record using the short and extended Hermann- Mauguin symbols. You may enter either `P 1 21 1' or `P 21' in the Space group field and get the same result. 3DB is Just the Starting Point A search in 3DB brings up a rich Atlas page summarizing additional knowledge related to the entry of interest. The links in this Atlas page carry you to the original sources of information. The number of external sources that 3DB searches and dynamically incorporates into the Atlas pages increases daily. The following table summarizes the external sources currently referenced by 3DB. Source Name Short Description BioMagResBank Relational Database for Sequence-Specific Protein NMR Data BLOCKS Database of conserved regions in groups of proteins CATH Protein Structure Classification Dali/FSSP Families of Structurally Similar Proteins EMBL European Molecular Biology Laboratory Entrez NCBI's Documentation database ENZYME Enzyme nomenclature database ESTHER ESTerases and alpha/beta Hydrolase Enzymes and Relatives GenBank NIH genetic sequence database GDB Genome Data Base Kinase Protein Kinase Database Project KineMage Protein Science's Kinemage server LPFC Library of Protein Family Cores MacroMolecule EBI's Crystal MacroMolecule Files MMDB Molecular Modelling Database NBD Nucleic Acid Database OLDERADO Core, Domain and Representative Structure Database PDBOBS Archive of obsolete PDB entries at SDSC PDBREPORT Structure verification reports for X-ray structures PIR Protein Information Resource PROSITE Dictionary of protein sites and patterns ProtMotDB Protein Motions Database scop Structural Classification of Proteins SWISS-3DIMAGE 3D images of proteins and other biological macromolecules SWISS-PROT Annotated protein sequence database TREMBL TRanslation from EMBL If you know of other sources of information related to PDB that can be incorporated into 3DB's Atlas page, please send an e-mail message to lsprilus@weizmann.weizmann.ac.il. Support your Local Store The Protein Data Bank has several mirror sites across the world. These sites have the same data and facilities as in the central PDB server. They are just closer to you, and, frequently, faster to access on the Internet. To help you know your neighborhood, the 3DB Browser incorporates "closer-site", an automatic script that detects your location and offers alternative sites that are closer to you (in the network sense). Drop an e-mail to lsprilus@weizmann.weizmann.ac.il if you are interested in getting the "closer-site" script for your own application. -------------------------------------------------------------------------- Request for a Revision of IUCr Policy on Publication and Deposition of Crystallographic Data Alex Wlodawer Macromolecular Structure Laboratory, ABL-Basic Research Program, Frederick Cancer Research and Development Center, National Cancer Institute, Frederick, MD, USA (wlodawer@ncifcrf.gov) Dear Colleagues, For the last two years, I have been working on trying to change the policies of journals and funding agencies which allow hold periods of up to one year for the coordinates resulting from crystallographic and NMR studies. (See also Sussman 1997.) It is now becoming clear that the best way to accomplish such a change would be to induce IUCr to change their official recommendations (International Union of Crystallography, 1989). Several of us have recently written a letter to Science, which appeared in the January 16th issue (Wlodawer, 1998), suggesting that their policy be modified. It is necessary, however, to involve the largest possible segment of the structural community in this endeavor. For that purpose, we are circulating a petition which will be presented to IUCr. If you agree with the text of the petition below, please send a brief message to me at the e-mail address wlodawer@ncifcrf.gov. You might also wish to send a message if you disagree with the petition and would like to keep the current policy in place. The results of this vote will be reported to the community before any further action is taken. References: International Union of Crystallography. Commission on Biological Macromolecules. (1989) Policy on publication and the deposition of data from crystallographic studies of biological macromolecules. Acta Crystallogr., Sect.A, 45, 658. (Policy also in section 11.3 of http://hobbes.gh.wits.ac.za/iucr- top/journals/acta/actaa_notes.html). Sussman, J. L. (1997) What's new at the PDB. PDB Q. Newsl., No.82, 1. Wlodawer, A., Davies, D., Petsko, G., Rossmann, M., Olson, A. & Sussman, J. L. (1998) Immediate release of crystallographic data: a proposal. Science, 279, 306. Petition: To: Commission on Biological Macromolecules, IUCr We, the undersigned, would like to request a revision of the IUCr policy on publication and deposition of data from crystallographic studies of biological macromolecules (Acta Cryst. A45, 658 (1989). It is our intention that if the policy gets revised, the new rules will be communicated to granting agencies and to scientific journals, in order to be universally accepted. The current policy has been implemented on the basis of the discussions which had taken place a decade ago. In the meantime, there has been an incredibly rapid increase in the rate of determination of 3D structures of biomacromolecules, as reflected by the deposition of a new structure in the Protein Data Bank (PDB), on average, every five hours. Unfortunately, in parallel, an increasing proportion of depositors take advantage of the PDB's policy of allowing structures to be kept `on hold' for up to a year after coordinate deposition. Consequently, as many as 45% of newly deposited structures are not available when the relevant papers are published. When the issue of deposition was debated by the community ten years ago, the time needed to solve a macromolecular structure was often measured in years, and was rarely less than one year. The time needed for detailed analysis of such structures was also fairly long. The one-year hold on coordinates was therefore instituted to allow the authors to reap the fruit of their tremendous investment of time and effort. Due to recent advances in protein expression and purification, crystallization procedures, X-ray instrumentation, and computer software, the time needed to solve a structure is often shorter than the allowed hold period. In light of such developments, it is very difficult to justify withholding coordinates for any period once the paper has been published. Biomolecular structure analysis has indeed succeeded in bringing 3D structures to the forefront of molecular biological research. This success has expanded both the interest in and utility of the information being deposited in the PDB. The molecular modeling community has grown and evolved considerably due to the expansion of this source of experimental data. The value of the data rests in their availability to the broader community. Methods are continuously being developed to analyze new structures and their relationships to the collection of existing structures. New uses for these data, such as statistical potentials for folding and threading calculations, and interface recognition tools, are evolving rapidly. No single research group can fully exhaust this wealth of information. The value of the resource grows proportionally to the timeliness of the data and to the number of scientists who have access to them. 3D structural information is also a crucial link elucidating the role of a translated region of a DNA sequence of unknown function. We feel most strongly that the time has come to change the rules of deposition so as to ensure that the coordinates are released concomitantly with publication of the paper(s) describing the structure. We are convinced that without access to the coordinates, the structures cannot be utilized for comparison with other proteins, for theoretical analysis or, more and more importantly, for drug design. We propose that coordinates deposited at the PDB should be marked as either "for immediate release" or "to be released upon publication". We also recommend that the maximum hold for primary data, i.e., X-ray structure factors, and NMR-based restraints, be reduced from 4 years to 1 year. These changes would bring macromolecular crystallography into line with the requirements of other fields, such as gene sequencing, which have never allowed extended hold periods. -------------------------------------------------------------------------- PDB Computer Services John McCarthy PDB's WWW Browser Discontinued During the final six months of 1997, usage of the PDB's WWW Browser had dropped significantly. Additionally, in December of 1997 the PDB released the latest version of the 3DB Browser(TM). It has all the features of the WWW Browser plus many more (see "The `Intelligent' Search Engine Behind the 3DB Browser(TM)" in this Newsletter). For these reasons, the PDB discontinued the WWW Browser in December of 1997. Please remove any bookmarks to it that you might still have. PDB CD-ROM Files Compression As was reported in the July 1997 PDB Quarterly Newsletter, the PDB started compressing files on its October 1997 CD-ROM release. The Structure Factor files were compressed allowing the full CD-ROM release to fit on six CDs. The January 1998 CD-ROM release, most likely, will still fit on six CDs by compressing the Structure Factor files, but in the near future, coordinate entry files will be compressed as well. As was stated in the July 1997 PDB Newsletter, an effect of compression will be that the filenames will be different. Files that had the PDB ".ent" suffix will have the ".gz" suffix when compressed. Any scripts that read coordinate entry files directly from the CD-ROM will have to be modified to use the new filenames and perform the uncompression as necessary. The PC-based browser PDB-Shell has been updated to be able to read compressed entry files. The PDB is using the Gnu gzip package to perform the compression and is distributing the Gnu Gunzip package in the CD-ROM set to allow CD-ROM users to perform uncompression. A questionnaire was sent to all users receiving the July 1997 CD-ROM release requesting their views regarding compression of files on the CD-ROM release. The responses were overwhelmingly in favor of compression. -------------------------------------------------------------------------- Energy Department Announces New BNL Contractor Based on Department of Energy press releases (http://apollo.osti.gov/doe/whatsnew/pressrel/pr97130.html and http://apollo.osti.gov/doe/whatsnew/pressrel/pr98001.html) Completing a major step in its ongoing effort to improve management and restore confidence at Brookhaven National Laboratory, the Department of Energy announced on November 25, 1997, the selection of Brookhaven Science Associates (BSA) as the new contractor to manage and operate its Long Island, NY, research facility. The BSA team is led by the Research Foundation of the State University of New York on behalf of the State University of New York at Stony Brook and Battelle Memorial Research Institute of Columbus, Ohio. Secretary of Energy Federico Pena said, "Brookhaven Science Associates has demonstrated leadership at their institutions, and I will look to them to fully integrate safety and environmental protection into scientific research, to accelerate and intensify recent efforts to rebuild community trust, and to achieve overall excellence. Working together, we will make it possible for the laboratory to carry out its mission as a world-class research facility and prove itself a good neighbor to Suffolk County and Long Island." "This is the fastest competition we have ever held for a management and operating contract, and reflects a new way of doing business at the Department of Energy," he added. Two proposals from nonprofit-led teams were submitted in response to a July 18, 1997, Request for Proposals. One proposal was from BSA. The other competing team was led by the IIT Research Institute of Chicago, Illinois. Both teams provided excellent proposals and demonstrated their ability to manage scientific, environmental, safety and health, and community involvement initiatives at the laboratory. The selected team demonstrated the best total capability to improve the laboratory's performance. Stony Brook is a national leader in high energy and nuclear physics. Battelle Memorial Research Institute has operated the department's Pacific Northwest National Laboratory in Richland, Washington, for the last 32 years and has long been a leader in applied science and technology, including environmental, safety and health management. Battelle manages environmental, safety and health activities at the department's Pantex Plant. BSA committed to exceeding the requirements of the department's Request for Proposal in several areas, including: the department's safety management program that integrates safety into employees' daily work activities; implementing IS0 14001, an international standard for environmental management systems; and instituting a Voluntary Protection Program whereby the lab will subscribe to meeting proven industrial standards for worker safety. Other immediate organizational commitments include retention of existing salary, benefits and tenure systems for employees and appointment of separate deputy directors for science and operations. In addition, offices for environment, safety and health, environmental management, reactor operations and community involvement will report directly to the laboratory director. The BSA proposal identified Dr. John Marburger as the new Brookhaven Laboratory director. Dr. Marburger is a distinguished science administrator and served as State University of New York at Stony Brook's president for 14 years. He has committed to integrate laboratory safety with scientific excellence and to regain community trust and confidence. The department's streamlined process for selecting a new contractor, limited to nonprofit organizations or teams led by nonprofit organizations, was completed in six months, rather than the usual 18, and was developed with extensive community, industrial and academic involvement. The department initially planned to award the contract in mid-November 1997. However, the award was not signed until January 5, 1998, in order to comply with recently enacted federal legislation requiring 60-day notice to the United States Congress before awarding certain contracts. BSA will assume responsibility for laboratory operations following a transition period of 55 days after the contract award. Until that time, Associated Universities Inc., the current contractor for Brookhaven, will continue to manage the laboratory. An introduction to BSA is available at: http://www.pubaf.bnl.gov/pr/BSA.htm. The source selection statement is available at: http://www.ch.doe.gov/bnlseb. -------------------------------------------------------------------------- Exceptional Science Fair Project Uses the PDB Reprinted below is a letter from a father about his son's use of the PDB. The PDB entry used was 1FKS: FK506 AND RAPAMYCIN-BINDING PROTEIN (FKBP12). The molecular dynamics program referred to was PMD, Parallel Molecular Dynamics (http://tincan.bioc.columbia.edu/pmd/pmd-summary.html), written by Dr. Andreas Windemuth. 25 Nov 1997 Hello Dr. Sussman, I was at the SuperComputing 97 show last week in San Jose, CA, and I stopped by the Brookhaven National Laboratory booth. I talked about my son who did his high school science fair project last year using the PDB at BNL, and the folks in the booth strongly suggested that I write you a short note describing his work. My son had a kidney transplant and one result of this is that he takes immunosupressive drugs to prevent organ rejection. While searching the PDB, he found a few entries that were the binding proteins for a commonly used immunosupressive. He copied the structure from the PDB server and then ran the structure through a molecular dynamics program available from Columbia University to generate the full 3D structure of the protein. He experimented with the protein by making single atom changes in one of the 107 amino acids, and had the MD program recompute the structure. He then compared the resulting shape of the new protein to the original protein to estimate the ability of his new protein to bind to the immunosupressive drug. He found that there were several changes he could make to the protein that would appear to have little impact on its binding capabilities, while other simple one-atom changes resulted in a very different 3D structure. He was able to conclude, based upon residual displacements and by using RasMol to visualize the structures, which proteins would still be active in binding to the immunosupressive and which would not. His science fair project was well received at both his school, Marian High School in Framingham, MA, as well as at the regional science fair at Worcester Polytechnic Institute and at the Massachusetts state science fair at MIT. This was a wonderfully educational experience for him, and gave him a positive experience in rational drug design. The relevance of this type of science to his daily life was made quite clear to him. The educational capabilities of having this type of data available on the Internet should not be overlooked. Regards, Dr. Don Dossa Digital Equipment Corp The entire PDB sends best wishes to Dr. Dossa's son for good health in the future. -------------------------------------------------------------------------- Writing Structure Factors in mmCIF using CCP4 Peter Keller European Bioinformatics Institute, Hinxton Hall, Cambridge, CB10 1SD, UK (keller@ebi.ac.uk) Depositing your MTZ-formatted structure factor data to the PDB using the Web- based AutoDep procedure is quite straightforward: all you need to do is use the `CIF' output option of the CCP4 program `mtz2various'. The output file can then be uploaded along with your coordinate file, when you start your submission. The PDB strongly encourages use of this procedure to prepare your structure factor file for submission. Unlike the other output formats of mtz2various, the mmCIF output will contain every reflection which is present in the MTZ file, even if the structure factor amplitude is the missing number flag, or the reflection is systematically absent for the space group which was finally assigned to the data. Each reflection is flagged in the output according to its status (see the CCP4 documentation on mtz2various for more information). Other ways to indicate the status of reflections in the output file: * The `FREEVAL' option can be used to indicate the test set that was excluded from refinement for calculation of the free R factor. * If, for some reason or other, you have performed your final refinement against a subset of the data, you can indicate the resolution limits with the `RESO' option, and a sigma cutoff with `EXCLUDE SIGP'. A simple example might look like this: mtz2various hklin sf.mtz hklout sf.cif OUTPUT CIF data_sf LABI FP=FNAT SIGFP=SIGFNAT END A more complicated example: mtz2various hklin sf.mtz hklout sf.cif OUTPUT CIF data_sf LABI FP=FNAT SIGFP=SIGFNAT I(+)=INAT SIGI(+)= SIGINAT FREE=FREEFLAG FREEVAL 2 RESO 15.0 2.1 END In this case, the reflections for which the FREESET column is 2, have been used as the free R test set, and only data between 15 and 2.1 Angstroms were used in the refinement. Also, the merged intensities which were input to `truncate' have been retained in the data file, and are being written to the output mmCIF. The parameter following `OUTPUT CIF' must begin with `data_'. The characters which follow identify the data, and are to some extent arbitrary: they will be changed by PDB staff as appropriate when your submission is processed. If you are unsure what to put here, choose some alphanumeric string which means something to you, such as the MTZ filename or the name of the protein. The name of the original MTZ file appears within the first few lines of the output file - look for `_audit.creation_method'. Anomalous data are handled with the DP/SIGDP (and I(-)/SIGI(-)) column assignments. -------------------------------------------------------------------------- Protein Topology WWW Site David R. Westhead1,4, Daniel C. Hatton1 and Janet M. Thornton1,2,3 1 European Bioinformatics Institute, EMBL outstation, Wellcome Trust Genome Campus, Hinxton, Cambridge, CM10 1SD, UK 2 Biomolecular Structure and Modelling Unit, Department of Biochemistry and Molecular Biology, University College, London, WC1E 6BT, UK 3 Laboratory of Molecular Biology, Department of Crystallography, Birkbeck College, University of London, Malet Street, London, WC1E 7HX, UK 4 E-mail: westhead@ebi.ac.uk We have recently set up a WWW site (http://tops.ebi.ac.uk/tops) devoted to protein structural topology. The central service offered at this site is an "atlas" of protein topology cartoons in which each PDB entry has a representative topology cartoon. Also available is a "server" facility to which protein structures can be submitted (in PDB file format) for cartoon calculation, and a good deal of information about protein structural topology. Protein topology cartoons are simple two-dimensional schematic diagrams of protein folds. They represent a fold as a sequence of secondary structure elements (helices and strands) and show the relative position and direction of these elements in the fold. An example cartoon is shown in figure 1. Protein three-dimensional folds can be complicated and difficult to interpret. The aim of topology cartoons is to simplify them so that they can be more easily understood and compared. The simplification afforded by the cartoon in figure 1 is clear. [figure not available as text] Figure 1. The topology cartoon and 3D structure of superoxide dismutase (1 jcv). The topology cartoon displayed by software available on the WWW site. In the cartoon, triangles represent beta strands and circles helices. The direction of the strands is implied by the orientation of the triangles: "up" (out of the plane of the page) strands are drawn as up triangles and "down" strands as down triangles. The peptide chain runs from N1 to C2. The structure of the fold as a sandwich made of two anti-parallel beta sheets is clear from the cartoon. The atlas of topology cartoons was generated from the version of the PDB current on July 1, 1997. In order to avoid the generation of many duplicate cartoons, the chains present in the database were first clustered at a sequence similarity threshold of 95%. Chains consisting of nucleic acid sequences were removed, as were protein chains of less than 30 residues. From an original total of 10534 chains this produced 2144 clusters of near-identical sequences. From each cluster a representative TOPS diagram was produced from a single structure. This was chosen to be the highest resolution X-ray structure in the cluster, or an NMR structure if no X-ray structures were available. Within a chain, each structural domain was plotted separately using domain definitions taken from the CATH1 protein structure classification. The cartoons were generated automatically, in the first instance, using a substantially modified version of the program TOPS 2. While the original version of TOPS would produce satisfactory cartoons for simpler protein folds, it was found to be unable to do so for many more complicated folds. The modifications were necessary in order to increase the success rate of the program sufficiently to make automatic generation of a large number of cartoons a viable proposition. The generation of the atlas of cartoons was viewed as a test of the new version of the program. Each cartoon in the atlas was checked manually with the 3D structure of the protein, and the success rate in producing satisfactory cartoons was found to be 82%. Among the failures were many cartoons which were correct but not aesthetically pleasing, but there were still some complicated folds for which the program failed. The cartoons judged to be failures were corrected by hand editing and included in the atlas. The atlas is viewed using an applet (a program written in the Java programming language, delivered over the WWW, and run on the client machine). A basic applet using Java version 1.0 simply allows the user to view the cartoons, while users with a WWW browser supporting Java version 1.1 can use a much more functional applet which allows editing and printing of the cartoons. The same applets are used for viewing, editing, and printing cartoons generated at the request of the user by the server facility. Some users with older machines and/or browsers have experienced difficulties with the Java technology and for this reason a purely HTML/GIF version of the atlas will be provided in the near future We hope to keep the atlas up to date as new structures arrive in the PDB. However, because updates to the atlas require significant effort, we anticipate that there will always be a time lag between structures arriving in the PDB and cartoons being put into the atlas. In this case users will be able to use the server to generate their own cartoons for the new structures. The software used in the generation of the atlas will be made available in some form, and details will be posted on the Web site. Acknowledgements We are grateful to Dr. T. P. Flores for giving us the source code for TOPS and allowing us to modify it without restriction. We are also grateful to Dr. C. A. Orengo for providing us with the domain boundary file associated with the CATH1 protein structural domain classification. References Orengo, C. A., Michie, A. D., Jones, S., Jones, D. T., Swindells, M. B. & Thornton, J. M. (1997). CATH--a hierarchic classification of protein domain structures. Structure, 5, 1093-1108. Flores, T. P., Moss, D. S., & Thornton, J. M. (1994). An algorithm for automatically generating protein topology cartoons. Prot. Eng. 7, 31-37. -------------------------------------------------------------------------- SARF2 - a Program for Comparison of Protein Structures Nickolai N. Alexandrov Amgen, Thousand Oaks, CA, USA (nicka@amgen.com, http://www-lmmb.ncifcrf.gov/~nicka/info.html) Discovering new similarities in protein structures is an extremely exciting process. It is especially interesting if proteins are not sequentially related and so the structural similarity is completely unexpected. Obviously, when you find a structural resemblance, you have two problems: first, you need to prove that the similarity is significant, and, second, you need to explain the biological meaning of this similarity. Traditionally the significance of the match is demonstrated by an unusually large number of C-alpha atoms which can be superimposed with a small root mean square distance (rmsd). Biological meaning can be explained by the evolutionary relationship of the proteins, similar functional properties, and/or energetic stability of the 3D motif. There are several programs for protein structure comparison recently reviewed by Gibrat et al. (1996). However, finding common motifs in 3D structures is not a trivial problem. Probably the most difficult part here is to think up a measure of similarity between two structures which correlates with biological sense. Usually a similarity between two structures is described in terms of the number of C-alpha atoms and the rmsd between them. Yet, these numbers do not provide an adequate measure of structural similarity. For example, isolated residues with a small rmsd are likely to form a less significant match than a spatial arrangement of continuous backbone fragments. The program SARF2 (Alexandrov, 1996) detects common motifs in protein structures which consist of similarly-arranged backbone fragments. (The abbreviation SARF stands for Spatial ARrangement of backbone Fragments and was first used by Alexandrov et al., 1992.) There are two kinds of the spatial resemblance: topological and non-topological similarities. Topological equivalence assumes that the fragments in both proteins are connected in the same sequential order. Non-topological similarities are relatively rare. An example of the non- topological structural similarity is four-helical bundle motif, in which helices can be differently connected, but still remain within the same protein architecture. SARF2 is able to detect both kinds of similarities. The Web site for SARF2 in the Laboratory of Experimental and Computational Biology at the National Cancer Institute (http://www- lmmb.ncifcrf.gov/~nicka/info.html) allows you to compare just two structures. If you want to compare many protein structures, you can download the program from the ftp site (ftp://ftp.ncifcrf.gov/pub/SARF2/) and run it on your SGI or DEC Alpha machine. There are mirror Web sites for SARF2 at the Baylor College of Medicine (http://defrag.bcm.tmc.edu:9503/lpt.html), at the GMD/SCAI in Germany (http://cartan.gmd.de/nick/run2.html), and at the Sanger Centre in England (http://genomic.sanger.ac.uk/). An important and still open question in protein structure comparison is an evaluation of the significance of the match. One way to solve this problem is to compare the structure of interest with all of the PDB and plot the distribution of the number of matched residues for each structure. The significance of the match can then be measured in the units of standard deviation from the mean of this distribution. This approach has been used by Alexandrov and Fischer (1996) to make a classification of the representative list of protein structures. A knowledge of the mechanism of protein function is sometimes a useful argument for the significance of the match. Frequently, active sites are surrounded by a similar structural environment, although the protein function can be different. And, vice versa, detection of an unexpected statistically significant structural similarity can lead to new speculations on the mechanism of protein function. The most interesting structural similarities are those between proteins with low amino acid identities. Understanding the origin of these similarities provides a deeper insight into the mystery of protein folding. The fact that the number of structural classes is smaller than the number of different sequence families encouraged many researchers to apply a variety of sequence-structure compatibility (threading) methods. One of these methods (program 123D), based on the contact capacity potentials, is also presented on the same NCI web site: http://www-lmmb.ncifcrf.gov/~nicka/info.html. References: Alexandrov, N. N. (1996). SARFing the PDB. Protein Eng. 9, 727-732. Alexandrov, N. N., Fischer, D. (1996). Analysis of topological and nontopological structural similarities in the PDB: new examples with old structures. Proteins, 25, 354-365. Alexandrov, N. N., Takahashi, K., & Go, N. (1992). Common spatial arrangements of backbone fragments in homologous and non-homologous proteins. J. Mol. Biol. 225, 5-9. Gibrat, J. F., Madej, T., & Bryant, S. H. (1996). Surprising similarities in structure comparison. Curr. Opinion in Struct. Biol. 6, 377-385. -------------------------------------------------------------------------- Molecular Docking by Fourier Correlation with FTDOCK Henry A. Gabb and Michael J.E. Sternberg Biomolecular Modelling Laboratory, Imperial Cancer Reseach Fund, London, UK (gabb@ibm.wes.hpc.mil, m.sternberg@icrf.icnet.uk) The ability to predict the binding geometries of biomolecular complexes is becoming increasingly important with the growing number of individual structures deposited in the Protein Data Bank because experimental determination of the structure of biomolecular complexes remains a difficult problem. FTDOCK was developed to address the problem of docking unbound molecules when the structure of the complex is unavailable (i.e., predictive docking). FTDOCK implements the geometric surface recognition algorithm of Katchalski-Katzir and coworkers (Katchalski-Katzir et al., 1992) to dock two macromolecules. The method takes advantage of the fast Fourier transform (FFT) to rapidly search the translational space of two rigidly rotated molecules. An electrostatic function amenable to the Fourier correlation algorithm has been developed in this laboratory that improves the final rank of correctly docked molecules (Gabb et al., 1997). Possible docking orientations are scored for surface complementarity and favourable electrostatics using Fourier correlation theory. For docking starting with unbound coordinates, we have shown that in some systems inclusion of electrostatics is critical to success. We have used FTDOCK in our laboratory to dock several protein systems for which the coordinates of the complex and the individual subunits are available (Gabb et al., 1997). The test set was comprised of six enzyme-inhibitor and four antibody-antigen complexes. In all but one of our test cases, correctly docked geometries (interface C-alpha root-mean-square deviation less than or equal to 2.5 Angstroms squared) are found during a complete search of binding space in a list that was always less than 250 complexes and often less than 30. At this point, biochemical information is still necessary to remove incorrect predictions. We found that knowledge of at least one binding site further improved rankings for correct solutions. For six out of nine test cases, a correctly docked complex was placed in the top five predictions. For the other three test cases, two had a correctly docked complex in the top fifteen predictions and the other had a correct answer in the top fifty. Considering that 1010 geometries are screened during the global search of binding space, these results are encouraging. When information about the binding site on both molecules is available, a correctly docked complex scored in the top five for eight out of nine test cases. Many of these had the correct answer ranked first in the list of predictions. Even the worst test case had a correctly docked complex ranked 27th. FTDOCK was developed under Irix 5.3 and 6.2, but the program should run on any UNIX computer. The current version of FTDOCK uses either the fast Fourier transform from Numerical Recipes Software (Press et al., 1986) or the Silicon Graphics CHALLENGEcomplib(TM) (Silicon Graphics Inc.) to take advantage of the SGI shared memory multiprocessor. However, the latter FFT will also run efficiently on SGI serial computers. A typical docking experiment takes about six hours of CPU time using eight processors in parallel on a SGI Power Challenge. This assumes a rotational increment of 15 (6385 nondegenerate rotations). A typical docking attempt takes 3-4 days on a SGI Indy using the Numerical Recipes FFT rather than the SGI library routine. In some cases, however, a larger increment can be used for the rotational search (Katchalski-Katzir et al., 1992). Using an angular deviation of 20 (2629 nondegenerate rotations), for example, reduces the computational time to less than one day on a SGI Indy workstation. FTDOCK can also be used to dock non-protein systems like nucleic acids or small molecules. Our experiments with non-protein systems have not yet been published. The program can be obtained via our WWW site (http://www.icnet.uk/bmm/software.html). References Gabb, H. A., Jackson, R. M. & Sternberg, M. J. E. (1997). Modelling protein docking using shape complementarity, electrostatics, and biochemical information. J. Mol. Biol., 272, 106-120. Katchalski-Katzir, E., Shariv, I., Eisenstein, M., Freisen, A. A., Aflalo, C. & Vakser, I. A. (1992). Molecular surface recognition: determination of geometric fit between proteins and their ligands by correlation techniques. Proc. Natl. Acad. Sci. USA 89, 2195-2199. Press, W. H., Teukolsky, S. A., Vetterling, W. T. & Flannery, B. P. (1986). Numerical Recipes in Fortran, Cambridge University Press. Available from Numerical Recipes Software (http://cfata2.harvard.edu/nr/). Silicon Graphics Inc. (1995) CHALLENGEcomplib(TM) Science and Math Library. (http://www.sgi.com/Products/Challengecomplib.html). -------------------------------------------------------------------------- MolView and MolView Lite Thomas J. Smith Department of Biological Sciences, Purdue University, West Lafayette, IN 47907 (tom@bragg.bio.purdue.edu) MolView is a program to display and analyze atomic structures and MolView Lite is a simple rendering program using the new QuickDraw3D technology. Both freeware applications are currently limited to the Macintosh personal computer, but work is underway to create a version compatible with WIN95. MolView has a wide variety of options to examine and display atomic structures. Atomic structures can be read into MolView as several types of text files: PDB, O plot files, ChemDraw 3D, and MolView files. Mono and stereo images can be interactively rotated with the mouse, tool palette buttons, or numerical input. Key aspects of the structures can be highlighted by mixing the available display modes: CPK, ribbon, ball&stick, line, and surface stippling. Emphasis has been placed on the user interface so that users unfamiliar with atomic structures can easily create figures and perform analysis while still being able to customize the object's attributes. Users can customize atomic labels and choose atoms for labeling by clicking on the atom or by picking them from a scrolling list. When creating ribbon diagrams, the secondary structure elements are either automatically determined from the structure using psi-phi values and hydrogen bonding patterns, or read from headers of PDB files. The user can color and toggle elements of the ribbon diagrams using a palette of buttons that display the current color, identifies the residue number of the element, and the type of secondary structure in that segment of the protein. The types of analyses that can be performed include distance measurements, 3D structural alignments, Ramachandran plots, Edmunson wheels, hydropathy plots, distance diagrams, B- value figures, hydrogen bonding patterns, and surface plots. More advanced users can also display crystallographic and non-crystallographically related molecules and unit cell boundaries. There are several options when displaying nucleic acids. A ribbon can be drawn along the phosphoribose backbone and the ring structures can be color coded by filling the rings with colored planes. For presentation and educational purposes, there are several types of files that can be read or written by MolView. When MolView files are updated, colors, some objects, and the new orientation matrix are saved in the file. Line drawings, MOL objects, can be written as separate objects with the color information stored in the header. Using these objects, students can drag and drop files into MolView to view prepared structural lessons. Three different types of QuickTime movies can be created to enhance display performance on older machines or when creating multi- media resources. The various objects and plots can all be written to object- oriented PICT files for publication quality images. Simple line drawings can be saved as DXF files for import into other rendering applications. Finally, all of the various types of molecular objects can be written to QuickDraw3D (3DMF) files where they can be read and interactively rendered by a growing number of applications that run under either MacOS or Windows95 operating systems. MolView Lite is a simple rendering application that reads and interactively renders the MolView 3DMF output. The image can be written out as a PICT image or QuickTime movie. There are also several other resources available at the MolView WWW site (http://bilbo.bio.purdue.edu/~tom). Files demonstrating crystallographic and non-crystallographic symmetry are included in the application package. An interactive tutorial can be downloaded that takes the user through examples of the major options and explains many of the terms used in the write-up. The write-up is available in Word (5.1 and 6.0) and HTML formats. Example images and movies are also available at this site. -------------------------------------------------------------------------- OLDERADO: Extracting Single Structures, Core Atoms and Domains from a NMR- derived Ensemble Lawrence A. Kelley and Michael J. Sutcliffe Department of Chemistry, Leicester University, Leicester, UK (L. Kelley@icrf.icnet.uk, sjm@le.ac.uk). We have recently developed a WWW server, OLDERADO (On-Line Database of Ensemble Representatives and Domains; http://neon.chem.le.ac.uk/olderado/) (Kelley & Sutcliffe, 1997), which identifies the "best" single structure in a NMR-derived ensemble (Sutcliffe, 1993; Kelley et al., 1996), and determines the "core" atoms across the ensemble and the domain(s) (or rigid body(ies)) to which these belong (Kelley et al., 1997). The database component of OLDERADO has been integrated into the "Atlas" page resulting from a PDB 3DB Browser(TM) search for a NMR-derived ensemble, and in addition, individual representative MODEL structures for a PDB entry can be downloaded via the European Bioinformatics Institute (EBI) (http://www2.ebi.ac.uk/msd/nmr_search.shtml). OLDERADO consists of two components: (i) a database of NMR-derived ensembles deposited in the PDB, and (ii) the functionality to upload and analyse a user's own ensemble of structures. Generation of the OLDERADO database, and processing of uploaded structures, is performed by two analysis tools: NMRCORE (Kelley et al., 1997) and NMRCLUST (Kelley et al., 1996). NMRCORE automatically defines the core atoms and the domains in which these lie. This is achieved using a sorted list of dihedral angle order parameters (Hyberts et al., 1992) to define the core, followed by the definition of the domain(s) which comprise the core using automatic clustering of the variances in inter-atom distances. NMRCLUST automatically clusters ensemble members into conformationally-related sub- families. All structures are superimposed in a pairwise manner and the resulting RMS distance between each pair calculated. These distances are used as a similarity score on which to base the clustering. Average linkage cluster analysis is used in conjunction with a novel penalty function to determine a cut-off in the clustering hierarchy automatically. At the top of the results page, there is a summary which defines the largest domain and the "most representative" MODEL entry. Under this are two tables - the first detailing (in order of domain size) the core and domain(s), and the second (in order of cluster size) the representative structure(s) and cluster membership. These domains and clusters can be viewed interactively in three- dimensions via the "View Domains" and "View Clusters" buttons, respectively. OLDERADO has also been integrated into the PDB 3DB Browser - it is accessed via the Atlas page if the result of a search is a NMR-derived ensemble. The link gives direct access to the OLDERADO database entry for this PDB entry; the information available is described in the preceding paragraph. Additionally, the OLDERADO methodology enables users to download via the EBI (with the aid of Kim Henrick in the Macromolecular Structure Group) an individual MODEL (by default, the "most representative"), rather than the entire ensemble, from an existing PDB entry. In cases where a user requires only a single MODEL, or a set of "representative" models, this reduces the bandwith required for download, reduces the diskspace required on the local machine, and eliminates the need to split a downloaded file. References Hyberts, S. G., Goldberg, M. S., Havel, T. F. & Wagner, G. (1992). The solution structure of eglin c based on measurements of many NOEs and coupling constants and its comparison with X-ray structures. Protein Sci. 1, 736-751. Kelley, L. A. & Sutcliffe, M. J. (1997). OLDERADO: on-line database of ensemble representatives and domains. On Line Database of Ensemble Representatives And DOmains. Protein Sci. 6, 2628-2630. Kelley, L. A., Gardner, S. P. & Sutcliffe, M. J. (1996). An automated approach for clustering an ensemble of NMR-derived protein structures into conformationally related subfamilies. Protein Eng. 9, 1063-1065. Kelley, L. A., Gardner, S. P. & Sutcliffe, M. J. (1997). An automated approach for clustering an ensemble of NMR-derived protein structures into conformationally related subfamilies. Protein Eng. 10, 737-741. Sutcliffe, M. J. (1993). Representing an ensemble of NMR-derived protein structures by a single structure. Protein Sci. 2, 936-944. -------------------------------------------------------------------------- Notes of a Protein Crystallographer - FRODO, the Electronic Hobbit Cele Abad-Zapatero Department of Structural Biology, Abbott Laboratories, Abbott Park, IL, USA (abad@abbott.com) From early childhood, John Ronald Reuel Tolkien (J.R.R. Tolkien: 1892-1973) was fascinated with languages. When he was five, his mother - who was fluent in Latin, French, and German - taught him to read in all three languages plus her native English. Fatherless since 1896, the family lived in a small rented cottage in the hamlet of Sarehole by the Cole River, far from the smokestacks and soot of Birmingham. The quiet meadows and streams and Sarehole were a haven for Ronald and his younger brother Hilary. There his mother introduced them to botany and inspired in them a love for plants, trees and the beauty of natural landscapes. Nonetheless, change again came to his life abruptly. His mother died in 1904 and the brothers were left under the guardianship of a Catholic priest, Father Francis Morgan, who had a tremendous influence on his education and his life. Tolkien graduated from King Edward VI's school in Birmingham and won an award to attend Oxford University. His interest and passion for languages led him to study philology, specializing in the literary and linguistic tradition of the English West Midlands with extensive knowledge of Anglo-Saxon (or Old English as in Beowulf), Middle English (the language of Chaucer), and Finnish, Icelandic, Norse and Germanic mythologies and folklore. He was Professor of Anglo-Saxon at Oxford and a Fellow of Pembroke College from 1925 to 1945, Professor of English Language and Literature and a Fellow of Merton College from 1945 until his retirement in 1959. It is impossible to separate Tolkien's academic achievements from his creation of two muti-faceted, highly imaginative, epic stories which had a tremendous influence on the youth of the 1960's all over the world and whose effect still reverberates today. In 1937 he published The Hobbit (Tolkien, 1994), which received high acclaim as a fascinating children's story in which he introduced as main characters a `hobbit' named Bilbo Baggins and a magician of sorts named Gandalf. Tolkien later wrote that the origin of the word hobbit seems to be: "a worn-down form of a word preserved more fully in the language of Rohan: holbyta or `hole-builder'" (Tolkien, 1993b). What is a hobbit? In his own words: " [..] They are (or were) a little people, about half our height, and smaller than the bearded dwarves. Hobbits have no beards. There is little or no magic about them, except the ordinary everyday sort which helps them to disappear quietly and quickly when large stupid folk like you and me come blundering along, making a noise like elephants which they can hear a mile off. They are inclined to be fat in the stomach; they dress in bright colours (chiefly green and yellow); wear no shoes, because their feet grow natural leathery soles and thick warm brown hair like the stuff on their heads (which is curly); have long clever brown fingers, good-natured faces, and laugh deep fruity laughs (especially after dinner, which they have twice a day when they can get it. Now you know enough to go on with." (Tolkien, 1994, p.3) The illusion of hobbits as calm, simple people capable of heroic feats caught on quickly and Tolkien was asked to write more adventures of Bilbo Baggins. The Hobbit had ended with Bilbo keeping a ring that he had found during his encounter with Gollum, and living happily in the Shire: the idyllic part of Middle-earth where the hobbits lived and that scholars have related to the Sarehole of Tolkien's childhood (Neimark, 1996). The author had no desire to write a sequel. Instead, The Fellowship of the Ring, the first volume of the epic trilogy The Lord of the Rings was published in 1954. Soon after, the next two volumes appeared: The Two Towers and The Return of the King. The completed work was a mythological world of monumental proportions in which Tolkien had given life to creatures, kingdoms, wars, calendars, climates, places, landscapes, and seasons to give flesh and blood to the languages spoken by the people of Middle-earth: humans, elves, trolls, goblins, giants, dragons, ents, balrogs, orcs. The hero was Frodo, heir and nephew of Bilbo Baggins, who together with his friend Sam and other companions of the fellowship undertake a quest to destroy the master evil ring of Sauron that Frodo had inherited from his uncle. The appeal of an innocent, gentle creature succeeding in destroying the forces of evil against all odds, in an unspoiled landscape of pristine forests, mountains and lakes was enormous. By 1967, The Lord of the Rings had been translated into nine languages with an estimated readership of fifty million people. The graffito: FRODO lives! (Tolkien, 1993a), appeared in the New York subway as testimony to a cultural phenomenon that had opened a magic wonderland of places, characters and events unhindered by the prosaic incidents of our everyday lives. Tolkien had transcended the arcana of scholarly research in obscure languages to create a universal allegory of the constant struggle of good against evil, with strong environmental overtones. FRODO, the electronic hobbit, had its origins in 1976. Whether the younger generations believe or not, at that time all protein models were built starting from a C? tracing obtained from markings on an electron density map drawn on small plexiglass sheets stacked up as "mini-maps" (Jones, 1985). From these guide coordinates, detailed atomic models were built at a much larger scale on a Richards optical comparator known in the trade as "Richards Box" or "Fred's Folly" using Kendrew model parts (Richards, 1985). Glass or plastic windows had to be drawn by hand with tracings of the electron density contours at the appropriate scale (2 cm=1 A). Atomic coordinates were laboriously extracted from this wire model by tedious and often inaccurate protocols (Salemme, 1985). There was an immediate need for a computerized method that would allow the fitting of an atomic model to the experimental electron density map, and which would remove the tedium and inaccuracies from macromolecular structure determination and refinement (Editorial, 1997). The idea was floating in the community and several laboratories had initiated projects to achieve that goal. Drs. J. Gassman and R. Huber found a bright young Welsh would-be crystallographer who was interested in living in Munich to develop such a tool, and encouraged him to make it a program useful for the routine operation in a protein crystallography laboratory. Tradition has it that the original program sent data back and forth between a PDP11 and a SIEMENS4004, in a computing environment where many of the programs were named after different hobbits. It was only natural that the central program will be named after the most famous of all the hobbits in Tolkien's trilogy. For obvious reasons, the test version used most of the computing cycles and was called initially SAURON. FRODO made his appearance in the protein crystallography community twenty years ago in 1978 (Jones, 1978). As for myself, I got to know FRODO very well in 1981 during three beautiful weeks of immersion during the incomparable Swedish spring. Our friendship developed during many nocturnal model-building sessions at the old Wallenberg Laboratory next to the ancient city castle in Uppsala. I must confess that we had our crises, but he was certainly a very friendly hobbit. I was the one to blame for every crisis. Quite often, I failed to understand his prompts or suggestions, and many times his cues made no sense to me. He was always patient, effective and obedient. You could CHAT (actual FRODO commands in capital letters) with him via a keyboard but the most effective way to communicate was with a tablet and a pen which would allow you to pick and identify atoms, and select different commands from a MENU on the screen. Obedient to the GO command, FRODO would display for you a certain volume of electron density and using well designed commands you could tell him to BREAK certain bonds and cut the protein chain into pieces. These pieces could then be moved with six degrees of freedom (FBRT) to make them fit into the three-dimensional electron density maps which could be rotated at will with dials. FRODO did not know any protein chemistry, or if he did, he would not explicitly tell you so. It was you who would organize those constellations of points in space into a meaningful protein chain by using the REFInement command. He would faithfully apply the rules of chemistry to certain ZONEs of your spatial points which were covered by your electron density contours. This was a tremendous help when trying to fit those old electron density maps. FRODO was also very handy at modeling exercises by allowing you to create MOLecular objects that you could use either as background while fitting electron density or as objects of study in their own right. For some time, the rumor (joke) floated in the community that the only documentation for FRODO was "The Lord of the Rings". This might have been true, but in his own humble way FRODO proved to be a very useful hobbit and was the ancestor of many other electronic hobbits that are now well settled in our computer underworld. In addition, his faithful friend SAM was always available to insert or delete residues, create a sequence and do all the necessary bookkeeping so that in the end everything was SAVEd in the disk with `amazing speed' and accuracy. During my visit, FRODO lived in an independent VAX750 computer and his commands were translated into a Vector General VG3400. Later he lived inside many other boxes or hobbit-holes in many other countries. His performance improved as his electronic eyes and hands improved, permitting us to view unimaginable shapes and forms and to examine atomic continents, islands and landscapes of undescribable complexity and beauty. Following his original insights, we can now see atomic crevasses and caves, canyons, rivers, mountain ridges and valleys in different and vivid colours, and subtle hues and shades. FRODO opened for us an atomic underworld that was beyond our reach before. He introduced us to an atomic Middle-world that we could not have imagined without his assistance and that we are just beginning to explore, appreciate and understand. One could argue that there are no malicious villains in our atomic Middle-world: no Dark Riders or Ringwraiths trying to prevent FRODO from destroying the evil ring. Yet, we routinely encounter, examine, and study molecules with pathogenic and curative properties in our crystals, and a major part of our time is spent trying to understand their interactions with themselves and with others. We are trying to defeat the evil forces of disease, pain and deformity and our operational domain is the atomic Middle-world that FRODO unveiled for us. There are parts of these atomic creatures that we cannot see or cannot fit well in our electron density maps, and that chase us in our sleep like the Dark Riders chased after FRODO and his friends. However, our true Gollum, Shelob and Sauron are uncertainty, lack of knowledge, and especially bias and disorder. Those restrictive forces will always be with us. In the meantime, FRODO will live on in the heart of those of us who -once upon a time- built protein models using mechanical parts and read the coordinates of our structures using a two- dimensional grid and a plumb line. He did so many things for us; he was such good a friend.... References Editorial (1997), String and sealing wax. Nature Struct. Biol. 4, 961-964. Jones, T. A. (1978). A graphics model building and refinement system for macromolecules. J. Appl. Cryst. 11, 268-272. Jones, T. A. (1985). Diffraction methods for biological macromolecules. Interactive computer graphics: FRODO. Methods Enzymol. 115, 157-171. Neimark, A. E. (1996). Myth Maker: J. R. R. Tolkien, pp. 85-86, Harcourt Brace & Co., New York. Richards, F. M. (1985). Optical matching of physical models and electron density maps: early developments. Methods Enzymol. 115, 145-154. Salemme, F. R. (1985). Some minor refinements on the Richards optical comparator and methods for model coordinate measurement. Methods Enzymol. 115, 154-157. Tolkien, J. R. R. (1993a). The Lord of the Rings Trilogy. Part One: The Fellowship of the Ring.Intro by Petr Bearle, Authorized edition of the fantasy classic by Ballantine Books, New York. Tolkien, J. R. R. (1993b). The Lord of the Rings Trilogy. Part Three: The Return of the King. Appendix F. Authorized edition of the fantasy classic by Ballantine Books, New York. Tolkien, J. R. R. (1994). The Hobbitt. 2nd Ed. Houghton Mifflin Company, New York. -------------------------------------------------------------------------- Web Sites Referenced in the January 1998 PDB Quarterly Newsletter BioMagResBank (BMRB) NMR database http://www.bmrb.wisc.edu FTDOCK http://www.icnet.uk/bmm/software.html IUCr Policy on Publication http://hobbes.gh.wits.ac.za/iucr-top/journals/acta/actaa_notes.html mmCIF http://ndb.rutgers.edu/NDB/mmcif/ MolView http://bilbo.bio.purdue.edu/~tom New BNL Contractor http://www.pubaf.bnl.gov/pr/BSA.htm http://apollo.osti.gov/doe/whatsnew/pressrel/pr97130.html http://apollo.osti.gov/doe/whatsnew/pressrel/pr98001.html http://www.ch.doe.gov/bnlseb Nucleic Acid Database Submissions http://ndbserver.rutgers.edu:80/NDB/deposition/index.html Numerical Recipies http://cfata2.harvard.edu/nr/ OLDERADO http://neon.chem.le.ac.uk/olderado/ PDB AutoDep Submissions http://www.pdb.bnl.gov http://autodep.ebi.ac.uk PDB 3DB Browser(TM) http://www.pdb.bnl.gov/pdb-bin/pdbmain PMD, Parallel Molecular Dynamics http://tincan.bioc.columbia.edu/pmd/pmd-summary.html Program 123D http://www-lmmb.ncifcrf.gov/~nicka/info.html Representative MODEL Structures for a PDB Entry http://www2.ebi.ac.uk/msd/nmr_search.shtml SARF2 http://www-lmmb.ncifcrf.gov/~nicka/info.html ftp://ftp.ncifcrf.gov/pub/SARF2/ http://defrag.bcm.tmc.edu:9503/lpt.html http://cartan.gmd.de/nick/run2.html http://genomic.sanger.ac.uk/ SFCHECK http://www.sdsc.edu/Xtal/IUCr/CC/School96/ SGI CHALLENGEcomplib(tm) http://www.sgi.com/Products/Challengecomplib.html Structure Factor mmCIF Dictionary ftp://ftp.pdb.bnl.gov/structure_factors/cifSF_dictionary TOPS http://tops.ebi.ac.uk/tops Uppsala Electron Density Server http://alpha2.bmc.uu.se/valid/density/form1.html ------------------------------------------------------------------------- Affiliated Centers and Mirror Sites Forty affiliated centers offer the Protein Data Bank database archives for distribution. These centers are members of the Protein Data Bank Service Association (PDBSA). Centers designated with an asterisk(*) may distribute the archives both on-line and on magnetic or optical media; those without an asterisk are on-line distributors only. Official PDB Mirror Sites are marked with a grey bar ( ) and are listed with their sponsoring center. ARGENTINA UNIVERSIDAD NACIONAL DE SAN LUIS Facultad de Ciencias Fisico Matematicas y Naturales Universidad Nacional de San Luis San Luis, Argentina Jorge A. Vila (54-652-22803) vila@unsl.edu.ar http://linux0.unsl.edu.ar/fmn PDB Mirror Site: http://pdb.unsl.edu.ar Fernando Aversa (aversa@unsl.edu.ar) AUSTRALIA WEHI The Walter and Eliza Hall Institute Melbourne, Australia Tony Kyne (61-3-9345-2586) tony@wehi.edu.au http://www.wehi.edu.au PBD Mirror Site: http://pdb.wehi.edu.au/pdb Tony Kyne (tony@wehi.edu.au) BRAZIL UNIVERSIDADE FEDERAL DE MINAS GERAIS Instituto de Ciencias Biologicas Belo Horizonte, MG - Brazil Marcelo M. Santoro (55-31-441-5611) santoro@icb.ufmg.br Ari M. Siqueira (55-31-952-7470) siqueira@icb.ufmg.br http://www.1cc.ufmg.br/ PDB Mirror Site: http://www.pdb.ufmg.br Ari M. Siqueira (siqueira@cenapad.ufmg.br) CANADA NATIONAL RESEARCH COUNCIL OF CANADA Institute for Marine Biosciences Halifax, N.S., Canada Christoph W. Sensen (902-426-7310) sensencw@niji.imb.nrc.ca http://cbrmain.cbr.nrc.ca CHINA PEKING UNIVERSITY Molecular Design Laboratory Institute of Physical Chemistry Beijing 100871, China Luhua Lai (86-10-62751490) lai@ipc.pku.edu.cn http://www.ipc.pku.edu.cn PDB Mirror Site: http://www.ipc.pku.edu.cn/pdb Li Weizhong (liwz@csb0.ipc.pku.edu.cn) FINLAND CSC CSC Scientific Computing Ltd. Espoo, Finland Erja Heikkinen (358-9-457-2433) erja.heikkinen@csc.fi http://www.csc.fi TURKU CENTRE FOR BIOTECHNOLOGY University of Turku and Abo Akademi University Turku, Finland Adrian Goldman (358-2-3338029) goldman@btk.utu.fi http://www.btk.utu.fi FRANCE IGBMC Laboratory of Structural Biology Strasbourg (Illkirch), France Frederic Plewniak (33-8865-3273) plewniak@igbmc.u-strasbg.fr http://www-igbmc.u-strasbg.fr LIGM Laboratorie d'ImmunoGenetique Moleculaire Montpellier, France Marie-Paule LeFranc (33-04-67-61-36-34) Lefranc@ligm.crbm.cnrs-mop.fr http://imgt.cnusc.fr:8104 GERMANY DKFZ German Cancer Research Center Heidelberg, Germany Otto Ritter (49-6221-42-2372) o.ritter@dkfz-heidelberg.de http://www.dkfz-heidelberg.de EMBL European Molecular Biology Laboratory Heidelberg, Germany Hans Doebbeling (49-6221-387-247) hans.doebbeling@embl-heidelberg.de http://www.EMBL-Heidelberg.DE GMD German National Research Center for Information Technology Sankt Augustin,Germany Theo Mevissen (49-2241-14-2784) theo.mevissen@gmd.de http://www.gmd.de PDB Mirror Site: http://pdb.gmd.de Theo Mevissen (theo.mevissen@gmd.de) ISRAEL WEIZMANN INSTITUTE OF SCIENCE Rehovot, Israel Jaime Prilusky (972-8-9343456) lsprilus@weizmann.weizmann.ac.il http://www.weizmann.ac.il PDB Mirror Site: http://pdb.weizmann.ac.il Marilyn Safran (pdbhelp@pdb.weizmann.ac.il) ITALY ICGEB International Centre for Genetic Engineering and Biotechnology Trieste, Italy Sandor Pongor (39-40-3757300) pongor@icgeb.trieste.it http://www.icgeb.trieste.it JAPAN FUJITSU KYUSHU SYSTEM ENGINEERING LTD. Computer Chemistry Systems Fukuoka, Japan Masato Kitajima (81-92-852-3131) ccs@fqs.fujitsu.co.jp http://www.fqs.co.jp/CCS *JAICI Japan Association for International Chemical Information Tokyo, Japan Hideaki Chihara (81-3-5978-3608) *OSAKA UNIVERSITY Institute for Protein Research Osaka, Japan Masami Kusunoki (81-6-879-8634) kusunoki@protein.osaka-u.ac.jp THE NETHERLANDS CAOS/CAMM Dutch National Facility for Computer Assisted Chemistry Nijmegen, The Netherlands Jan Noordik (31-80-653386) noordik@caos.caos.kun.nl http://www.caos.kun.nl POLAND WARSAW UNIVERSITY Interdisciplinary Centre for Modelling Warszawa, Poland Wojtek Sylwestrzak (48-22-874-9100) W.Sylwestrzak@icm.edu.pl http://www.icm.edu.pl PDB Mirror Site: http://pdb.icm.edu.pl Wojtek Sylwestrzak (W.Sylwestrzak@icm.edu.pl) SWEDEN UPPSALA UNIVERSITY Department of Molecular Biology Uppsala University Uppsala, Sweden Alwyn Jones (46-18-174982) alwyn@xray.bmc.uu.se http://pdb.bmc.uu.se or http://alpha2.bmc.uu.se TAIWAN NATIONAL TSING HUA UNIVERSITY Department of Life Science HsinChu City, Taiwan J.-K. Hwang (+886 3-5715131, extension 3481) or lshjk@life.nthu.edu.tw P.C. Lyu (+886 3-5715131 extension 3490) lslpc@life.nthu.edu.tw http://life.nthu.edu.tw PDB Mirror Site: http://pdb.life.nthu.edu.tw/ Tony Wu (mirror@life.nthu.edu.tw) NCHC National Center for High-Performance Computing Hsinchu, Taiwan, ROC Jyh-Shyong Ho (886-35-776085; ext: 342) c00jsh00@nchc.gov.tw UNITED KINGDOM BIRKBECK Crystallography Department Birkbeck College, University of London London, United Kingdom Ian Tickle (44-171-6316854) tickle@cryst.bbk.ac.uk http://www.cryst.bbk.ac.uk *CCDC Cambridge Crystallographic Data Centre Cambridge, United Kingdom David Watson (44-1223-336394) watson@ccdc.cam.ac.uk http://www.ccdc.cam.ac.uk PDB Mirror Site: http://pdb.ccdc.cam.ac.uk/ Ian Bruno (mirror@ccdc.cam.ac.uk) EMBL OUTSTATION: THE EUROPEAN BIOINFORMATICS INSTITUTE Wellcome Trust Genome Campus Hinxton, Cambridge, United Kingdom Philip McNeil (44-1223-494-401) mcneil@ebi.ac.uk http://www.ebi.ac.uk PDB Mirror Site: http://www2.ebi.ac.uk/pdb Philip McNeil (pdbhelp@ebi.ac.uk) *OML Oxford Molecular Ltd. Oxford, United Kingdom Kevin Woods (44-1865-784600) kwoods@oxmol.co.uk http://www.oxmol.co.uk or http://www.oxmol.com SEQNET Daresbury Laboratory Warrington, United Kingdom User Interface Group (44-1925-603351) uig@daresbury.ac.uk http://www.seqnet.dl.ac.uk UNITED STATES *APPLIED THERMODYNAMICS, LLC Hunt Valley, Maryland, USA George Privalov (410-771-1626) George_Privalov@classic.msn.comhttp://www.mole3d.com BMRB BioMagResBank University of Wisconsin - Madison Madison, Wisconsin, USA Eldon L. Ulrich (608-265-5741) elu@bmrb.wisc.edu http://www.bmrb.wisc.edu BMERC BioMolecular Engineering Research Center College of Engineering, Boston University Boston, Massachusetts, USA Nancy Sands (617-353-7123) sands@darwin.bu.edu http://bmerc-www.bu.edu CMU Carnegie Mellon/Pittsburgh Supercomputing Center Pittsburgh, Pennsylvania, USA Hugh Nicholas (412-268-4960) nicholas@psc.edu http://pscinfo.psc.edu/biomed/biomed.html MAG Molecular Applications Group Palo Alto, California, USA Margaret Radebold (415-846-3575) bold@mag.com http://www.mag.com *MSI Molecular Simulations Inc. San Diego, California, USA Stephen Sharp (619-799-5353) ssharp@msi.com http://www.msi.com NCBI National Center for Biotechnology Information National Library of Medicine National Institutes of Health Bethesda, Maryland, USA Stephen Bryant (301-496-2475) bryant@ncbi.nlm.nih.gov http://www.ncbi.nlm.nih.gov NCSA National Center for Supercomputing Applications University of Illinois at Urbana-Champaign Champaign, Illinois, USA Allison Clark (217-244-0768) aclark@ncsa.uiuc.edu http://www.ncsa.uiuc.edu/Apps/CB NCSC North Carolina Supercomputing Center Research Triangle Park, North Carolina, USA Linda Spampinato (919-248-1133) linda@ncsc.org http://www.mcnc.org *PANGEA SYSTEMS, INC. Oakland, CA 94612 Greg Thayer (510-628-0100) gregt@pangeasystems.com SAN DIEGO SUPERCOMPUTER CENTER San Diego, California, USA Philip E. Bourne (619-534-8301) bourne@sdsc.edu http://www.sdsc.edu *TRIPOS Tripos, Inc. St. Louis, Missouri, USA Akbar Nayeem (314-647-1099; ext: 3224) akbar@tripos.com http://www.tripos.com UNIVERSITY OF GEORGIA BioCrystallography Laboratory Department of Biochemistry and Molecular Biology University of Georgia Athens, Georgia, USA John Rose or B.C. Wang (706-542-1750) rose@BCL4.biochem.uga.edu http://www.uga.edu/~biocryst PDB Mirror Site: http://BCL10.bmb.uga.edu John Rose (rose@BCL4.biochem.uga.edu)- ------------------------------------------------------------------------- Related WWW Sites Databases Archive of Obsolete PDB Entries http://pdbobs.sdsc.edu/ BMRB (BioMagResBank) http://www.bmrb.wisc.edu CCDC (Cambridge Crystallographic Data Centre) http://www.ccdc.cam.ac.uk EBI (European Bioinformatics Institute) http://www.ebi.ac.uk EMBL (European Molecular Biology Laboratory) http://www.embl-heidelberg.de ExPASy Molecular Biology Server http://www.expasy.ch GDB (Genome Data Base) http://gdbwww.gdb.org GenBank (NIH Genetic Sequence Database) http://www.ncbi.nlm.nih.gov/Web/Genbank/index.html HIC-Up (Hetero-compound Information Centre Uppsala) http://alpha2.bmc.uu.se/hicup/ HIV Protease Database http://www-fbsc.ncifcrf.gov/HIVdb/ Klotho: Biochemical Compounds Declarative Database http://www.ibc.wustl.edu/klotho/ Library of Protein Family Cores http://WWW-SMI.Stanford.EDU/projects/helix/LPFC/ Crystal MacroMolecule Files at EBI http://www2.ebi.ac.uk/msd/macmol_doc.shtml NCBI (National Center for Biotechnology Information) http://www.ncbi.nlm.nih.gov NDB (Nucleic Acid Database) http://ndbserver.rutgers.edu PDB (Protein Data Bank) http://www.pdb.bnl.gov PIR (Protein Information Resource) http://www-nbrf.georgetown.edu/pir Prolysis: A Protease and Protease Inhibitor Web Server http://delphi.phys.univ-tours.fr/Prolysis/ Protein Kinase Database Project http://www.sdsc.edu/kinases/ Protein Motions Database http://hyper.stanford.edu/~mbg/ProtMotDB/ RELIBase http://pdb.pdb.bnl.gov:8081/home.html SCOP: Structural Classification of Proteins http://scop.mrc-lmb.cam.ac.uk/scop/ Mirrored at Protein Data Bank http://www.pdb.bnl.gov/scop/ Swiss-Prot Sequence Database http://expasy.hcuge.ch/sprot/sprot-top.html CATH Protein Structure Classification http://www.biochem.ucl.ac.uk/bsm/cath Enzyme Structures Database http://www.biochem.ucl.ac.uk/bsm/enzymes/ PDBsum http://www.biochem.ucl.ac.uk/bsm/pdbsum Software-Related Sites CCP4 http://www.dl.ac.uk/CCP/CCP4/main.html ftp://ccp4a.dl.ac.uk/pub/ccp4 mmCIF http://ndbserver.rutgers.edu/NDB/mmcif O Home Page http://imsb.au.dk/~mok/o/ OPM (Object-Protocol Model) Data Management Tools http://gizmo.lbl.gov/DM_TOOLS/OPM/OPM.html RasMol Home Page http://www.umass.edu/microbio/rasmol/ SHELX Home Page http://linux.uni-ac.gwdg.de/SHELX Squid: Analysis and Display of Data from Crystallography and Molecular Dynamics http://www.yorvic.york.ac.uk/~oldfield/squid/ VMD - Visual Molecular Dynamics http://www.ks.uiuc.edu/Research/vmd/ X-PLOR Home Page http://xplor.csb.yale.edu/ Other Resources Crystallography Worldwide http://www.unige.ch/crystal/w3vlc/crystal.index.html BioMoo http://www.cco.caltech.edu/~mercer/htmls/BioMOOHomePage.html DALI - Comparison of Protein Structures in 3D http://www.embl-heidelberg.de/dali/dali.html NCSA Biology Workbench http://biology.ncsa.uiuc.edu/ MOOSE (Macromolecular Structure Database at San Diego Supercomputer Center) http://db2.sdsc.edu/moose PDB_select: Representative PDBStructures ftp://ftp.embl- heidelberg.de/pub/databases/protein_extras/pdb_select/recent.pdb_select PROCHECK - To Submit a PDB File for Analysis http://www.cryst.bbk.ac.uk/PPS/procheck/test.html Protein Structure Verification-Biotech Server http://biotech.embl-heidelberg.de:8400/ Mirrored at Protein Data Bank http://biotech.pdb.bnl.gov:8400/ Resources for Macromolecular Structure Information http://www.ucmb.ulb.ac.be/StructResources.html The Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/vsms/ Weizmann Institute, Genome and Bioinformatics http://bioinfo.weizmann.ac.il/ ------------------------------------------------------------------------------- PDB (TM) Order Form Name of User Date Organization Phone Address Fax E-mail - Price is valid through September 30, 1998 - Price is per CD-ROM set released -- releases occur four times per year - Facsimile and phone orders are not acceptable The Protein Data Bank MUST receive all three of the following items before shipment can be completed (please send all required items together via postal mail -- facsimile and phone orders are NOT acceptable): 1. Completed order form; 2. Mailing label indicating exact shipping address; and 3. Payment (using one of the two options below): * Check payable to Brookhaven National Laboratory in U.S. dollars and drawn on a U.S. bank. Foreign checks cannot be accepted and will be returned. * Original purchase order payable to Brookhaven National Laboratory. After your order is processed, you will be invoiced by Brookhaven National Laboratory. Please indicate exact address to which invoice should be sent: A wire transfer is acceptable only AFTER we have received an original purchase order from your organization and you have been invoiced by Brookhaven. After receiving Brookhaven's invoice, your bank may send a wire transfer to: Bank name: Morgan Guaranty Trust Co. of New York Account name: Brookhaven National Laboratory Account number: 076-51-912 Please send all three required items together via postal mail to: PDB(TM) Orders Biology Department, Building 463 Brookhaven National Laboratory P.O. Box 5000 Upton, NY 11973-5000 One (1) release of the PDB(TM) on CD-ROM -- ISO 9660 Format $362.45 Total for four (4) releases $1449.80 (tax and shipping charges not applicable) For Order Information: Telephone... +1-516-344-5752 * Fax... +1-516-344-1376 * Email... orders@pdb.pdb.bnl.gov -------------------------------------------------------------------------- Access to the PDB Main Telephone +1-516-344-3629 Help Desk Telephone +1-516-344-6356 Fax +1-516-344-5751 Help Desk pdbhelp@bnl.gov General Correspondence pdb@bnl.gov WWW Home Page http://www.pdb.bnl.gov FTP Server ftp.pdb.bnl.gov Network Services sysadmin@pdb.pdb.bnl.gov Entry Error Reports errata@pdb.pdb.bnl.gov Order Information orders@pdb.pdb.bnl.gov User Group PDBusrgrp@suna.biochem.duke.edu Listserver Postings pdb-l@pdb.pdb.bnl.gov Listserver Subscriptions listserv@pdb.pdb.bnl.gov to subscribe, the text of your message should be subscribe PDB-L Your Name ----------------------------------------------------------------- FTP Directory Structure for Entries The PDB FTP server is updated weekly. Files are available by anonymous ftp to ftp.pdb.bnl.gov. Entry files are found under the directory pub/pdb/ all_entries/ coordinate entry files in compressed and uncompressed format biological_units/ generated coordinates for the biomolecules current_release/ current database, with entries removed or added since the last CD-ROM fullrelease/ static copy of the database as found on the last CD-ROM latest_update/ entries added or removed in the most recent FTP update newly_released/ entries released since the last CD-ROM nmr_restraints/ compressed NMR restraint files obsolete_entries/ withdrawn and/or replaced entries structure_factors/ compressed structure factor files fullrelease, newly_released, and current_release are divided into multiple subdirectories. -------------------------------------------------------------------------- Scientific Consultants John P. Rose, University of Georgia, Athens, Georgia, USA Sasha Faibusovich Clifford Felder Kurt Giles Jaime Prilusky Mia Raves Marilyn Safran Vladimir Sobolev Yehudit Weisinger Weizmann Institute of Science Rehovot, Israel -------------------------------------------------------------------------- PDB Staff Joel L. Sussman, Head Enrique E. Abola, Deputy Head and Head of Scientific Content/Archive Management Otto Ritter, Head of Informatics Frances C. Bernstein Betty R. Deroski Arthur Forman Sabrina Hargrove Jiansheng Jiang Mariya Kobiashvili Jiri Koutnik Patricia A. Langdon Michael D. Libeson Dawei Lin Nancy O. Manning John E. McCarthy Christine Metz Michael J. Miley Regina K. Shea Janet L. Sikora S. Swaminathan Dejun Xue -------------------------------------------------------------------------- Statement of Support The PDB is supported by a combination of Federal Government Agency funds (work supported by the U.S. National Science Foundation; the U.S. Public Health Service,National Institutes of Health, National Center for Research Resources, National Institute of General Medical Sciences, and National Library of Medicine; and the U.S. Department of Energy under contract DE- AC02-76CH00016) and user fees. ------------------------------------------------------------------------- Instructions to Authors Contributions to the PDB Quarterly Newsletter may be sent by e-mail or diskette to: Nancy O. Manning, Editor oeder@bnl.gov References should be in the format used by the Journal of Molecular Biology. Deadlines for contributions are: March 1, June 1, September 1, and December 1. -------------------------------------------------------------------------- Protein Data Bank Biology Department, Bldg. 463 Brookhaven National Laboratory P.O. Box 5000 Upton, NY 11973-5000 USA Telephone +1-516-344-3629 Fax +1-516-344-5751 ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- Number of Entries Deposited (Bar) and Average Time to Release (Line) Accumulated and Averaged on a Quarterly Basis [image not available as text] Bar Graph - Number of Entries in the Following Categories: OnHold - (light blue) On-hold per depositor request Processing - (white) Being processed Released - (black) Released Line Graph - Average Number of Days to Release The data were accumulated and averaged on a quarterly basis. The average turn- around times for entries now being processed are estimated based on the average of the last 12 months. Data for the last quarter are accumulated until the date specified on the graph. See http://www.pdb.bnl.gov/pdb-docs/EntryTurnAround.html for regularly updated plot. -------------------------------------------------------------------------------- Protein Data Bank Biology Department, Bldg. 463 Brookhaven National Laboratory P.O. Box 5000 Upton, NY 11973-5000 USA Telephone +1-516-344-3629 Fax +1-516-344-5751