RCSB PDB Newsletter Number 31 -- Fall 2006 Published quarterly by the Research Collaboratory for Structural Bioinformatics Protein Data Bank Weekly RCSB PDB news is published at www.pdb.org To change your subscription options, please visit lists.sdsc.edu/mailman/listinfo.cgi/rcsb-news ----------------------------------------- TABLE OF CONTENTS Message from the RCSB PDB Data Deposition and Processing Next Generation of ADIT and ADIT-NMR Available for Depositions Tips for Depositing Multiple Related Structures using ADIT The Searchable PDB Exchange Dictionary Deposition Statistics Data Query, Reporting, and Access Protein Workshop: A Visualization Tool Advanced Search Tutorial Website Statistics Outreach and Education Art of Science Exhibitions Meetings and Exhibits 2006 RCSB PDB Poster Prize Molecules of the Quarter New RCSB PDB Flyers Available in Print and Online PDB Education Corner: Robert J. Warburton, Shepherd University PDB Community Focus: Wah Chiu, Baylor College of Medicine RCSB PDB Job Listing: Biochemical Information & Annotation Specialist Statement of Support, Partners, Leadership Team Snapshot -------------------------------------------- MESSAGE FROM THE RCSB PDB In August, the wwPDB announced that PDB depositions will be restricted to atomic coordinates that are substantially determined by experimental measurements on specimens containing biological macromolecules, effective October 15, 2006. This policy was recommended and endorsed by a working group comprised of structural and computational biologists and endorsed by the wwPDB advisory committee. Thus, theoretical model depositions (such as models determined purely in silico using, for example, homology or ab initio methods) will no longer be accepted. Theoretical models that have been available from the PDB archives will continue to be publicly available via the existing models FTP directory. A paper describing the outcome of the working group's Workshop on Archiving Structural Models of Biological Macromolecules was published in Structure: H.M. Berman, S.K. Burley, W. Chiu, A. Sali, A. Adzhubei, P.E. Bourne, S.H. Bryant, J. Roland L. Dunbrack, K. Fidelis, J. Frank, A. Godzik, K. Henrick, A. Joachimiak, B. Heymann, D. Jones, J.L. Markley, J. Moult, G.T. Montelione, C. Orengo, M.G. Rossmann, B. Rost, H. Saibil, T. Schwede, D.M. Standley, and J.D. Westbrook (2006) Outcome of a workshop on archiving structural models of biological macromolecules. Structure. 14: 1211-1217. Questions about this transition should be sent to info@wwpdb.org. -------------------------------------------- DATA DEPOSITION AND PROCESSING NEXT GENERATION OF ADIT AND ADIT-NMR AVAILABLE FOR DEPOSITIONS When depositing your next structure, try using either beta-ADIT or beta-ADIT-NMR. Beta-ADIT (deposit-beta.rcsb.org/adit/) has been designed to make your deposition more complete and error-free. This tool offers a number of advantages over the current version of ADIT, including: * Consistency checking between sequence and coordinates * Indication of format errors, with suggestions for solutions * Easier options for entering author information Structures deposited using beta-ADIT will result in real deposition sessions that will be processed by annotators. beta-ADIT will become the only version of ADIT after a period of testing. Please help us improve this tool by sending your feedback to deposit@deposit.rcsb.org. Beta-ADIT-NMR (batfish.bmrb.wisc.edu/bmrb-adit/) can be used to create individual or combined NMR depositions to the BMRB and PDB archives. This new deposition system accepts multiple NMR data files: structural (e.g., coordinates) and experimental (e.g., constraints, chemical shifts, coupling constants, relaxation data, pKa). The RCSB PDB and BMRB have developed this single tool so that depositors would not be required to use two different tools to deposit these data. Once data is deposited using beta-ADIT-NMR, they will be processed (structural data at the RCSB PDB, experimental data at the BMRB), and be available for download at their respective public domains. Questions and suggestions about beta-ADIT-NMR should be sent to bmrbhelp@bmrb.wisc.edu. RCSB PDB FOCUS: TIPS FOR DEPOSITING MULTIPLE RELATED STRUCTURES USING ADIT When depositing many structures that are related to one another, there are a few ways of making the ADIT process simpler: * Structures solved using X-ray crystallography or NMR should be prepared using pdb_extract before using ADIT. This will minimize manual typing and save time during the deposition process. pdb_extract takes information about data collection, phasing, density modification, and the final structure refinement from the output files and log files produced by the applications used for structure determination. The collected information is organized into a file ready for deposition using ADIT. Information duplicated in all entries (author name, citation information, protein names, etc.) can be included in a text file that is prepared once and used when running pdb_extract for each entry. After pdb_extract has combined all the available information into a single file for each structure, ADIT can be used for quick deposition. * A similar tool is being developed for structures solved by other experimental methods. For these structures, deposit one representative structure following the instructions provided at deposit.pdb.org. Then write to help@deposit.rcsb.org to let us know about the other related entries. Once the first entry has been annotated, processed and finalized, it can be used as a template for your subsequent depositions. For each structure, replace the coordinates and update the information in the header section of the PDB file as necessary to prepare the related files for deposition. * If the structures have ligands, drugs or inhibitors bound to them, please check Ligand Depot and match the 3 letter code in the file to the one used in the chemical component dictionary. If the ligand is not present in the dictionary, please email detailed information (complete chemical name, 2D figure showing connectivity, bond order and sterochemistry) along with the RCSB and PDB IDs of the associated entries to expedite the processing of these files. THE SEARCHABLE PDB EXCHANGE DICTIONARY The information collected, processed and distributed by the wwPDB is all defined in the PDB Exchange Dictionary.(1) This dictionary, along with several other dictionaries, can be searched using the text box at the top of the Dictionary Resources page, and browsed using the HTML version. The XML Schema for the PDB Exchange Data Dictionary is also available for download. The PDB Exchange Dictionary includes definitions for X-ray crystallography, NMR, 3D EM, and protein production. These definitions were developed and reviewed by discipline experts and by the member organizations of the wwPDB. There are currently 3395 definitions in the PDB Exchange Dictionary that are divided into 283 categories. The categories are organized into groups of related definitions analogous to the organization of related columns in a table. The PDB Exchange Dictionary uses the dictionary language developed for the macromolecular Crystallographic Informa- tion File (mmCIF) dictionary.(2) The dictionary includes textual definitions and examples as would be found in any language dictionary, as well as data type, boundary conditions and controlled vocabularies that can be used by software applications to validate and maintain uniformity of usage in data files. Since the dictionary is fully software accessible it can also be translated into alternative formats, as has been done in the case of the eXtensible Markup Language (XML) to provide a PDBML dictionary.(3) (1)Westbrook, J. et al. (2005) In Hall, S. R. and McMahon, B. (eds.), International Tables for Crystallography. Springer, Dordrecht, The Netherlands, Vol. G. pp. 195-198. (2)Fitzgerald, P.M.D. et al. (2005) In Hall, S. R. and McMahon, B. (eds.), International Tables for Crystallography. Springer, Dordrecht, The Netherlands, Vol. G. Definition and exchange of crystallographic data, pp. 295-443. (3)Westbrook, J. et al. (2005) Bioinformatics, 21, 988-992. DEPOSITION STATISTICS As of October 1, 5376 structures have been deposited to the PDB this year. The entries were processed by the wwPDB teams at RCSB PDB, MSD-EBI, and PDBj. Of the structures deposited, 69.3% were deposited with a release status of "hold until publication"; 17.8% were released as soon as annotation of the entry was complete; and 12.9% were held until a particular date. 81.9% of these entries were determined by X-ray crystallographic methods; 12.5% were determined by NMR methods; and 82.6% of all of these depositions were deposited with experimental data. -------------------------------------------- DATA QUERY, REPORTING, AND ACCESS PROTEIN WORKSHOP: A VISUALIZATION TOOL Protein Workshop is a new molecular viewer available from the RCSB PDB from every structure summary page. Its simple interface lets users quickly and easily select structural elements to change the coloring, labeling, and representation style (ribbons, cylinders, and more). Users can also color specific structural features based on conformation type, hydrophobicity, and residue type. Protein Workshop is an excellent tool for generating high-resolution images in JPG, BMP, TIFF, WBMP, and PNG formats. A tutorial for creating these images is available. This Java tool uses the Molecular Biology Toolkit (mbt) and JOGL technology. It requires no installation other than the most recent version of Java. Tutorials are available from the RCSB PDB website. ADVANCED SEARCH TUTORIAL The majority of simple searches of the RCSB PDB website are performed using the keyword box at the top of each page. More specific and complex searches are possible using the "Advanced Search" that appears at the top of each page. The Advanced Search fully unleashes the power of the RCSB PDB query engine. The circled "Quick Feature Guide" launches the Advanced Search tutorial. This short animated and narrated feature requires Flash software. At the conclusion of the tutorial, which describes the major features of the Advanced Search, the listener has the option to go to a help page which has the full details for using all of the Advanced Search features. If you have any problems using advanced search, email info@rcsb.org for help. Suggestions for improvements are always welcome. WEBSITE STATISTICS Access statistics for www.pdb.org are given below for the third quarter of 2006. MONTH...UNIQUE VISITORS...# OF VISITS.....BANDWITH JUL...........95,899........237,200.......549.54GB AUG...........84,884........216,782.......542.96GB SEP..........109,812........267,619.......570.53GB -------------------------------------------- OUTREACH AND EDUCATION Art of Science Exhibitions The Art of Science exhibit was on display at City College, The City University of New York from August 1-11, 2006. Sponsored by the Pathways Bioinformatics and Biomolecular Center, the exhibit was opened with a presentation by RCSB PDB Director Helen M. Berman. The exhibit and talk coincided with the CenterÕs bioinformatics workshop for high school students. The exhibit then traveled to the Hostetter Arts Center at The Pingry School in Martinsville, NJ. This exhibit also featured models from 3D Molecular Designs1. For the past few years, Pingry students in Tommie Hata's and Deidre O'Mara's science classes have been interested in structural biology (see Spring 2004's Education Corner). Their SMART teams (Students Modeling a Research Topic) have visited the RCSB PDB at Rutgers, built three-dimensional models of structures, such as RNA polymerase, and presented their work at Experimental Biology conferences. During September, Pingry students (grades 7-12) were able to explore the structures found in the PDB. Biology and art classes were held in the gallery to look at this interesting intersection of art and science, and to inspire students to create works of their own. If you would be interested in sponsoring this exhibit at your institution, please let us know at info@rcsb.org. Meetings and Exhibits * American Crystallographic Association (ACA's) Annual Meeting: The RCSB PDB met with many depositors and users at the ACA's Annual Meeting (July 22 - July 27, 2006; Honolulu, Hawaii). Demonstrations of the RCSB PDB website and deposition software tools were given in the exhibit hall. RCSB PDB Director Helen M. Berman received the M.J. Buerger Award from the ACA. The award recognizes her lifetime work in the development of information services for the global community of researchers who both produce and use macromolecular structural data. The triennial award was established in 1983 in honor of the crystallographic contributions of Martin J. Buerger, Institute Professor Emeritus of M.I.T. and University Professor Emeritus of the University of Connecticut. The award recognizes established scientists who have made contributions of exceptional distinction in areas of interest to the ACA. The award was presented by ACA President Robert Bau at the Buerger Symposium. The session featured a few of Berman's collaborators, and focused on new and emerging technologies for determining biological macromolecular structures, on how the resultant data is used to further the understanding of molecular function, and on the underpinnings of the bioinformatics framework that makes many of these studies possible. * In Silico Analysis of Proteins: The 20th Anniversary of Swiss-Prot As part of the celebration of Swiss-Prot's 20 years of service to the scientific community, RCSB PDB Co-Director Philip E. Bourne presented the lecture "The RCSB PDB - Teaching an Old Dog New Tricks" (July 30 - August 4; Fortaleza, Brazil). * 14th Annual International Conference on Intelligent Systems for Molecular Biology (ISMB) At the ISMB meeting (August 6-10, 2006; Fortaleza, Brazil), the RCSB PDB presented new website features and answered questions regarding new tools and services. At the exhibit booth, visitors saw full demonstration and sample code for accessing data and resources through the RCSB PDB's web services framework. * The RCSB PDB Poster Prize was awarded for best student posters related to macromolecular crystallography at ACA and the European Crystallographic Meeting (ECM; August 6 - 11; Leuven, Belgium), and for best student poster in the "Structural Bioinformatics" category at ISMB this year. The prize will also be awarded at the Asian Crystallographic Association meeting later this year. Winners received a subscription to Science and their choice of a volume of the International Tables for Crystallography (International Union of Crystallography in conjunction with Springer) for the crystallographic prizes, and Bioinformatics: The Machine Learning Approach (Baldi and Brunak, 2001) for the structural bioinformatics prize. Many thanks to all of the participants, judges, and organizers. The winning posters are described at http://www.pdb.org/pdb/static.do?p=general_information/about_pdb/poster_prize_2006.html Molecules of the Quarter The Molecule of the Month series explores the function and significance of selected biological macromolecules for a general audience. The molecules featured this quarter were amyloid-beta precursor protein, AAA+ proteases, and elongation factors. The complete Molecule of the Month features are accessible from the RCSB PDB home page. New RCSB PDB Flyers Available in Print and Online Two new brochures are available for RCSB PDB users. The General Information trifold provides an overview of the RCSB PDB project, and includes information about data deposition, data query and reporting, Molecule of the Month, structural genomics, wwPDB, and outreach and education resources. 5 Easy Steps for Structure Deposition describes the tools that facilitate NMR and X-ray crystal structure deposition and validation for depositors. To receive printed copies of these flyers, please send your postal address and brochure request to info@rcsb.org. Requests can be made for multiple copies. -------------------------------------------- PDB EDUCATION CORNER: Robert J. Warburton, Shepherd University Robert J. Warburton earned his Ph.D. in Biochemistry from Duquesne University. He is currently a Professor of Biochemistry at Shepherd University teaching courses in Biochemistry, Protein Chemistry and the non-majors course, Chemistry in Society. There's a story about a group of blind men inspecting an elephant. Each can only see a part of the whole and each is convinced they know the identity of the animal. Each is, of course wrong. This was my feeling in the 1980's as I worked in the lab of Dr. Dave Seybert at Duquesne University. My Ph.D. thesis involved the attempt to discern the three-dimensional shape of bovine mitochondrial adrenodoxin reductase (AR) by limited proteoltyic cleavage. The structure of the enzyme was, at that time, unknown. A number of cDNA sequences had been determined and some information, with respect to glycosylation sites, had been proposed. An initial experiment using limited tryptic cleavage had been designed based on the functional similarities between a spinach ferridoxin oxidoreductase and the bovine mitochondrial enzyme. The cleavage produced fragments of approximately 30 kDa and 20 kDa and indicated the possibility of a two domain structure in the 55 kDa AR. So began a series of experiments attempting to characterize the structure and function of the two fragments within the whole -- I felt that I was the blind man with an elephant.(1) The experience of my graduate work cemented my interest in the relationship of subtle changes in primary structure and function. I left Duquesne in 1990 and traveled south to the lab of Dr. Jeff Frelinger at the University of North Carolina at Chapel Hill. Here was a new story for me to read. Here we actually had a photograph of the elephant to study in detail. The people in the lab were concerned with trying to determine how the triad of heavy chain, light chain and peptide of the Class I Major Histocompatibility Complex (MHC) molecule HLA-A*0201 would be affected by point mutations. Multiple variations of primary sequence of the heavy chain had been produced containing single and multiple point mutations by the use of saturation mutagenesis. Not only did we have the picture of the elephant, but having moved some of the parts of the beast around, could it still walk? The first crystal structure of HLA-A*0201 had been solved by Dr. Pam Bjorkman and co-workers, working in the lab of the late Dr. Don Wiley at Harvard, and submitted to the PDB as 3HLA.(2) The issue of Nature that contained the first images of "A2" was poured over by the folks in the Frelinger lab. As is always the case, many questions were both simultaneously answered and many more asked by the structure presented. It was at this time that I was introduced to the Evans and Sutherland workstation. Many hours were spent in a darkened room slowly moving the structure back and forth, zooming in and out, and drinking coffee -- I still remember the thrill of seeing a dynamic image on the screen before me. One of my projects was to determine the effect of two point mutations on HLA-A*0201 that had disrupted a disulfide bridge. This bridge, between cysteine 101 and cysteine 164 held an extended section of alpha-helix to the edge of a beta-sheet platform that made up one side of a peptide binding cleft. The question asked was: "could such a mutation in the primary sequence, and the consequent loss of rigidity in the tertiary structure disrupt the ability of the protein to function" ... could the elephant still walk? The answer was, yes it could walk, but it didn't leave the house much! (3) For me, the difference between seeing and not seeing the protein of interest was enormous and as fundamental to understanding as building models in organic chemistry can make stereochemistry make sense. For the full Education Corner: A Journey Out of Darkness, including references and images, please see the HTML or PDF version of the newsletter. -------------------------------------------- PDB COMMUNITY FOCUS: Wah Chiu, Baylor College of Medicine Dr. Wah Chiu is the Alvin Romansky Professor of Biochemistry at Baylor College of Medicine. He is a leading investigator in the structural determination of biological nanomachines using cryo-electron microscopy (cryoEM) towards atomic resolution. His laboratory has pioneered various experimental and computational methods in biological cryoEM. He has determined cryoEM structures of filament bundles, ion channels, viruses and chaperonins at subnanometer resolutions. He is the founding director of two NIH-supported research centers: the National Center for Macromolecular Imaging (ncmi.bcm.edu) and the Center for Protein Folding Machinery (proteinfoldingcenter.org). Both involve investigators from diverse disciplines in biology, medicine, physics, chemistry, engineering and computing from different institutions and industries across the U.S. He is the founding director of the Graduate Program in Structural and Computational Biology and Molecular Biophysics at Baylor College of Medicine (scbmb.bcm.tmc.edu) with 68 faculty members from multiple academic institutions in the greater Houston area to train future scientists at the interface between biomedicine and physical, chemical, mathematical, computational and engineering sciences. He is also the co-founder of the Gulf Coast Consortia for Collaborative Research and Training in the Houston-Galveston Area with faculty and trainees from Baylor College of Medicine, Rice University, University of Houston, MD Anderson Cancer Center, University of Texas Houston Medical School and University of Texas Galveston Medical Branch. Q: What was your path into the field of cryoEM? A: I entered the field of electron microscopy while I was a graduate student. The field of cryoEM was started in the lab at Berkeley where I did my PhD thesis. Q: What is cryoEM? Why do you think the number of cryoEM structures is increasing? A: CryoEM uses a transmission electron microscope with frozen, hydrated specimens kept at low temperature (below liquid nitrogen temperature). The number of cryoEM structures is increasing partly because the technology has become simpler for biologists to use and partly because the biologists are interested in studying large complexes that are too difficult for conventional crystallography or that are complementary to the crystal structures of the molecular components or the entire complex in one crystalline state. Q: How far can single particle cryoEM technology be pushed -- will it eventually be possible to attain truly atomic resolution? A: The best single particle cryoEM study is now capable of producing a density map of a large macromolecular assembly at ~4 Angstrom. I expect that combining the bioinformatics and available PDB structures of component homologs, the cryoEM map will be interpretable in terms of a polypeptide backbone trace and bulky side chains in the near future. To determine single particle structure at truly atomic resolution (i.e. 2 Angstrom or better), numerous technical hurdles have to be overcome. Q: What is the next frontier in terms of structures that will be observable by cryoEM? A: CryoEM is currently focused on the study of biological assemblies which are composed of multiple molecular components and have multiple conformations at different functional states. There is also tremendous enthusiasm to pursue cryo-electron tomography of cells and organelles. Q: Currently, cryoEM maps can be deposited at the EBI, and then coordinates fitted into the maps are deposited into the PDB. How can the processes of depositing and archiving cryoEM data be improved? A: We would like to deposit both the cryoEM density map and the associated models to one site. Currently, we lack both uniform standards for data representation and tools for visualizing low resolution cryoEM maps. In addition, validation of the observed map and model requires more technical development. To make a successful repository site, we need collaborations among the cryoEM specialists, biology end-users, computational and mathematical specialists and the experienced staff at the RCSB PDB and EBI-MSD. Equally important, steady federal support to initiate and maintain such an infrastructure is necessary. Q: What is the work of the National Center for Macromolecular Imaging (NCMI)? A: NCMI is a national facility for macromolecular cryoEM supported by National Center of Research Resources of the NIH. We have the missions of core technology research and development, collaboration, service, training and dissemination. Our Center serves the biological community in a manner similar to synchrotron beam lines in that users can apply to use our facility. The approved projects will be carried out by the users or in collaboration with our experienced staff. We also engage in development of data processing and structure interpretation software, all of which are freely available through our web site. NCMI also sponsors annual workshops to train users to use the newest cryoEM technologies. ---------------------------------------- RCSB PDB JOB LISTING: BIOCHEMICAL INFORMATION & ANNOTATION SPECIALIST The RCSB Protein Data Bank (www.pdb.org) is a publicly accessible information portal for researchers and students interested in structural biology. At its center is the PDB archive – the sole international repository for the 3-dimensional structure data of biological macromolecules. These structures hold significant promise for the pharmaceutical and biotechnology industries in the search for new drugs and in efforts to understand the mysteries of human disease. The primary mission of the RCSB PDB is to provide accurate, well-annotated data in the most timely and efficient way possible to facilitate new discoveries and scientific advances. The RCSB processes, stores, and disseminates these important data, and develops the software tools needed to assist users in depositing and accessing structural information. The RCSB Protein Data Bank at Rutgers University in Piscataway, NJ has an opening for a Biochemical Information & Annotation Specialist to curate and standardize macromolecular structures for distribution in the PDB archive. The annotation specialist communicates daily with members of the deposition community, and annotates, releases, and updates entries in the PDB archive. BLAST, PubMed, and other tools are used for the annotation process performed on a linux box. A background in biological chemistry (PhD, MS, BS or BA) is required. Experience with linux computer systems, biological databases, crystallography, and/or NMR spectroscopy is a strong advantage. The successful candidate should be self-motivated, pay close attention to detail, possess strong written and oral communication skills, and meet deadlines. This position offers the opportunity to participate in an exciting project with significant impact on the scientific community. Please send resume to Dr. Helen M. Berman at pdbjobs@rcsb.rutgers.edu. ---------------------------------------- STATEMENT OF SUPPORT The RCSB PDB is supported by funds from the National Science Foundation, the National Institute of General Medical Sciences, the Office of Science, Department of Energy, the National Library of Medicine, the National Cancer Institute, the National Center for Research Resources, the National Institute of Biomedical Imaging and Bioengineering, the National Institute of Neurological Disorders and Stroke, and the National Institute of Diabetes & Digestive & Kidney Diseases. The RCSB PDB is managed by two partner sites of the Research Collaboratory for Structural Bioinformatics: RUTGERS Rutgers, The State University of New Jersey Department of Chemistry and Chemical Biology 610 Taylor Road Piscataway, NJ 08854-8087 SDSC/UCSD San Diego Supercomputer Center and the Skaggs School of Pharmacy and Pharmaceutical Sciences University of California, San Diego 9500 Gilman Drive La Jolla, CA 92093-0537 RCSB PDB LEADERSHIP TEAM Dr. Helen M. Berman - Director Rutgers University berman@rcsb.rutgers.edu Dr. Philip E. Bourne - Co-Director SDSC/SKAGGS/UCSD bourne@sdsc.edu A list of current RCSB PDB Team Members is available from the website. The RCSB PDB is a member of the Worldwide PDB (www.wwpdb.org) ----------------------------------------- SNAPSHOT October 1, 2006 39051 released atomic coordinate entries * Molecule Type 35767 proteins, peptides, and viruses 1671 nucleic acids 1579 protein/nucleic acid complexes 34 other * Experimental Technique 33126 X-ray 5707 NMR 134 electron microscopy 84 other 21163 structure factor files 3014 NMR restraint files