RCSB PDB Newsletter Number 39 -- October 2008 Published quarterly by the Research Collaboratory for Structural Bioinformatics Protein Data Bank Weekly RCSB PDB news is published at www.pdb.org To change your subscription options, please visit lists.sdsc.edu/mailman/listinfo.cgi/rcsb-news ----------------------------------------- TABLE OF CONTENTS Message from the RCSB PDB Data Deposition and Processing Announcement: Comprehensive Format Guide Version 3.2 Ligand Expo: A Resource for Depositing Structures PDB Focus: What is the Smallest Polymer Structure That Can Be Deposited to the PDB? ADIT Focus: Session Restart IDs Deposition Statistics Data Query, Reporting, and Access Exploring Structures Through PubMed Abstracts Website Statistics Enhanced RSS Feed from the RCSB PDB Outreach and Education Meetings and Presentations RCSB PDB Poster Prizes DOIs for PDB Structures and the Molecule of the Month wwPDB Paper: Representation of Viruses in the Remediated PDB Education Corner: Molecular Visualization in Your Pocket by Brad Larson, Ph.D., Sunset Lake Software PDB Community Focus: Paul D. Adams, Ph.D., Lawrence Berkeley Laboratory Statement of Support, Partners, Leadership Team Snapshot -------------------------------------------- MESSAGE FROM THE RCSB PDB An increasing number of novel structures in the PDBarchive are being deposited by structural genomics centers located worldwide. The RCSBPDBnow links to a resource that makes all the structural genomics products generated by the Protein Structure Initative (PSI) available to the greater scientific community. First established in the spring of 2008, the PSI Structural Genomics Knowledgebase (PSI SGKB; kb.psi-structuralgenomics.org) is an entry point to all of the protein structure and production resources created by the PSI. From the home page, researchers can enter the sequence or PDB ID of a protein to find the corresponding structure and others like it, structural and functional annotations from key external databases, comparative homology model structures available through the Protein Models Portal, experimental progress of similar progress>rotein targets through TargetDB, protocols for protein production through PepcDB, and availability of those DNA clone materials through the PSI Materials Repository. Keyword searches return other useful information including descriptions of new technologies and methods, a list of publications detailing key findings, and links to related resources provided by the PSI centers. The PSI SGKB makes it possible for researchers to access a wealth of information from one site. In September, the PSI SGKB was re-launched as a Gateway in collaboration with the Nature Publishing Group (NPG). The expansion of the PSI SGKB has added a research library, RSS feeds, editorials about new research advances, news, and an events calendar to present a broader view of research activities in structural biology and structural genomics. The PSISGKBis funded by the NIGMS. -------------------------------------------- DATA DEPOSITION AND PROCESSING Announcement: Comprehensive Format Guide Version 3.2 During the past year, wwPDB annotators have collaborated on a project to clarify the details and procedures related to data processing and annotation. The result is a PDB Contents Guide Version 3.2 that more fully describes the PDB file format. This document is available either as a PDF or via HTML from the wwPDB website, and is accompanied by a document highlighting these clarifications. In the coming months, all files released by the wwPDB will follow the format as described in this document. Details will be made available at www.pdb.org and at www.wwpdb.org. Ligand Expo: A Resource for Depositing Structures The Chemical Component Dictionary archives chemical and structural information about all residue and small molecule components found in PDB entries. Ligand Expo is a tool that can access, visualize, and build reports about these data. It can also be used to prepare a file for deposition through the following process: * Search Ligand Expo for a chemical component that matches your ligand * If a match is found, use that corresponding 3-character code for the ligand in your coordinates * If the ligand is not found, choose a new 3-character code for the ligand * When depositing your structure with ADIT, upload the chemical name and a file showing the chemical image for the new ligand into the Ligand Information section Ligand Expo (ligand-expo.rcsb.org) is an update of the Ligand Depot resource. PDB Focus: What is the Smallest Polymer Structure That Can Be Deposited to the PDB? The PDB contains biomolecular polymers including polypeptides, polynucleotides, polysaccharides, and their complexes. Polypeptide structures containing 24 or more residues can be deposited to the PDB. Smaller peptides that are complexed with a larger polymer (greater than the minimum length defined above) may be deposited to the PDB. Crystal structures of peptides with fewer than 24 residues, such as antibiotics, should be sent to the Cambridge Crystallographic Data Centre (CCDC; www.ccdc.cam.ac.uk). Polynucleotide structures with 4 or more residues are accepted at the PDB. Smaller oligonucleotides (dinucleotides and trinucleotides) can be deposited at the Nucleic Acid Database (NDB; ndbserver.rutgers.edu). Coordinates for the repeating unit of fibrous polymers and polysaccharide structures with 4 or more sugar residues may be deposited at the PDB archive. Molecules that do not conform to these guidelines but have been previously deposited in the archive will not be removed. Structures may be deposited to the archive via the wwPDB. ADIT Focus: Restarting Deposit Sessions A structure can be deposited over a period of time by using ADIT's "Session Restart ID" feature. This identifier appears in red in the center of the browser window when ADIT's "deposit" step is first started. It is also seen in the title of the browser throughout the deposition session. The case-sensitive restart ID should be entered in the space provided on the ADIT home page to return to the deposition session. Any data entered in a category are stored every time the user selects the SAVE button. All entered data associated with a particular entry can be accessed using the restart ID until the "DEPOSIT NOW" button is selected, for up to six months after the session has been last updated. ADIT is available at the RCSB PDB and PDBj. ADIT-NMR can be used to deposit data to both the PDB and BMRB. A tutorial guide to using ADIT is available in English and Japanese. Simulation sessions for "in progress" deposition are available to practice learning how to use ADIT at rcsb-deposit-demo-1.rutgers.edu. Deposition Statistics In the third quarter of 2008, 1924 experimentally-determined structures were deposited to the PDB archive. The entries were processed by wwPDB teams at the RCSB PDB, PDBe, and PDBj. Of the structures deposited, 76.9% were deposited with a release status of "hold until publication"; 18.1% were released as soon as annotation of the entry was complete; and 5.0% were held until a particular date. 92.2% of these entries were determined by X-ray crystallographic methods; 6.8% were determined by NMR methods. During the same time period, 1928 structures were released in the PDB. -------------------------------------------- DATA QUERY, REPORTING, AND ACCESS Website Statistics Access statistics for www.pdb.org and ftp://ftp.wwpdb.org for the third quarter of 2008 are given below. Month Unique Number of Bandwidth HTTP FTP Visitors Visits Downloads Downloads JUL 2008.....161567.....355065.......636.25 GB..3735223...12876019 AUG 2008.....133412.....296024.......514.27 GB..3157620...11587553 SEP 2008.....168114.....366983.......631.13 GB..4614066...10354741 Exploring Structures Through PubMed Abstracts PubMed (www.pubmed.gov) abstracts for the primary citations of PDB entries are integrated into the RCSB PDB website. When reviewing query results of multiple structures at the RCSB PDB site, the Citations Tab provides a PubMed-like list of the corresponding primary citations. This list can be downloaded in Medline format for use with bibliographic programs such as Endnote and RefWorks by selecting the "Medline Format" option from the lefthand menu. The Citations Tab also links to all structures that share a primary citation, and to related articles in PubMed. For each PDB structure, the "Abstract" link on a Structure Summary page provides information downloaded from PubMed about the citation. The keywords, title, and abstract on this page can be used to query the PDB for other entries that have the same words in their abstracts. The citation information can be downloaded in Medline format by selecting the "Medline Format" option from the lefthand menu. Selecting the PubMed icon takes users to the abstract in PubMed. Using the Advanced Search, PubMed keyword searches can be combined with any other query option. -------------------------------------------- OUTREACH AND EDUCATION Meetings and Presentations The RCSB PDB and the wwPDB have been participating in several meetings: * At the 22nd Annual Symposium of The Protein Society (July 19-23; San Diego, CA), Peter Rose presented the poster "Effective Mining of the Protein Data Bank" which explored the many different search functions available from www.pdb.org. * Many PDB users stopped by the RCSB PDBs exhibit booth at the 16th Annual International Conference for Intelligent Systems for Molecular Biology (ISMB; July 19-23; Toronto, Canada), the official conference of the International Society for Computational Biology (ISCB). Associate Director Phil Bourne was involved in many presentations and discussions, and delivered a 3DSig Keynote Lecture at the Structural Bioinformatics and Computational Biophysics satellite meeting. * At the XXI Congress & General Assembly of the International Union of Crystallography (IUCr; August 23 31; Osaka, Japan) wwPDB members from around the globe hosted a joint exhibition stand for demonstrations, met with users, and participated in a variety of sessions and commission meetings. Also at the IUCr meeting, RCSB PDB Director Helen M. Berman presented a keynote lecture entitled What the Protein Data Bank tells us about the past, present, and future of structural biology, and John Westbrook presented Data Quality in the PDB Archive. Professor Berman also participated in a Question and Answer session at the Commission on Biological Macromolecules. * As part of the International Structural Genomics Organization's Conference on Structural Genomics (Sept. 20-24; Oxford, UK), RCSB PDB Director Helen Berman described the PSI Structural Genomics Knowledgebase. * At the EMBO 08 Practical Course on Computational Aspects of the Protein Target Selection, Protein Production Management and Structure Analysis Pipeline (Sept. 22-26; Hinxton, UK), Helen Berman and John Westbrook (RCSB PDB), Haruki Nakamura (PDBj), and Kim Henrick, Dimitris Dimitropoulos, Eugene Krissinel, and Tom Oldfield (PDBe) led tutorial sessions about various resources and tools from wwPDB sites. * At the European Conference in Computational Biology (ECCB 08; Sept. 22-26; Sardinia, Italy), Martha Quesada and Andreas Prlic (RCSB PDB) gave an overview of the multifaceted wwPDB collaboration and demonstrated features found at the RCSB PDB, PDBe, PDBj, and BMRB websites. * Upcoming RCSB PDB events include the New Jersey Science Convention for science teachers (October 14-15; Somerset, NJ), eCheminfo Community of Practice Meeting on Advances in Drug Discovery and Development (October 13-17; Philadelphia, PA), the meeting of the Association of Science and Technology Centers (October 18-21; Philadelphia, PA) and the Pittsburgh Diffraction Conference (October 30-November 1; Pittsburgh, PA). DOIs for PDB Structures and the Molecule of the Month PDB structures can be cited using their PDB ID and related published citation. Structures may also be referenced using their Digital Object Identifier (DOI). PDBDOIs are formatted as 10.2210/pdbXXXX/pdb, where XXXX is the PDB ID. For example, the DOI for entry 4hhb is "10.2210/pdb4hhb/pdb". DOIs can be used in a URL (dx.doi.org/10.2210/pdb4hhb/pdb) or entered in a DOI resolver (such as www.crossref.org) to automatically link to file pdb4hhb.ent.gz on the main PDB FTP archive. DOIs are also available for RCSB PDB Molecule of the Month features in the format: 10.2210/rcsb_pdb/mom_YYYY_MM (where YYYY is the year and MM the number of the month, using one or two digits). For example, the DOI for the May 2003 feature on hemoglobin by Shuchismita Dutta and David S. Goodsell is "10.2210/rcsb_pdb/mom_2003_5". These features are referenced with the DOI and the author/s of the article. A page describing policies & references for using and citing PDB data and RCSB PDB resources is available at www.pdb.org. RCSB PDB Poster Prizes The RCSB PDB Poster Prize for best student poster related to structure and function prediction at the ISCBs ISMB meeting went to Dariya Glazer for Clustering Across Space and Time (Dariya Glazer, Randy Radmer, Russ Altman; Stanford University). At the IUCr meeting, the judges found that the best student poster related to macromolecular crystallography was "Structural insights into the mitochondrial import complex, TIM9.10" by Chaille T. Webb(1,2), Michael Baker(1), Michael T. Ryan(1), Peter M. Colman(1), and Jacqueline M. Gulbis(1), (1)The Walter and Eliza Hall Institute of Medical Research and (2)The University of Melbourne, Australia). The winners will receive a subscription to Science and a related reference book. Special thanks to our judges and the conference organizers. wwPDB Paper: Representation of Viruses in the Remediated PDB The 2007 release of remediated data improved the representation of deposited and experimental coordinate frames, symmetry, and frame transformations in the archive. A paper describing the scheme used by the wwPDB to represent viruses and other biological assemblies with regular noncrystallographic symmetry has been published: C.L. Lawson, S. Dutta, J.D. Westbrook, K. Henrick and H. M. Berman (2008) Representation of viruses in the remediated PDB archive Acta Cryst. D64: 874-882 -------------------------------------------- EDUCATION CORNER: Molecular Visualization in your Pocket by Brad Larson, Ph.D., Sunset Lake Software Brad Larson is the sole proprietor of Sunset Lake Software, a software company devoted to the development of applications aimed at engineering and the sciences. Molecules for the iPhone and iPod Touch is the first public product of this company, with others planned. His day job is Chief Technology Officer of SonoPlot, Inc., a company he co-founded as a spinoff of research at the University of Wisconsin-Madison. SonoPlot (www.sonoplot.com) manufactures high-precision fluid dispensers, called Microplotters, that can print microelectronic circuits and high-density protein microarrays. He holds a B.S. in Chemical Engineering, with a minor in Computer Science, from the Rose-Hulman Institute of Technology in Terre Haute, IN, and a M.S. and Ph.D. in Materials Science from the University of Wisconsin-Madison. Additional background information and publication PDFs are at www.sunsetlakesoftware.com/about. In recent years, we've seen an explosion in processing power within portable devices such as cell phones and media players. Current devices more closely resemble general-purpose computers, especially Apple Inc.s new iPhone and iPod Touch. Both use a form of the OS X operating system found on Macintosh computers, and can run custom applications designed using a desktop-caliber software development kit (SDK). Additionally, these applications are available worldwide through Apples iTunes Store and can be downloaded with a single click. Many high school and college students will carry these devices into classes this fall, providing a opportunity for educational software to be used in unique ways. Molecules is one such application that will hopefully introduce students to the fascinating world of biomolecules and their 3D structures. One of the inspirations for Molecules was a conversation I had with my brother, Matt Larson, who is using X-ray crystallography to determine the structure of a protein as part of his Ph.D. work at the University of Alabama at Birmingham. He was preparing to present a poster with an early structure for this protein, and I couldnt help but think that a static image wasnt the best means of displaying that structure. After witnessing the capabilities of the iPhone and iPod Touch, I imagined what it would be like if he could take an iPod to a conference and pull it out of his pocket every time he wanted to show off the full 3D structure of his protein or even download and research on the spot any PDB molecule mentioned in a presentation. For the full article, please go to www.pdb.org. -------------------------------------------- PDB Community Focus: Paul D. Adams, Ph.D., Lawrence Berkeley Laboratory Paul D. Adams is a Senior Scientist and Deputy Director of the Physical Biosciences Division at Lawrence Berkeley Laboratory, Head of the Berkeley Center for Structural Biology, Vice President for Technology at the Joint BioEnergy Institute, and an Adjunct Professor in the Department of Bioengineering at the University of California Berkeley. He studied biochemistry at Edinburgh University where, in 1992, he also received his Ph.D. in structural biology for work on rodent pheromone binding proteins using crystallographic and molecular modeling methods. During this time he became involved in parallel computing and spent a year working at the Edinburgh Parallel Computing Center. In 1992 he moved to Yale University for a postdoctoral position developing crystallographic and computational modeling methods. Together with Axel T. Brunger and others, he developed the Crystallography and NMR System package that has been a mainstay of structural biology for the last decade. In 1999 he moved to the Lawrence Berkeley Laboratory to start a new group developing tools for structural biology. His current research interests span computation, structural biology and biofuels. Much of his research is focused on the development of new algorithms and computational methods for addressing problems in structural biology. He leads the NIH-funded PHENIX collaboration developing new software for the automated solution of macromolecular structures using crystallographic methods. He also collaborates with researchers at Los Alamos National Laboratory and Baylor College of Medicine to incorporate methods for neutron diffraction and analysis of high-resolution structures from single particle cryo-electron microscopy. He has collaborated with experimentalists, most recently Judith Frydman at Stanford, to understand the structure and function of chaperonins. He is the author of over 80 papers, book chapters and review articles. As Head of the Berkeley Center for Structural Biology at the Advanced Light Source, he oversees the development, maintenance and operation of five synchrotron beamlines for macromolecular crystallographic data collection. Recent upgrades to the beamlines have improved automation and X-ray flux. He is also involved in the growing field of biofuels research, leading programs at both the Joint BioEnergy Institute in Emeryville and the Energy Biosciences Institute in Berkeley. He has had a long and fruitful association with the Protein Data Bank and is a member of the wwPDB X-ray Validation Task Force. He also leads the development of the Technology Portal for the PSI Structural Genomics Knowledgebase. Q: You have had a long history in the development of crystallographic software. Your latest project is PHENIXwhat can you tell us about it? A: At the beginning of this century, the field of structural genomics really took off and it was clear that there would be an increased emphasis on automated crystallographic structure solution. The development of PHENIX was a response to this. A number of us got together, including Randy Read, Tom Terwilliger, Tom Ioerger, and more recently Jane and David Richardson, and decided that it was the right time to do something new. PHENIX has been designed with automation in mind and is based on the Python scripting language. There are tools for automated structure solution using experimental phasing and molecular replacement, structure refinement, ligand coordinate and restraint generation, and structure validation. Although PHENIX has been created with automated structure solution in mind, it is pleasing to see that it is also being used for a number of very challenging structures. PHENIX can be downloaded on the web by academic researchers at www.phenix-online.org. Q: A recent article in Science highlighted the Technology Portal of the PSI SGKB site. What is your vision for this resource? A: Although there is currently much debate about the best future direction for structural genomics research in the US, I think most would agree that this branch of research has generated a large number of tools that are of benefit to all structural biologists. However, one of the problems, not unusual in science, is the communication of these innovations to the scientific community. NIH very wisely decided to put some funding into developing a web-based resource for dissemination of information generated in the US by PSI-funded structural genomics projects. I see this PSI Knowledgebase, lead by Helen Berman, as an essential vehicle for making the broader scientific community aware of these advances. The Technology Portal of the Knowledgebase lets anyone read about new technologies that might be helpful to their own research. Our goal is to provide information so that these researchers can easily get in touch with the technology developers and thus expand the adoption of new technologies. We are also going to set up forums to promote more discussions about technology development in the general scientific community. Q: Most recently, you have taken on a leadership role in the Joint BioEnergy Institute. What are your goals for this organization? A: The long-term availability of fossil fuels and concerns about global warming have made the development of carbon-neutral and renewable sources of energy a priority. The conversion of cellulosic (plant) material to transport fuels has the potential to provide a significant fraction of available fuel in the future. The Joint BioEnergy Institute (JBEI) is one of three Department of Energy (DOE) funded research centers developing the basic science and technology for biofuels production. JBEI is a collaborative effort between Lawrence Berkeley Laboratory, Sandia National Laboratory, Lawrence Livermore Laboratory, UC Berkeley and UC Davis, and the Carnegie Institution at Stanford. Were located in Emeryville, California, about three miles from UC Berkeley and Lawrence Berkeley Laboratory. There are four Divisions that are addressing different parts of the biofuels problem: Feedstocks, Deconstruction, Fuels Synthesis and Technology. As the Vice President leading the Technology Division, my goal is to create new technologies to support the development of biofuels. In the next five to ten years, JBEI researchers will have developed new feedstocks optimized for biofuels production, new methods to breakdown lignocellulose, and microbes that are optimized for the conversion of sugars to fuels. Q: You have been involved in many aspects of structural biology. Where do you see the future? A: I feel fortunate to have experience in both experimental and computational structural biologymy B.Sc. and Ph.D are in biochemistry but I got involved in crystallography, molecular modeling, molecular dynamics, and parallel computing as a graduate student and postdoc. The field has changed greatly over the last twenty years and I think the future of structural biology is already happening. In the past it was only practical for researchers to specialize in one main experimental technique, such as X-ray crystallography. Now more researchers are embracing the idea that answering a biological problem often requires multiple experimental methods, and that it is feasible to bring these methods in house. I anticipate that many of the ground breaking structural biology research projects in the future will combine multiple biophysical techniques, such as crystallography (X-ray and neutron), electron microscopy, and small angle scattering. Hopefully, the use of computational techniques, such as molecular simulation, will increase as those methods improve. Ultimately new experimental techniques will be developed, such as single particle X-ray diffractive imaging, which will undoubtedly open up new research possibilities. ---------------------------------------- STATEMENT OF SUPPORT The RCSB PDB is supported by funds from the National Science Foundation, the National Institute of General Medical Sciences, the Office of Science, Department of Energy, the National Library of Medicine, the National Cancer Institute, the National Center for Research Resources, the National Institute of Biomedical Imaging and Bioengineering, the National Institute of Neurological Disorders and Stroke, and the National Institute of Diabetes & Digestive & Kidney Diseases. The RCSB PDB is managed by two partner sites of the Research Collaboratory for Structural Bioinformatics: RUTGERS Rutgers, The State University of New Jersey Department of Chemistry and Chemical Biology 610 Taylor Road Piscataway, NJ 08854-8087 SDSC/Skaggs/UCSD San Diego Supercomputer Center and the Skaggs School of Pharmacy and Pharmaceutical Sciences University of California, San Diego 9500 Gilman Drive La Jolla, CA 92093-0537 RCSB PDB LEADERSHIP TEAM Dr. Helen M. Berman - Director Rutgers University berman@rcsb.rutgers.edu Dr. Martha Quesada - Deputy Director Rutgers University mquesada@rcsb.rutgers.edu Dr. Philip E. Bourne - Associate Director SDSC/Skaggs/UCSD bourne@sdsc.edu A list of current RCSB PDB Team Members is available from the website. The RCSB PDB is a member of the Worldwide PDB (www.wwpdb.org) -------------------------------------------- SNAPSHOT October 1, 2008 53384 released atomic coordinate entries * Molecule Type 49279 proteins, peptides, and viruses 1918 nucleic acids 2154 protein/nucleic acid complexes 33 other * Experimental Technique 45587 X-ray 7502 NMR 195 electron microscopy 100 other 34726 structure factor files 4196 NMR restraint files