RCSB PDB Newsletter Number 32 -- Winter 2007 Published quarterly by the Research Collaboratory for Structural Bioinformatics Protein Data Bank Weekly RCSB PDB news is published at www.pdb.org To change your subscription options, please visit lists.sdsc.edu/mailman/listinfo.cgi/rcsb-news ----------------------------------------- TABLE OF CONTENTS Message from the RCSB PDB Data Deposition and Processing 2006 Statistics DOIs Available for Released Entries in the PDB Archive PDB Focus: The ADIT Help System Annotation at the RCSB PDB RCSB PDB Focus: Depositing New Chemical Components (Ligands) Data Query, Reporting, and Access PDB Focus: Sorting Search Results and Tabular Reports RCSB PDB Focus: Exploring Domains in Protein Structure Searching for Sequence Variants Website Statistics RCSB PDB Focus: External Links Outreach and Education PDB Structures on Exhibit at the Birch Aquarium wwPDB Paper Published RCSB PDB Poster Prize Awarded at AsCA Meeting News Molecules of the Quarter PDB Education Corner: Methods in Structural Biology Course at CSHL PDB Community Focus: Julian Voss-Andreae, Protein Sculptor PDB Community Focus: Statement of Support, Partners, Leadership Team Snapshot -------------------------------------------- MESSAGE FROM THE RCSB PDB Covering the period of July 1, 2005 – June 30, 2006, the RCSB Protein Data Bank's Annual Report documents the database and website released during this period. The Annual Report provides background information about the RCSB PDB resource and describes current progress and accomplishments. Available online as a PDF, this snapshot also explores the RCSB PDB's different activities in data deposition, structural genomics, and education. The cover highlights the structures featured in the Molecule of the Month during the report period. Each installment introduces readers to the structure and function of a particular molecule, and discusses the relevance of the molecule to human health and welfare. Written and illustrated by David S. Goodsell (The Scripps Research Institute), the series is a great place for users of all levels to start exploring the RCSB PDB resource. This cover, along with information about the proteins, is also available as a downloadable PDF flyer. The report is distributed to the diverse community of PDB users in academia, industry, and education. If you would like a printed copy of this report, please send your postal address to info@rcsb.org. -------------------------------------------- DATA DEPOSITION AND PROCESSING 2006 STATISTICS In 2006, 6911 experimentally-determined structures were deposited to the PDB archive. The entries were processed by wwPDB teams at the RCSB PDB, MSD-EBI, and PDBj. Of the structures deposited in 2006, 71.8% had a release status of "hold until publication"; 16.1% were released as soon as annotation of the entry was complete; and 12.1% were held until a particular date. 86.7% of these entries were determined by X-ray crystallographic methods; 12.8% were determined by NMR methods. 87.5% of these depositions were deposited with experimental data. During the same period of time, 6910 structures were released into the archive. DOIS AVAILABLE FOR RELEASED ENTRIES IN THE PDB ARCHIVE Structures released by the wwPDB into the PDB Archive are now being assigned a Document Object Identifier (DOI). The DOI System is used to identify content objects (such as journal articles, books, and figures) in the digital environment. The DOIs for PDB structures all have the same format – 10.2210/pdbXXXX/pdb – where XXXX should be replaced with the desired PDB ID. For example, the DOI for PDB entry 4HHB is "10.2210/pdb4hhb/pdb". This links directly to the entry in the PDB file format on the FTP server. The DOI can be used as part of a URL to obtain this data file (dx.doi.org/10.2210/pdb4hhb/pdb), or can be entered in a DOI resolver (such as www.crossref.org) to automatically link to pdb4hhb.ent.Z on the main PDB ftp archive (ftp://ftp.rcsb.org). DOIs are automatically registered by the wwPDB when entries are released after the weekly update. They will not be available before a structure's release. Along with the ftp location, the DOIs for PDB entries also include the entry title, the authors, and the deposition date. PDB FOCUS: THE ADIT HELP SYSTEM The deposition tool ADIT (at RCSB or PDBj) includes examples and definitions provided in the PDB Exchange Dictionary as guides for users depositing their structures. An explanation for each piece of information requested by ADIT can be obtained by selecting the Help button located next to the named data item. This information will appear in the bottom frame. Pressing the Help button in the top frame will display these instructions. Examples for these items can be obtained by selecting the Example button within the table. This information will appear in the bottom frame. At any time during deposition, you may view the current state of the entire entry by pressing the PREVIEW ENTRY button. ANNOTATION AT THE RCSB PDB The question "So, what does an 'annotator' do?" has been answered with the article: A Biocurator Perspective: Annotation at the Research Collaboratory for Structural Bioinformatics Protein Data Bank Kyle Burkhardt, Bohdan Schneider, Jeramia Ory (2006) PLoS Comput Biol 2(10): e99 The typical day of an annotator and the challenges facing PDB curators are described. The RCSB PDB is looking for Biochemical Information & Annotation Specialists to curate and standardize macromolecular structures for distribution in the PDB archive. A background in biological chemistry (PhD, MS, BS or BA) is required. Experience with linux computer systems, biological databases, crystallography, and/or NMR spectroscopy is a strong advantage. The successful candidate should be self-motivated, pay close attention to detail, possess strong written and oral communication skills, and meet deadlines. This position offers the opportunity to participate in an exciting project with significant impact on the scientific community. Please send resume to Dr. Helen M. Berman at pdbjobs@rcsb.rutgers.edu. RCSB PDB FOCUS: DEPOSITING NEW CHEMICAL COMPONENTS (LIGANDS) To deposit new ligands, please check Ligand Depot to see if the ligand, drug, ion, non-standard residue, modified residue, group, etc. is present in our chemical component dictionary. If the ligand is present, please make sure that the 3 letter code for the ligand in your file matches the one used in the chemical component dictionary. If the ligand is not present in the dictionary, users can now upload a 2-D figure of the structure as part of the ADIT deposition process in PostScript, TIFF, or GIF formats. From the left-hand categories menu, scroll down and select the "Upload Supplemental Information: Ligand Information" to provide this information. Although the 3 letter code used for the ligand has no specific significance, you may check Ligand Depot and select a code for your ligand that has not been taken, otherwise, one will be selected for you. -------------------------------------------- DATA QUERY, REPORTING, AND ACCESS RCSB PDB FOCUS: SORTING SEARCH RESULTS AND TABULAR REPORTS The RCSB PDB offers many ways of looking at the information contained in the database. After searching for a set of structures, users can explore individual structures or examine the whole set by creating reports. For examples, please see www.rcsb.org/pdb/general_information/news_publications/newsletters/2006q4/query.html RCSB PDB FOCUS: EXPLORING DOMAINS IN PROTEIN STRUCTURE Domains can be thought of as the smallest structural units from which proteins are assembled that retain properties of the whole protein, such as a hydrophobic core. In certain cases, domains can also function independently from the rest of the structure. Any given protein structure is comprised of one or more domains from which the overall properties of the protein are derived. Analyzing a protein structure from the point of view of its composite domains is an important, yet not fully solved problem. The RCSB PDB offers various ways of exploring domains in protein structures: PDOMAINS (pdomains.rcsb.org/pdomains)1 is a resource centered around the definition and assignment of structural domains in proteins. It offers analysis of existing approaches to domain definition and provides a benchmark dataset to evaluate and cross-compare automatic domain assignment methods. The Browse Database option of the ‘Search’ tab on the left-hand menu offers a tool to explore the hierarchically organized and curated domain definitions produced by SCOP (scop.mrc-lmb.cam.ac.uk/scop) and CATH (www.cathdb.info). Structure Summary pages for individual structures provide links to sets of all structures containing domains similarly categorized by SCOP and CATH. Each category is a link to a result. Sequence Details pages for each structure illustrate domains aligned with sequence and secondary structure. The colored domain definition links in the SCOP Domains section coincide with the colored bars underneath the sequence to indicate the location of the domains. Boundaries between domains are highlighted further with vertical dashed lines. Users can choose domain definitions according to either SCOP or CATH and the programs DomainParser (compbio.ornl.gov/structure/domainparser) and PDP (123d.ncifcrf.gov/pdp.html). SEARCHING FOR SEQUENCE VARIANTS Protein structure sequences are assigned UniProt/SwissProt IDs (UNP/SWS). The new 'Sequence Variants/Non-variants' RCSB PDB feature lets users retrieve all structures with a particular UNP/SWS ID, grouped by the presence or absence of sequence variations (variants or non-variants). Searches for variants will provide structures with post-translational modifications, whereas searching for non-variants will provide occurrences of structures that have at least one identical polypeptide chain. This query can be run using the Advanced Search option. * Click on the 'Search' tab on the left-hand menu, and expand the 'Search Database' menu option. * Click on 'Search Database'>>'Advanced Search' * Select 'Choose a Query Type'>>'Sequence Features'>>'Sequence (Non/)Variants' * Enter a structure ID * Select the desired chain from the pull-down menu * Select 'No' to retrieve all structures whose sequences do not vary from the reference sequence due to point mutations/insertions/deletions. For example, using 1LJ3 chain A, variant = 'No' will pull up structures of lysozyme with no sequence variations. Variant = 'Yes' retrieves all structures whose sequences vary from the reference UNP/SWS sequence due to point mutations/insertions/deletions. Variants and non-variants can also be retrieved from the Structure Summary page for any structure having such variants with the same UNP/SWS ID. From the lefthand menu, select Structure Analysis-> Sequence Variants. The resulting page displays the pairwise alignment of the structure sequence and the UNP/SWS sequence for each structure. 2006 WEBSITE STATISTICS Access statistics for the year 2006 are given below for the RCSB PDB website at www.pdb.org. Month Unique Number of Bandwidth Visitors Visits Jan-06 82372 183705 488.54 GB Feb-06 91159 195783 625.25 GB Mar-06 104495 230746 527.27 GB Apr-06 145465 303735 614.87 GB May-06 184052 416301 654.34 GB Jun-06 218440 497835 628.79 GB Jul-06 95899 237200 549.54 GB Aug-06 84884 216782 542.96 GB Sep-06 109812 267619 570.53 GB Oct-06 121301 278533 542.23 GB Nov-06 146272 348661 727.24 GB Dec-06 112310 259043 637.95 GB RCSB PDB FOCUS: EXTERNAL LINKS Structure Summary pages for all structures in the PDB now offer a set of external links. Accessible from the left menu, these links provide further information about the structure under study, such as biochemical pathway information, stereochemistry and ligand binding data. -------------------------------------------- OUTREACH AND EDUCATION PDB STRUCTURES ON EXHIBIT AT THE BIRCH AQUARIUM The Sea of Genes exhibit at the Birch Aquarium (La Jolla, CA) helps to unravel the genetic secrets of life in the ocean through interactive displays that highlight the exciting discoveries of Scripps researchers. One feature of this exhibition is an interactive kiosk that invites the user to display information about specific proteins found in marine organisms – and in the PDB. This animation is also available (in Flash format) from the RCSB PDB’s online Educational Resources page. This kiosk is the result of a collaboration between the RCSB PDB and the Birch Aquarium (Scripps Institution of Oceanography at University of California, San Diego). The Sea of Genes exhibit will be on display until Spring 2007. wwPDB PAPER PUBLISHED A paper describing the wwPDB – background, data deposition and access information, data uniformity efforts, and more – has been published: The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data. Helen Berman, Kim Henrick, Haruki Nakamura, and John L. Markle (2007). Nucleic Acids Research 35(Database issue):D301-3; doi: 10.1093/nar/gkl971 RCSB PDB POSTER PRIZE AWARDED AT ASCA Thanks to everyone who participated in the recent RCSB PDB Poster Prize competition for best student poster related to macromolecular crystallography at the Joint Conference of the Asian Crystallographic Association and the Crystallographic Society of Japan (AsCA; November 20-23 in Tsukuba, Japan). The award went to: Structural studies on the SUF proteins involved in the biogenesis of iron-sulfur clusters. Norika Sumi1, Kei Wada1, Shintaro Kitaoka1, Kei Suzuki1, Yuko Hasegawa1, Yoshiko Minami2, Yasuhiro Takahashi1, and Keiichi Fukuyama1 1Department of Biology, Graduate School of Science, Osaka University, Toyonaka, Osaka 560-0043 Japan 2Department of Biochemistry, Faculty of Science, Okayama University of Science, Okayama 700-0005 Japan Many thanks to our judges and organizers. Judges: Anders Liljas (Chair; Lund University), K. Byrappa (University of Mysore), Mitchell Guss (University of Sydney), Chwan-Deng Hsiao (Academia Sinica), and Genji Kurisu (University of Tokyo) Organizer: Soichi Wakatsuki (Institute of Materials Structure Science, KEK, Japan) MEETING NEWS * Director Helen Berman described "wwPDB: An International Collaboratory for Structural Bioinformatics" at the 20th CODATA International Conference in Beijing, China (October 23-25). * Berman also gave the talk "Resources for Structural Genomics from the RCSB PDB" at the International Structural Genomics Organization Conference in Beijing (October 22-26). Zukang Feng presented the poster "The RCSB PDB structural genomics portal" at this meeting. * Co-director Phil Bourne traveled to Cuba to discuss "Protein Structural Data Reveals How Environmental Pressures Shape Evolution" at the 27th Latin-American Conference on Chemistry (October 16-20). * Bourne then presented "Assigning DOIs to data objects" at the Crossref Annual Member Meeting in Boston, on November 1, 2006. * The RCSB PDB recently exhibited at the annual conference of the Association of Science-Technology Centers (October 28-31 in Louisville, KY). MOLECULES OF THE QUARTER The Molecule of the Month series explores the function and significance of selected biological macromolecules for a general audience. The molecules featured this quarter were cytochrome p450, fibrin, and transposase. The complete Molecule of the Month features are accessible from the RCSB PDB home page. -------------------------------------------- PDB EDUCATION CORNER: Gary L. Gilliland, X-Ray Methods in Structural Biology Course at CSHL The 2006 class of sixteen students has successfully completed the sixteen-day course X-Ray Methods in Structural Biology. Since 1988, students and instructors have descended upon Cold Spring Harbor Laboratory (CSHL) in New York to study the principles of X-ray crystallography. They then apply these concepts to actual structure determinations through extensive laboratory practicals. Each year, students are selected from a large international pool of applicants from academic, private, and commercial research laboratories. Their previous research experience ranges from that of graduate student to laboratory director. During the course, class begins at 9:00 a.m., with the instructors retiring late in the evening while many of the students work on into the night. As the PDB user community knows, macromolecular crystallography yields a wealth of unique structural information that is used to further our understanding of biological systems. But learning the skills needed for structure determinations using diffraction methods is not easy. Crystallographic training is most often done in a research laboratory through what could be considered an apprenticeship approach, often with little formal training. Designed for scientists with a working knowledge of protein structure and function, the X-Ray Methods in Structural Biology course provides those new to the field an opportunity to learn from practicing instructors that are contributing significantly to the current methods of structure determination. The course curriculum was developed to emphasize a hands-on approach that includes crystallizing several proteins and determining one or more structures using state-of-the-art software. This laboratory and computational course focuses on the major techniques used to determine the three-dimensional structures of macromolecules. Topics covered include basic diffraction theory, crystallization (proteins, nucleic acids and complexes), crystal characterization, X-ray sources and optics, crystal freezing, data collection, synchrotron and home-source X-ray data reduction, multiple isomorphous replacement, multiwavelength anomalous diffraction, molecular replacement, solvent flattening, non-crystallographic symmetry averaging, electron density interpretation, molecular graphics, structure refinement, structure validation, coordinate deposition in the PDB and structure presentation. To read the rest of this article, please see www.rcsb.org/pdb/general_information/news_publications/newsletters/2006q4/education_corner.html Applications for the next course are due June 15, 2007; please see meetings.cshl.edu/courses/c-crys07.shtml for more information. -------------------------------------------- PDB COMMUNITY FOCUS: Julian Voss-Andreae, Protein Sculptor Julian Voss-Andreae is a German-born sculptor based in Portland, Oregon. In his youth he painted for a number of years, but then changed course and studied physics at the universities of Berlin and Edinburgh. After participating in a seminal quantum physics experiment in Vienna as part of his graduate research, Voss-Andreae moved to the US in 2000 with his passion for art rekindled. He graduated from the Pacific Northwest College of Art (PNCA) in 2004 with a BFA in sculpture. While still at PNCA, Voss-Andreae developed a novel kind of sculpture based on the structure of proteins, the building blocks of life. Voss-Andreae’s work has been commissioned internationally and has been highlighted in journals such as Leonardo and Science. For more information, please see www.julianvossandreae.com. Q: Your sculptures are amazing depictions of three-dimensional proteins. How do you decide upon the scale and the materials to use? Are you trying to tell a story about the molecule? Is there a connection between the proteins shown and the wood or metal? A: I want to make sculptures that work as metaphors. When I use the structure of a protein as a starting point, it is not only about the beauty of that structure. That can be accomplished equally well with a protein model. I want to create something meaningful beyond that. To give you an example, I created a sculpture based on the structure of collagen (using the coordinate file for 1BKV). I designed cutouts on each piece (a peptide unit) to reveal the dominant force lines, which is something we see in steel bridges and other steel constructions every day. That way the piece subtly alludes to its role in our body as a structural component. At some point I decided to depart from the molecular structure and opened up the intertwining helices toward the top. This expanded the meaning of the sculpture towards another important aspect most people think of in connection with collagen, aging. Collagen is responsible for our skin’s elasticity – its degradation famously leads to the wrinkles that accompany aging. Together with the title “Unraveling Collagen” the piece now has the potential to function as a metaphor for our growth physically as well as mentally, and for our intertwined paths through life. In another piece, titled “Heart of Steel”, I was interested in using a literal connection between the chemistry in the protein and the chemistry on the sculpture’s surface. I made a complete human hemoglobin (1A3N) out of a certain kind of steel known as ‘weathering steel’. This special alloy initially rusts like ordinary steel but eventually stops because the special oxide layer it builds up is not water soluble and thus protects it from further corrosion. I finished the piece with a shiny surface and installed it. Upon its unveiling it was still gleaming, but after a few rain showers the color started to change and within half a year it was dark red. What had completely changed the look of the sculpture is of course the same chemical reaction that occurs when we breathe: Iron binds to oxygen. Another sculpture is based on a light-harvesting complex (1NKZ), the protein scaffolding containing a subunit of the photosynthesis apparatus in plants). So I displayed the piece on the floor of a dark empty room with a candle in the center casting shadows of the structures on the wall. That way the sculpture in its environment conveys a feeling of sacredness; it somehow resembles an altar. The flickering shadows of the eighteen alpha helices arranged in two concentric circles are intriguing, because they look a bit like moving plants. It is as if the light-harvesting complex still originates flora, but now with exchanged roles: The macroscopic plants of our world become ephemeral shadows, whereas the microscopic, and ordinarily not perceivable basis for their existence, becomes a tangible object. That is the kind of story I am interested in telling. It is about triggering associations and emotions. I want to make objects imbued with meaning of a poetic kind. My objects should have the potential to evoke metaphors in a scientific context which sometimes seems irreconcilable with the non-rational nature of poetry. To read the rest of this article, please see: www.rcsb.org/pdb/general_information/news_publications/newsletters/2006q4/community.html ---------------------------------------- STATEMENT OF SUPPORT The RCSB PDB is supported by funds from the National Science Foundation, the National Institute of General Medical Sciences, the Office of Science, Department of Energy, the National Library of Medicine, the National Cancer Institute, the National Center for Research Resources, the National Institute of Biomedical Imaging and Bioengineering, the National Institute of Neurological Disorders and Stroke, and the National Institute of Diabetes & Digestive & Kidney Diseases. The RCSB PDB is managed by two partner sites of the Research Collaboratory for Structural Bioinformatics: RUTGERS Rutgers, The State University of New Jersey Department of Chemistry and Chemical Biology 610 Taylor Road Piscataway, NJ 08854-8087 SDSC/Skaggs/UCSD San Diego Supercomputer Center and the Skaggs School of Pharmacy and Pharmaceutical Sciences University of California, San Diego 9500 Gilman Drive La Jolla, CA 92093-0537 RCSB PDB LEADERSHIP TEAM Dr. Helen M. Berman - Director Rutgers University berman@rcsb.rutgers.edu Dr. Philip E. Bourne - Co-Director SDSC/Skaggs/UCSD bourne@sdsc.edu A list of current RCSB PDB Team Members is available from the website. The RCSB PDB is a member of the Worldwide PDB (www.wwpdb.org) -------------------------------------------- SNAPSHOT January 1, 2007 40933 released atomic coordinate entries * Molecule Type 37537 proteins, peptides, and viruses 1687 nucleic acids 1674 protein/nucleic acid complexes 35 other * Experimental Technique 34672 X-ray 6035 NMR 142 electron microscopy 84 other 23871 structure factor files 3282 NMR restraint files