Literature DB >> 24590705

Data management in the modern structural biology and biomedical research environment.

Matthew D Zimmerman1, Marek Grabowski, Marcin J Domagalski, Elizabeth M Maclean, Maksymilian Chruszcz, Wladek Minor.   

Abstract

Modern high-throughput structural biology laboratories produce vast amounts of raw experimental data. The traditional method of data reduction is very simple-results are summarized in peer-reviewed publications, which are hopefully published in high-impact journals. By their nature, publications include only the most important results derived from experiments that may have been performed over the course of many years. The main content of the published paper is a concise compilation of these data, an interpretation of the experimental results, and a comparison of these results with those obtained by other scientists.Due to an avalanche of structural biology manuscripts submitted to scientific journals, in many recent cases descriptions of experimental methodology (and sometimes even experimental results) are pushed to supplementary materials that are only published online and sometimes may not be reviewed as thoroughly as the main body of a manuscript. Trouble may arise when experimental results are contradicting the results obtained by other scientists, which requires (in the best case) the reexamination of the original raw data or independent repetition of the experiment according to the published description of the experiment. There are reports that a significant fraction of experiments obtained in academic laboratories cannot be repeated in an industrial environment (Begley CG & Ellis LM, Nature 483(7391):531-3, 2012). This is not an indication of scientific fraud but rather reflects the inadequate description of experiments performed on different equipment and on biological samples that were produced with disparate methods. For that reason the goal of a modern data management system is not only the simple replacement of the laboratory notebook by an electronic one but also the creation of a sophisticated, internally consistent, scalable data management system that will combine data obtained by a variety of experiments performed by various individuals on diverse equipment. All data should be stored in a core database that can be used by custom applications to prepare internal reports, statistics, and perform other functions that are specific to the research that is pursued in a particular laboratory.This chapter presents a general overview of the methods of data management and analysis used by structural genomics (SG) programs. In addition to a review of the existing literature on the subject, also presented is experience in the development of two SG data management systems, UniTrack and LabDB. The description is targeted to a general audience, as some technical details have been (or will be) published elsewhere. The focus is on "data management," meaning the process of gathering, organizing, and storing data, but also briefly discussed is "data mining," the process of analysis ideally leading to an understanding of the data. In other words, data mining is the conversion of data into information. Clearly, effective data management is a precondition for any useful data mining. If done properly, gathering details on millions of experiments on thousands of proteins and making them publicly available for analysis-even after the projects themselves have ended-may turn out to be one of the most important benefits of SG programs.

Entities:  

Mesh:

Year:  2014        PMID: 24590705      PMCID: PMC4086192          DOI: 10.1007/978-1-4939-0354-2_1

Source DB:  PubMed          Journal:  Methods Mol Biol        ISSN: 1064-3745


  46 in total

1.  The Protein Data Bank.

Authors:  H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  Mining the structural genomics pipeline: identification of protein properties that affect high-throughput experimental analysis.

Authors:  Chern-Sing Goh; Ning Lan; Shawn M Douglas; Baolin Wu; Nathaniel Echols; Andrew Smith; Duncan Milburn; Gaetano T Montelione; Hongyu Zhao; Mark Gerstein
Journal:  J Mol Biol       Date:  2004-02-06       Impact factor: 5.469

3.  Announcing the worldwide Protein Data Bank.

Authors:  Helen Berman; Kim Henrick; Haruki Nakamura
Journal:  Nat Struct Biol       Date:  2003-12

4.  Curators of the world unite: the International Society of Biocuration.

Authors:  Alex Bateman
Journal:  Bioinformatics       Date:  2010-03-19       Impact factor: 6.937

5.  Towards rationalization of crystallization screening for small- to medium-sized academic laboratories: the PACT/JCSG+ strategy.

Authors:  Janet Newman; David Egan; Thomas S Walter; Ran Meged; Ian Berry; Marouane Ben Jelloul; Joel L Sussman; David I Stuart; Anastassis Perrakis
Journal:  Acta Crystallogr D Biol Crystallogr       Date:  2005-09-28

6.  Design of a data model for developing laboratory information management and analysis systems for protein production.

Authors:  Anne Pajon; John Ionides; Jon Diprose; Joël Fillon; Rasmus Fogh; Alun W Ashton; Helen Berman; Wayne Boucher; Miroslaw Cygler; Emeline Deleury; Robert Esnouf; Joël Janin; Rosalind Kim; Isabelle Krimm; Catherine L Lawson; Eric Oeuillet; Anne Poupon; Stéphane Raymond; Tim Stevens; Herman van Tilbeurgh; John Westbrook; Peter Wood; Eldon Ulrich; Wim Vranken; Li Xueli; Ernest Laue; David I Stuart; Kim Henrick
Journal:  Proteins       Date:  2005-02-01

7.  The Protein Information Management System (PiMS): a generic tool for any structural biology research laboratory.

Authors:  Chris Morris; Anne Pajon; Susanne L Griffiths; Ed Daniel; Marc Savitsky; Bill Lin; Jonathan M Diprose; Alan Wilter da Silva; Katya Pilicheva; Peter Troshin; Johannes van Niekerk; Neil Isaacs; James Naismith; Colin Nave; Richard Blake; Keith S Wilson; David I Stuart; Kim Henrick; Robert M Esnouf
Journal:  Acta Crystallogr D Biol Crystallogr       Date:  2011-03-18

8.  The Structural Biology Knowledgebase: a portal to protein structures, sequences, functions, and methods.

Authors:  Margaret J Gabanyi; Paul D Adams; Konstantin Arnold; Lorenza Bordoli; Lester G Carter; Judith Flippen-Andersen; Lida Gifford; Juergen Haas; Andrei Kouranov; William A McLaughlin; David I Micallef; Wladek Minor; Raship Shah; Torsten Schwede; Yi-Ping Tao; John D Westbrook; Matthew Zimmerman; Helen M Berman
Journal:  J Struct Funct Genomics       Date:  2011-04-07

9.  Proteopedia - a scientific 'wiki' bridging the rift between three-dimensional structure and function of biomacromolecules.

Authors:  Eran Hodis; Jaime Prilusky; Eric Martz; Israel Silman; John Moult; Joel L Sussman
Journal:  Genome Biol       Date:  2008-08-03       Impact factor: 13.583

Review 10.  BEI Resources: supporting antiviral research.

Authors:  Robert Baker; Susan Peacock
Journal:  Antiviral Res       Date:  2008-08-23       Impact factor: 5.970

View more
  21 in total

Review 1.  X-ray crystallography over the past decade for novel drug discovery - where are we heading next?

Authors:  Heping Zheng; Katarzyna B Handing; Matthew D Zimmerman; Ivan G Shabalin; Steven C Almo; Wladek Minor
Journal:  Expert Opin Drug Discov       Date:  2015-07-15       Impact factor: 6.098

2.  Generating enzyme and radical-mediated bisubstrates as tools for investigating Gcn5-related N-acetyltransferases.

Authors:  Cory Reidl; Karolina A Majorek; Joseph Dang; David Tran; Kristen Jew; Melissa Law; Yasmine Payne; Wladek Minor; Daniel P Becker; Misty L Kuhn
Journal:  FEBS Lett       Date:  2017-08-01       Impact factor: 4.124

Review 3.  Critical evaluation of bioinformatics tools for the prediction of protein crystallization propensity.

Authors:  Huilin Wang; Liubin Feng; Geoffrey I Webb; Lukasz Kurgan; Jiangning Song; Donghai Lin
Journal:  Brief Bioinform       Date:  2018-09-28       Impact factor: 11.622

4.  A public database of macromolecular diffraction experiments.

Authors:  Marek Grabowski; Karol M Langner; Marcin Cymborowski; Przemyslaw J Porebski; Piotr Sroka; Heping Zheng; David R Cooper; Matthew D Zimmerman; Marc André Elsliger; Stephen K Burley; Wladek Minor
Journal:  Acta Crystallogr D Struct Biol       Date:  2016-10-28       Impact factor: 7.652

Review 5.  100 Years later: Celebrating the contributions of x-ray crystallography to allergy and clinical immunology.

Authors:  Anna Pomés; Maksymilian Chruszcz; Alla Gustchina; Wladek Minor; Geoffrey A Mueller; Lars C Pedersen; Alexander Wlodawer; Martin D Chapman
Journal:  J Allergy Clin Immunol       Date:  2015-07       Impact factor: 10.793

6.  Insight into the 3D structure and substrate specificity of previously uncharacterized GNAT superfamily acetyltransferases from pathogenic bacteria.

Authors:  Karolina A Majorek; Tomasz Osinski; David T Tran; Alina Revilla; Wayne F Anderson; Wladek Minor; Misty L Kuhn
Journal:  Biochim Biophys Acta Proteins Proteom       Date:  2016-10-23       Impact factor: 3.036

Review 7.  Databases, Repositories, and Other Data Resources in Structural Biology.

Authors:  Heping Zheng; Przemyslaw J Porebski; Marek Grabowski; David R Cooper; Wladek Minor
Journal:  Methods Mol Biol       Date:  2017

8.  Albumin-Based Transport of Nonsteroidal Anti-Inflammatory Drugs in Mammalian Blood Plasma.

Authors:  Mateusz P Czub; Katarzyna B Handing; Barat S Venkataramany; David R Cooper; Ivan G Shabalin; Wladek Minor
Journal:  J Med Chem       Date:  2020-06-17       Impact factor: 7.446

Review 9.  The impact of structural genomics: the first quindecennial.

Authors:  Marek Grabowski; Ewa Niedzialkowska; Matthew D Zimmerman; Wladek Minor
Journal:  J Struct Funct Genomics       Date:  2016-03-02

10.  Structural and biochemical analysis of Bacillus anthracis prephenate dehydrogenase reveals an unusual mode of inhibition by tyrosine via the ACT domain.

Authors:  Ivan G Shabalin; Artyom Gritsunov; Jing Hou; Joanna Sławek; Charles D Miks; David R Cooper; Wladek Minor; Dinesh Christendat
Journal:  FEBS J       Date:  2019-12-26       Impact factor: 5.542

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.