Marni J Falk1, Lishuang Shen2, Xiaowu Gai2. 1. Division of Human Genetics, Department of Pediatrics, The Children's Hospital of Philadelphia and University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania 19104, USA; 2. Center for Personalized Medicine, Children's Hospital Los Angeles, Los Angeles, California 90027, USA.
Mitochondrial disease is now recognized to represent a highly heterogeneous group of genetic disorders that impair energy metabolism and can potentially involve every organ system, with more than 250 causative genes already confirmed across both the nuclear and mitochondrial genomes (Koopman et al. 2012). Novel variants in a recently recognized nuclear disease gene are reported in this issue to cause an autosomal-recessive multisystemic mitochondrial disease affecting two cousins from a Lebanese family (Joshi et al. 2016). As has now essentially become commonplace in the highly heterogeneous classes of neurologic and mitochondrial diseases, massively parallel genomic sequencing in a single family with a complex phenotype proved to be an effective means to identify potentially pathogenic novel mutations in a nuclear gene encoding a mitochondrial protein that was recently linked to cerebellar ataxia (Choquet et al. 2015; Jobling et al. 2015). Protein modeling, functional analyses in patient fibroblasts, and wild-type cDNA rescue were then used to conclusively demonstrate the causality of the novel PMPCA mutations in their patients’ more expansive phenotype that extended beyond isolated cerebellar atrophy to also include multisystemic mitochondrial disease. In this iterative fashion that directly relies on continual publication of case reports and data sharing within community resources, community knowledge steadily grows of new gene disorders, new mutations within known genes, and their expanded phenotypes, further improving community recognition of the functional significance of the up to 1500 proteins that localize within mitochondria (Calvo et al. 2016). However, given the large number of mitochondrial disease genes having low prevalence, their wide spectrum of clinical presentations, and often overlapping clinical phenotypes (Koopman et al. 2012), it is increasingly challenging for any one clinician or researcher to recognize or stay current on all potential gene causes for a patient's disease presentation. Large public databases such as ClinVar serve as centralized warehouses for annotation assertions on potential gene causes for diverse disease presentations but remain dependent on user contributions (Landrum et al. 2016).To aid the community of mitochondrial disease researchers in curating and accessing the rapidly expanding knowledge base of mitochondrial disease genetics, a Web-based resource has been built, the Mitochondrial Disease Sequence Data Resource (MSeqDR; https://mseqdr.org). This multidimensional data resource aims to support both clinicians and researchers in the mitochondrial disease community as conceived and built by the MSeqDR Consortium (Falk et al. 2015), a grass-roots effort begun in 2012 by mitochondrial disease researchers, clinicians, and bioinformaticians with the support of the United Mitochondrial Disease Foundation (UMDF) and North American Mitochondrial Disease Consortium (NAMDC). Thus, the focus of MSeqDR is to provide a centralized community resource in which to deposit, curate, annotate, analyze, and share accurate genomic knowledge on the many different mitochondrial diseases, genes, and variants that are continually being recognized to cause mitochondrial disease (McCormick et al. 2013). MSeqDR includes distinct domains in an interactive Web interface designed to intuitively support the diverse genomic data needs of clinicians, researchers, and diagnostic laboratories working in the field of mitochondrial disease. Major domains include (1) MSeqDR Gbrowse, a genome browser with custom tracks specific to diverse aspects of mitochondrial biology and disease; (2) MSeqDR-LSDB, a locus-specific database that provides curated data on more than 1360 nuclear and mitochondrial genes linked to mitochondrial biology and disease, as well as their transcripts and variants (MSeqDR currently synchronizes with ClinVar by periodically extracting variant annotations relevant to mitochondrial diseases from ClinVar [Landrum et al. 2016]); (3) a suite of public and custom bioinformatics tools to support direct genomic analyses by MSeqDR users on their own data sets of either nuclear or mitochondrial genomes; (4) phenome-based data and analysis tools to catalog features of mitochondrial diseases using defined terms from the Human Phenotype Ontology (HPO) (Groza et al. 2015); (5) cloud- and Web-based exome and/or genome analysis in Genesis 2.0 (formerly GEM.app) (Gonzalez et al. 2013), which enables discovery of additional patient cases for rare genetic disorders via the Matchmaker Exchange (Gonzalez et al. 2015), and (6) submission tools to facilitate genomic data set and individual variant deposition into MSeqDR. Additionally, variant annotation tools have been implemented within MSeqDR to simplify the work required to formulate variants in ClinVar submission format. Collaboration with the Clinical Genomic Resource (ClinGen) project team and the National Center for Biotechnology Information (NCBI) is underway to enable coordinated and automated ClinVar submission via MSeqDR. In this way, there will be a synergistic relationship between the community-specific MSeqDR resource and the more general ClinVar resource. Specifically, it is our goal that the mitochondrial disease community use MSeqDR as their primary system for deposition and analysis of genomic data, with subsequent sharing of interpreted variants into ClinVar in a way that will be directly managed by MSeqDR. This might potentially include reference in the ClinVar variant entries to connect a submitted interpretation with the supporting observations housed in MSeqDR, accessible to all users that have access authentication. Detailed explanation of each MSeqDR component has been recently published (Shen et al. 2016). In addition, step-by-step user tutorials and practice exercises are provided on the MSeqDR website to familiarize users with the diverse functionality of MSeqDR. Conveniently, a central search portal is accessible from every page within MSeqDR, allowing users to efficiently survey all relevant data within the resource on any given disease, gene, or variant of interest. All sites within MSeqDR are internally linked and directly link out to many external databases. Although accounts are provided free to all academic users, login is required to access all of the data and different domain capabilities within MSeqDR. Public access is only provided to public data sets by default. Through the account system, users gain access to data sets shared by other MSeqDR consortium members. However, MSeqDR users retain the option to keep their own data sets private, share them with other MSeqDR users, or share them with the general public.As was done for the PMPCA mutations identified by Joshi et al. (2016), novel disease genes and variants can be readily added to MSeqDR-LSDB. A custom MSeqDR accession number is assigned to all annotated pathogenic variants that are either submitted by users or batch extracted from other databases such as ClinVar or Ensembl, each containing a unique “MSCV” (MSeqDR clinically related variant) identifier with a seven-digit code. Users can readily submit variants, in either VCF (variant call format) or HGVS (Human Genome Variation Society) format, using the custom MSeqDR “pathogenic variant submission tool” (https://mseqdr.org/submission.php). Authenticated users can add variants, as well as logged comments on any variant annotation, with assistance by the semiautomated variant annotation tool that is available within MSeqDR. Expert review panels are now being organized within the MSeqDR community to review the levels of evidence for pathogenicity of all reported variants within all MSeqDR-LSDB curated genes, following the guidelines set by the ClinGen Consortium (Rehm et al. 2015). Active development is also underway to incorporate phenotype entry and analysis tools in MSeqDR, to provide deeper knowledge for individual variants of the often complex, multisystem phenotypes seen in affected individuals (Shen et al. 2016). Specifically, MSeqDR is actively working to capture standardized phenotype data in controlled vocabularies, including HPO and MeSH (Medical Subject Headings). PhenoTips is being used for phenotype data input (Girdea et al. 2013) and combined with the associated phenotype-driven exploration and analysis tools in MSeqDR that support HPO, OMIM (Online Mendelian Inheritance in Man), and MeSH terminology.In summary, MSeqDR provides a centralized Web portal in which to deposit, organize, curate, analyze, discover, educate, and share genomic knowledge necessary for the global community to better diagnose and understand the complex and ever-growing cadre of mitochondrial diseases. It is our hope that MSeqDR will prove to be a widely used and useful resource for data deposition and analysis by the diverse community of genomic researchers, mitochondrial disease clinicians, and genetic diagnostic laboratories who continue to discover and evaluate new mitochondrial diseasepatients, genes, and variants.
ADDITIONAL INFORMATION
Acknowledgments
The server for MSeqDR is hosted since August 2015 at Children's Hospital Los Angeles (CHLA) with support from the CHLA IT department. It was previously located at Massachusetts Eye and Ear Infirmary (MEEI) with support from the Ocular Genomics Institute MEEI Bioinformatics Center (MBC) group and MEEI IT department. We are grateful to the leadership and staff of the United Mitochondrial Disease Foundation, including Chuck Mohan, Dan Wright, Patrick Kelley, Philip Yeske, Cliff Gorski, and Janet Owens for their organizational and financial support for MSeqDR Consortium activities. This work was also supported in part by the National Institutes of Health (U54-NS078059 and U41-HG006834). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.MSeqDR Consortium Participants: Marcella Attimonelli, Renkui Bai, Sherri Bale, Jirair Bedoyan, Doron Behar, Richard G. Boles, Penelope Bonnen, Virginia Brilhante, Lisa Brooks, Michael Brudno, Claudia Calabrese, Sarah Calvo, Patrick F. Chinnery, John Christodoulou, Deanna Church, Rosanna Clima, Bruce Cohen, IFM de Coo, William C. Copeland, Zarazuela Zolkipli-Cunningham, Jeana T. DaRe, Olga Derbenevoa, Maria Angela Diroma, Johan T. den Dunnen, David Dimmock, Gregory Enns, Giuseppe Gasparre, Rebecca Ganetzky, Amy Goldstein, Daniel Navarro-Gomez, Michael Gonzalez, Katrina Gwinn, Sihoun Hahn, Richard Haas, Hakon Hakonarson, Michio Hirano, Douglas Kerr, Danuta Krotoski, Austin Larson, Jeremy Leipzig, Dong Li, Marie T. Lott, Maria Lvova, Finley Macrae, Donna Maglott, Elizabeth McCormick, Grant Mitchell, Vamsi Mootha, Colleen Clarke Muraresku, Iris Gonzalez, Yasushi Okazaki, Melissa Parisi, Juan Carlos Perin, Eric Pierce, Vincent Procaccio, Holger Prokisch, Aurora Pujol, Shamima Rahman, David Ralph, Honey Reddi, Heidi Rehm, Erin Riggs, Richard Rodenburg, Yaffa Rubinstein, Russell Saneto, Mariangela Santorsola, Curt Scharfe, Claire Sheldon, Eric Shoubridge, Domenico Simone, Bert Smeets, Jan Smeitink, Christine Stanley, Fons Stassen, Anu Suomalainen-Waartiovaara, Mark Tarnopolsky, Isabelle Thiffault, David Thorburn, Johan Van Hove, Mannis van Oven, Lynne Wolfe, Lee-Jun Wong, Philip Yeske, Douglas C. Wallace, Zhe Zhang, and Stephan Zuchner.
Authors: Karine Choquet; Olga Zurita-Rendón; Roberta La Piana; Sharon Yang; Marie-Josée Dicaire; Kym M Boycott; Jacek Majewski; Eric A Shoubridge; Bernard Brais; Martine Tétreault Journal: Brain Date: 2015-12-10 Impact factor: 13.501
Authors: Rebekah K Jobling; Mirna Assoum; Oleksandr Gakh; Susan Blaser; Julian A Raiman; Cyril Mignot; Emmanuel Roze; Alexandra Dürr; Alexis Brice; Nicolas Lévy; Chitra Prasad; Tara Paton; Andrew D Paterson; Nicole M Roslin; Christian R Marshall; Jean-Pierre Desvignes; Nathalie Roëckel-Trevisiol; Stephen W Scherer; Guy A Rouleau; André Mégarbané; Grazia Isaya; Valérie Delague; Grace Yoon Journal: Brain Date: 2015-03-25 Impact factor: 13.501
Authors: Heidi L Rehm; Jonathan S Berg; Lisa D Brooks; Carlos D Bustamante; James P Evans; Melissa J Landrum; David H Ledbetter; Donna R Maglott; Christa Lese Martin; Robert L Nussbaum; Sharon E Plon; Erin M Ramos; Stephen T Sherry; Michael S Watson Journal: N Engl J Med Date: 2015-05-27 Impact factor: 91.245
Authors: Marta Girdea; Sergiu Dumitriu; Marc Fiume; Sarah Bowdin; Kym M Boycott; Sébastien Chénier; David Chitayat; Hanna Faghfoury; M Stephen Meyn; Peter N Ray; Joyce So; Dimitri J Stavropoulos; Michael Brudno Journal: Hum Mutat Date: 2013-05-24 Impact factor: 4.878
Authors: Marni J Falk; Xiaowu Gai; Lishuang Shen; Maria Angela Diroma; Michael Gonzalez; Daniel Navarro-Gomez; Jeremy Leipzig; Marie T Lott; Mannis van Oven; Douglas C Wallace; Colleen Clarke Muraresku; Zarazuela Zolkipli-Cunningham; Patrick F Chinnery; Marcella Attimonelli; Stephan Zuchner Journal: Hum Mutat Date: 2016-03-21 Impact factor: 4.878
Authors: Marni J Falk; Lishuang Shen; Michael Gonzalez; Jeremy Leipzig; Marie T Lott; Alphons P M Stassen; Maria Angela Diroma; Daniel Navarro-Gomez; Philip Yeske; Renkui Bai; Richard G Boles; Virginia Brilhante; David Ralph; Jeana T DaRe; Robert Shelton; Sharon F Terry; Zhe Zhang; William C Copeland; Mannis van Oven; Holger Prokisch; Douglas C Wallace; Marcella Attimonelli; Danuta Krotoski; Stephan Zuchner; Xiaowu Gai Journal: Mol Genet Metab Date: 2014-12-04 Impact factor: 4.797
Authors: Tudor Groza; Sebastian Köhler; Dawid Moldenhauer; Nicole Vasilevsky; Gareth Baynam; Tomasz Zemojtel; Lynn Marie Schriml; Warren Alden Kibbe; Paul N Schofield; Tim Beck; Drashtti Vasant; Anthony J Brookes; Andreas Zankl; Nicole L Washington; Christopher J Mungall; Suzanna E Lewis; Melissa A Haendel; Helen Parkinson; Peter N Robinson Journal: Am J Hum Genet Date: 2015-06-25 Impact factor: 11.025
Authors: Melissa J Landrum; Jennifer M Lee; Mark Benson; Garth Brown; Chen Chao; Shanmuga Chitipiralla; Baoshan Gu; Jennifer Hart; Douglas Hoffman; Jeffrey Hoover; Wonhee Jang; Kenneth Katz; Michael Ovetsky; George Riley; Amanjeev Sethi; Ray Tully; Ricardo Villamarin-Salomon; Wendy Rubinstein; Donna R Maglott Journal: Nucleic Acids Res Date: 2015-11-17 Impact factor: 16.971
Authors: Kathryn M Camp; Danuta Krotoski; Melissa A Parisi; Katrina A Gwinn; Bruce H Cohen; Christine S Cox; Gregory M Enns; Marni J Falk; Amy C Goldstein; Rashmi Gopal-Srivastava; Gráinne S Gorman; Stephen P Hersh; Michio Hirano; Freddie Ann Hoffman; Amel Karaa; Erin L MacLeod; Robert McFarland; Charles Mohan; Andrew E Mulberg; Joanne C Odenkirchen; Sumit Parikh; Patricia J Rutherford; Shawne K Suggs-Anderson; W H Wilson Tang; Jerry Vockley; Lynne A Wolfe; Steven Yannicelli; Philip E Yeske; Paul M Coates Journal: Mol Genet Metab Date: 2016-09-20 Impact factor: 4.797
Authors: Elizabeth M McCormick; Marie T Lott; Matthew C Dulik; Lishuang Shen; Marcella Attimonelli; Ornella Vitale; Amel Karaa; Renkui Bai; Daniel E Pineda-Alvarez; Larry N Singh; Christine M Stanley; Stacey Wong; Anshu Bhardwaj; Daria Merkurjev; Rong Mao; Neal Sondheimer; Shiping Zhang; Vincent Procaccio; Douglas C Wallace; Xiaowu Gai; Marni J Falk Journal: Hum Mutat Date: 2020-11-10 Impact factor: 4.878
Authors: Xiomara Q Rosales; John L P Thompson; Richard Haas; Johan L K Van Hove; Amel Karaa; Danuta Krotoski; Kristin Engelstad; Richard Buchsbaum; Salvatore DiMauro; Michio Hirano Journal: J Transl Genet Genom Date: 2020-04-28