Cole A Deisseroth1, Johannes Birgmeier1, Ethan E Bodle2, Jennefer N Kohler3, Dena R Matalon2, Yelena Nazarenko4, Casie A Genetti5, Catherine A Brownstein5, Klaus Schmitz-Abe5, Kelly Schoch6, Heidi Cope6, Rebecca Signer7, Julian A Martinez-Agosto7,8,9, Vandana Shashi6, Alan H Beggs5, Matthew T Wheeler3,10, Jonathan A Bernstein11, Gill Bejerano12,13,14,15. 1. Department of Computer Science, Stanford University, Stanford, CA, USA. 2. Department of Pediatrics, Stanford School of Medicine, Stanford, CA, USA. 3. Stanford Center for Undiagnosed Diseases, Stanford, CA, USA. 4. Department of Biomedical Data Science, Stanford University, Stanford, CA, USA. 5. The Manton Center for Orphan Disease Research, Division of Genetics and Genomics, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA. 6. Department of Pediatrics, Duke University School of Medicine, Durham, NC, USA. 7. Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA. 8. Department of Pediatrics, Division of Medical Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA. 9. Department of Psychiatry, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA. 10. Department of Medicine, Stanford School of Medicine, Stanford, CA, USA. 11. Department of Pediatrics, Stanford School of Medicine, Stanford, CA, USA. Jon.Bernstein@stanford.edu. 12. Department of Computer Science, Stanford University, Stanford, CA, USA. bejerano@stanford.edu. 13. Department of Pediatrics, Stanford School of Medicine, Stanford, CA, USA. bejerano@stanford.edu. 14. Department of Biomedical Data Science, Stanford University, Stanford, CA, USA. bejerano@stanford.edu. 15. Department of Developmental Biology, Stanford University, Stanford, CA, USA. bejerano@stanford.edu.
Abstract
PURPOSE: Diagnosing monogenic diseases facilitates optimal care, but can involve the manual evaluation of hundreds of genetic variants per case. Computational tools like Phrank expedite this process by ranking all candidate genes by their ability to explain the patient's phenotypes. To use these tools, busy clinicians must manually encode patient phenotypes from lengthy clinical notes. With 100 million human genomes estimated to be sequenced by 2025, a fast alternative to manual phenotype extraction from clinical notes will become necessary. METHODS: We introduce ClinPhen, a fast, high-accuracy tool that automatically converts clinical notes into a prioritized list of patient phenotypes using Human Phenotype Ontology (HPO) terms. RESULTS: ClinPhen shows superior accuracy and 20× speedup over existing phenotype extractors, and its novel phenotype prioritization scheme improves the performance of gene-ranking tools. CONCLUSION: While a dedicated clinician can process 200 patient records in a 40-hour workweek, ClinPhen does the same in 10 minutes. Compared with manual phenotype extraction, ClinPhen saves an additional 3-5 hours per Mendelian disease diagnosis. Providers can now add ClinPhen's output to each summary note attached to a filled testing laboratory request form. ClinPhen makes a substantial contribution to improvements in efficiency critically needed to meet the surging demand for clinical diagnostic sequencing.
PURPOSE: Diagnosing monogenic diseases facilitates optimal care, but can involve the manual evaluation of hundreds of genetic variants per case. Computational tools like Phrank expedite this process by ranking all candidate genes by their ability to explain the patient's phenotypes. To use these tools, busy clinicians must manually encode patient phenotypes from lengthy clinical notes. With 100 million human genomes estimated to be sequenced by 2025, a fast alternative to manual phenotype extraction from clinical notes will become necessary. METHODS: We introduce ClinPhen, a fast, high-accuracy tool that automatically converts clinical notes into a prioritized list of patient phenotypes using Human Phenotype Ontology (HPO) terms. RESULTS: ClinPhen shows superior accuracy and 20× speedup over existing phenotype extractors, and its novel phenotype prioritization scheme improves the performance of gene-ranking tools. CONCLUSION: While a dedicated clinician can process 200 patient records in a 40-hour workweek, ClinPhen does the same in 10 minutes. Compared with manual phenotype extraction, ClinPhen saves an additional 3-5 hours per Mendelian disease diagnosis. Providers can now add ClinPhen's output to each summary note attached to a filled testing laboratory request form. ClinPhen makes a substantial contribution to improvements in efficiency critically needed to meet the surging demand for clinical diagnostic sequencing.
Entities:
Keywords:
Mendelian disease diagnosis; medical genetics; natural language processing; prioritized disease phenotypes
Authors: Kristin D Kernohan; Taila Hartley; Najmeh Alirezaie; Peter N Robinson; David A Dyment; Kym M Boycott Journal: Hum Mutat Date: 2017-12-14 Impact factor: 4.878
Authors: Sebastian Köhler; Marcel H Schulz; Peter Krawitz; Sebastian Bauer; Sandra Dölken; Claus E Ott; Christine Mundlos; Denise Horn; Stefan Mundlos; Peter N Robinson Journal: Am J Hum Genet Date: 2009-10 Impact factor: 11.025
Authors: Zachary D Stephens; Skylar Y Lee; Faraz Faghri; Roy H Campbell; Chengxiang Zhai; Miles J Efron; Ravishankar Iyer; Michael C Schatz; Saurabh Sinha; Gene E Robinson Journal: PLoS Biol Date: 2015-07-07 Impact factor: 8.029
Authors: Peter N Robinson; Sebastian Köhler; Anika Oellrich; Kai Wang; Christopher J Mungall; Suzanna E Lewis; Nicole Washington; Sebastian Bauer; Dominik Seelow; Peter Krawitz; Christian Gilissen; Melissa Haendel; Damian Smedley Journal: Genome Res Date: 2013-10-25 Impact factor: 9.043
Authors: Joanna S Amberger; Carol A Bocchini; François Schiettecatte; Alan F Scott; Ada Hamosh Journal: Nucleic Acids Res Date: 2014-11-26 Impact factor: 19.160
Authors: Sebastian Köhler; Nicole A Vasilevsky; Mark Engelstad; Erin Foster; Julie McMurry; Ségolène Aymé; Gareth Baynam; Susan M Bello; Cornelius F Boerkoel; Kym M Boycott; Michael Brudno; Orion J Buske; Patrick F Chinnery; Valentina Cipriani; Laureen E Connell; Hugh J S Dawkins; Laura E DeMare; Andrew D Devereau; Bert B A de Vries; Helen V Firth; Kathleen Freson; Daniel Greene; Ada Hamosh; Ingo Helbig; Courtney Hum; Johanna A Jähn; Roger James; Roland Krause; Stanley J F Laulederkind; Hanns Lochmüller; Gholson J Lyon; Soichi Ogishima; Annie Olry; Willem H Ouwehand; Nikolas Pontikos; Ana Rath; Franz Schaefer; Richard H Scott; Michael Segal; Panagiotis I Sergouniotis; Richard Sever; Cynthia L Smith; Volker Straub; Rachel Thompson; Catherine Turner; Ernest Turro; Marijcke W M Veltman; Tom Vulliamy; Jing Yu; Julie von Ziegenweidt; Andreas Zankl; Stephan Züchner; Tomasz Zemojtel; Julius O B Jacobsen; Tudor Groza; Damian Smedley; Christopher J Mungall; Melissa Haendel; Peter N Robinson Journal: Nucleic Acids Res Date: 2016-11-28 Impact factor: 16.971
Authors: Karthik A Jagadeesh; Johannes Birgmeier; Harendra Guturu; Cole A Deisseroth; Aaron M Wenger; Jonathan A Bernstein; Gill Bejerano Journal: Genet Med Date: 2018-07-12 Impact factor: 8.822
Authors: Zhengyi Deng; Kanhua Yin; Yujia Bao; Victor Diego Armengol; Cathy Wang; Ankur Tiwari; Regina Barzilay; Giovanni Parmigiani; Danielle Braun; Kevin S Hughes Journal: JCO Clin Cancer Inform Date: 2019-08
Authors: Lisa Bastarache; Jacob J Hughey; Jeffrey A Goldstein; Julie A Bastraache; Satya Das; Neil Charles Zaki; Chenjie Zeng; Leigh Anne Tang; Dan M Roden; Joshua C Denny Journal: J Am Med Inform Assoc Date: 2019-12-01 Impact factor: 4.497
Authors: Cong Liu; Casey N Ta; Jim M Havrilla; Jordan G Nestor; Matthew E Spotnitz; Andrew S Geneslaw; Yu Hu; Wendy K Chung; Kai Wang; Chunhua Weng Journal: Am J Hum Genet Date: 2022-08-22 Impact factor: 11.043
Authors: Johannes Birgmeier; Maximilian Haeussler; Cole A Deisseroth; Ethan H Steinberg; Karthik A Jagadeesh; Alexander J Ratner; Harendra Guturu; Aaron M Wenger; Mark E Diekhans; Peter D Stenson; David N Cooper; Christopher Ré; Alan H Beggs; Jonathan A Bernstein; Gill Bejerano Journal: Sci Transl Med Date: 2020-05-20 Impact factor: 19.319