Alden King-Yung Leung1, Melissa Chun-Jiao Liu2, Le Li3, Yvonne Yuk-Yin Lai4,5, Catherine Chu4,5, Pui-Yan Kwok4,5, Pak-Leung Ho2, Kevin Y Yip3,6, Ting-Fung Chan1,7,6. 1. School of Life Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong. 2. Carol Yu Center for Infection and Department of Microbiology, The University of Hong Kong, Queen Mary Hospital, Pok Fu Lam, Hong Kong. 3. Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong. 4. Cardiovascular Research Institute, University of California, San Francisco, CA 94153, USA. 5. Institute of Human Genetics, University of California, San Francisco, CA 94153, USA. 6. Hong Kong Bioinformatics Centre, The Chinese University of Hong Kong, Shatin, Hong Kong. 7. State Key Laboratory of Agrobiotechnology, The Chinese University of Hong Kong, Shatin, Hong Kong.
Abstract
BACKGROUND: Optical mapping is an emerging technology that complements sequencing-based methods in genome analysis. It is widely used in improving genome assemblies and detecting structural variations by providing information over much longer (up to 1 Mb) reads. Current standards in optical mapping analysis involve assembling optical maps into contigs and aligning them to a reference, which is limited to pairwise comparison and becomes bias-prone when analyzing multiple samples. FINDINGS: We present a new method, OMMA, that extends optical mapping to the study of complex genomic features by simultaneously interrogating optical maps across many samples in a reference-independent manner. OMMA captures and characterizes complex genomic features, e.g., multiple haplotypes, copy number variations, and subtelomeric structures when applied to 154 human samples across the 26 populations sequenced in the 1000 Genomes Project. For small genomes such as pathogenic bacteria, OMMA accurately reconstructs the phylogenomic relationships and identifies functional elements across 21 Acinetobacter baumannii strains. CONCLUSIONS: With the increasing data throughput of optical mapping system, the use of this technology in comparative genome analysis across many samples will become feasible. OMMA is a timely solution that can address such computational need. The OMMA software is available at https://github.com/TF-Chan-Lab/OMTools.
BACKGROUND: Optical mapping is an emerging technology that complements sequencing-based methods in genome analysis. It is widely used in improving genome assemblies and detecting structural variations by providing information over much longer (up to 1 Mb) reads. Current standards in optical mapping analysis involve assembling optical maps into contigs and aligning them to a reference, which is limited to pairwise comparison and becomes bias-prone when analyzing multiple samples. FINDINGS: We present a new method, OMMA, that extends optical mapping to the study of complex genomic features by simultaneously interrogating optical maps across many samples in a reference-independent manner. OMMA captures and characterizes complex genomic features, e.g., multiple haplotypes, copy number variations, and subtelomeric structures when applied to 154 human samples across the 26 populations sequenced in the 1000 Genomes Project. For small genomes such as pathogenic bacteria, OMMA accurately reconstructs the phylogenomic relationships and identifies functional elements across 21 Acinetobacter baumannii strains. CONCLUSIONS: With the increasing data throughput of optical mapping system, the use of this technology in comparative genome analysis across many samples will become feasible. OMMA is a timely solution that can address such computational need. The OMMA software is available at https://github.com/TF-Chan-Lab/OMTools.
Authors: Janet M Young; Raelynn M Endicott; Sean S Parghi; Megan Walker; Jeffrey M Kidd; Barbara J Trask Journal: Am J Hum Genet Date: 2008-08 Impact factor: 11.025
Authors: Jennifer M Shelton; Michelle C Coleman; Nic Herndon; Nanyan Lu; Ernest T Lam; Thomas Anantharaman; Palak Sheth; Susan J Brown Journal: BMC Genomics Date: 2015-09-29 Impact factor: 3.969