Literature DB >> 27706134

De novo assembly and phasing of a Korean human genome.

Jeong-Sun Seo1,2,3,4,5, Arang Rhie1,2,3, Junsoo Kim1,4, Sangjin Lee1,5, Min-Hwan Sohn1,2,3, Chang-Uk Kim1,2,3, Alex Hastie6, Han Cao6, Ji-Young Yun1,5, Jihye Kim1,5, Junho Kuk1,5, Gun Hwa Park1,5, Juhyeok Kim1,5, Hanna Ryu4, Jongbum Kim4, Mira Roh4, Jeonghun Baek4, Michael W Hunkapiller7, Jonas Korlach7, Jong-Yeon Shin1,5, Changhoon Kim4.   

Abstract

Advances in genome assembly and phasing provide an opportunity to investigate the diploid architecture of the human genome and reveal the full range of structural variation across population groups. Here we report the de novo assembly and haplotype phasing of the Korean individual AK1 (ref. 1) using single-molecule real-time sequencing, next-generation mapping, microfluidics-based linked reads, and bacterial artificial chromosome (BAC) sequencing approaches. Single-molecule sequencing coupled with next-generation mapping generated a highly contiguous assembly, with a contig N50 size of 17.9 Mb and a scaffold N50 size of 44.8 Mb, resolving 8 chromosomal arms into single scaffolds. The de novo assembly, along with local assemblies and spanning long reads, closes 105 and extends into 72 out of 190 euchromatic gaps in the reference genome, adding 1.03 Mb of previously intractable sequence. High concordance between the assembly and paired-end sequences from 62,758 BAC clones provides strong support for the robustness of the assembly. We identify 18,210 structural variants by direct comparison of the assembly with the human reference, identifying thousands of breakpoints that, to our knowledge, have not been reported before. Many of the insertions are reflected in the transcriptome and are shared across the Asian population. We performed haplotype phasing of the assembly with short reads, long reads and linked reads from whole-genome sequencing and with short reads from 31,719 BAC clones, thereby achieving phased blocks with an N50 size of 11.6 Mb. Haplotigs assembled from single-molecule real-time reads assigned to haplotypes on phased blocks covered 89% of genes. The haplotigs accurately characterized the hypervariable major histocompatability complex region as well as demonstrating allele configuration in clinically relevant genes such as CYP2D6. This work presents the most contiguous diploid human genome assembly so far, with extensive investigation of unreported and Asian-specific structural variants, and high-quality haplotyping of clinically relevant alleles for precision medicine.

Entities:  

Mesh:

Substances:

Year:  2016        PMID: 27706134     DOI: 10.1038/nature20098

Source DB:  PubMed          Journal:  Nature        ISSN: 0028-0836            Impact factor:   49.962


  32 in total

1.  Consed: a graphical tool for sequence finishing.

Authors:  D Gordon; C Abajian; P Green
Journal:  Genome Res       Date:  1998-03       Impact factor: 9.043

2.  STAR: ultrafast universal RNA-seq aligner.

Authors:  Alexander Dobin; Carrie A Davis; Felix Schlesinger; Jorg Drenkow; Chris Zaleski; Sonali Jha; Philippe Batut; Mark Chaisson; Thomas R Gingeras
Journal:  Bioinformatics       Date:  2012-10-25       Impact factor: 6.937

3.  Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.

Authors:  Chen-Shan Chin; David H Alexander; Patrick Marks; Aaron A Klammer; James Drake; Cheryl Heiner; Alicia Clum; Alex Copeland; John Huddleston; Evan E Eichler; Stephen W Turner; Jonas Korlach
Journal:  Nat Methods       Date:  2013-05-05       Impact factor: 28.547

4.  Tools and best practices for data processing in allelic expression analysis.

Authors:  Stephane E Castel; Ami Levy-Moonshine; Pejman Mohammadi; Eric Banks; Tuuli Lappalainen
Journal:  Genome Biol       Date:  2015-09-17       Impact factor: 13.583

5.  A highly annotated whole-genome sequence of a Korean individual.

Authors:  Jong-Il Kim; Young Seok Ju; Hansoo Park; Sheehyun Kim; Seonwook Lee; Jae-Hyuk Yi; Joann Mudge; Neil A Miller; Dongwan Hong; Callum J Bell; Hye-Sun Kim; In-Soon Chung; Woo-Chung Lee; Ji-Sun Lee; Seung-Hyun Seo; Ji-Young Yun; Hyun Nyun Woo; Heewook Lee; Dongwhan Suh; Seungbok Lee; Hyun-Jin Kim; Maryam Yavartanoo; Minhye Kwak; Ying Zheng; Mi Kyeong Lee; Hyunjun Park; Jeong Yeon Kim; Omer Gokcumen; Ryan E Mills; Alexander Wait Zaranek; Joseph Thakuria; Xiaodi Wu; Ryan W Kim; Jim J Huntley; Shujun Luo; Gary P Schroth; Thomas D Wu; HyeRan Kim; Kap-Seok Yang; Woong-Yang Park; Hyungtae Kim; George M Church; Charles Lee; Stephen F Kingsmore; Jeong-Sun Seo
Journal:  Nature       Date:  2009-07-08       Impact factor: 49.962

6.  Mutations and common polymorphisms in ADAMTS13 gene responsible for von Willebrand factor-cleaving protease activity.

Authors:  Koichi Kokame; Masanori Matsumoto; Kenji Soejima; Hideo Yagi; Hiromichi Ishizashi; Masahisa Funato; Hiroshi Tamai; Mutsuko Konno; Kei Kamide; Yuhei Kawano; Toshiyuki Miyata; Yoshihiro Fujimura
Journal:  Proc Natl Acad Sci U S A       Date:  2002-08-14       Impact factor: 11.205

7.  The diploid genome sequence of an individual human.

Authors:  Samuel Levy; Granger Sutton; Pauline C Ng; Lars Feuk; Aaron L Halpern; Brian P Walenz; Nelson Axelrod; Jiaqi Huang; Ewen F Kirkness; Gennady Denisov; Yuan Lin; Jeffrey R MacDonald; Andy Wing Chun Pang; Mary Shago; Timothy B Stockwell; Alexia Tsiamouri; Vineet Bafna; Vikas Bansal; Saul A Kravitz; Dana A Busam; Karen Y Beeson; Tina C McIntosh; Karin A Remington; Josep F Abril; John Gill; Jon Borman; Yu-Hui Rogers; Marvin E Frazier; Stephen W Scherer; Robert L Strausberg; J Craig Venter
Journal:  PLoS Biol       Date:  2007-09-04       Impact factor: 8.029

8.  A global reference for human genetic variation.

Authors:  Adam Auton; Lisa D Brooks; Richard M Durbin; Erik P Garrison; Hyun Min Kang; Jan O Korbel; Jonathan L Marchini; Shane McCarthy; Gil A McVean; Gonçalo R Abecasis
Journal:  Nature       Date:  2015-10-01       Impact factor: 49.962

9.  Detailed analysis of 15q11-q14 sequence corrects errors and gaps in the public access sequence to fully reveal large segmental duplications at breakpoints for Prader-Willi, Angelman, and inv dup(15) syndromes.

Authors:  Andrew J Makoff; Rachel H Flomen
Journal:  Genome Biol       Date:  2007       Impact factor: 13.583

10.  ClinVar: public archive of relationships among sequence variation and human phenotype.

Authors:  Melissa J Landrum; Jennifer M Lee; George R Riley; Wonhee Jang; Wendy S Rubinstein; Deanna M Church; Donna R Maglott
Journal:  Nucleic Acids Res       Date:  2013-11-14       Impact factor: 16.971

View more
  132 in total

1.  Reconstruction and evolutionary history of eutherian chromosomes.

Authors:  Jaebum Kim; Marta Farré; Loretta Auvil; Boris Capitanu; Denis M Larkin; Jian Ma; Harris A Lewin
Journal:  Proc Natl Acad Sci U S A       Date:  2017-06-19       Impact factor: 11.205

Review 2.  Detecting Somatic Mutations in Normal Cells.

Authors:  Yanmei Dou; Heather D Gold; Lovelace J Luquette; Peter J Park
Journal:  Trends Genet       Date:  2018-05-03       Impact factor: 11.639

3.  LR_Gapcloser: a tiling path-based gap closer that uses long reads to complete genome assembly.

Authors:  Gui-Cai Xu; Tian-Jun Xu; Rui Zhu; Yan Zhang; Shang-Qi Li; Hong-Wei Wang; Jiong-Tang Li
Journal:  Gigascience       Date:  2019-01-01       Impact factor: 6.524

4.  Microfluidic long DNA sample preparation from cells.

Authors:  Paridhi Agrawal; Kevin D Dorfman
Journal:  Lab Chip       Date:  2019-01-15       Impact factor: 6.799

Review 5.  New technologies to uncover the molecular basis of disorders of sex development.

Authors:  Hayk Barseghyan; Emmanuèle C Délot; Eric Vilain
Journal:  Mol Cell Endocrinol       Date:  2018-04-13       Impact factor: 4.102

6.  A 12-kb structural variation in progressive myoclonic epilepsy was newly identified by long-read whole-genome sequencing.

Authors:  Takeshi Mizuguchi; Takeshi Suzuki; Chihiro Abe; Ayako Umemura; Katsushi Tokunaga; Yosuke Kawai; Minoru Nakamura; Masao Nagasaki; Kengo Kinoshita; Yasunobu Okamura; Satoko Miyatake; Noriko Miyake; Naomichi Matsumoto
Journal:  J Hum Genet       Date:  2019-02-13       Impact factor: 3.172

7.  Sequencing and de novo assembly of 150 genomes from Denmark as a population reference.

Authors:  Lasse Maretty; Jacob Malte Jensen; Bent Petersen; Jonas Andreas Sibbesen; Siyang Liu; Palle Villesen; Laurits Skov; Kirstine Belling; Christian Theil Have; Jose M G Izarzugaza; Marie Grosjean; Jette Bork-Jensen; Jakob Grove; Thomas D Als; Shujia Huang; Yuqi Chang; Ruiqi Xu; Weijian Ye; Junhua Rao; Xiaosen Guo; Jihua Sun; Hongzhi Cao; Chen Ye; Johan van Beusekom; Thomas Espeseth; Esben Flindt; Rune M Friborg; Anders E Halager; Stephanie Le Hellard; Christina M Hultman; Francesco Lescai; Shengting Li; Ole Lund; Peter Løngren; Thomas Mailund; Maria Luisa Matey-Hernandez; Ole Mors; Christian N S Pedersen; Thomas Sicheritz-Pontén; Patrick Sullivan; Ali Syed; David Westergaard; Rachita Yadav; Ning Li; Xun Xu; Torben Hansen; Anders Krogh; Lars Bolund; Thorkild I A Sørensen; Oluf Pedersen; Ramneek Gupta; Simon Rasmussen; Søren Besenbacher; Anders D Børglum; Jun Wang; Hans Eiberg; Karsten Kristiansen; Søren Brunak; Mikkel Heide Schierup
Journal:  Nature       Date:  2017-07-26       Impact factor: 49.962

8.  Simulations of knotting of DNA during genome mapping.

Authors:  Aashish Jain; Kevin D Dorfman
Journal:  Biomicrofluidics       Date:  2017-04-11       Impact factor: 2.800

9.  A golden goat genome.

Authors:  Kim C Worley
Journal:  Nat Genet       Date:  2017-03-30       Impact factor: 38.330

Review 10.  Reference standards for next-generation sequencing.

Authors:  Simon A Hardwick; Ira W Deveson; Tim R Mercer
Journal:  Nat Rev Genet       Date:  2017-06-19       Impact factor: 53.242

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.