Literature DB >> 29194476

RecoverY: k-mer-based read classification for Y-chromosome-specific sequencing and assembly.

Samarth Rangavittal1, Robert S Harris1, Monika Cechova1, Marta Tomaszkiewicz1, Rayan Chikhi2, Kateryna D Makova1,3,4, Paul Medvedev5,3,6,4.   

Abstract

Motivation: The haploid mammalian Y chromosome is usually under-represented in genome assemblies due to high repeat content and low depth due to its haploid nature. One strategy to ameliorate the low coverage of Y sequences is to experimentally enrich Y-specific material before assembly. As the enrichment process is imperfect, algorithms are needed to identify putative Y-specific reads prior to downstream assembly. A strategy that uses k-mer abundances to identify such reads was used to assemble the gorilla Y. However, the strategy required the manual setting of key parameters, a time-consuming process leading to sub-optimal assemblies.
Results: We develop a method, RecoverY, that selects Y-specific reads by automatically choosing the abundance level at which a k-mer is deemed to originate from the Y. This algorithm uses prior knowledge about the Y chromosome of a related species or known Y transcript sequences. We evaluate RecoverY on both simulated and real data, for human and gorilla, and investigate its robustness to important parameters. We show that RecoverY leads to a vastly superior assembly compared to alternate strategies of filtering the reads or contigs. Compared to the preliminary strategy used by Tomaszkiewicz et al., we achieve a 33% improvement in assembly size and a 20% improvement in the NG50, demonstrating the power of automatic parameter selection. Availability and implementation: Our tool RecoverY is freely available at https://github.com/makovalab-psu/RecoverY. Contact: kmakova@bx.psu.edu or pashadag@cse.psu.edu. Supplementary information: Supplementary data are available at Bioinformatics online.

Entities:  

Mesh:

Year:  2018        PMID: 29194476      PMCID: PMC6030959          DOI: 10.1093/bioinformatics/btx771

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  27 in total

1.  The male-specific region of the human Y chromosome is a mosaic of discrete sequence classes.

Authors:  Helen Skaletsky; Tomoko Kuroda-Kawaguchi; Patrick J Minx; Holland S Cordum; LaDeana Hillier; Laura G Brown; Sjoerd Repping; Tatyana Pyntikova; Johar Ali; Tamberlyn Bieri; Asif Chinwalla; Andrew Delehaunty; Kim Delehaunty; Hui Du; Ginger Fewell; Lucinda Fulton; Robert Fulton; Tina Graves; Shun-Fang Hou; Philip Latrielle; Shawn Leonard; Elaine Mardis; Rachel Maupin; John McPherson; Tracie Miner; William Nash; Christine Nguyen; Philip Ozersky; Kymberlie Pepin; Susan Rock; Tracy Rohlfing; Kelsi Scott; Brian Schultz; Cindy Strong; Aye Tin-Wollam; Shiaw-Pyng Yang; Robert H Waterston; Richard K Wilson; Steve Rozen; David C Page
Journal:  Nature       Date:  2003-06-19       Impact factor: 49.962

2.  A fast, lock-free approach for efficient parallel counting of occurrences of k-mers.

Authors:  Guillaume Marçais; Carl Kingsford
Journal:  Bioinformatics       Date:  2011-01-07       Impact factor: 6.937

3.  Sequencing the mouse Y chromosome reveals convergent gene acquisition and amplification on both sex chromosomes.

Authors:  Y Q Shirleen Soh; Jessica Alföldi; Tatyana Pyntikova; Laura G Brown; Tina Graves; Patrick J Minx; Robert S Fulton; Colin Kremitzki; Natalia Koutseva; Jacob L Mueller; Steve Rozen; Jennifer F Hughes; Elaine Owens; James E Womack; William J Murphy; Qing Cao; Pieter de Jong; Wesley C Warren; Richard K Wilson; Helen Skaletsky; David C Page
Journal:  Cell       Date:  2014-10-30       Impact factor: 41.582

4.  Evolution of X-degenerate Y chromosome genes in greater apes: conservation of gene content in human and gorilla, but not chimpanzee.

Authors:  Hiroki Goto; Lei Peng; Kateryna D Makova
Journal:  J Mol Evol       Date:  2009-01-14       Impact factor: 2.395

5.  Efficient counting of k-mers in DNA sequences using a bloom filter.

Authors:  Páll Melsted; Jonathan K Pritchard
Journal:  BMC Bioinformatics       Date:  2011-08-10       Impact factor: 3.169

6.  Strict evolutionary conservation followed rapid gene loss on human and rhesus Y chromosomes.

Authors:  Jennifer F Hughes; Helen Skaletsky; Laura G Brown; Tatyana Pyntikova; Tina Graves; Robert S Fulton; Shannon Dugan; Yan Ding; Christian J Buhay; Colin Kremitzki; Qiaoyan Wang; Hua Shen; Michael Holder; Donna Villasana; Lynne V Nazareth; Andrew Cree; Laura Courtney; Joelle Veizer; Holland Kotkiewicz; Ting-Jan Cho; Natalia Koutseva; Steve Rozen; Donna M Muzny; Wesley C Warren; Richard A Gibbs; Richard K Wilson; David C Page
Journal:  Nature       Date:  2012-02-22       Impact factor: 49.962

7.  SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler.

Authors:  Ruibang Luo; Binghang Liu; Yinlong Xie; Zhenyu Li; Weihua Huang; Jianying Yuan; Guangzhu He; Yanxiang Chen; Qi Pan; Yunjie Liu; Jingbo Tang; Gengxiong Wu; Hao Zhang; Yujian Shi; Yong Liu; Chang Yu; Bo Wang; Yao Lu; Changlei Han; David W Cheung; Siu-Ming Yiu; Shaoliang Peng; Zhu Xiaoqian; Guangming Liu; Xiangke Liao; Yingrui Li; Huanming Yang; Jian Wang; Tak-Wah Lam; Jun Wang
Journal:  Gigascience       Date:  2012-12-27       Impact factor: 6.524

8.  Chimpanzee and human Y chromosomes are remarkably divergent in structure and gene content.

Authors:  Jennifer F Hughes; Helen Skaletsky; Tatyana Pyntikova; Tina A Graves; Saskia K M van Daalen; Patrick J Minx; Robert S Fulton; Sean D McGrath; Devin P Locke; Cynthia Friedman; Barbara J Trask; Elaine R Mardis; Wesley C Warren; Sjoerd Repping; Steve Rozen; Richard K Wilson; David C Page
Journal:  Nature       Date:  2010-01-13       Impact factor: 49.962

9.  The khmer software package: enabling efficient nucleotide sequence analysis.

Authors:  Michael R Crusoe; Hussien F Alameldin; Sherine Awad; Elmar Boucher; Adam Caldwell; Reed Cartwright; Amanda Charbonneau; Bede Constantinides; Greg Edvenson; Scott Fay; Jacob Fenton; Thomas Fenzl; Jordan Fish; Leonor Garcia-Gutierrez; Phillip Garland; Jonathan Gluck; Iván González; Sarah Guermond; Jiarong Guo; Aditi Gupta; Joshua R Herr; Adina Howe; Alex Hyer; Andreas Härpfer; Luiz Irber; Rhys Kidd; David Lin; Justin Lippi; Tamer Mansour; Pamela McA'Nulty; Eric McDonald; Jessica Mizzi; Kevin D Murray; Joshua R Nahum; Kaben Nanlohy; Alexander Johan Nederbragt; Humberto Ortiz-Zuazaga; Jeramia Ory; Jason Pell; Charles Pepe-Ranney; Zachary N Russ; Erich Schwarz; Camille Scott; Josiah Seaman; Scott Sievert; Jared Simpson; Connor T Skennerton; James Spencer; Ramakrishnan Srinivasan; Daniel Standage; James A Stapleton; Susan R Steinman; Joe Stein; Benjamin Taylor; Will Trimble; Heather L Wiencko; Michael Wright; Brian Wyss; Qingpeng Zhang; En Zyme; C Titus Brown
Journal:  F1000Res       Date:  2015-09-25

10.  A time- and cost-effective strategy to sequence mammalian Y Chromosomes: an application to the de novo assembly of gorilla Y.

Authors:  Marta Tomaszkiewicz; Samarth Rangavittal; Monika Cechova; Rebeca Campos Sanchez; Howard W Fescemyer; Robert Harris; Danling Ye; Patricia C M O'Brien; Rayan Chikhi; Oliver A Ryder; Malcolm A Ferguson-Smith; Paul Medvedev; Kateryna D Makova
Journal:  Genome Res       Date:  2016-03-02       Impact factor: 9.043

View more
  9 in total

Review 1.  Satellite DNAs and human sex chromosome variation.

Authors:  Monika Cechova; Karen H Miga
Journal:  Semin Cell Dev Biol       Date:  2022-05-27       Impact factor: 7.499

2.  Correcting palindromes in long reads after whole-genome amplification.

Authors:  Sven Warris; Elio Schijlen; Henri van de Geest; Rahulsimham Vegesna; Thamara Hesselink; Bas Te Lintel Hekkert; Gabino Sanchez Perez; Paul Medvedev; Kateryna D Makova; Dick de Ridder
Journal:  BMC Genomics       Date:  2018-11-06       Impact factor: 3.969

3.  DiscoverY: a classifier for identifying Y chromosome sequences in male assemblies.

Authors:  Samarth Rangavittal; Natasha Stopa; Marta Tomaszkiewicz; Kristoffer Sahlin; Kateryna D Makova; Paul Medvedev
Journal:  BMC Genomics       Date:  2019-08-09       Impact factor: 3.969

Review 4.  How to identify sex chromosomes and their turnover.

Authors:  Daniela H Palmer; Thea F Rogers; Rebecca Dean; Alison E Wright
Journal:  Mol Ecol       Date:  2019-10-10       Impact factor: 6.185

5.  Sequencing Red Fox Y Chromosome Fragments to Develop Phylogenetically Informative SNP Markers and Glimpse Male-Specific Trans-Pacific Phylogeography.

Authors:  Benjamin N Sacks; Zachary T Lounsberry; Halie M Rando; Kristopher Kluepfel; Steven R Fain; Sarah K Brown; Anna V Kukekova
Journal:  Genes (Basel)       Date:  2021-01-14       Impact factor: 4.096

6.  The assembly of caprine Y chromosome sequence reveals a unique paternal phylogenetic pattern and improves our understanding of the origin of domestic goat.

Authors:  Changyi Xiao; Jingjin Li; Tanghui Xie; Jianhai Chen; Sijia Zhang; Salma Hassan Elaksher; Fan Jiang; Yaoxin Jiang; Lu Zhang; Wei Zhang; Yue Xiang; Zhenyang Wu; Shuhong Zhao; Xiaoyong Du
Journal:  Ecol Evol       Date:  2021-05-04       Impact factor: 2.912

7.  Shared and Species-Specific Patterns of Nascent Y Chromosome Evolution in Two Guppy Species.

Authors:  Jake Morris; Iulia Darolti; Natasha I Bloch; Alison E Wright; Judith E Mank
Journal:  Genes (Basel)       Date:  2018-05-03       Impact factor: 4.096

8.  Sc-ncDNAPred: A Sequence-Based Predictor for Identifying Non-coding DNA in Saccharomyces cerevisiae.

Authors:  Wenying He; Ying Ju; Xiangxiang Zeng; Xiangrong Liu; Quan Zou
Journal:  Front Microbiol       Date:  2018-09-12       Impact factor: 5.640

9.  Schistosome W-Linked Genes Inform Temporal Dynamics of Sex Chromosome Evolution and Suggest Candidate for Sex Determination.

Authors:  Marwan Elkrewi; Mikhail A Moldovan; Marion A L Picard; Beatriz Vicoso
Journal:  Mol Biol Evol       Date:  2021-12-09       Impact factor: 16.240

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.