Literature DB >> 30630097

Alignment-free approaches for predicting novel Nuclear Mitochondrial Segments (NUMTs) in the human genome.

Wentian Li1, Jerome Freudenberg2, Jan Freudenberg3.   

Abstract

The nuclear human genome harbors sequences of mitochondrial origin, indicating an ancestral transfer of DNA from the mitogenome. Several Nuclear Mitochondrial Segments (NUMTs) have been detected by alignment-based sequence similarity search, as implemented in the Basic Local Alignment Search Tool (BLAST). Identifying NUMTs is important for the comprehensive annotation and understanding of the human genome. Here we explore the possibility of detecting NUMTs in the human genome by alignment-free sequence similarity search, such as k-mers (k-tuples, k-grams, oligos of length k) distributions. We find that when k=6 or larger, the k-mer approach and BLAST search produce almost identical results, e.g., detect the same set of NUMTs longer than 3 kb. However, when k=5 or k=4, certain signals are only detected by the alignment-free approach, and these may indicate yet unrecognized, and potentially more ancestral NUMTs. We introduce a "Manhattan plot" style representation of NUMT predictions across the genome, which are calculated based on the reciprocal of the Jensen-Shannon divergence between the nuclear and mitochondrial k-mer frequencies. The further inspection of the k-mer-based NUMT predictions however shows that most of them contain long-terminal-repeat (LTR) annotations, whereas BLAST-based NUMT predictions do not. Thus, similarity of the mitogenome to LTR sequences is recognized, which we validate by finding the mitochondrial k-mer distribution closer to those for transposable sequences and specifically, close to some types of LTR.
Copyright © 2019 Elsevier B.V. All rights reserved.

Entities:  

Keywords:  Alignment-free; Jensen-Shannon divergence; Manhattan plot; Mitochondria; NUMT; k-mer

Mesh:

Year:  2019        PMID: 30630097     DOI: 10.1016/j.gene.2018.12.040

Source DB:  PubMed          Journal:  Gene        ISSN: 0378-1119            Impact factor:   3.688


  2 in total

1.  Phylogenetic Analysis of HIV-1 Genomes Based on the Position-Weighted K-mers Method.

Authors:  Yuanlin Ma; Zuguo Yu; Runbin Tang; Xianhua Xie; Guosheng Han; Vo V Anh
Journal:  Entropy (Basel)       Date:  2020-02-23       Impact factor: 2.524

Review 2.  Variant calling: Considerations, practices, and developments.

Authors:  Stepanka Zverinova; Victor Guryev
Journal:  Hum Mutat       Date:  2021-12-16       Impact factor: 4.700

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.