Literature DB >> 29762634

Lineage-associated underrepresented permutations (LAUPs) of mammalian genomic sequences based on a Jellyfish-based LAUPs analysis application (JBLA).

Le Zhang1,2, Ming Xiao2,3, Jingsong Zhou1, Jun Yu4,5.   

Abstract

Motivation: This study addresses several important questions related to naturally underrepresented sequences: (i) are there permutations of real genomic DNA sequences in a defined length (k-mer) and a given lineage that do not actually exist or underrepresented? (ii) If there are such sequences, what are their characteristics in terms of k-mer length and base composition? (iii) Are they related to CpG or TpA underrepresentation known for human sequences? We propose that the answers to these questions are of great significance for the study of sequence-associated regulatory mechanisms, such cytosine methylation and chromosomal structures in physiological or pathological conditions such as cancer.
Results: We empirically defined sequences that were not included in any well-known public databases as lineage-associated underrepresented permutations (LAUPs). Then, we developed a Jellyfish-based LAUPs analysis application (JBLA) to investigate LAUPs for 24 representative species. The present discoveries include: (i) lengths for the shortest LAUPs, ranging from 10 to 14, which collectively constitute a low proportion of the genome. (ii) Common LAUPs showing higher CG content over the analysed mammalian genome and possessing distinct CG*CG motifs. (iii) Neither CpG-containing LAUPs nor CpG island sequences are randomly structured and distributed over the genomes; some LAUPs and most CpG-containing sequences exhibit an opposite trend within the same k and n variants. In addition, we demonstrate that the JBLA algorithm is more efficient than the original Jellyfish for computing LAUPs. Availability and implementation: We developed a Jellyfish-based LAUP analysis (JBLA) application by integrating Jellyfish (Marçais and Kingsford, 2011), MEME (Bailey, et al., 2009) and the NCBI genome database (Pruitt, et al., 2007) applications, which are listed as Supplementary Material. Supplementary information: Supplementary data are available at Bioinformatics online.

Entities:  

Mesh:

Year:  2018        PMID: 29762634     DOI: 10.1093/bioinformatics/bty392

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  7 in total

1.  2019nCoVAS: Developing the Web Service for Epidemic Transmission Prediction, Genome Analysis, and Psychological Stress Assessment for 2019-nCoV.

Authors:  Ming Xiao; Guangdi Liu; Jianghang Xie; Zichun Dai; Zihao Wei; Ziyao Ren; Jun Yu; Le Zhang
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2021-08-06       Impact factor: 3.702

2.  Comprehensively benchmarking applications for detecting copy number variation.

Authors:  Le Zhang; Wanyu Bai; Na Yuan; Zhenglin Du
Journal:  PLoS Comput Biol       Date:  2019-05-28       Impact factor: 4.475

3.  Exploring the dynamics and interplay of human papillomavirus and cervical tumorigenesis by integrating biological data into a mathematical model.

Authors:  Wenting Wu; Lei Song; Yongtao Yang; Jianxin Wang; Hongtu Liu; Le Zhang
Journal:  BMC Bioinformatics       Date:  2020-05-05       Impact factor: 3.169

Review 4.  Artificial intelligence in cancer target identification and drug discovery.

Authors:  Yujie You; Xin Lai; Yi Pan; Huiru Zheng; Julio Vera; Suran Liu; Senyi Deng; Le Zhang
Journal:  Signal Transduct Target Ther       Date:  2022-05-10

5.  An integrated platform for Brucella with knowledge graph technology: From genomic analysis to epidemiological projection.

Authors:  Fubo Ma; Ming Xiao; Lin Zhu; Wen Jiang; Jizhe Jiang; Peng-Fei Zhang; Kang Li; Min Yue; Le Zhang
Journal:  Front Genet       Date:  2022-09-14       Impact factor: 4.772

Review 6.  Exploring the computational methods for protein-ligand binding site prediction.

Authors:  Jingtian Zhao; Yang Cao; Le Zhang
Journal:  Comput Struct Biotechnol J       Date:  2020-02-17       Impact factor: 7.271

7.  Developing the novel bioinformatics algorithms to systematically investigate the connections among survival time, key genes and proteins for Glioblastoma multiforme.

Authors:  Yujie You; Xufang Ru; Wanjing Lei; Tingting Li; Ming Xiao; Huiru Zheng; Yujie Chen; Le Zhang
Journal:  BMC Bioinformatics       Date:  2020-09-17       Impact factor: 3.169

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.