Literature DB >> 33372147

DeepTFactor: A deep learning-based tool for the prediction of transcription factors.

Gi Bae Kim1,2,3,4,5,6, Ye Gao7,8,9, Bernhard O Palsson8,9,10, Sang Yup Lee11,2,3,4,5,6.   

Abstract

A transcription factor (TF) is a sequence-specific DNA-binding protein that modulates the transcription of a set of particular genes, and thus regulates gene expression in the cell. TFs have commonly been predicted by analyzing sequence homology with the DNA-binding domains of TFs already characterized. Thus, TFs that do not show homologies with the reported ones are difficult to predict. Here we report the development of a deep learning-based tool, DeepTFactor, that predicts whether a protein in question is a TF. DeepTFactor uses a convolutional neural network to extract features of a protein. It showed high performance in predicting TFs of both eukaryotic and prokaryotic origins, resulting in F1 scores of 0.8154 and 0.8000, respectively. Analysis of the gradients of prediction score with respect to input suggested that DeepTFactor detects DNA-binding domains and other latent features for TF prediction. DeepTFactor predicted 332 candidate TFs in Escherichia coli K-12 MG1655. Among them, 84 candidate TFs belong to the y-ome, which is a collection of genes that lack experimental evidence of function. We experimentally validated the results of DeepTFactor prediction by further characterizing genome-wide binding sites of three predicted TFs, YqhC, YiaU, and YahB. Furthermore, we made available the list of 4,674,808 TFs predicted from 73,873,012 protein sequences in 48,346 genomes. DeepTFactor will serve as a useful tool for predicting TFs, which is necessary for understanding the regulatory systems of organisms of interest. We provide DeepTFactor as a stand-alone program, available at https://bitbucket.org/kaistsystemsbiology/deeptfactor.

Entities:  

Keywords:  ChIP-exo; deep learning; transcription factor; transcription regulation; y-ome

Mesh:

Substances:

Year:  2021        PMID: 33372147      PMCID: PMC7812831          DOI: 10.1073/pnas.2021171118

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  30 in total

1.  Novel regulators of the csgD gene encoding the master regulator of biofilm formation in Escherichia coli K-12.

Authors:  Hiroshi Ogasawara; Toshiyuki Ishizuka; Shuhei Hotta; Michiko Aoki; Tomohiro Shimada; Akira Ishihama
Journal:  Microbiology (Reading)       Date:  2020-09       Impact factor: 2.777

Review 2.  Machine learning applications in systems metabolic engineering.

Authors:  Gi Bae Kim; Won Jun Kim; Hyun Uk Kim; Sang Yup Lee
Journal:  Curr Opin Biotechnol       Date:  2019-09-30       Impact factor: 9.740

Review 3.  Opening the Black Box: Interpretable Machine Learning for Geneticists.

Authors:  Christina B Azodi; Jiliang Tang; Shin-Han Shiu
Journal:  Trends Genet       Date:  2020-04-17       Impact factor: 11.639

Review 4.  A primer on deep learning in genomics.

Authors:  James Zou; Mikael Huss; Abubakar Abid; Pejman Mohammadi; Ali Torkamani; Amalio Telenti
Journal:  Nat Genet       Date:  2018-11-26       Impact factor: 38.330

Review 5.  Deep learning for computational biology.

Authors:  Christof Angermueller; Tanel Pärnamaa; Leopold Parts; Oliver Stegle
Journal:  Mol Syst Biol       Date:  2016-07-29       Impact factor: 11.429

6.  Using deep learning to model the hierarchical structure and function of a cell.

Authors:  Jianzhu Ma; Michael Ku Yu; Samson Fong; Keiichiro Ono; Eric Sage; Barry Demchak; Roded Sharan; Trey Ideker
Journal:  Nat Methods       Date:  2018-03-05       Impact factor: 28.547

7.  UniProt: a worldwide hub of protein knowledge.

Authors: 
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

8.  P2TF: a comprehensive resource for analysis of prokaryotic transcription factors.

Authors:  Philippe Ortet; Gilles De Luca; David E Whitworth; Mohamed Barakat
Journal:  BMC Genomics       Date:  2012-11-15       Impact factor: 3.969

9.  Interpretable factor models of single-cell RNA-seq via variational autoencoders.

Authors:  Valentine Svensson; Adam Gayoso; Nir Yosef; Lior Pachter
Journal:  Bioinformatics       Date:  2020-06-01       Impact factor: 6.937

10.  Logomaker: beautiful sequence logos in Python.

Authors:  Ammar Tareen; Justin B Kinney
Journal:  Bioinformatics       Date:  2019-12-10       Impact factor: 6.937

View more
  9 in total

1.  Complete Genome Sequencing Analysis of Deinococcus wulumuqiensis R12, an Extremely Radiation-Resistant Strain.

Authors:  Zijie Dai; Zhidong Zhang; Liying Zhu; Zhengming Zhu; Ling Jiang
Journal:  Curr Microbiol       Date:  2022-08-16       Impact factor: 2.343

2.  Synthetic Biology Meets Machine Learning.

Authors:  Brendan Fu-Long Sieow; Ryan De Sotto; Zhi Ren Darren Seet; In Young Hwang; Matthew Wook Chang
Journal:  Methods Mol Biol       Date:  2023

3.  Membrane contact probability: An essential and predictive character for the structural and functional studies of membrane proteins.

Authors:  Lei Wang; Jiangguo Zhang; Dali Wang; Chen Song
Journal:  PLoS Comput Biol       Date:  2022-03-30       Impact factor: 4.475

4.  Programming living sensors for environment, health and biomanufacturing.

Authors:  Xinyi Wan; Behide Saltepe; Luyang Yu; Baojun Wang
Journal:  Microb Biotechnol       Date:  2021-05-07       Impact factor: 6.575

5.  PredicTF: prediction of bacterial transcription factors in complex microbial communities using deep learning.

Authors:  Lummy Maria Oliveira Monteiro; João Pedro Saraiva; Rodolfo Brizola Toscan; Peter F Stadler; Rafael Silva-Rocha; Ulisses Nunes da Rocha
Journal:  Environ Microbiome       Date:  2022-02-08

6.  Integrated Analysis of Single-Molecule Real-Time Sequencing and Next-Generation Sequencing Eveals Insights into Drought Tolerance Mechanism of Lolium multiflorum.

Authors:  Qiuxu Liu; Fangyan Wang; Yang Shuai; Linkai Huang; Xinquan Zhang
Journal:  Int J Mol Sci       Date:  2022-07-18       Impact factor: 6.208

7.  Deep learning explains the biology of branched glycans from single-cell sequencing data.

Authors:  Rui Qin; Lara K Mahal; Daniel Bojar
Journal:  iScience       Date:  2022-09-19

8.  Unraveling the functions of uncharacterized transcription factors in Escherichia coli using ChIP-exo.

Authors:  Ye Gao; Hyun Gyu Lim; Hans Verkler; Richard Szubin; Daniel Quach; Irina Rodionova; Ke Chen; James T Yurkovich; Byung-Kwan Cho; Bernhard O Palsson
Journal:  Nucleic Acids Res       Date:  2021-09-27       Impact factor: 16.971

9.  Biolayer interferometry for DNA-protein interactions.

Authors:  John K Barrows; Michael W Van Dyke
Journal:  PLoS One       Date:  2022-02-02       Impact factor: 3.240

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.