Literature DB >> 35536198

BioKIT: a versatile toolkit for processing and analyzing diverse types of sequence data.

Jacob L Steenwyk1,2, Thomas J Buida3, Carla Gonçalves1,2,4,5, Dayna C Goltz6, Grace Morales7, Matthew E Mead1,2, Abigail L LaBella1,2, Christina M Chavez1,2, Jonathan E Schmitz7, Maria Hadjifrangiskou2,7, Yuanning Li1, Antonis Rokas1,2.   

Abstract

Bioinformatic analysis-such as genome assembly quality assessment, alignment summary statistics, relative synonymous codon usage, file format conversion, and processing and analysis-is integrated into diverse disciplines in the biological sciences. Several command-line pieces of software have been developed to conduct some of these individual analyses, but unified toolkits that conduct all these analyses are lacking. To address this gap, we introduce BioKIT, a versatile command line toolkit that has, upon publication, 42 functions, several of which were community-sourced, that conduct routine and novel processing and analysis of genome assemblies, multiple sequence alignments, coding sequences, sequencing data, and more. To demonstrate the utility of BioKIT, we conducted a comprehensive examination of relative synonymous codon usage across 171 fungal genomes that use alternative genetic codes, showed that the novel metric of gene-wise relative synonymous codon usage can accurately estimate gene-wise codon optimization, evaluated the quality and characteristics of 901 eukaryotic genome assemblies, and calculated alignment summary statistics for 10 phylogenomic data matrices. BioKIT will be helpful in facilitating and streamlining sequence analysis workflows. BioKIT is freely available under the MIT license from GitHub (https://github.com/JLSteenwyk/BioKIT), PyPi (https://pypi.org/project/jlsteenwyk-biokit/), and the Anaconda Cloud (https://anaconda.org/jlsteenwyk/jlsteenwyk-biokit). Documentation, user tutorials, and instructions for requesting new features are available online (https://jlsteenwyk.com/BioKIT).
© The Author(s) 2022. Published by Oxford University Press on behalf of Genetics Society of America. All rights reserved. For permissions, please email: journals.permissions@oup.com.

Entities:  

Keywords:  bioinformatics; codon; gene-wise relative synonymous codon usage; genetic code; genome assembly quality; multiple sequence alignment

Mesh:

Substances:

Year:  2022        PMID: 35536198      PMCID: PMC9252278          DOI: 10.1093/genetics/iyac079

Source DB:  PubMed          Journal:  Genetics        ISSN: 0016-6731            Impact factor:   4.402


  72 in total

1.  The evolution of annelids reveals two adaptive routes to the interstitial realm.

Authors:  Torsten Hugo Struck; Anja Golombek; Anne Weigert; Franziska Anni Franke; Wilfried Westheide; Günter Purschke; Christoph Bleidorn; Kenneth Michael Halanych
Journal:  Curr Biol       Date:  2015-07-23       Impact factor: 10.834

2.  Bioinformatics software for biologists in the genomics era.

Authors:  Sudhir Kumar; Joel Dudley
Journal:  Bioinformatics       Date:  2007-05-07       Impact factor: 6.937

3.  Error, signal, and the placement of Ctenophora sister to all other animals.

Authors:  Nathan V Whelan; Kevin M Kocot; Leonid L Moroz; Kenneth M Halanych
Journal:  Proc Natl Acad Sci U S A       Date:  2015-04-20       Impact factor: 11.205

4.  Profile analysis: detection of distantly related proteins.

Authors:  M Gribskov; A D McLachlan; D Eisenberg
Journal:  Proc Natl Acad Sci U S A       Date:  1987-07       Impact factor: 11.205

5.  Codon usage in yeast: cluster analysis clearly differentiates highly and lowly expressed genes.

Authors:  P M Sharp; T M Tuohy; K R Mosurski
Journal:  Nucleic Acids Res       Date:  1986-07-11       Impact factor: 16.971

6.  Archaeal phylogeny: reexamination of the phylogenetic position of Archaeoglobus fulgidus in light of certain composition-induced artifacts.

Authors:  C R Woese; L Achenbach; P Rouviere; L Mandelco
Journal:  Syst Appl Microbiol       Date:  1991       Impact factor: 4.022

7.  Patterns and evolution of nucleotide landscapes in seed plants.

Authors:  Laurana Serres-Giardi; Khalid Belkhir; Jacques David; Sylvain Glémin
Journal:  Plant Cell       Date:  2012-04-06       Impact factor: 11.277

8.  Evolution of genome size and complexity in the rhabdoviridae.

Authors:  Peter J Walker; Cadhla Firth; Steven G Widen; Kim R Blasdell; Hilda Guzman; Thomas G Wood; Prasad N Paradkar; Edward C Holmes; Robert B Tesh; Nikos Vasilakis
Journal:  PLoS Pathog       Date:  2015-02-13       Impact factor: 6.823

9.  GenomeQC: a quality assessment tool for genome assemblies and gene structure annotations.

Authors:  Nancy Manchanda; John L Portwood; Margaret R Woodhouse; Arun S Seetharam; Carolyn J Lawrence-Dill; Carson M Andorf; Matthew B Hufford
Journal:  BMC Genomics       Date:  2020-03-02       Impact factor: 3.969

10.  Signatures of optimal codon usage in metabolic genes inform budding yeast ecology.

Authors:  Abigail Leavitt LaBella; Dana A Opulente; Jacob L Steenwyk; Chris Todd Hittinger; Antonis Rokas
Journal:  PLoS Biol       Date:  2021-04-19       Impact factor: 8.029

View more
  1 in total

1.  Examination of Genome-Wide Ortholog Variation in Clinical and Environmental Isolates of the Fungal Pathogen Aspergillus fumigatus.

Authors:  Maria Augusta C Horta; Jacob L Steenwyk; Matthew E Mead; Luciano H Braz Dos Santos; Shu Zhao; John G Gibbons; Marina Marcet-Houben; Toni Gabaldón; Antonis Rokas; Gustavo H Goldman
Journal:  mBio       Date:  2022-06-29       Impact factor: 7.786

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.