| Literature DB >> 26917953 |
Jessica P Hekman1, Jennifer L Johnson1, Anna V Kukekova1.
Abstract
Domesticated species occupy a special place in the human world due to their economic and cultural value. In the era of genomic research, domesticated species provide unique advantages for investigation of diseases and complex phenotypes. RNA sequencing, or RNA-seq, has recently emerged as a new approach for studying transcriptional activity of the whole genome, changing the focus from individual genes to gene networks. RNA-seq analysis in domesticated species may complement genome-wide association studies of complex traits with economic importance or direct relevance to biomedical research. However, RNA-seq studies are more challenging in domesticated species than in model organisms. These challenges are at least in part associated with the lack of quality genome assemblies for some domesticated species and the absence of genome assemblies for others. In this review, we discuss strategies for analyzing RNA-seq data, focusing particularly on questions and examples relevant to domesticated species.Entities:
Keywords: NGS; RNA-seq; domestication; transcriptomics
Year: 2016 PMID: 26917953 PMCID: PMC4756862 DOI: 10.4137/BBI.S29334
Source DB: PubMed Journal: Bioinform Biol Insights ISSN: 1177-9322
RNA-seq bioinformatic workflow for calling differentially expressed genes.
| STEP | TOOLS | CHALLENGES |
|---|---|---|
| 1. Remove low-quality reads, barcodes, and adapters | Fastx-toolkit, FLEXBAR, or Trimmomatic | Follow recommended protocol |
| 2. Remove mitochondrial and ribosomal sequences | Bowtie2 | Sequences from the same or related species should be used |
| 3. Align to reference genome | TopHat2 | Incomplete or nonexistent reference genome |
| 4. Call differentially expressed genes | DESeq2, edgeR, or limma | Incomplete or nonexistent reference genome annotation |
Properties of assemblies of human, mouse, and domesticated species.
| SPECIES | ASSEMBLY | SEQUENCING TECHNOLOGY | SCAFFOLDS | SCAFFOLD N50 (BP) | TOTAL GAP LENGTH (BP) |
|---|---|---|---|---|---|
| Human | GRCh38.p5 | Sanger | 797 | 59,364,414 | 161,368,151 |
| Mouse | GRCm38.p4 | Sanger | 293 | 52,589,046 | 79,356,756 |
| Dog | CanFam3.1 | Sanger | 3,310 | 45,876,610 | 18,261,639 |
| Chicken | Gallus_gallus−4.0 | Sanger | 16,847 | 12,877,381 | 14,074,301 |
| Cow | Btau_4.6.1 | Sanger | 13,387 | 2,599,288 | 176,429,395 |
| Pig | Sscrofa10.2 | Sanger and NGS | 9,906 | 576,008 | 289,397,178 |
| Turkey | Turkey 5.0 | NGS | 233,806 | 3,801,642 | 35,294,427 |
| Yak | BosGru 2.0 | NGS | 41,192 | 1,407,960 | 120,154,638 |
| Ferret | MusPutFur1.0 | NGS | 7,783 | 9,335,154 | 132,851,443 |
Note: All data were downloaded from ncbi.nlm.nih.gov on December 2, 2015.
Numbers of curated and uncurated transcripts annotated in the RefSeq database for human, mouse, and domesticated species.
| SPECIES | SPECIES NAME USED IN REFSEQ SEARCH | TOTAL REFSEQ TRANSCRIPTS | CURATED REFSEQ TRANSCRIPTS | UNCURATED REFSEQ TRANSCRIPTS |
|---|---|---|---|---|
| Human | 100,068 | 39,623 | 60,445 | |
| Mouse | 78,241 | 29,900 | 48,341 | |
| Dog | 47,095 | 1,675 | 45,420 | |
| Chicken | 32,244 | 6,197 | 26,047 | |
| Cow | 70,342 | 13,329 | 57,013 | |
| Pig | 47,445 | 4,154 | 43,291 | |
| Turkey | 26,450 | 93 | 26,357 | |
| Yak | 28,868 | 7 | 28,861 | |
| Ferret | 48,113 | 61 | 48,052 |
Note: All data were downloaded by searching “Species name”[porgn] AND refseq[filter] AND biomol_mrna[PROP] (eg, “Canis lupus familiaris”[porgn] AND refseq[filter] AND biomol_mrna[PROP]) at http://www.ncbi.nlm.nih.gov/nuccore/on December 10, 2015.