| Literature DB >> 30154823 |
Carolina N Correia1, Kirsten E McLoughlin1, Nicolas C Nalpas1, David A Magee1, John A Browne1, Kevin Rue-Albrecht1, Stephen V Gordon2,3, David E MacHugh1,3.
Abstract
RNA-seq has emerged as an important technology for measuring gene expression in peripheral blood samples collected from humans and other vertebrate species. In particular, transcriptomics analyses of whole blood can be used to study immunobiology and develop novel biomarkers of infectious disease. However, an obstacle to these methods in many mammalian species is the presence of reticulocyte-derived globin mRNAs in large quantities, which can complicate RNA-seq library sequencing and impede detection of other mRNA transcripts. A range of supplementary procedures for targeted depletion of globin transcripts have, therefore, been developed to alleviate this problem. Here, we use comparative analyses of RNA-seq data sets generated from human, porcine, equine, and bovine peripheral blood to systematically assess the impact of globin mRNA on routine transcriptome profiling of whole blood in cattle and horses. The results of these analyses demonstrate that total RNA isolated from equine and bovine peripheral blood contains very low levels of globin mRNA transcripts, thereby negating the need for globin depletion and greatly simplifying blood-based transcriptomic studies in these two domestic species.Entities:
Keywords: RNA-seq; blood; cattle; globin; horses; pigs; reticulocyte; transcriptome
Year: 2018 PMID: 30154823 PMCID: PMC6102425 DOI: 10.3389/fgene.2018.00278
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Figure 1Schematic of the bioinformatics workflow for RNA-seq data acquisition, quality control, analysis, and interpretation.
Status of current human, porcine, equine, and bovine haemoglobin gene annotations in the Ensembl, NCBI RefSeq, and UCSC databases.
| Annotation release | Human release 92 (April 2018) | NCBI | hg38.refGene annotation track (last updated on May 2018) |
| Genome assembly used to derive annotation | GRCh38.p12, GCA_000001405.27, December 2017 | GRCh38.p12, GCF_000001405.38, December 2017 | GRCh38, GCF_000001405.15, December 2013 |
| Annotated with gene ID ENSG00000206172 | Annotated with Entrez Gene ID 3039 | Annotated with Entrez Gene ID 3039 | |
| Annotated with gene ID ENSG00000188536 | Annotated with Entrez Gene ID 3040 | Annotated with Entrez Gene ID 3040 | |
| Annotated with gene ID ENSG00000244734 | Annotated with Entrez Gene ID 3043 | Annotated with Entrez Gene ID 3043 | |
| Annotation release | Pig release 92 (April 2018) | NCBI | susScr3.refGene annotation track (last updated on May 2018) |
| Genome assembly used to derive annotation | Sscrofa11.1, GCA_000003025.6, February 2017 | Sscrofa11.1, GCF_000003025.6, February 2017 | Sscrofa11.1, GCF_000003025.6, February 2017 |
| Absent from current annotation release, accessible via online search with ID 110259958.1 | Annotated with Entrez Gene ID 110259958 | Absent | |
| Absent from current annotation release, accessible via online search with ID 100737768.1 | Annotated with Entrez Gene ID 100737768 | Absent | |
| Annotated with gene ID ENSSSCG00000014725 | Annotated with Entrez Gene ID 407066 | Annotated with Entrez Gene ID 407066 | |
| Annotation release | Horse release 92 (April 2018) | NCBI | equCab2.refGene annotation track (last updated on May 2018) |
| Genome assembly used to derive annotation | EquCab 2, GCA_000002305.1, September 2007 | EquCab3, GCF_002863925.1, May 2018 | EquCab 2, GCA_000002305.1, September 2007 |
| Absent | Annotated with Entrez Gene ID 100036557 | Annotated with Entrez Gene ID 100036557 | |
| Absent | Annotated with Entrez Gene ID 100036558 | Annotated with Entrez Gene ID 100036558 | |
| Annotated with gene ID ENSECAG00000010020 | Annotated with Entrez Gene ID 100054109 | Annotated with Entrez Gene ID 100054109 | |
| Annotation release | Cow release 92 (April 2018) | NCBI | bosTau8.refGene annotation track (last updated on May 2018) |
| Genome assembly used to derive annotation | UMD3.1, GCA_000003055.3, November 2009 | ARS-UCD1.2, GCF_002263795.1, April 2018 | UMD3.1.1, GCA_000003055.4, June 2014 |
| Annotated as | Annotated with Entrez gene ID 100140149 | Absent | |
| Annotated as | Annotated with Entrez Gene ID 512439 | Annotated with Entrez Gene ID 512439 | |
| Annotated with gene ID ENSBTAG00000038748 | Annotated with Entrez Gene ID 280813 | Annotated with Entrez Gene ID 280813 | |
(Zerbino et al., .
(O'Leary et al., .
(Kent et al., .
(Karolchik et al., .
(Tyner et al., .
Summary of RNA-seq filtering/trimming and mapping statistics.
| Undepleted | Inward unstranded | PE | 40,218,886 | 8,203,217 | 20.4% | 32,015,669 | 25,593,239 | 80.9% | NCBI RefSeq | |
| Globin depleted | Inward unstranded | PE | 36,874,759 | 10,704,088 | 29.0% | 26,170,671 | 20,371,624 | 75.8% | NCBI RefSeq | |
| Undepleted | Inward unstranded | PE | 39,036,515 | 4,613,991 | 11.8% | 28,685,437 | 25,129,140 | 87.7% | NCBI RefSeq | |
| Globin depleted | Inward unstranded | PE | 31,339,886 | 3,899,427 | 12.4% | 22,867,049 | 19,959,995 | 87.3% | NCBI RefSeq | |
| Undepleted | Unstranded | SE | 24,271,141 | 38,892 | 0.2% | 14,850,797 | 11,387,774 | 76.5% | NCBI RefSeq | |
| Undepleted | Inward stranded forward | PE | 20,495,983 | 3,474,597 | 17.0% | 17,021,386 | 12,353,147 | 72.6% | NCBI RefSeq |
The Salmon tool categorises fragments as single read (for SE RNA-seq libraries) or a read pair (for PE RNA-seq libraries).
Figure 2Ridge plots showing density of sample gene-level transcripts per million (TPM). Results are shown from undepleted (purple) or globin-depleted (green) treatments.
Figure 3Average proportions of haemoglobin genes to total expressed genes from peripheral blood RNA-seq data in humans, pigs, horses, and cattle.
Summary statistics for haemoglobin gene-level transcripts per million (TPM).
| Undepleted | 12 | 191,209 | 16,601 | ||
| Globin depleted | 12 | 66,718 | 23,557 | ||
| Undepleted | 12 | 300,000 | 29,523 | ||
| Globin depleted | 12 | 79,818 | 31,259 | ||
| Undepleted | 12 | 200,000 | 43,262 | ||
| Globin depleted | 12 | 20,770 | 6,706 | ||
| Undepleted | 12 | 86 | 30 | ||
| Globin depleted | 12 | 13 | 13 | ||
| Undepleted | 12 | 243,864 | 31,605 | ||
| Globin depleted | 12 | 84,021 | 86,095 | ||
| Undepleted | 12 | 476,284 | 52,939 | ||
| Globin depleted | 12 | 136,172 | 128,232 | ||
| Undepleted | 37 | 443 | 560 | ||
| Undepleted | 37 | 653 | 789 | ||
| Undepleted | 37 | 1,024 | 1,144 | ||
| Undepleted | 10 | 21 | 29 | ||
| Undepleted | 10 | 1,101 | 1,102 | ||
| Undepleted | 10 | 532 | 469 |
Figure 4Distribution of haemoglobin gene-level transcripts per million (TPM). Results are shown from undepleted (purple) or globin-depleted (green) treatments.