| Literature DB >> 33874908 |
Shuhua Zhan1, Cortland Griswold2, Lewis Lukens3.
Abstract
BACKGROUND: Genetic variation for gene expression is a source of phenotypic variation for natural and agricultural species. The common approach to map and to quantify gene expression from genetically distinct individuals is to assign their RNA-seq reads to a single reference genome. However, RNA-seq reads from alleles dissimilar to this reference genome may fail to map correctly, causing transcript levels to be underestimated. Presently, the extent of this mapping problem is not clear, particularly in highly diverse species. We investigated if mapping bias occurred and if chromosomal features associated with mapping bias. Zea mays presents a model species to assess these questions, given it has genotypically distinct and well-studied genetic lines.Entities:
Keywords: Gene coexpression analysis; Genetic diversity; Maize; Mapping bias; RNA-Seq; Sequence divergence; Transcriptome variation; eQTL analysis
Mesh:
Year: 2021 PMID: 33874908 PMCID: PMC8056621 DOI: 10.1186/s12864-021-07577-3
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Numbers of genes with B73 and Mo17 positively acting cis regulatory alleles using B73 and Mo17 reference genomes revealed read mapping preferences for reference alleles
| B73 referencea | Mo17 referenceb | |
|---|---|---|
| Total cis-eQTLc | 9306 | 7985 |
| B73 positive cis-eQTL | 6341 (68.1%) | 2816 (35.3%) |
| Mo17 positive cis-eQTL | 2965 (31.9%) | 5169 (64.7%) |
RNA-seq were mapped to the B73a or Mo17b reference using STAR with the default criteria as given in Supplemental Information.
c The numbers of genes with cis-eQTL were recorded excluding genes with both cis and trans eQTL
Fig. 1Depiction of a correlation test to determine if preferential alignment affected gene transcript estimates. a A chromosomal region whose genes transcripts are all inferred to be correlated across RILs indicates preferential read alignment. A hypothetical region has five linked genes (A-E) that are in complete linkage disequilibrium within the B73 x Mo17 RIL population. Individual members of the RIL population are arbitrarily labelled 1–97. Preferential read alignment causes the B73 allele to have a positive effect on all genes’ transcript abundance estimates relative to the Mo17 allele, leading to one group of coexpressed genes. b A chromosomal region with two groups of genes coexpressed across RILs indicates unbiased read alignments across the regions’ genes. Relative to Mo17, B73 alleles increase transcript abundances for genes A, C, and D and decrease transcript abundances for genes B and E. Genes A, C, and D are coexpressed, and genes B and E are also coexpressed
Fig. 2The relationship between the proportion of cis-eQTL that had positive B73 effects when aligned to the B73 reference genome and the number of SNPs between B73 and Mo17 genes. The line plots are a lowess smooth of the number of SNPs per gene per 2 Mb of chromosome 3 sequence and the proportion of B73 positive cis-eQTL per gene per 2 Mb of chromosome 3 sequence. The x-axis is the physical location of intervals in Mb along chromosome 3. The left y-axis is the average number of SNPs within cis-eQTL genes in the intervals. The right y-axis is proportion of B73 positive cis-eQTL out of the total cis-eQTL in the interval. An arrow indicates the centromere location
Fig. 3For genes positively affected by B73 (A) and Mo17 (B) cis-eQTL, the number of genic SNPs and expression level differences of homozygous B73 and Mo17 lines were plotted. Exon sequence divergence was calculated as the number of SNPs matching exons in a gene divided by the total exon lengths and multiplied by 1000. The numbers in parenthesis represent the numbers of genes in each group. The y-axis is the absolute value of the allelic expression level difference of B73 minus Mo17 in FPKM, or two times the expected additive effect of the locus. a Expression differences between B73 and Mo17 were highly correlated with numbers of SNPs per 1 kb exon for the 5559 B73 positive cis-eQTL genes with SNP information. b Expression level differences between Mo17 and B73 were weakly correlated with numbers of SNPs per 1 kb exon for the 2561 Mo17 positive cis-eQTL genes
Median percentages of reads from B73 and Mo17 that mapped to UTRs relative to exons among genes upregulated by B73 or Mo17 eQTL
| B73 reads | Mo17 reads | All readsc | |
|---|---|---|---|
| B73 + eQTL | 17.1a | 14.3 | 15.7 |
| Mo17 + eQTL | 13.7b | 13.1 | 13.4 |
aAmong genes upregulated by B73 alleles at cis eQTL, the percentage of reads that mapped to UTRs was significantly higher for reads derived from B73 alleles than for reads derived from Mo17 alleles. Wilcoxon signed rank test, V = 7,544,600, P < 2.2e-16
bAmong genes upregulated by Mo17 alleles at cis eQTL, the proportion of reads that mapped to UTRs was also significantly higher for reads derived from B73 alleles than for reads derived from Mo17 alleles. Wilcoxon signed rank test, V = 1,324,300, P = 1.1e-8
cThe percentage of all reads irregardless of allelic origin that mapped to UTRs was significantly higher for genes upregulated by B73 alleles than genes with upregulated by Mo17 alleles. Wilcoxon rank sum test, W = 25,439,952, P < 2.2e-16
Median percentages of reads from B73 and Mo17 that mapped to splice junctions relative to all exons for genes with B73 or Mo17 positive eQTL
| B73 reads | Mo17 reads | All readsc | |
|---|---|---|---|
| B73 + eQTL | 15.2a | 15.3 | 15.2 |
| Mo17 + eQTL | 15.6b | 15.8 | 15.7 |
aAmong genes upregulated by B73 alleles, the proportion of reads that mapped to splice sites was significantly higher for Mo17 derived reads than B73 derived reads. Wilcoxon signed rank test, V = 5,474,100, p-value = 4.947e-12
bAmong genes upregulated by Mo17 alleles, the proportion of reads that mapped to splice sites was significantly higher for Mo17 derived reads than B73 derived reads. Wilcoxon signed rank test, V = 1,269,200, p-value = 0.019
cThe percentage of all reads that mapped to splice sites was significantly lower in genes upregulated by B73 alleles compared to genes upregulated by Mo17 alleles. Wilcoxon rank sum test, W = 23,696,329, p-value = 4.518e-4
Fig. 4Comparison of alignment criteria on cis-eQTL effect size. Each cis-eQTL for which the B73 allele increased a target gene’s expression was labelled “yes” if its effect estimated using the most relaxed alignment criteria was lower than its effect using the default alignment criteria. The proportion of “yes” genes was calculated. The cis-eQTL effect was estimated as the average of the mean expression of RILs with the B73 genotype minus the mean expression of RILs with the Mo17 genotype. A large proportion of B73 positive cis-eQTLs had greater effects using default alignment criteria (binomial test, P < 2.2e-16)
Number of genes with B73 and Mo17 positively acting, cis-regulatory alleles detected in different analyses
| Methods | STAR-StringTie | TopHat2-Cufflinks | Salmon | ||||
|---|---|---|---|---|---|---|---|
| cis-eQTL only | default criteria | relaxed criteria | most relaxed criteria | default criteria | relaxed criteria | most relaxed criteria | default criteria |
| Total | 9306 | 9256 | 9260 | 11,908 | 11,720 | 11,643 | 7855 |
| B73 positive cis-eQTL | 6341 (68.1%) | 6203 (67.0%) | 6172 (66.6%) | 9905 (83.2%) | 9639 (82.1%) | 9504 (81.6%) | 4959 (63.1%) |
| Mo17 positive cis-eQTL | 2965 (31.9%) | 3053 (33.0%) | 3088 (33.3%) | 2003 (16.8%) | 2081 (17.8%) | 2139 (18.4%) | 2896 (36.9%) |