| Literature DB >> 27226169 |
Yanjun Zan1, Xia Shen2, Simon K G Forsberg1, Örjan Carlborg3.
Abstract
An increased knowledge of the genetic regulation of expression in Arabidopsis thaliana is likely to provide important insights about the basis of the plant's extensive phenotypic variation. Here, we reanalyzed two publicly available datasets with genome-wide data on genetic and transcript variation in large collections of natural A. thaliana accessions. Transcripts from more than half of all genes were detected in the leaves of all accessions, and from nearly all annotated genes in at least one accession. Thousands of genes had high transcript levels in some accessions, but no transcripts at all in others, and this pattern was correlated with the genome-wide genotype. In total, 2669 eQTL were mapped in the largest population, and 717 of them were replicated in the other population. A total of 646 cis-eQTL-regulated genes that lacked detectable transcripts in some accessions was found, and for 159 of these we identified one, or several, common structural variants in the populations that were shown to be likely contributors to the lack of detectable RNA transcripts for these genes. This study thus provides new insights into the overall genetic regulation of global gene expression diversity in the leaf of natural A. thaliana accessions. Further, it also shows that strong cis-acting polymorphisms, many of which are likely to be structural variations, make important contributions to the transcriptional variation in the worldwide A. thaliana population.Entities:
Keywords: Arabidopsis thaliana; RNA sequencing; eQTL mapping; gene expression; structural variation
Mesh:
Year: 2016 PMID: 27226169 PMCID: PMC4978887 DOI: 10.1534/g3.116.030874
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1Overlap of RNA-seq scored transcripts in the leaf of 140 natural A. thaliana accessions (Schmitz ) (SCHMITZ-data; Normalized FPKM > 0) and 107 Swedish natural A. thaliana accessions (Dubin ) (DUBIN-data; RPKM > 0). The numbers of detected transcripts in all accessions of the respective datasets are shown in yellow. The numbers of detected transcripts in at least one, but not all, of the accessions in the respective datasets are shown in purple.
Figure 2(A) Distribution of the number of genes with transcripts in the leaf of 140 natural A. thaliana accessions (Schmitz ) scored by RNA-seq (Normalized FPKM > 0). (B) Relationship between the ranks of the average transcript levels for all genes with transcripts detected in at least one accession (y-axis), and the number of accessions in (Schmitz ) where transcripts for the gene is found (x-axis). Each dot in the plot represent one of the 33,265 genes with FPKM > 0 in at least one accession of (Schmitz ). The transcript-level rank is based on average transcript levels in the accessions where transcripts for a particular gene are detected. Due to this, the ranks are less precise for transcripts present in fewer accessions. (C) Distribution of the number of genes with detected transcripts in the leaf of 107 Swedish natural A. thaliana accessions (Dubin ) scored by RNA-sequencing (RPKM > 0). (D) Relationship between the ranks of the average transcript levels for all genes with detected transcripts in at least one accession and the number of accessions in (Dubin ) where the gene is expressed. Each dot in the plot represent one of the 25,382 genes with RPKM > 0 in at least one accession of Dubin ).
Figure 3Correlation between the genetic and transcriptome covariances among 4317 genes with transcripts detected in between 14 and 126 of the accessions in the SCHMITZ-data, and that are expressed above a level where transcripts RNA-seq have been able to detect transcripts for a gene in all accessions. Each dot in the figure represents a pairwise relationship between two accessions, with the transcript covariance on the y-axis and the genetic covariance on the y-axis.
Genes with cis-eQTL detected in the population of 140 natural A. thaliana accessions (SCHMITZ-data) contributing to the accession specific presence or absence of transcripts and earlier reported biological function
| Locus | Gene | SNP | MAF | Log odds ratio ± SE | Replicated | Reference | |
|---|---|---|---|---|---|---|---|
| chr2_9028685 | 0.14 | 4.84 ± 0.76 | 1.75 × 10−10 | Yes-GBF | |||
| chr5_23935224 | 0.43 | 2.82 ± 0.54 | 1.83 × 10−7 | Yes | |||
| chr1_29007464 | 0.46 | 7.07 ± 1.02 | 5.18 × 10−12 | No.e | |||
| chr2_16017043 | 0.29 | 0.37 ± 0.07 | 3.64 × 10−7 | No.e | |||
| chr1_4601762 | 0.51 | 0.36 ± 0.06 | 3.87 × 10−10 | No | |||
| chr1_24217798 | 0.06 | 0.14 ± 0.02 | 3.22 × 10−10 | No | |||
| chr2_8168512 | 0.07 | 0.18 ± 0.03 | 6.13 × 10−9 | No | |||
| chr3_10230473 | 0.39 | 0.4 ± 0.07 | 6.45 × 10−8 | No | |||
| chr3_22031771 | 0.06 | 0.15 ± 0.02 | 1.09 × 10−9 | No | |||
| chr4_583422 | 0.12 | 0.2 ± 0.03 | 7.48 × 10−12 | No | |||
| chr4_597373 | 0.06 | 0.15 ± 0.02 | 1.41 × 10−9 | No | |||
| chr4_14639984 | 0.21 | 0.36 ± 0.07 | 2.32 × 10−7 | No |
Top SNP in association analysis.
Minor allele frequency of the top associated SNP in the SCHMITZ-data.
Log odds ratio of the top associated SNP ± SE.
Nominal P-value for the top associated SNP.
Could the original association in the SCHMITZ-data be replicated in the DUBIN-data: Yes-GBF, Replicated at Genome Wide Bonferroni threshold; Yes, replicated at Bonferroni threshold correcting for number of markers in ± 1 Mb window of the peak SNP in Dubin-data; No.e, not expressed in DUBIN-data so could not be tested for replication; No, nonsignificant in replication analysis.
Figure 4The eQTL analysis detects a highly significant, replicable association for the expression of the gene HAC1 (AT2G21045). The peak SNP is located in an exon of HAC1. (A/C) Distributions of transcript-levels for the 140/107 accessions (FPKM/RPKM-values from RNA-sequencing) in the SCHMITZ-data (A) and DUBIN-data (C) (Schmitz ; Dubin ), respectively. (B, D) Illustrations of the association-profiles (Kierczak ) for expression of the gene AT2G21045 (HAC1) in the SCHMITZ-data (B) and the DUBIN-data (D) (Schmitz )/(Dubin ), respectively. There is a highly significant cis-eQTL to a SNP located in an exon of HAC1.