| Literature DB >> 27538502 |
Chengqi Wang1, Swamy R Adapa1, Justin Gibbons1,2, Stephen Sutton1, Rays H Y Jiang3.
Abstract
BACKGROUND: Understanding the regulation mechanism of var gene expression is crucial for explaining antigenic variation in Plasmodium falciparum. Recent work observed that while all var genes produce transcripts, only a few var genes exhibit high expression levels. However, the global regulation of var expression and the relationship between epigenetic and genetic control remains to be established. RESULT: We have systematically reanalyzed the existing genomic data including chromatin configurations and gene expressions; and for the first time used robust statistical methods to show that the intron and 2 kb upstream regions of each endogenous var gene always maintain high chromatin accessibility, with high potential to bind transcription factors (TFs). The levels of transcripts for different var gene family members are associated with this chromatin accessibility. Any given var gene thus shows punctuated chromatin states throughout the asexual life cycle. This is demonstrated by three independent transcript datasets. Chromatin accessibility in the var intron and 2 kb upstream regions are also positively correlated with their GC content, suggesting the level of var genes silencing might be encoded in their intron sequences. Interestingly, both var intron and 2 kb upstream regions exhibit higher chromatin accessibility when the genes have relatively lower transcription levels, suggesting a punctuated repressive function for these regulatory elements.Entities:
Keywords: 2 kb upstream region; Chromatin accessibility; FAIRE-Seq; Intron; MNase-Seq; Var gene
Mesh:
Substances:
Year: 2016 PMID: 27538502 PMCID: PMC4990864 DOI: 10.1186/s12864-016-3005-7
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Var introns show significantly higher chromatin accessibility. a Average genome-wide sequence read coverage around the intron. Both data sets show higher chromatin accessibility around var introns. Coverage is expressed as the fraction of the highest coverage value among all positions within the plot window. The yellow bar represents average intron length. b Boxplot shows FAIRE-Seq signal around introns grouped according to gene annotation (‘**’ represents p-value < 0.01, FDR <0.01 for all comparisons between different gene groups and var genes). c Average genome-wide heterochromatin sequence read coverage around the intron. Both data sets show lower heterochromatin signal around var introns. Coverage is defined as in Fig. 1a
Fig. 2Var 2 kb upstream regions show higher chromatin accessibility. a Boxplot distribution shows var genes have longer 5′ upstream region (Control genes indicate all P. falciparum genes except var genes). Ends var indicates the 25 var genes located on the end of chromosome, while central var indicates the 37 of var genes located on the central region of chromosome (‘**’ represents p-value < 0.01, while ‘***’ represents p-value < 2.2e-16. P-value was calculated based on Wilcoxon-Rank-Sum test). b Average genome-wide sequence read coverage on 5′ upstream region. Both data sets show higher chromatin accessibility around 2 kb upstream region. c Boxplot shows FAIRE-Seq signal around the 2 kb upstream regions grouped according to gene annotation (‘*’ represents p-value < 0.05, FDR <0.05 for all comparisons between different gene groups and var genes). d Average genome-wide heterochromatin sequence read coverage on 5′ upstream region. Both data sets show lower heterochromatin signal around 2 kb var upsteam regions. Coverage is expressed as Fig. 1a
Fig. 3The chromatin accessibility of introns and 2 kb upstream region are associated with var gene expression. a, b Vioplots show the FAIRE-Seq signals of different var gene regions that are classified based on gene expression data (Otto TD et al. RNA-Seq data [27]). c The bee swarm plot shows Pearson correlation value between FAIRE-Seq signal on different var 5′ upstream regions and var gene expression from three independent expression data sets. The median value and standard deviation is represented by the black bar (‘*’ represents Wilcoxon-Rank-Sum test p-value < 0.05 compared with the second most significant correlation value). d Scatterplot shows FAIRE-Seq signal around var intron and 2 kb upstream region. X-axis and Y-axis represent log(FAIRE-Seq RPKM) value on var 2 kb upstream and intron region. e The bee swarm plot shows FAIRE-Seq signal correlation value between different var regions and exon during ring and trophozoite stages (‘*’ represents Wilcoxon-Rank-Sum test p-value < 0.05 compared with second significant correlation value)
Fig. 4Sequence GC content on two var regulatory regions is associated with their chromatin accessibility. a Scatterplot shows the correlation between sequence GC content and chromatin accessibility of genome 1 kb bins during ring and trophozoite stages (see Methods). Each point represents a 1 kb bin in P. falciparum genome. X-axis and Y-axis represent the correlation value between GC content of each 1 kb bin (each bin is divided again into 10 bins with length 100 bp) and its FAIRE-Seq signal during ring and trophozoite stages. Green histogram plot represents the distribution of correlation value. b Average sequence GC content around var intron and 2 kb upstream regions is plotted. c The sequence GC content distribution of different genome regions are plotted (‘C_2kb’ and ‘C_intron’ indicate 2 kb upstream and intron region of P. falciparum genes with one intron, ‘*’ represents Wilcoxon-Rank-Sum test p-value < 0.05). d The bee swarm plot shows Pearson correlation between sequence GC content of different genome regions and their FAIRE-Seq signal during different blood stages (‘*’ represents Wilcoxon-Rank-Sum test p-value < 0.05 compared with other elements)
Fig. 5Summary of var gene silencing model. a “Promoter-intron pairing” model. The var intron and promoter are required for var gene silencing. b “Chromatin-spreading” model. Lower expressed var gene exhibits higher heterochromatin marker levels along the gene body. c “Punctuated chromatin” model. Both “Chromatin-spreading” and “Promoter-intron pairing” work for var gene regulation. Lower expressed var genes tend to exhibit higher chromatin accessibility on both intron and 2 kb upstream regions, whereas lower chromatin accessibility can be observed on higher expressed var genes