Literature DB >> 31201999

Global Quantitative Mapping of Enhancers in Rice by STARR-seq.

Jialei Sun¹, Na He¹, Longjian Niu², Yingzhang Huang¹, Wei Shen³, Yuedong Zhang¹, Li Li³, Chunhui Hou⁴.

Abstract

Enhancers activate transcription in a distance-, orientation-, and position-independent manner, which makes them difficult to be identified. Self-transcribing active regulatory region sequencing (STARR-seq) measures the enhancer activity of millions of DNA fragments in parallel. Here we used STARR-seq to generate a quantitative global map of rice enhancers. Most enhancers were mapped within genes, especially at the 5' untranslated regions (5'UTR) and in coding sequences. Enhancers were also frequently mapped proximal to silent and lowly-expressed genes in transposable element (TE)-rich regions. Analysis of the epigenetic features of enhancers at their endogenous loci revealed that most enhancers do not co-localize with DNase I hypersensitive sites (DHSs) and lack the enhancer mark of histone modification H3K4me1. Clustering analysis of enhancers according to their epigenetic marks revealed that about 40% of identified enhancers carried one or more epigenetic marks. Repressive H3K27me3 was frequently enriched with positive marks, H3K4me3 and/or H3K27ac, which together label enhancers. Intergenic enhancers were also predicted based on the location of DHS regions relative to genes, which overlap poorly with STARR-seq enhancers. In summary, we quantitatively identified enhancers by functional analysis in the genome of rice, an important model plant. This work provides a valuable resource for further mechanistic studies in different biological contexts.

Entities: Chemical Disease Gene Mutation Species

Keywords: Enhancer; Epigenetic modification; Functional analysis; Gene expression; Plant

Mesh：

Substances：

Year: 2019 PMID： 31201999 PMCID： PMC6624190 DOI： 10.1016/j.gpb.2018.11.003

Source DB: PubMed Journal: Genomics Proteomics Bioinformatics ISSN： 1672-0229 Impact factor: 7.691

Introduction

Gene expression is tightly regulated, which is critical for plant development and responses to alterations in the environment and hormone levels [1]. Promoters proximal to transcription start sites (TSS) are frequently considered sufficient for the initiation and elongation of transcription, but the level of promoter-driven expression is generally low [1]. High level of gene expression requires the participation of enhancers to increase the efficiency of transcription initiation and elongation to produce more mRNAs, although the exact mechanisms remain poorly understood. Different from promoters, whose function correlates with genomic location, enhancers regulate the expression of target genes in a distance-, orientation-, and position-independent manner [2]. Thus, defined location-to-function relationship cannot be easily established between an enhancer and its target gene, which makes the identification of enhancers challenging. In mammalian genomes, one gene can be regulated simultaneously by multiple enhancers, or by different enhancers in different tissues and at different developmental stages. Meanwhile, one enhancer can regulate multiple genes [2], [3], [4], [5]. Advancements in molecular biology and computational techniques have enabled the characterization of enhancers genome-wide based on epigenetic marks [6], [7], [8], [9], [10], [11], [12], [13], [14], [15] or by direct measurement of transcription enhancing activity of candidate sequences [16], [17], [18], [19], [20], [21], [22]. Intergenic enhancers have been predicted in Arabidopsis according to the location of DNase hypersensitive sites (DHSs). Basically a DHS is considered as an enhancer if located more than 1.5 kb away from the TSS, and not inside any gene body [23]. However, the arbitrary exclusion of DHSs close to the TSS may lead to the exclusion of substantial number of potential enhancers. To date, only a handful of enhancers have been identified in plants [24], [25], [26], [27], [28], and no genome-wide annotation of enhancers has yet been performed based on functional analysis. Self-transcribing active regulatory region sequencing (STARR-seq) has been successfully used to measure enhancer activity of millions of fragments in Drosophila melanogaster and mammalian genomes [16], [18], [19], [20], [21], [22]. Here, we used STARR-seq and successfully mapped the locations and quantified the strengths of enhancers in the model plant rice. We analyzed the epigenetic marks of the identified enhancers and revealed that histone modifications and chromatin states for rice enhancers are quite different from those for STARR-seq enhancers identified in animal models. Our work provides a functional enhancer resource and shows that plant and animal enhancers may be different at least in epigenetic features.

Results

Global quantitative enhancer discovery using STARR-seq

To comprehensively identify sequences with enhancer activity, we constructed a reporter library from randomly fragmented genomic DNA of the rice cultivar Nipponbare (Oryza sativa L. ssp japonica). The plasmid DNA of the reporter library was transfected in replicates into protoplasts isolated from the stem of 2-week-old rice seedlings. Plasmid DNA and mRNA were isolated from transfected cells 16 h after transfection. Sequencing libraries of plasmids and mRNA were generated and subjected to paired-end sequencing on Illumina HiSeq X Ten platform (Figure S1). For the two transfection replicates, 30.6 million and 15.5 million (Table S1) independent fragments were recovered from input plasmid libraries, with a median length of ∼670 bp (Figure S2A and B). As a result, ∼90% of the rice genome was covered with at least one unique fragment for each single nucleotide (Figure S2C and D). For the cDNA libraries generated from isolated mRNA samples, 13.7 million and 6.1 million independent fragments were produced (Table S1, Figure S2E and F). The quality of the libraries were checked and the enrichment of cDNA over plasmid input was determined for 600 bp bins across the genome and potential enhancers were identified (Figures S2–S7). Two identified enhancer sites located on chromosome 9 are shown in Figure 1A as examples.

Figure 1

Genome-wide quantitative enhancer discovery

A. STARR-seq cDNA (red) and input plasmid (gray) fragment densities at representative genomic loci. Light and dark blue boxes denote the identified enhancers in two replicates, respectively. B. Gene expression indicated by STARR-seq enrichment and real-time PCR quantification is linearly correlated (r = 0.79). STARR-seq enhancers were arbitrarily grouped into weak, medium, and strong categories based on the enrichment of cDNA over input plasmid with FC ranging below 2.0, 2.0 to 4.0, and above 4.0, respectively. Error bars indicate total four real-time qPCR quantifications, two qPCRs for each independent biological replicate; inset, the same data depicted as a box plot. Significant difference between groups was determined using Wilcoxon rank-sum test (*, P < 0.05; ***, P < 0.001). r, Pearson correlation coefficient; FC, fold change.

Genome-wide quantitative enhancer discovery A. STARR-seq cDNA (red) and input plasmid (gray) fragment densities at representative genomic loci. Light and dark blue boxes denote the identified enhancers in two replicates, respectively. B. Gene expression indicated by STARR-seq enrichment and real-time PCR quantification is linearly correlated (r = 0.79). STARR-seq enhancers were arbitrarily grouped into weak, medium, and strong categories based on the enrichment of cDNA over input plasmid with FC ranging below 2.0, 2.0 to 4.0, and above 4.0, respectively. Error bars indicate total four real-time qPCR quantifications, two qPCRs for each independent biological replicate; inset, the same data depicted as a box plot. Significant difference between groups was determined using Wilcoxon rank-sum test (*, P < 0.05; ***, P < 0.001). r, Pearson correlation coefficient; FC, fold change. To validate the identified peaks, we chose 29 sites (Table S2) with weak (<2 fold enrichment), medium (2–4 fold enrichment) or strong (>4 fold enrichment) enhancer strength. These sites were cloned into luciferase reporter vectors and the reporter gene expression was quantified using real-time RT-PCR and normalized against the expression of the co-transfected Renilla luciferase reporter gene. Our data showed that STARR-seq enrichment intensity is highly correlated with the reporter gene expression levels across a wide activity range for enhancers (Pearson correlation coefficient, r = 0.79; Figure 1B). The activities of the weak, medium, and strong enhancers also showed significant differences between different groups (P < 0.05, P < 0.001, and P < 0.001 for weak vs. medium, weak vs. strong, and medium vs. strong enhancers, respectively, Wilcoxon test) (Figure 1B). The Pearson correlation coefficient for the two replicates was 0.604 (Figure S5), demonstrating that STARR-seq is reproducible in plant system. 15,208 and 12,210 regions (Figure 2A) were significantly enriched from the replicates 1 and 2, respectively (P < 0.001). Among them, 9642 enriched peaks were identified in both biological replicates (Table S3) and used for further analysis (Figure 2A).

Figure 2

Distribution of STARR-seq enhancers in rice genome

A. Enhancers identified in two independent STARR-seq experiments. Totally 9642 enhancers were commonly discovered in both replicates. B. and C. Distribution and relative enrichment of identified enhancers in the rice genome. D. Distribution of enhancers relative to the gene body. TSSs and TTSs of all genes are aligned at the beginning and the end of gene body, respectively. Extended regions of 2.4 kb (the median size of genes in rice) upstream of TSS and downstream of TTS are divided into 10 bins of equal size (240 bp), respectively. The horizontal gray dotted line shows the mean density of enhancers from control elements, which was calculated from 10,000 randomly selected regions of 700 bp in length and repeated for at least 10 times. TE, transposable element; TSS, transcription start site; TTS, transcription termination site.

Distribution of STARR-seq enhancers in rice genome A. Enhancers identified in two independent STARR-seq experiments. Totally 9642 enhancers were commonly discovered in both replicates. B. and C. Distribution and relative enrichment of identified enhancers in the rice genome. D. Distribution of enhancers relative to the gene body. TSSs and TTSs of all genes are aligned at the beginning and the end of gene body, respectively. Extended regions of 2.4 kb (the median size of genes in rice) upstream of TSS and downstream of TTS are divided into 10 bins of equal size (240 bp), respectively. The horizontal gray dotted line shows the mean density of enhancers from control elements, which was calculated from 10,000 randomly selected regions of 700 bp in length and repeated for at least 10 times. TE, transposable element; TSS, transcription start site; TTS, transcription termination site.

Enhancers are enriched within gene body

STARR-seq enhancers identified in the Drosophila genome are mostly located within genes and at promoter regions, especially in introns (55.6%), whereas only 22.6% of enhancers are in intergenic regions. Meanwhile Drosophila enhancers are significantly underrepresented in repetitive sequences [16]. To reveal if the distribution pattern of enhancers in rice is different from that in Drosophila, we calculated the percentage of rice STARR-seq enhancers in different functional genomic regions. Surprisingly, more than 50% (5020) of enhancers were mapped in repetitive sequences, most of which are transposable elements (TEs) (Figure 2B). Moreover, nearly all of these enhancers (4831/5020) overlap with one type of repetitive elements such as short interspersed nuclear elements (SINEs), long interspersed nuclear elements (LINEs), long terminal repeats (LTRs), DNA transposons, satellite DNA, or simple repeats (Table S4). The enrichment of STARR-seq enhancers in TE-related sequences in rice may be consistent with the hypothesis that TEs may regulate gene expression or even give rise to new genes during evolution [29], [30]. In addition to the repeats, identified enhancers are overrepresented in the 5′ untranslated regions (5′UTR) and coding sequences, but underrepresented in introns (Figure 2C). These observations demonstrate a strikingly difference in the distribution pattern of enhancers between the Drosophila and rice genomes [16]. To reveal if enhancers show different distribution patterns in different chromatin contexts, we further divided the rice genome into repetitive sequences enriched with TEs (TE regions) and sequences without TEs (non-TE regions) and analyzed the enhancer distribution relative to proximal genes. Overall, the enhancer distribution patterns are similar in both TE and non-TE regions (Figure 2D) despite the different genetic composition in these two types of sequences. Enhancers are located mostly within genes at the 5′ end, and their abundance gradually declines to background level toward the 3′ end of genes (Figure 2D).

The majority of genes lack proximal enhancers in the rice genome

The majority of STARR-seq enhancers were mapped within or close to genes (gene body ±5 kb), which suggests that proximal regulation by enhancers may be a prevalent choice in the rice genome. Compared to the total number of annotated and predicted genes (∼56,000) in rice [31], the number of STARR-seq enhancers is relatively low (9642), i.e., less than 0.2 enhancers per gene on average, which suggests that most genes may not be directly regulated by enhancers in close proximity (gene body ±5 kb). Further analysis shows that 28.7% (15,997) of genes (Figure 3A and B) have at least one proximal enhancer, suggesting that many enhancers have to be in proximity to more than one gene.

Figure 3

The proximal sequences of most genes are lack of identified enhancers

Number (A) and percentage (B) of total genes with or without STARR-seq enhancer in proximity in the whole genome. Enhancers located within 5 kb upstream of TSS, inside gene body, and within 5 kb downstream of TTS are considered to be proximal to genes. Number (C) and percentage (D) of genes expressed at different levels with or without enhancers in proximity in non-TE regions. Number (E) and percentage (F) of genes expressed at different levels with or without enhancers in proximity in TE regions. Genes are grouped into four categories based on their expression levels. Silent, RPKM = 0; low, 0 < RPKM ≤ 1; medium, 1 < RPKM ≤ 10; high, RPKM > 10. The horizontal blue lines show the percentage of genes with enhancers in proximity in the whole genome.

The proximal sequences of most genes are lack of identified enhancers Number (A) and percentage (B) of total genes with or without STARR-seq enhancer in proximity in the whole genome. Enhancers located within 5 kb upstream of TSS, inside gene body, and within 5 kb downstream of TTS are considered to be proximal to genes. Number (C) and percentage (D) of genes expressed at different levels with or without enhancers in proximity in non-TE regions. Number (E) and percentage (F) of genes expressed at different levels with or without enhancers in proximity in TE regions. Genes are grouped into four categories based on their expression levels. Silent, RPKM = 0; low, 0 < RPKM ≤ 1; medium, 1 < RPKM ≤ 10; high, RPKM > 10. The horizontal blue lines show the percentage of genes with enhancers in proximity in the whole genome. To investigate if enhancers are preferentially enriched for active genes, we separated genes into four groups according to their expression levels (silent, low, medium, and high) and calculated the percentage of genes with a proximal enhancer for each group. The percentage of genes with proximal enhancers changes little (from 22.7% to 26.1%) for genes with the different expression levels in non-TE regions (Figure 3C and D). In contrast to non-TE regions, about 40% of silent and lowly-expressed genes were found with proximal enhancers in TE regions (Figure 3E and F). These results suggest that STARR-seq enhancers are not necessarily enriched at actively transcribed genes in vivo. Our observations agree with previous reports that episomal analysis may not reflect the endogenous chromatin state of the assayed sequences [16].

Genes in repetitive sequences are enriched with enhancers

Not only silent and lowly-expressed genes in TE regions are enriched with proximal enhancers, in fact, 52.1% of identified STARR-seq enhancers are located in TE regions (Figure 2B), which account for about 42.8% of the rice genome (Figure 2B) and harbor only 28.3% (15,839) of total genes (∼56,000) (Figure 3E). Most genes inside TE regions (15,318, 96.7% of total genes in TE regions) were lowly expressed (960 and 1389 genes with or without enhancer in proximity, respectively) or silent (5181 and 7788 genes with or without enhancer in proximity, respectively) (Figure 3E). And 39.6% (6277, sum of four gene groups of different expression level; Figure 3E) of total genes located in TE regions (15,839) contained at least one proximal enhancer. For genes of higher transcription levels or genes in non-TE regions, the percentage of genes with a proximal enhancer was lower than the average for total genes (28.7%) (Figure 3B, D, and F). Gene ontology (GO) analysis showed that genes in TE regions with proximal enhancers are mostly enriched in categories of DNA replication, integration, nucleotide binding, etc. (Figure S8).

STARR-seq enhancers overlap poorly with DHSs

DNase I hypersensitivity has been associated with DNA sequences with different types of activity, one of which is enhancing gene transcription [32]. To examine the endogenous accessibility of STARR-seq enhancers, we analyzed the DNase I hypersensitivity across the rice genome using previously published data [33]. Surprisingly, only 8.7% (839) of STARR-seq enhancers were found overlapping with DHSs (Figure 4A, Figure S9). Actually, the low overlap between DHSs and STARR-seq enhancers was also reported in humans that only about 13.6% of human STARR-seq enhancers co-localize with DHSs (Figure S9) [20]. Quite different in Drosophila, 48.5% of STARR-seq enhancers co-localize with DHSs [16] (Figure S9). These results together suggest that the hypersensitivity to DNase I digestion may not be an inseparable characteristic for functionally identified enhancers, which seems true at least in the rice and human genomes.

Figure 4

Epigenetic marks associated with enhancers in rice genome

A. Numbers of identified enhancers that overlap with DHSs. B. Expression of genes in two biological replicates with proximal enhancers overlapping with DHS (open) or not (close) (****, P < 10−10, Wilcoxon rank-sum test). In this analysis, STARR-seq enhancers uniquely identified in each replicate were included. C. Epigenetic signals for enhancers overlapping with DHSs (red) or not (blue). D. Epigenetic signal for enhancers in non-TE (red) or TE (blue) regions, respectively. The horizontal gray dotted line shows the relative enrichment of the examined epigenetic signal of random genomic sites as a control. DHS, DNase I hypersensitive site.

Epigenetic marks associated with enhancers in rice genome A. Numbers of identified enhancers that overlap with DHSs. B. Expression of genes in two biological replicates with proximal enhancers overlapping with DHS (open) or not (close) (****, P < 10−10, Wilcoxon rank-sum test). In this analysis, STARR-seq enhancers uniquely identified in each replicate were included. C. Epigenetic signals for enhancers overlapping with DHSs (red) or not (blue). D. Epigenetic signal for enhancers in non-TE (red) or TE (blue) regions, respectively. The horizontal gray dotted line shows the relative enrichment of the examined epigenetic signal of random genomic sites as a control. DHS, DNase I hypersensitive site. Chromatin of high accessibility is frequently associated with active transcription [34]. For genes with proximal enhancers overlapping with DHSs (open chromatin), their expression levels are indeed generally higher than those of genes with non-DHS enhancers (close chromatin) in proximity (Figure 4B).

H3K4me1 is underrepresented at enhancers identified in the rice genome

H3K4me1 has been frequently used as a mark for enhancer prediction, which is enriched at enhancers identified by STARR-seq in both Drosophila and human genomes [16], [20]. However, in rice, H3K4me1 is obviously underrepresented at the endogenous sites of enhancers, independent of their overlapping state with DHS (Figure 4C) or location in TE or non-TE regions (Figure 4D). This surprising observation raises a question over the role of H3K4me1 as a reliable enhancer mark in the rice genome.

H3K27ac is enriched at rice STARR-seq enhancers

H3K27ac is another histone modification used for the prediction of active enhancers, which is also enriched at enhancers identified by STARR-seq in both Drosophila and human genomes [16], [20]. Similarly in rice, H3K27ac is also enriched although only at a selective portion of STARR-seq enhancers, specifically those associated with DHS (red curves, Figure 4C) or located in non-TE regions (red curves, Figure 4D). It is of note that even enhancers not associated with DHSs show slight enrichment of H3K27ac (blue curves, Figure 4C). Quite different from H3K4me1, H3K27ac seems to be a more conserved enhancer mark across plant and animal kingdoms.

Active transcription mark H3K4me3 is enriched at STARR-seq enhancers

H3K4me3 is generally associated with actively transcribed genes in animal models [35]. In the rice genome, H3K4me3 is enriched preferentially at STARR-seq enhancers that overlap with DHSs or are located in non-TE regions (Figure 4C and D). These observations could be explained by the fact that a significant portion of enhancers are located inside genes at the 5′UTR and in coding sequences (Figure 2D), where the H3K4me3 is detected if these genes are actively transcribed. However, these observations also suggest that enhancers can possibly be genes themselves in rice genome.

Repressive histone mark H3K27me3 colocalizes with STARR-seq enhancers

H3K27me3 is mostly associated with repressed chromatin state [35]. Surprisingly, we found that identified enhancers were enriched with H3K27me3 (Figure 4C and D; also see sC3, sC4, and sC7 in Figure 5A and B), which is not enriched for enhancers identified in either Drosophila or human genomes [16], [20]. Interestingly, many of these enhancers (sC3 and sC4) are also enriched with H3K4me3 and H3K27ac (Figure 5A and B). Whether these enhancers are poised or actively regulate gene expression is difficult to determine at this time. One possibility could be that the same genomic loci are modified differentially in different subpopulations of cells in culture. Different from H3K27me3, H3K9me3 is nearly completely absent from identified enhancers (Figure 5A).

Figure 5

Epigenetic clustering reveals combinatorial association of active and repressive histone modifications with a subset of enhancers

A. STARR-seq enhancers are clustered over the signals of DHSs and five histone modifications as indicated at the top of the panel. The number and percentage of enhancers for each cluster are shown on the side. B. The level of enrichment of epigenetic marks for each cluster of enhancers is shown in different color. Signal density of clusters is ranked as absent (dark gray), low (light blue), medium (orange), and high (red). The signal density is determined by calculating the percentage of enhancers carrying an examined epigenetic mark over the total number of enhancers in a specific cluster (see Figure S10A for details). A cluster is designated to be absent of an epigenetic mark if ≤10% of the elements in that cluster carried the epigenetic mark examined, or designated as low, medium, or high if the percentage of elements in a specific cluster carrying an epigenetic mark examined was between 10%–30%, 30%–60%, or >60%. Number of enhancers in each cluster is shown at the bottom. C. Percentage of identified enhancers in each cluster with (red) or without (gray) a specific epigenetic mark. D. Percentage of enhancers in each cluster present in non-TE (red) or TE (gray) regions. E. Expression of genes with proximal enhancers of different cluster in TE (blue) or in non-TE (red) regions. Genes with enhancers in non-TE regions of nearly all clusters (except sC7) show significantly (*, P < 0.01, Wilcoxon rank-sum test) higher expression level than in those in TE regions.

Epigenetic clustering reveals combinatorial association of active and repressive histone modifications with a subset of enhancers A. STARR-seq enhancers are clustered over the signals of DHSs and five histone modifications as indicated at the top of the panel. The number and percentage of enhancers for each cluster are shown on the side. B. The level of enrichment of epigenetic marks for each cluster of enhancers is shown in different color. Signal density of clusters is ranked as absent (dark gray), low (light blue), medium (orange), and high (red). The signal density is determined by calculating the percentage of enhancers carrying an examined epigenetic mark over the total number of enhancers in a specific cluster (see Figure S10A for details). A cluster is designated to be absent of an epigenetic mark if ≤10% of the elements in that cluster carried the epigenetic mark examined, or designated as low, medium, or high if the percentage of elements in a specific cluster carrying an epigenetic mark examined was between 10%–30%, 30%–60%, or >60%. Number of enhancers in each cluster is shown at the bottom. C. Percentage of identified enhancers in each cluster with (red) or without (gray) a specific epigenetic mark. D. Percentage of enhancers in each cluster present in non-TE (red) or TE (gray) regions. E. Expression of genes with proximal enhancers of different cluster in TE (blue) or in non-TE (red) regions. Genes with enhancers in non-TE regions of nearly all clusters (except sC7) show significantly (*, P < 0.01, Wilcoxon rank-sum test) higher expression level than in those in TE regions.

Epigenetic clustering of identified enhancers

To find out if there is any unique combination of histone modifications, we clustered all identified enhancers into eight groups (sC1–8, s standing for STARR-seq enhancers) based on the signal strength of multiple epigenetic marks including DHS, H3K4me1, H3K4me3, H3K27ac, H3K27me3, and H3K9me3 (Figure 5A, Figure S10A). Clusters 1–7 (sC1–7) together contain only 37.9% (3650) of total enhancers (Figure 5A). And cluster sC8 alone contains 62.1% (5992) of the total enhancers, which carries negligible level of any analyzed epigenetic mark (Figure 5A and B). In fact, only 334 sites (sC2) show strong signal for the presence of H3K4me1 (Figure 5A and B), even fewer than the enhancers overlapping with DHS (839) (Figure 4A). H3K4me3 is enriched for 4 clusters of enhancers (sC1 and sC3–5), for which different levels of H3K27ac are enriched as well (Figure 5A and B). Overall, H3K4me3 and H3K27ac are two mostly enriched histone marks, DHS and H3K27me3 are moderately enriched, whereas H3K4me1 and H3K9me3 are least enriched for enhancers in sC1–7 (Figure 5A–C). STARR-seq enhancers in rice are slightly enriched in TE regions where gene activity is low. To reveal if enhancers of different clusters follow the same distribution pattern of total STARR-seq enhancers, we reexamined the distribution of each cluster of enhancers in the TE and non-TE regions. Quite different from total STARR-seq enhancers, averagely 83% of sC1–7 enhancers are mapped in non-TE regions (Figure 5D). Accordingly, the majority of sC8 enhancers (73%) were associated with TE regions (Figure 5D). Enhancer activity of each cluster was also examined. Except sC7, genes associated with enhancers of all other clusters (sC1, sC2–6, and sC8) are expressed at significantly higher levels in non-TE regions than in TE regions (P < 0.01, Wilcoxon’s test; Figure 5E). Enhancers of sC7 are mostly enriched with repressive H3K27me3, different from other H3K27me3 enriched clusters (sC3 and sC4) that are also enriched with active H3K4me3 and H3K27ac (Figure 5A and B). Overall, the median expression levels of genes associated with proximal STARR-seq enhancers (sC1–2 and sC3–6) of active histone marks are significantly higher than the median level of total genes (Figure 5E). These results show that STARR-seq enhancer activity is well correlated with histone modifications and genomic sequence composition.

DHS-predicted enhancers differ from STARR-seq enhancers

Enhancers have been predicted based on chromatin accessibility in Arabidopsis [23]. To test if STARR-seq enhancers overlap with DHS-predicted enhancers, we followed the previous report [23] and defined a DHS site as enhancer if it is located >1.5 kb upstream of TSS but not in a gene body. According to this criterion, 13,770 out of total 37,168 DHSs were predicted to be enhancers (Figure 6A, Table S5). Only 20% of 13,770 sites are in TE regions, whereas the remaining 80% are located in non-TE regions (Figure 6A), indicating an under-representation and over-representation in TE and intergenic regions, respectively (Figure 6B). This distribution patterns is different from that of the STARR-seq enhancers (Figure 2C). In fact, DHS-predicted enhancers barely overlap with STARR-seq enhancers. There are only 220 overlapping sites, accounting for <2.3% and 1.6% of total STARR-seq and DHS-predicted enhancers, respectively (Figure 6C). These results show that DHS-predicted enhancers are a group of DNA elements different from STARR-seq enhancers.

Figure 6

Predicted enhancers based on DHS location

A. Number of DHS-predicted enhancers (13,770) and non-enhancer DHSs (23,398) and percentage of these sites in different functional genomic regions are shown below. B. Relative enrichment/depletion of DHS-predicted enhancers and non-enhancer DHSs in different genomic regions. C. Overlap analysis among STARR-seq enhancers, DHS-predicted enhancers, and non-enhancer DHSs. D. DHS-predicted enhancers are clustered over the signal of five histone modifications. E. The level of enrichment of epigenetic mark for each cluster of enhancers is shown in different color (see Figure S10B for details). F. Expression of genes with proximal DHS-predicted enhancers in TE (blue) or non-TE (red) regions. G. Non-enhancer DHSs are clustered over the signal of five histone modifications as for STARR-seq enhancers. H. The level of enrichment of epigenetic mark for each cluster of enhancers is shown in different color (see Figure S10C for details). For panels E and H, signal density of clusters is ranked as absent (dark gray), low (light blue), medium (orange), and high (red). The signal density is determined by the percentage of enhancers in a specific cluster carrying the examined epigenetic mark. Number of enhancers in each cluster is shown at the bottom. A cluster is designated to be absent of an epigenetic mark if ≤10% of the elements in that cluster carried the epigenetic mark examined, or designated as low, medium, or high if the percentage of elements in a specific cluster carrying an epigenetic mark examined was between 10%–30%, 30%–60%, or >60%. I. Expression of genes with proximal non-enhancer DHSs in TE (blue) or non-TE (red) regions.

Predicted enhancers based on DHS location A. Number of DHS-predicted enhancers (13,770) and non-enhancer DHSs (23,398) and percentage of these sites in different functional genomic regions are shown below. B. Relative enrichment/depletion of DHS-predicted enhancers and non-enhancer DHSs in different genomic regions. C. Overlap analysis among STARR-seq enhancers, DHS-predicted enhancers, and non-enhancer DHSs. D. DHS-predicted enhancers are clustered over the signal of five histone modifications. E. The level of enrichment of epigenetic mark for each cluster of enhancers is shown in different color (see Figure S10B for details). F. Expression of genes with proximal DHS-predicted enhancers in TE (blue) or non-TE (red) regions. G. Non-enhancer DHSs are clustered over the signal of five histone modifications as for STARR-seq enhancers. H. The level of enrichment of epigenetic mark for each cluster of enhancers is shown in different color (see Figure S10C for details). For panels E and H, signal density of clusters is ranked as absent (dark gray), low (light blue), medium (orange), and high (red). The signal density is determined by the percentage of enhancers in a specific cluster carrying the examined epigenetic mark. Number of enhancers in each cluster is shown at the bottom. A cluster is designated to be absent of an epigenetic mark if ≤10% of the elements in that cluster carried the epigenetic mark examined, or designated as low, medium, or high if the percentage of elements in a specific cluster carrying an epigenetic mark examined was between 10%–30%, 30%–60%, or >60%. I. Expression of genes with proximal non-enhancer DHSs in TE (blue) or non-TE (red) regions. To reveal the epigenetic modification patterns, we carried out clustering analysis according to the histone modifications at DHS-predicted enhancers (Figure 6D, Figure S10B). Although DHSs indicate open chromatin possibly enriched with active histone marks, clustering analysis showed that only 16.9% of total DHS-predicted enhancers are enriched with any examined histone modifications including repressive H3K27me3 (Figure 6D and E). Interestingly, H3K4me1 is also absent from nearly all DHS-predicted enhancers (Figure 6D and E). Similar to STARR-seq enhancers, H3K4me3 is highly enriched for most clusters (dC1–5, d standing for DHS-predicted enhancers) (Figure 6D and E). Interestingly, H3K27me3 and H3K27ac are also enriched for two (dC3 and dC7) and three (dC1, dC4, and dC6) different clusters, respectively (Figure 6D and E), but different from STARR-seq enhancer clusters (sC3 and sC4), in which these two modifications are enriched for about 20% sites (data not shown). To test if DHS-predicted enhancers increase proximal gene expression, similar analysis was conducted as for STARR-seq enhancers. As many as four clusters (dC1 and dC3–dC5) in TE or non-TE regions differ little in their effect on the expression level of proximal genes (P > 0.05, Wilcoxon test; Figure 6F). Moreover, 5 out of 8 clusters of enhancers (dC1, dC3–5, and dC7) show no positive effect on proximal gene expression compared to the median expression level of all genes (Figure 6F). These results together suggest that enhancer prediction based on the location of DHSa is less successful in screening out regulatory elements that are positively correlated with higher gene expression or positive histone marks at endogenous loci.

DHSs not predicted as enhancers may function as promoters

Since DHS-predicted enhancers are not well correlated with histone modification and proximal gene activity, we wonder if the remaining DHSs not predicted as enhancers (non-enhancer DHSs, 23,398 sites) (Figure 6A, Table S6) behave similarly. First, we examined the distribution of non-enhancer DHSs, which was found to differ sharply from that of the DHS-predicted enhancers. Non-enhancer DHSs are obviously enriched in the 5′ upstream regions of genes, TSS flanking regions, and 5′UTR. They are also slightly overrepresented in first introns and 3′UTRs (Figure 6B). We further clustered non-enhancer DHSs as for DHS-predicted enhancers. A little more than half of non-enhancer DHSs (50.5%) are grouped into several clusters, carrying different types of histone modifications (nC1–7, n standing for non-enhancer DHSs) (Figure 6G and H, Figure S10C). For nC1–7, the histone modification patterns are similar to those of dC1–7, except that there are many more enhancers found in nC1–7 than in dC1–7 (Figure 6E and H). The expression levels of genes with non-enhancer DHSs in proximity were also examined. Interestingly, expression levels of genes with most clusters of non-enhancer DHSs are actually higher than the median expression levels of all genes (Figure 6I). Considering the locations of non-enhancer DHSs, it is reasonable to expect that genes with these sites carrying active histone modifications are active and expressed at a decent level. The fact that most of 9140 non-enhancer DHSs (nC1–5) are enriched with H3K4me3, a mark of active transcription, suggests that these sites may be promoters rather than enhancers. We further compared the distribution of non-enhancer DHSs, STARR-seq enhancers, and DHS-predicted enhancers. These three groups of elements show strikingly different distribution patterns relative to the TSS (Figure 7A). DHSs are mostly positioned close to TSSs, with 91% and 81% of all DHSs mapped within the regions ranging from 5 kb upstream of TSS to 5 kb downstream of TSS in the rice and Drosophila genomes, respectively (Figure S11). DHS-predicted intergenic enhancers (13,770, 37% out of total 37,168 DHSs genome-wide) are located at least 1.5 kb upstream of the TSS and outside of a gene body (Figure 7A, middle) as defined. Accordingly, non-enhancer DHSs (23,398, 63% out of total 37,168 DHSs genome-wide) (Figure 7A, right) are located within gene body and the regions 1.5 kb 5′upstream of TSSs. Specifically, these DHS sites are overrepresented in sequences <200 bp upstream of TSS, TSS ±50 bp, and the 5′UTR regions (Figure 6B), dramatically different from the distributions of STARR-seq enhancers, which are mostly enriched within the gene body favoring the 5′ end (Figure 7A, left). Only 619 out of 23,395 of non-enhancer DHSs overlap with STARR-seq enhancers (Figure 6C). Our analysis shows that at least a portion of the non-enhancer DHSs are functional enhancers, suggesting that enhancer prediction arbitrarily based on DHS location may also lead to loss of real functional elements.

Figure 7

Distribution of three types of DNA elements and proposed enhancer function model in rice genome

A. Relative enrichment of STARR-seq enhancers, DHS-predicted enhancers, and non-enhancer DHS sites. Red and blue lines show sites in non-TE and TE regions, respectively; the dashed line indicates the location of TSS. B. The proposed model for enhancer distribution and functional mechanisms in rice genome. Gene body is shown as light blue rectangle box. Red boxes indicate STARR-seq enhancers. DHS-predicted enhancers and non-enhancer DHSs are shown as arrows in dark red and dark blue, respectively. Density of STARR-seq enhancers gradually decreases within the gene body from the 5′TSS to the 3′TTS. Genes can be self-regulated by enhancers located within themselves. DHS-predicted enhancers are mostly located not far away from gene bodies. Non-enhancer DHSs are mostly enriched at promoter regions. Enhancers in one gene may activate other genes separated far away linearly but close in three-dimensional space. The start and orientation of transcription are indicated using an orange arrow.

Distribution of three types of DNA elements and proposed enhancer function model in rice genome A. Relative enrichment of STARR-seq enhancers, DHS-predicted enhancers, and non-enhancer DHS sites. Red and blue lines show sites in non-TE and TE regions, respectively; the dashed line indicates the location of TSS. B. The proposed model for enhancer distribution and functional mechanisms in rice genome. Gene body is shown as light blue rectangle box. Red boxes indicate STARR-seq enhancers. DHS-predicted enhancers and non-enhancer DHSs are shown as arrows in dark red and dark blue, respectively. Density of STARR-seq enhancers gradually decreases within the gene body from the 5′TSS to the 3′TTS. Genes can be self-regulated by enhancers located within themselves. DHS-predicted enhancers are mostly located not far away from gene bodies. Non-enhancer DHSs are mostly enriched at promoter regions. Enhancers in one gene may activate other genes separated far away linearly but close in three-dimensional space. The start and orientation of transcription are indicated using an orange arrow.

Discussion

Histone modifications have been used for enhancer identification [6]. H3K4me1 is the primary modification used to predict if a genomic locus is a potential enhancer at all. Other histone modifications have also been employed to identify enhancers of different states. For example, H3K27ac is associated with active and super enhancers, and the coexistence of H3K27me3 and H3K4me1 is believed to be indicative of poised enhancers [12], [11]. However, such methods may fail to predict enhancers in genomic loci devoid of histone modifications. Moreover, enhancers predicted based on epigenetic modifications still need to be validated using genetic methods which is time- and labor-consuming, if large number of predicted sites need to be tested [36]. STARR-seq measures enhancer activity of candidate sequences independent of their endogenous chromatin context and epigenetic state, which has been successfully applied to enhancer analysis in both Drosophila and human genomes [16], [20]. To improve our understanding of the nature and functional mechanisms of plant enhancers, we employ STARR-seq and report here the first functional enhancer mapping genome-wide in the important model plant rice. Previous work predicting enhancers by chromatin sensitivity to DNase I digestion arbitrarily exclude DHSs located <1.5 kb upstream of the TSS and within a gene body [23]. In Drosophila, the majority of STARR-seq enhancers are actually located within a gene body or in proximity to genes [16]. Similarly, our STARR-seq analysis of the rice genome also shows that the majority of enhancers are localized within the gene body. The consistent observation of enhancer enrichment in the gene body in two evolutionarily far-separated genomes may suggest that most genes could be regulated by DNA elements built into their sequences (Figure 7B). Furthermore, it will be interesting to see whether enhancers in one gene can also activate other genes separated by long distances linearly. Capture Hi-C in the plant genome, using identified enhancers as anchors, may be used to reveal whether genes separated far away can be co-regulated by enhancers located within gene bodies. Our analysis also reveals several unexpected features of enhancers in the rice genome. Firstly, the majority of STARR-seq enhancers do not overlap with DHSs. DNase I hypersensitivity can be associated with any open and relaxed chromatin region, including insulators and other protein binding sites [32]. Predicting enhancers relying solely on the location of DHSs relative to genes may fail to rule out the possibilities that many DHSs predicted as enhancers may actually be of other non-enhancer functions, and many DHSs not predicted as enhancers may in fact function as real enhancers. Secondly, although H3K4me1 has been used to predict enhancers in mammals [37], we find that H3K4me1 is absent from most STARR-seq enhancers and DHSs independent of their locations in the rice genome. These observations suggest that H3K4me1 may not necessarily be a conservative enhancer mark in the rice genome. However, at the same time, H3K4me3 is found enriched at many identified enhancers. Whether H3K4me3 is a real unique enhancer mark in plant requires more experimental evidence. Third, many STARR-seq enhancers are enriched with repressive H3K27me3, majority of which are co-enriched with active chromatin marks of H3K4me3. Enhancers with both repressive H3K27me3 and H3K4me3 could be bivalent elements [38]. Surprisingly, H3K27ac and H3K27me3 co-exist at about 20% of STARR-seq enhancers in clusters sC3–5 (Figure 5A and B). These two modifications are mutually exclusive at the same location on histone H3 tail [39]. Currently, we cannot rule out the possibilities that histones at these enhancers may be modified differentially in different subgroups of cells, or even differentially on different allele in the same cell. In either case, further careful analysis is warranted to reveal the underlying causes of this intriguing observation. In summary, we present a comprehensive enhancer activity map generated by quantitative measurement using STARR-seq for an important model plant. Successful characterization of enhancers in different cell types will help to improve our understanding of the tissue-specific selection of enhancers during development and shed new lights on the elusive functional mechanisms of enhancers at large.

Materials and methods

STARR-seq reporter plasmid construction and library preparation

To perform STARR-seq in rice cells, we constructed a screening vector (Figure S12) using the backbone of plasmid pBI221 by introducing a CMV 35S mini promoter, an intron and a GFP sequence, which are arranged sequentially and their sequences are shown in Table S7. Linear vector plasmid pBI221 was obtained by PCR amplification. We constructed reporter library as previously described [16] with some modifications. Briefly, we extracted genomic DNA from the 2-week-old rice seedlings. DNA was fragmented by sonication (30% power, 5 s on, 5 s off, repeat 30 times in a volume of 600 µl) (Scientz II-D, Ningbo Scientz Biotechnology, Ningbo, China). DNA fragments (500–800 bp in length) were repaired and ligated to VAHTS Adapters for Illumina with VAHTS Mate Pair Library Prep Kit for Illumina® (Catalog No. ND104, Vazyme Biotech, Nanjing, China) following the manufacturer’s protocol. We then cloned the adaptor-ligated genomic DNA into linearized vector using the ClonExprress II One Step Cloning Kit (Catalog No. C112, Vazyme Biotech). Ligated product was used to transform Trans1-T1 Phage Resistant Chemically Competent Cell (CD501, TransGen Biotech, Beijing, China) according to the manufacturer’s protocol. Transformed bacterial cells were cultured and reporter plasmids were purified using E.Z.N.A.® Endo-Free Plasmid Maxi Kit (Catalog No. D6926, Omega Bio-tek, Norcross, GA) and quantified on NanoDrop ONE (Thermo Fisher Scientific, Waltham, MA).

Protoplast transfection

Protoplasts from rice stem were isolated and transfected as described [40] with minimal modification. For transfection, 30–40 μg of reporter plasmid DNA was mixed with 100 μl of protoplasts (∼1 × 106 cells) in a tube containing 110 μl of freshly prepared solution of polyethylene glycol (40%, W/V) (Catalog No. 807490, Sigma–Aldrich Biotech, St. Louis, MO). The protoplasts transfected were cultured at 28 °C in the dark.

Construction of reporter cDNA and input plasmid libraries for Illumina sequencing

mRNA and plasmid DNA in transfected protoplasts were recovered after 16 h of transfection using TransZol Up Plus RNA Kit (Catalog No. ER501, TransGen Biotech, Beijing, China) and poly(A)+ RNA fraction was isolated using VAHTS mRNA Capture Beads (Catalog No. N401, Vazyme Biotech). 5 U of DNase I (Catalog No. M0303S, New England Biolabs, Ipswich, MA) was used to digest DNA at 37 °C for 20 min. Synthesis of first strand cDNA was carried out using TransScript One-Step gDNA Removal and cDNA synthesis SuperMix (Catalog No. AU311, TransGen Biotech). The total reporter cDNA was amplified for Illumina sequencing with a 2-step nested PCR strategy using the TransStart® FastPfu Fly DNA Polymerase (Catalog No. AP231, TransGen Biotech). First-round PCR products were purified using GeneJET PCR Purification Kit (Catalog No. K0702, Thermo Fisher Scientific) and was used as template for the second-round PCR amplification with VAHTS™ DNA Adapters for Illumina® (Catalog No. N302, Vazyme Biotech). Second-round PCR products were purified using GeneJET PCR Purification Kit and eluted in 20–30 μl of the elution buffer (EB). After the capture of poly(A)+ RNA, the left aqueous solution was treated with 10 μl of RNase A (Catalog No. GE 101, TransGen® Biotech) to remove any RNA in solution before plasmid DNA was purified using GeneJET PCR Purification Kit and eluted in 50 μl of EB. Purified plasmid DNA was amplified with the TransStart® FastPfu Fly DNA Polymerase and VAHTSTM DNA Adapters for Illumina® to enrich the inserted sequences cloned in reporter plasmids. DNA of reporter inserts was purified using GeneJET PCR Purification Kit and eluted in 20–30 μl of EB. Both cDNA and reporter inserts DNA libraries were sequenced on Illumina HiSeq X Ten platform.

Mapping and STARR-seq enhancer identification

We used Bowtie2 [41] to map the sequencing data to the Nipponbare reference genome (IRGSP1.0). Mapped reads were filtered with SAMtools [42] and only uniquely mapped reads were kept. The reproducibility of two independent experiments was evaluated and the Pearson correlation coefficient was calculated by multiBamSummary and plotCorrelation in deepTools [43]. Genome coverage of reporter insert DNA was calculated by BEDTools [44]. STARR-seq enhancer identification was carried out as described [16] using R package BasicSTARRseq and Bonferroni correction was performed to adjust P values. Genomic region was identified as enhancer if the enrichment of cDNA over input plasmid insert is ≥1.3 fold and the adjusted P value is <0.001. Only enhancers found in both replicates were kept for further analysis. Overlapping enhancers from two replicates were merged using BEDTools merge [44]. The distance between enhancer and the proximal TSS was computed using BEDTools closest command.

Comparison between STARR-seq enhancers and DHS-predicted enhancers

DHS data previously generated [33] were used to predict enhancers according to their location relative to genes in the rice genome following the definition previously described for Arabidopsis [23]. Genomic regions sensitive to DNase I digestion were identified as DHSs using MACS1.4 [45]. BEDTool intersect was used to filter the DHSs in intergenic regions. If the middle point of a STARR-seq enhancer fell within the sequence of a DHS-predicted enhancer, then these two elements are considered overlapping. AgriGO V2.0 [46] was used for GO analysis for enhancer proximal genes.

Enhancer cluster analysis on epigenetic modification

We downloaded dataset of H3K4me3 and H3K27me3 (accession No. GSE19602) [47], H3K27ac and H3K9me3 (accession No. GSE79033 [48], as well as DHSs (accession No. GSE26734) [35] from the Gene Expression Omnibus (GEO). Data of H3K4me1 (accession No. PRJCA000387) were downloaded from GSA [49]. Genomic regions with enriched histone modification were called using MACS1.4 [45]. We used 10,000 randomly selected regions of 700 bp in length as control, and repeated for at least 10 times to calculate the mean value of analyzed features as explained in Figure legend. R package EnrichedHeatmap [50] was used to plot the enrichment of histone modifications (H3K4me1, H3K4me3, H3K27ac, H3K27me3, and H3K9me3) and DHSs with the center of analyzed elements positioned at middle point and extended upstream and downstream up to 5 kb. K-means in Cluster3.0 [51] was used to cluster STARR-seq enhancers, DHS-predicted enhancers, and non-enhancer DHSs. We submitted the sequences of STARR-seq enhancers, DHS-predicted enhancers, and non-enhancer DHSs to the MEME-ChIP web server19 for de novo motif finding in the JASPAR CORE (2018) 20 plant database (http://jaspar.genereg.net/collection/core/). The motifs identified are enclosed in Table S8. We used R for all statistical analysis.

Data availability

The sequencing datasets in this study can be accessed at the Gene Expression Omnibus (GEO) as GEO: GSE121231.

Authors’ contributions

CH and LN designed and supervised the experiments. JS constructed the reporter library and validated the identified sites. YZ participated in cell preparation, transfection, and sequencing library preparation. NH carried out bioinformatics analysis. WS processed the raw data and participated in bioinformatics analysis. YZ helped with reporter library construction. LL advised on data analysis. CH wrote the manuscript with input from all authors. All authors read and approved the final manuscript.

Competing interests

The authors have declared no competing interests.

10 in total

1. Shooting for the STARRs: A Modified STARR-seq Assay for Rapid Identification and Evaluation of Plant Regulatory Sequences in Tobacco Leaves.

Authors: Matthias Benoit
Journal: Plant Cell Date: 2020-05-20 Impact factor: 11.277

2. Identification of Plant Enhancers and Their Constituent Elements by STARR-seq in Tobacco Leaves.

Authors: Tobias Jores; Jackson Tonnies; Michael W Dorrity; Josh T Cuperus; Stanley Fields; Christine Queitsch
Journal: Plant Cell Date: 2020-05-14 Impact factor: 11.277

3. Genome-wide identification of functional enhancers and their potential roles in pig breeding.

Authors: Yinqiao Wu; Yuedong Zhang; Hang Liu; Yun Gao; Yuyan Liu; Ling Chen; Lu Liu; David M Irwin; Chunhui Hou; Zhongyin Zhou; Yaping Zhang
Journal: J Anim Sci Biotechnol Date: 2022-07-04

4. Transcriptional and post-transcriptional regulation of young genes in plants.

Authors: Vivek Kumar Raxwal; Somya Singh; Manu Agarwal; Karel Riha
Journal: BMC Biol Date: 2022-06-09 Impact factor: 7.364

5. Accessible chromatin regions and their functional interrelations with gene transcription and epigenetic modifications in sorghum genome.

Authors: Chao Zhou; Zhu Yuan; Xueping Ma; Huilan Yang; Ping Wang; Lanlan Zheng; Yonghong Zhang; Xiaoyun Liu
Journal: Plant Commun Date: 2020-12-31

6. Genome-wide strand asymmetry in massively parallel reporter activity favors genic strands.

Authors: Brian S Roberts; E Christopher Partridge; Bryan A Moyers; Vikram Agarwal; Kimberly M Newberry; Beth K Martin; Jay Shendure; Richard M Myers; Gregory M Cooper
Journal: Genome Res Date: 2021-04-20 Impact factor: 9.043

7. Integration of Count Difference and Curve Similarity in Negative Regulatory Element Detection.

Authors: Na He; Wenjing Wang; Chao Fang; Yongjian Tan; Li Li; Chunhui Hou
Journal: Front Genet Date: 2022-02-18 Impact factor: 4.599

Review 8. Cis-regulatory sequences in plants: Their importance, discovery, and future challenges.

Authors: Robert J Schmitz; Erich Grotewold; Maike Stam
Journal: Plant Cell Date: 2022-02-03 Impact factor: 11.277

Review 9. Anno genominis XX: 20 years of Arabidopsis genomics.

Authors: Nicholas J Provart; Siobhan M Brady; Geraint Parry; Robert J Schmitz; Christine Queitsch; Dario Bonetta; Jamie Waese; Korbinian Schneeberger; Ann E Loraine
Journal: Plant Cell Date: 2021-05-31 Impact factor: 12.085

10. Genome-wide analysis and functional annotation of chromatin-enriched noncoding RNAs in rice during somatic cell regeneration.

Authors: Yu-Chan Zhang; Yan-Fei Zhou; Yu Cheng; Jia-Hui Huang; Jian-Ping Lian; Lu Yang; Rui-Rui He; Meng-Qi Lei; Yu-Wei Liu; Chao Yuan; Wen-Long Zhao; Shi Xiao; Yue-Qin Chen
Journal: Genome Biol Date: 2022-01-19 Impact factor: 13.583

10 in total