| Literature DB >> 24393560 |
Yishay Pinto, Haim Y Cohen, Erez Y Levanon.
Abstract
BACKGROUND: ADAR proteins are among the most extensively studied RNA binding proteins. They bind to their target and deaminate specific adenosines to inosines. ADAR activity is essential, and the editing of a subset of their targets is critical for viability. Recently, a huge number of novel ADAR targets were detected by analyzing next generation sequencing data. Most of these novel editing sites are located in lineage-specific genomic repeats, probably a result of overactivity of editing enzymes, thus masking the functional sites. In this study we aim to identify the set of mammalian conserved ADAR targets.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24393560 PMCID: PMC4053846 DOI: 10.1186/gb-2014-15-1-r5
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Mammalian evolutionarily conserved sites
| 1 | chr1 | 160302244 | - | COPA | CDS | NM_001098398:c.A490G:p.I164V |
| 2 | chr11 | 105804694 | + | GRIA4 | CDS | NM_000829:c.A2293G:p.R765G |
| 3 | chr11 | 105815132 | + | GRIA4 | intron | |
| 4 | chr11 | 105816106 | + | GRIA4 | intron | |
| 5 | chr11 | 105816129 | + | GRIA4 | intron | |
| 6 | chr11 | 105816145 | + | GRIA4 | intron | |
| 7 | chr11 | 105816160 | + | GRIA4 | intron | |
| 8 | chr12 | 5021742 | + | KCNA1 | CDS | NM_000217:c.A1198G:p.I400V |
| 9 | chr13 | 46090371 | + | COG3 | CDS | NM_031431:c.A1903G:p.I635V |
| 10 | chr14 | 26917530 | - | NOVA1 | CDS | NM_006489:c.A1087G:p.S363G |
| 11 | chr14 | 101506074 | + | mir376C | microRNA | |
| 12 | chr17 | 43045220 | - | C1QL1 | CDS | NM_006688:c.A197G:p.Q66R |
| 13 | chr19 | 47152854 | - | DACT3 | CDS | NM_145056:c.A775G:p.R259G |
| 14 | chr2 | 20450819 | - | PUM2 | 3′UTR | |
| 15 | chr2 | 21233202 | - | APOB | CDS | NM_000384:c.C6538G:p.Q2180stop |
| 16 | chr2 | 210835613 | + | UNC80 | CDS | NM_032504:c.A7990G:p.S2664G |
| 17 | chr20 | 36147533 | - | BLCAP | CDS | NM_001167821:c.A44G:p.K15R |
| 18 | chr20 | 36147563 | - | BLCAP | CDS | NM_001167821:c.A14G:p.Q5R |
| 19 | chr20 | 36147572 | - | BLCAP | CDS | NM_001167821:c.A5G:p.Y2C |
| 20 | chr20 | 36148080 | - | BLCAP | intron | |
| 21 | chr20 | 52104918 | + | TSHZ2 | 3′UTR | |
| 22 | chr21 | 30953750 | - | GRIK1 | CDS | NM_175611:c.A1862G:p.Q621R |
| 23 | chr21 | 34922801 | + | SON | CDS | NM_032195:c.A1264G:p.T422A |
| 24 | chr21 | 34923319 | + | SON | CDS | NM_032195:c.A1782G:p.L594L |
| 25 | chr21 | 46595620 | + | ADARB1 | intron | |
| 26 | chr3 | 53820892 | + | CACNA1D | CDS | NM_001128839:c.A4791G:p.I1597M |
| 27 | chr3 | 58141801 | + | FLNB | CDS | NM_001164319:c.A6815G:p.Q2272R |
| 28 | chr3 | 62423807 | - | CADPS | CDS | NM_183393:c.A3512G:p.E1171G |
| 29 | chr4 | 57976234 | - | IGFBP7 | CDS | NM_001253835:c.A284G:p.K95R |
| 30 | chr4 | 57976286 | - | IGFBP7 | CDS | NM_001253835:c.A232G:p.R78G |
| 31 | chr4 | 158257875 | + | GRIA2 | CDS | NM_000826:c.A1820G:p.Q607R |
| 32 | chr4 | 158257879 | + | GRIA2 | CDS | |
| 33 | chr4 | 158258136 | + | GRIA2 | intron | |
| 34 | chr4 | 158258137 | + | GRIA2 | intron | |
| 35 | chr4 | 158281294 | + | GRIA2 | CDS | NM_000826:c.A2290G:p.R764G |
| 36 | chr5 | 156736808 | + | CYFIP2 | CDS | NM_001037332:c.A958G:p.K320E |
| 37 | chr6 | 34100903 | - | GRM4 | CDS | NM_000841:c.A371G:p.Q124R |
| 38 | chr6 | 44120349 | + | TMEM63B | CDS | NM_018426:c.A1856G:p.Q619R |
| 39 | chr6 | 102337689 | + | GRIK2 | CDS | NM_001166247:c.A1699G:p.I567V |
| 40 | chr6 | 102337702 | + | GRIK2 | CDS | NM_001166247:c.A1712G:p.Y571C |
| 41 | chr6 | 102372589 | + | GRIK2 | CDS | NM_001166247:c.A1862G:p.Q621R |
| 42 | chr6 | 102372630 | + | GRIK2 | intron | |
| 43 | chr6 | 102374616 | + | GRIK2 | intron | |
| 44 | chr6 | 102374643 | + | GRIK2 | intron | |
| 45 | chr6 | 150093334 | + | PCMT1 | intron | |
| 46 | chr8 | 103841636 | - | AZIN1 | CDS | NM_148174:c.A1099G:p.S367G |
| 47 | chr8 | 103841637 | - | AZIN1 | CDS | NM_148174:c.A1098G:p.E366E |
| 48 | chr9 | 97847739 | + | mir23B | microRNA | |
| 49 | chrX | 114082682 | + | HTR2C | CDS | NM_000868:c.A466G:p.I156V |
| 50 | chrX | 114082684 | + | HTR2C | CDS | NM_000868:c.A468G:p.I156M |
| 51 | chrX | 114082689 | + | HTR2C | CDS | NM_000868:c.A473G:p.N158S |
| 52 | chrX | 114082694 | + | HTR2C | CDS | NM_000868:c.A478G:p.I160V |
| 53 | chrX | 122598962 | + | GRIA3 | CDS | NM_000828:c.A2323G:p.R775G |
| 54 | chrX | 122598998 | + | GRIA3 | intron | |
| 55 | chrX | 135111055 | + | SLC9A6 | intron | |
| 56 | chrX | 135111070 | + | SLC9A6 | intron | |
| 57 | chrX | 151358319 | - | GABRA3 | CDS | NM_000808:c.A1026G:p.I342M |
| 58 | chrX | 153579737 | - | FLNA | intron | |
| 59 | chrX | 153579950 | - | FLNA | CDS | NM_001456:c.A6998G:p.Q2333R |
List of conserved editing sites. Coordinates are based on the genome version GRCh37/hg19. For each site, the table includes the following information: chromosome, genomic coordinate, strand, gene name, genomic compartment, RefSeq id (if available), editing transformation, and coordinate related to the Refseq ID, and amino acid change (for the same Refseq ID).
Figure 1Mammalian set of editing sites. (A) BLAST hits for human-mouse editing sets alignment, the Y axis represents the alignment length and the X axis represents the identity percent. The conserved set is colored red, non-conserved hits are colored blue, and the linear separator is colored in black. (B) Venn diagram of human editing sites shows that only a tiny fraction of the editing sites are conserved. The conserved sites are small minority of the non-Alu sites, as well. All sites (1,432,744) are colored blue, non-Alu sites (52,312) are colored yellow, and 59 conserved sites are colored red. (C) Number of total known editing sites (red) and conserved (blue) since the identification of the first editing sites, until today. Identification of sites using next generation sequencing technologies began in 2009; this period is colored in gray. While the total number of editing sites increased by six orders of magnitude during this period, the number of conserved sites barely increased. (D) Hit enrichment for editing sites compared to SNPs using exactly the same pipeline shows high signal-to-noise ratio. The number of hits was normalized to all sites dataset sizes (left) and to non-Alu sites (right).
Figure 2The size of the ESS is almost independent of data accumulation. (A) An accumulation curve of editing sites per strain (data derived from Danecek et al., whole brain samples). Strain datasets are sorted in ascending order of editing site amount (that is, the first strain contains the least number of editing sites, the second is the strain with the least additional editing sites, and so on). This result shows that addition of data does not lead to the addition of more conserved sites. (B-D) Visualization of sites per strain, ESS (B), random sites selected from all sites in the same proportion as the ESS (C), and all other sites (D). Editing signal is colored in yellow; sites with no data, that have, fewer than three reads are colored in gray, and sites with no evidence for editing are colored in blue. The heat-maps demonstrate a strong editing signal for conserved sites over all mice strains in contrast to the non-conserved sites.
Figure 3Most of the ESS sites are located in a coding region or adjacent to such a site. (A) Genomic location of evolutionarily conserved sites. (B) Frequency of non-synonymous editing alterations in exonic sites for both groups demonstrates enrichment of sites that cause amino acid change in the ESS compared to the control (all other sites, P <2 × 10-11 calculated by Fisher’s exact test). (C-E) Secondary structure shows spatial proximity of coding and intron sites of GRIK2 (C), FLNA (D) and BLCAP (E) genes; editing sites are highlighted in orange and marked by an arrow.
Figure 4Neighbor preferences for ESS and all sites. Nucleotide frequency for ESS (A), and all non-Alu sites (B). Both signatures are in agreement with the ADAR motif.
Figure 5ESS exhibit significantly higher and more consistent editing levels and higher expression levels compared to all other sites. (A) Distribution of editing levels for ESS (black) and all other sites (white) (*P <10-6, Fisher’s exact test). (B) Mean editing levels for ESS versus all other sites (*P <7 × 10-22, Student’s two-tailed t-test). (C) Mean standard deviation for ESS and control (*P <4.6 × 10-8, two-tailed Student’s t-test). (D, E) ESS exhibits higher expression levels, as demonstrated by box plot (D) and by mean expression levels (E) (*P <10-28, two-tailed Student’s t-test).
Figure 6A-to-I editing as a mechanism for the reversion of G-to-A evolution. All mouse editing sites were converted to human genome coordinates. G-to-A ratio was calculated and fixed as 1 (left). All human editing sites were converted to mouse genome coordinates; G-to-A ratio was calculated and normalized (right), exhibiting 1.66-fold enrichment compared to the mouse-to-human conversion. (*P = 10-7, Fisher’s exact test).
Figure 7Editing and exonization in the SLC9A6 gene. (A) Schematic illustration of exons 12 to 14 of the SLC9A6 gene. Exons are depicted as blue boxes; the LINE inverted repeats are depicted as red boxes. Sense and antisense LINEs are expected to create a dsRNA secondary structure, thereby allowing RNA editing. The two editing sites are indicated in orange, revealing an R/G amino acid change. (B) Validation of editing by Sanger sequencing, genomic DNA (upper panel) and cDNA (lower panel) from the same individual; editing sites are marked by arrows.