| Literature DB >> 29527417 |
Raili Ermel1, Katyayani Sukhavasi2, Oscar Franzén3, Rajeev Jain2, Anamika Jain2, Christer Betsholtz3,4, Chiara Giannarelli5,6, Jason C Kovacic5, Arno Ruusalepp1,2,7, Josefin Skogsberg8, Ke Hao6, Eric E Schadt6,7, Johan L M Björkegren3,2,6,7.
Abstract
RNA editing modifies transcripts and may alter their regulation or function. In humans, the most common modification is adenosine to inosine (A-to-I). We examined the global characteristics of RNA editing in 4,301 human tissue samples. More than 1.6 million A-to-I edits were identified in 62% of all protein-coding transcripts. mRNA recoding was extremely rare; only 11 novel recoding sites were uncovered. Thirty single nucleotide polymorphisms from genome-wide association studies were associated with RNA editing; one that influences type 2 diabetes (rs2028299) was associated with editing in ARPIN. Twenty-five genes, including LRP11 and PLIN5, had editing sites that were associated with plasma lipid levels. Our findings provide new insights into the genetic regulation of RNA editing and establish a rich catalogue for further exploration of this process.Entities:
Keywords: Bioinformatics; Biostatistics; Gene expression; Quantitative trait loci; RNA editing; RNA-seq
Year: 2018 PMID: 29527417 PMCID: PMC5844249 DOI: 10.7717/peerj.4466
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Figure 1General characteristics of RNA editing events.
(A) Simplified flowchart of the data analysis pipeline. See Fig. S1 for details. MHC, major histocompatibility complex. (B) Percentages of RNA editing types that were identified. The x-axis shows the editing type, and the y-axis shows the percentage of total editing calls. (The plot counts redundant sites, i.e., if the same RNA editing site is found in two samples, then the site is counted twice.) Editing events were identified in strand-specific libraries. Changes indicated in red (x-axis) are canonical events. The majority of events are consistent with A-to-I editing. (C) As (B) except that editing events were identified in non-strand-specific libraries. (D) Relationship between sequencing depth after collapsing of PCR duplicates (x-axis; million reads) and number of A-to-G(I) editing calls (y-axis). Each dot represents a liver sample from one subject. Only the non-strand-specific samples are shown. (E) Percentage of A-to-G(I) events in various genomic repeats. The repeat family/class is shown on the x-axis and identified events/genomic coverage in percent is on the y-axis. Alu is the only repeat class that shows enrichment compared with the total genomic coverage. (F) Pie chart of A-to-G(I) events in Alu subfamilies (S, Y, and J). (G) The box plot shows sample mean editing ratios of A-to-G(I) events. One outlier was removed from the Y subfamily. Editing events in Alu Y have significantly higher editing ratios compared with those in Alu J and S (Mann–Whitney U test; P < 2.2e − 16).
Summary of data and RNA editing events.
| Library type | Seq. read length (bp) | No. samples | Seq. depth (×109) | Total no. RNA editing events | No. A-to-G(I) events | No. C-to-T(U) events |
|---|---|---|---|---|---|---|
| Strand-specific | 50 | 2,267 | 20.6 | 714,372 | 450,002 | 40,116 |
| Non-strand-specific | 100 | 2,034 | 22.3 | 2,217,412 | 1,484,357 | 375,935 |
Notes.
Total read count after collapsing of PCR duplicates (billion single-end sequencing reads).
Includes T-to-C calls.
Includes G-to-A calls.
Figure 2Expression of ADAR1 and ADAR2 across studied tissues.
Box plots show gene expression of (A) ADAR1 and (B) ADAR2 in RPKM (y-axis). AOR, aorta; MAM, internal mammary artery; MAC, macrophage; FOC, foam cell; LIV, liver; SKM, skeletal muscle; SUF, subcutaneous fat; VAF, visceral fat; BLO, whole blood.
Novel A-to-G(I) mRNA-recoding sites.
| Site | Gene | Amino acid change | Tissue(s) | Median editing ratio | N |
|---|---|---|---|---|---|
| 4:57,110,146 | E69G | AOR, MAM | 0.11 | 197 | |
| 7:38,262,191 | N58S | BLO | 0.40 | 37 | |
| 8:144,247,668 | S1037G, S1028G | LIV, VAF | 0.47 | 94 | |
| 9:33,271,197 | K121E | BLO, SUF | 0.07 | 853 | |
| 10:45,789,442 | K1158R, | BLO, LIV, SUF, | 0.50 | 216 | |
| 12:57,625,434 | R500G | MAM | 0.29 | 74 | |
| 16:30,188,879 | E434G | VAF | 0.17 | 22 | |
| 17:1,534,142 | D242G | LIV | 0.08 | 55 | |
| 19:7,520,407 | S389G | MAM | 0.09 | 95 | |
| 19:54,221,229 | Q270R | FOC, MAC | 0.21 | 251 | |
| 19:54,221,256 | E261G | FOC, MAC | 0.50 | 223 |
Notes.
Genomic coordinate of the site (GRCh38). Given as chromosome:position.
Tissue abbreviations: aortic wall (AOR), internal mammary artery (MAM), whole blood (BLO), liver (LIV), visceral fat (VAF), subcutaneous fat (SUF), skeletal muscle (SKM), foam cells (FOC), and macrophages (MAC).
Across all samples.
Number of samples in which editing was found. This number may include samples whose editing signal was lower than required to meet the criteria for detection.
A-to-G(I) RNA editing sites associated with clinical parameters.
| Site | Tissue | Trait | FDR | Gene | Region | |
|---|---|---|---|---|---|---|
| 1:204,556,653 | LIV | HDL | 2.7e−06 | 0.04 | 3′ UTR | |
| 2:37,100,523 | BLO | LDL | 5.1e−06 | 0.03 | 3′ UTR | |
| 6:149,822,432 | SUF | LDL | 7.0e−06 | 0.01 | Intron | |
| 10:50,804,961 | LIV | LDL/HDL | 5.7e−07 | 0.01 | 3′ UTR | |
| 12:7,097,577 | LIV | LDL/HDL | 5.9e−07 | 0.01 | Intron | |
| 12:68,843,961 | LIV | 18:2, linoleic acid | 8.2e−08 | 0.003 | 3′ UTR | |
| 19:4,526,500 | LIV | HDL | 3.1e−07 | 0.005 | Intron | |
| 19:6,683,589 | LIV | HDL | 2.7e−06 | 0.04 | Intron, RE | |
| 19:11,449,682 | LIV | VLDL | 1.4e−06 | 0.02 | Intron |
Notes.
Genomic coordinate of the site (GRCh38). Given as chromosome:position.
Tissue abbreviations: aortic wall (AOR), internal mammary artery (MAM), whole blood (BLO), liver (LIV), visceral fat (VAF), subcutaneous fat (SUF), skeletal muscle (SKM), foam cells (FOC), and macrophages (MAC).
False discovery rate (permutation P-value). Based on 10,000 permutations.
Regulatory element (RE).
Figure 3RNA editing quantitative trait loci.
(A) Schematic illustration of an RNA editing QTL (edQTL). The genomic marker (e.g., a SNP) is associated with RNA editing levels at a specific site in a transcript. (B) Summary of number of identified (index) edQTLs per tissue. (C–E) Box plots of edQTLs where regulatory SNPs coincide with lead SNPs from GWAS. P-values refer to non-rank-normal transformed data. The genomic coordinate of the RNA editing site and the overlapping gene are shown on top of each plot. The GWAS SNP is shown at the bottom together with its reported trait. The width of the box is proportional to the number of observations it contains. Rings are outlier samples. The molecular distance is indicated on the top of each subplot: (cis) intra-chromosome association; and (trans) inter-chromosome association. GWAS P-values: rs2573346 (sarcoidosis), P = 9e−16; rs2028299 (type 2 diabetes), P = 1e−11; rs78579285 (joint mobility), P = 6e−12; rs7546668 (glomerular filtration rate), P = 1e−09; rs4073054 (HDL), P = 5e−11; rs1127311 (C-reactive protein or triglycerides), P = 6e−09; rs3947 (blood protein levels), P = 2e−27; rs4957048 (multiple sclerosis/ulcerative colitis), P = 1e−09; rs6756513 (platelet count), P = 7e−10; rs10847434 (coronary artery disease), P = 6e−15; and rs1883350 (fatty liver disease), P = 4e−10.