| Literature DB >> 35468141 |
Hongyang Li1, Ridvan Eksi1, Daiyao Yi1, Bradley Godfrey2, Lisa R Mathew3, Christopher L O'Connor2, Markus Bitzer2, Matthias Kretzler1,2, Rajasree Menon1,2, Yuanfang Guan1,2.
Abstract
Studying isoform expression at the microscopic level has always been a challenging task. A classical example is kidney, where glomerular and tubulo-interstitial compartments carry out drastically different physiological functions and thus presumably their isoform expression also differs. We aim at developing an experimental and computational pipeline for identifying isoforms at microscopic structure-level. We microdissected glomerular and tubulo-interstitial compartments from healthy human kidney tissues from two cohorts. The two compartments were separately sequenced with the PacBio RS II platform. These transcripts were then validated using transcripts of the same samples by the traditional Illumina RNA-Seq protocol, distinct Illumina RNA-Seq short reads from European Renal cDNA Bank (ERCB) samples, and annotated GENCODE transcript list, thus identifying novel transcripts. We identified 14,739 and 14,259 annotated transcripts, and 17,268 and 13,118 potentially novel transcripts in the glomerular and tubulo-interstitial compartments, respectively. Of note, relying solely on either short or long reads would have resulted in many erroneous identifications. We identified distinct pathways involved in glomerular and tubulo-interstitial compartments at the isoform level, creating an important experimental and computational resource for the kidney research community.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35468141 PMCID: PMC9037928 DOI: 10.1371/journal.pcbi.1010040
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Fig 2Variation of transcription start and end positions of consensus full-length transcripts.
(A) Thirteen multi-exon consensus full-length transcripts (CFL’s) from gene AIF1 locus. (B) The first exon with vertical (green) lines demonstrating the location of annotated transcription start sites (TSS). Thirteen CFL’s have a total of 13 different TSS, only 6 out of 13 TSS are within 10 bp of an annotated TSS. (C) The last exon with vertical (red) line shows the only annotated transcription end sites (TES). Thirteen CFL’s have a total of 7 different TES, only 4 out of 7 TES are within 10 bp annotated TES.
Validation of PacBio transcript features from kidney tissue.
| TSS | Splice Junctions | TES | Validated Multi-exon CFLs | |
|---|---|---|---|---|
| Glomerular | ||||
| PacBio with Illumina ERCB | ||||
| Annotation only | 3732 | 1194 | 4850 | 47785 |
| Short reads only | 5583 | 39 | ||
| Short reads or annotation | 111506 | 5071 | ||
| Total | 38831 | 171742 | 26260 | 71492 |
| Tubulointerstitial | ||||
| PacBio with Illumina TN | ||||
| Annotation only | 3534 | 7329 | 5278 | 53540 |
| Short reads only | 3105 | 8 | ||
| Short reads or annotation | 119932 | 5318 | ||
| PacBio with Illumina ERCB | ||||
| Annotation only | 516 | 4883 | ||
| Short reads only | 8824 | 104 | ||
| Short reads or annotation | 125651 | 5425 | ||
| Total | 42896 | 185185 | 24625 | 75573 |
TSS- transcription start site; TES- transcription end site; CFL consensus full-length transcripts; ERCB European renal cDNA bank; TN tumor nephrectomy.
Classification of validated-collapsed isoforms to GENCODE annotation using the Cuffcompare tool.
| Type of Match | Validated Tubulo—interstitial Isoforms | Validated Glomerular Isoforms | |
|---|---|---|---|
| 1 | Complete match of intron chain with an annotated isoform | 10407 | 10882 |
| 2 | Contained within a reference isoform | 3852 | 3857 |
| Total annotated transcripts | 14259 | 14739 | |
| 3 | Potentially novel isoform | 8910 | 7767 |
| 4 | Intergenic transcript | 4208 | 9501 |
| Total novel transcripts | 13118 | 17268 | |
| 5 | A transfrag falling entirely within a reference intron | 11407 | 16627 |
| 6 | Single exon transfrag overlapping a reference exon and at least 10 bp of a reference intron | 3729 | 4956 |
| 7 | Exonic overlap with reference on the opposite strand | 2310 | 3420 |