| Literature DB >> 35858042 |
Zhongzhen Liu1, Taifu Wang1, Xi Yang1, Qing Zhou1, Sujun Zhu2, Juan Zeng2, Haixiao Chen1, Jinghua Sun1,3, Liqiang Li1, Jinjin Xu1, Chunyu Geng1, Xun Xu1, Jian Wang1, Huanming Yang1, Shida Zhu1, Fang Chen1, Wen-Jing Wang1.
Abstract
BACKGROUND: Cell-free messenger RNA (cf-mRNA) and long non-coding RNA (cf-lncRNA) are becoming increasingly important in liquid biopsy by providing biomarkers for disease prediction, diagnosis and prognosis, but the simultaneous characterization of coding and non-coding RNAs in human biofluids remains challenging.Entities:
Keywords: RNA sequencing; biofluids; cell-free RNA; in vitro diagnosis
Mesh:
Substances:
Year: 2022 PMID: 35858042 PMCID: PMC9299576 DOI: 10.1002/ctm2.987
Source DB: PubMed Journal: Clin Transl Med ISSN: 2001-1326
FIGURE 1Schema of polyadenylation ligation‐mediated sequencing (PALM‐Seq) library preparation and data analysis workflow. (A) Experiment design plan. The library preparation process includes polynucleotide kinase (PNK) treatment and targeted depletion of abundant RNAs. (B) Data analysis workflow. Small RNAs are quantified by counts and reads per million (RPM), whereas mRNAs or long non‐coding RNAs (lncRNAs) are quantified by counts and transcripts per million (TPM)
FIGURE 2Biotype distribution and complexity of RNA in different treatments of polyadenylation ligation‐mediated sequencing (PALM‐Seq). (A) The percentage of mapped reads of different RNA biotypes obtained by different treatments. (B) The number of mRNAs (left panel) and long non‐coding RNAs (lncRNAs) (right panel) detected by different treatments. (C) Venn diagram shows the number of detected mRNAs (left) and lncRNAs (right) by different treatments. Genes that could be detected in two samples or more are considered to exist in this group. (D) Profiles of mRNA (upper) and lncRNA (lower) with and without T4 polynucleotide kinase (PNK) treatment and targeted RNA depletion. The Pearson correlation is calculated through simple linear regression of log2(TPM + 1). (E) Boxplots that show the percentage of reads that locate on the sense and antisense strand. The reads were recognized as exons, introns or promoters on mRNAs or lncRNAs. Wilcoxon sum‐rank test is used to identify the significance with p < .001. (F) The distribution of fragment length of mRNAs (upper panel) and lncRNAs (lower panel) by different treatments. The X‐axis shows the different reads length (bp), whereas the Y‐axis shows the rate of each different size of sequence fragment. (G) The ratio of pyrimidine versus purine at the 3′ ends and the rate of adenine at the 5′ end of mRNAs (upper) and lncRNAs (lower) in PALM‐Seq libraries. The p value was calculated by matched Student's t test (two‐tailed), and the level of significant was identified as following: not significant (N.S.), *p < .05, **p < .01, ***p < .001
FIGURE 3Characterization of plasma cell‐free RNA (cfRNA) from pregnant women. (A) Heat map of abundance of placenta‐specific genes, overlapped with differentially abundant cell‐free messenger RNA (cf‐mRNA) and long non‐coding RNA (cf‐lncRNA) between non‐pregnant and pregnant females at different trimesters. The data were converted to log2(TPM + 1) then scaled and clustered by rows. (B) Heat map of abundance of differentially abundant microRNA (miRNA). The data were converted to log2(RPM + 1) then scaled by rows. (C) Abundance of selected mRNA in polyadenylation ligation‐mediated sequencing (PALM‐Seq) (transcripts per million [TPM]) and reverse transcription‐quantitative PCR (RT‐qPCR) (ΔCT, normalized to B2M). (D) Abundance of selected miRNA in PALM‐Seq (RPM) and RT‐qPCR (ΔCT, normalized to RN7SL). (E) Boxplots (left) summarize the read length distributions for representative placenta‐specific mRNA and lncRNA across four pregnant females at three time points. Boxes represent the IQR, and whiskers represent first/third quartile 1.5*IQR. The right panel is the log2 (TPM + 1) of corresponding genes. (F) The ratio of pyrimidine and purine at the 3′ ends of mRNAs (left) and lncRNAs (right) in non‐pregnant and pregnant women at three time points. N: Non‐pregnant; T1: trimester 1; T2: trimester 2; T3: trimester 3. (G) Principal component analysis (PCA) plot of the frequency of 4‐mer end motifs of mRNA (left panel) and lncRNA (right panel). Each point represents individual non‐pregnant or pregnant women at three time points
FIGURE 4Characterization of cell‐free RNAs (cfRNAs) of different biofluids. (A) Number of mRNAs and microRNAs (miRNAs) detected in different biofluids. (B) Principal component analysis (PCA) plot using transcripts per million (TPM) of mRNA (upper) and long non‐coding RNA (lncRNA) (lower) TPM. Each point represents a single biofluid sample. (C) Heat map of differentially abundant genes in five biofluids (left panel). The data were converted to log2(TPM + 1) and then scaled and clustered by rows. (D) Abundance of representative genes in different biofluids. X‐axis represents different biofluids. Y‐axis represents values of log2(TPM + 1). (E) Plot of the origin of the enriched genes in different biofluid. Y‐axis represents specific tissues; X‐axis represents the ‘Rich Factor’, which means the percentage of all the user‐provided genes that are found in the given ontology term. (F) Plot of the enriched gene ontology (GO) biological process and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways of biofluid‐specific mRNAs. Y‐axis represents the pathways; X‐axis represents the ‘Rich Factor’. (G) The immune scores calculated by CIBERSORT (upper panel) and xCell (lower panel) in all five body fluids. Two‐tailed Student's t test was used to evaluated p value and the level of significant was identified as following: not significant (N.S.), *p < .05, **p < .01, ***p < .001
FIGURE 5Characterization of the cell‐free RNA (cfRNA) fragments in different biofluids. (A) The distribution of mRNAs (left panel) and long non‐coding RNAs (lncRNAs) (right panel) fragment length in five biofluids using the paired‐end sequencing data. The X‐axis shows the read length (bp), whereas the Y‐axis shows the rate of each different size of sequence fragments. (B) The percentage of mRNA (left panel) and lncRNA (right panel) fragments with different lengths. The fragments are divided into <50 bp, 50–200 bp, >200 bp. (C) The number of mRNAs (left panel) and lncRNAs (right level) with long fragments (>200 bp fragments), divided by abundance value. (D) The frequency plot shows the frequency of each type of four nucleotides at the 4‐mer end motifs of mRNAs (upper panel) and lncRNAs (lower panel) in five body fluids. (E) Principal component analysis (PCA) plots based on the frequency of all 256 possible 4‐mer end motifs of mRNAs (upper panel) and lncRNAs (lower panel). Each point represents a single biofluid sample. (F) The ratio of pyrimidine and purine at the 3′ ends (left panel) and adenine at the 5′ end (right panel) of mRNAs and lncRNAs in different biofluids. Fractured RNAs of Universal Human RNA Reference (UHRR) by heat or RNase A treatment are used as control of intracellular RNAs. (G) Scatter plot shows the correlation between the rate of short fragments (<50 bp) and the ratio of pyrimidine and purine at the 3′ ends (left panel) and adenine at the 5′ end (right panel)