| Literature DB >> 35887003 |
Jan Pawel Jastrzebski1, Aleksandra Lipka2, Marta Majewska3, Karol G Makowczenko4, Lukasz Paukszto1,5, Joanna Bukowska6, Slawomir Dorocki7, Krzysztof Kozlowski8, Mariola Slowinska9.
Abstract
Long non-coding RNAs (lncRNAs) are transcripts not translated into proteins with a length of more than 200 bp. LncRNAs are considered an important factor in the regulation of countless biological processes, mainly through the regulation of gene expression and interactions with proteins. However, the detailed mechanism of interaction as well as functions of lncRNAs are still unclear and therefore constitute a serious research challenge. In this study, for the first time, potential mechanisms of lncRNA regulation of processes related to sperm motility in turkey were investigated and described. Customized bioinformatics analysis was used to detect and identify lncRNAs, and their correlations with differentially expressed genes and proteins were also investigated. Results revealed the expression of 863 new/unknown lncRNAs in ductus deferens, testes and epididymis of turkeys. Moreover, potential relationships of the lncRNAs with the coding mRNAs and their products were identified in turkey reproductive tissues. The results obtained from the OMICS study may be useful in describing and characterizing the way that lncRNAs regulate genes and proteins as well as signaling pathways related to sperm motility.Entities:
Keywords: DEL; NGS; RNA-seq; gene expression; lncRNA; sperm motility; transcriptomics; turkey
Mesh:
Substances:
Year: 2022 PMID: 35887003 PMCID: PMC9324027 DOI: 10.3390/ijms23147642
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 6.208
Overall statistics of sequenced, mapped and processed data. The read values are presented in millions. Min/max are the minimum and maximum values. Mean values are calculated for all six samples for each tissue.
| T | E | DD | ||||
|---|---|---|---|---|---|---|
| Min/Max | Mean | Min/Max | Mean | Min/Max | Mean | |
|
| 127.9/165.8 | 146.9 | 126.4/176.9 | 152.0 | 102.6/167.7 | 145.9 |
|
| 110.7/140.6 | 123.8 | 96/151.9 | 127.5 | 86.2/137.6 | 121.4 |
|
| 81.1/86.5 | 84.3 | 76/87.2 | 83.5 | 82/84.7 | 83.2 |
|
| 78/100 | 87.6 | 64.8/117.3 | 93.0 | 62.3/99.9 | 84.8 |
|
| 76.2/97.6 | 85.9 | 62.6/115.8 | 91.6 | 61.3/98.4 | 83.5 |
|
| 97.6/98.6 | 98.00 | 96.6/98.9 | 98.3 | 98.2/98.6 | 98.4 |
|
| 59,151/73,569 | 67,594.3 | 22,109/61,868 | 47,862.8 | 33,355/47,861 | 41,412.5 |
|
| 36,879/43,510 | 40,405.3 | 17,596/34,711 | 28,438.5 | 21,196/27,526 | 24,970.2 |
Figure 1The flowchart of the lncRNA identification process. The process consists of several steps (described in the main text) grouped in the three stages: “input data”—preparing and reading the data from the previous steps (colored orange) and reference data (colored blue), “filtering steps”—filtering the transcripts basing on the annotation (i.e., extracting known lncRNAs and removing protein-coding transcripts) and structural features such as sequence length and the number of exons and “coding probability”—screening the sequence dataset based on the coding probability.
Figure 2The Venn plots show the number of transcripts in the lncRNA identification process in the two stages: (A) filtering by basic parameters: expression level (expr), number of exons, sequence length, annotation to protein-coding genes (nonPC), (B) numbers of sequences predicted as non-coding using seven tools.
Figure 3MA and volcano plots of expressed transcripts (presented on a gene level) in three comparisons: (A,D) epididymis vs. testis, (B,E) ductus deferens vs. testis and (C,F) epididymis vs. ductus deferens. Each dot presents expressed transcripts, where dark grey is DEGs, red is upregulated lncRNAs and green is downregulated lncRNAs. The triangles and rectangles indicate off the chart values.
The number of up- and down-regulated differentially expressed lncRNAs in three comparisons.
| E vs. T | DD vs. T | E vs. DD | |
|---|---|---|---|
| All | 139 | 182 | 25 |
| Up-regulated | 93 | 132 | 25 |
| Down-regulated | 46 | 50 | 0 |
Differentially expressed genes (DEGs) associated with molecular processes engaged in sperm motility and grouped into three sets (T/(E + DD), E/(T + DD) and DD/(T + E) based on expression profiles presented in Figure 5.
| Group | Process | Genes |
|---|---|---|
| T/(E + DD) | actin binding |
|
| E/(T + DD) | calcium ion binding |
|
| DD/(T + E) | axonemal dynein complex assembly |
|
Figure 4The expression profiles of DELs in three comparisons based on trans-correlated DEGs: (A) DEGs of statistically significant expression difference between ductus deferens (DD) and other tissues-testis (T) and epididymis (E), (B) epididymis vs. testis and ductus deferens, (C) testis vs. epididymis and ductus deferens. The red triangle indicates the tissue against which the expression profiles of the other two tissues (green circles) were compared. The expression profiles of DEGs are presented as grey lines, blue lines are the profiles of positively correlated DELs and red are negatively correlated lncRNAs. Expression profiles (the Z-score values of FPKM-Fragments Per Kilo base of transcript per Million mapped fragments) of all identified DELs are presented as the heatmap in Figure 5.
Cumulative correlations between DELs and DEGs. Significant correlations between lncRNA and mRNA or proteins in each comparison (number of interactions associated with the sperm motility/all identified interactions).
| Relations/Interactions | Cis | Trans | lncRNA-mRNA | lncRNA-Protein |
|---|---|---|---|---|
| E vs. T | 0/23 | 519/15,190 | 3302/10,171 | 5845/9712 |
| DD vs. T | 0/30 | 749/22,983 | 4875/13,938 | 7335/14,172 |
| E vs. DD | 0/1 | 99/759 | 221/317 | 405/512 |
Figure 5The expression heatmap (the Z-score of the FPKM value) of the identified lncRNAs in each sample of each tissue grouped in the clusters (the dendrogram inside). The outer bars’ length corresponds to the chromosomal distribution of the DELs.
Figure 6Trans-correlations (A) and direct lncRNA-mRNA interactions (B) identified in the group of processes with significantly expressed genes between DEGs and DELs associated with the E/(T + DD) group of molecular functions (see Table 4 and Figure 4). The first column in Figure A represents the biological processes. The second column shows the genes associated with the processes in the first column (the lines of the Sankey plot colored corresponding to the processes represent the gene–process relationship). The third column presents lncRNAs, and the lines connecting them with genes symbolize the expression correlation (blue—positive correlation, red—negative correlation). The green color in Figure B indicates lncRNAs, and red indicates the protein-coding genes. The lines symbolize the predicted direct interactions of the lncRNA-mRNA.
Figure 7The STRING network of proteins involved in sperm motility. DD/(T + E), E/(T + DD) and T/(E + DD) groups correspond to the three groups of analyzed processes (see Table 4 and Figure 5). The color of the lines symbolizes the type of interaction and the source of information: light blue and pink-known interactions, green, red and blue-predicted interactions, yellow—text mining, black—co-expression, and purple—protein homology.
Figure 8The real-time PCR results. The statistical significance of the expression differences was confirmed at the level <0.05 and is marked with a star. Data are shown as mean values ± SD.
Figure 9A visualization of the 3D model of the human HNF4A protein (PDB id 4IQR). The protein part is presented as ribbons, and the conserved DNA-binding domains are colored red.