| Literature DB >> 28934256 |
Heng Chen1,2,3, Guoqiang Xiao1,2,3, Xueliang Chai1,2,3, Xingguan Lin1,2,3, Jun Fang1,2,3, Shuangshuang Teng1,2,3.
Abstract
BACKGROUND: Blood clams (Tegillarca granosa) are one of the most commercial shellfish in China and South Asia with wide distribution in Indo-Pacific tropical to temperate estuaries. However, recent data indicate a decline in the germplasm of this species. Furthermore, the molecular mechanisms underpinning reproductive regulation remain unclear and information regarding genetic diversity is limited. Understanding the reproductive biology of shellfish is important in interpreting their embryology development, reproduction and population structure. Transcriptome sequencing (RNA-seq) rapidly obtains genetic sequence information from almost all transcripts of a particular tissue and currently represents the most prevalent and effective method for constructing genetic expression profiles.Entities:
Mesh:
Year: 2017 PMID: 28934256 PMCID: PMC5608214 DOI: 10.1371/journal.pone.0184584
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Correlation analysis between two selected individual blood clams.
| T01 | T02 | T03 | T04 | T05 | T06 | T07 | |
|---|---|---|---|---|---|---|---|
| T02 | 0.8843 | ||||||
| T03 | 0.7863 | 0.8641 | |||||
| T04 | 0.8557 | 0.9085 | 0.7593 | ||||
| T05 | 0.6415 | 0.7548 | 0.7108 | 0.6454 | |||
| T06 | 0.5396 | 0.4673 | 0.4125 | 0.4078 | 0.7390 | ||
| T07 | 0.5991 | 0.5164 | 0.4338 | 0.4442 | 0.7004 | 0.8368 | |
| T08 | 0.4502 | 0.5410 | 0.5555 | 0.4406 | 0.8801 | 0.7267 | 0.6601 |
Values shown represent r2 value. T01-T04 are females and T05-T08 are males. The condition (r2>0.82) was used to eliminate differential individuals. T01, T02, T04 were similar, thus eliminating T03 which was significantly different. T05 is similar to T08 while T06 is similar to T07. However, T05 was not similar to T06. Therefore, we constructed two groups for analysis (T05 T08 versus T01 T02 T04, T06 T07 versus T01 T02 T04) to search for more differentially-expressed genes (DEGs).
Quality control analysis for RNA-seq data.
| Sample | ID | GC (%) | Q20 (%) | Q30 (%) |
|---|---|---|---|---|
| ♀1 | T01 | 37.48 | 94.06 | 89.46 |
| ♀2 | T02 | 37.05 | 94.20 | 89.78 |
| ♀3 | T03 | 36.86 | 94.22 | 89.77 |
| ♀4 | T04 | 37.65 | 94.22 | 89.77 |
| ♂1 | T05 | 36.82 | 94.50 | 90.22 |
| ♂2 | T06 | 37.04 | 94.51 | 90.25 |
| ♂3 | T07 | 37.29 | 94.28 | 89.84 |
| ♂4 | T08 | 36.25 | 94.45 | 90.16 |
Q (Quality) -score represents the accuracy of base recognition, Q-score = -10*log P where P represents the probability of an error in base recognition (Q30>85%), and GC (%) represents the proportion of GC bases of the entire database for each individual blood clam (T01-T08).
Fig 1Length distribution of unigenes showing assembly quality of the blood clam transcriptome.
Fig 2Venn diagrams for DEG relationships between two groups of blood clam showing the common differentially expressed genes (DEGs) and specific DEGs in each gene set.
Fig 3Volcano plot for group analysis of blood clams (T05 T08 versus T01 T02 T04).
We used specific criteria to identify the significance of differential expression: an FDR (False Discovery Rate)<0.01 and |log2FC (Fold Change)|≥1. The left side of 0 on the x-axis represents male-biased differentially expressed genes (DEGs) while the right side of 0 represents female-biased DEGs. The green region shows significant differences (an FDR<0.01), while the red region shows non-significant differences (an FDR≥0.01).
Fig 4Hierarchical cluster analysis of blood clams (T01 T02 T03 T04 versus T05 T06 T07 T08).
Each column represents a sample, each row represents a gene, and each different color represents log2 (Fragments per kilobase of transcript per million mapped reads (FPKM)) to indicate different expression levels. The clustering branch indicates similarity between genes or samples.
Fig 5Cluster of Orthologous Groups (COG) functional classification of differentially expressed genes (DEGs) in blood clam (T05 T08 versus T01 T02 T04).
The x-axis shows 25 categories while the y-axis shows the number of DEGs corresponding to each category.
Fig 6Gene Ontology (GO) functional classification of differentially expressed genes (DEGs) in blood clam (T05 T08 versus T01 T02 T04).
The x-axis shows three terms and 52 sub-terms while the y-axis shows the proportion of DEGs and unigenes corresponding to each subcategory. The red column represents annotation of all genes, while the blue column represents annotation of DEGs. Under the background of the total genes and DEGs, a term having a large number of DEGs may be related to sexual differences.
Fig 7Partial Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway of differentially expressed genes (DEGs) (T05 T08 versus T01 T02 T04).
The x-axis shows 50 out of 104 pathways which contain more than one DEG while the y-axis shows the proportion of DEGs corresponding to each pathway.
qRT-PCR validation of transcriptome sequencing.
| Functional classification | ID | Function | RNA-seq | qRT-PCR |
|---|---|---|---|---|
| Transcription | M1 | Sox-14 | 166.664 | 170.386 |
| M2 | Armadillo repeat-containing protein 4 | 11.295 | 53.269 | |
| M3 | Armadillo repeat-containing protein 3 | 6.678 | 14.705 | |
| M4 | Sox-8 | 533.812 | 2987.417 | |
| M5 | forkhead box J1 protein, partial | 79.224 | 3.825 | |
| F1 | Foxl2 | 129.687 | 146.537 | |
| F2 | Forkhead box protein N2 | 115.624 | 133.897 | |
| F3 | Forkhead box E protein, partial | 47.888 | 117.538 | |
| F4 | Spermatogenesis- and oogenesis | 23.519 | — | |
| A1 (M) | Sox2 | 5.920 | 1.412 | |
| A2 (F) | β-catenin | 2.203 | 4.045 | |
| A3 (F) | Dax1 | 1.344 | 1.766 | |
| A4 (M) | Sox 9 | 1.593 | 2.174 | |
| A5 (M) | DMRTA2 | 2.468 | 29.468 | |
| Signal transduction mechanisms | M6 | Testis-specific serine/threonine-protein kinase 1 | 2264.369 | 1482.427 |
| M7 | Troponin C, skeletal muscle | 411.277 | 5803.698 | |
| M8 | Testis-specific serine/threonine-protein kinase 1 | 1939.053 | 1056.228 | |
| M9 | Sperm motility kinase X | 334.901 | 3955.033 | |
| M10 | Testis-specific serine/threonine-protein kinase 5 | 280.606 | — | |
| M11 | Testis-specific serine/threonine-protein kinase 4 | 1832.153 | 4594.291 | |
| Carbohydrate, Lipid, Amino acid transport and metabolism | M12 | Glycogen phosphorylase, muscle form | 1523.181 | 40322.961 |
| M13 | Tax1-binding protein 1-like protein B | 163.139 | 73.838 | |
| F5 | Vitellogenin-6 | 2090.712 | — | |
| F6 | Chymotrypsin-like elastase family member 2A | 17.829 | 2.962 | |
| F7 | Chymotrypsin-like serine proteinase | 17.423 | 7.342 | |
| F8 | Chymotrypsin-like elastase family member 2A | 14.892 | 4.283 | |
| F9 | Chymotrypsin-like serine proteinase | 16.022 | 5.073 | |
| Egg coated protein | F10 | vitelline envelope zona pellucida domain 4 | 129.305 | — |
| F11 | vitelline envelope zona pellucida domain 10 | 12.964 | 7.139 | |
| F12 | vitelline envelope zona pellucida domain 10 | 597.534 | 177.465 | |
| F13 | vitelline envelope zona pellucida domain 10 | 12.382 | 3.310 | |
| Immune-related protein | M14 | Sperm-associated antigen 6 | 20.916 | 41.411 |
| F14 | placenta-specific gene 8 protein-like | 20.263 | 31.953 | |
| F15 | Placental protein 11 | 7.582 | 10.440 | |
| Cell cycle control | M15 | F-box only protein 39 | 350.894 | 6.619 |
| F16 | G2/mitotic-specific cyclin-B | 193.215 | 159.333 | |
| Chromatin structure and dynamics | M16 | Sperm-specific protein PHI-2B/PHI-3 | 716.255 | 2678.482 |
‘–’ represents male or female-specific genes, M represents males, F represents females and fold change indicates the differential change in expression between the two genders.
Candidate genes for sex determination and differentiation in model organism.
| Gene source | |||
|---|---|---|---|
| Mouse | |||
| Mouse | |||
| Mouse | Y | ||
| Mouse | |||
| Mouse | Y | ||
| Mouse | |||
| Mouse | Y | ||
| Mouse | |||
| Mouse | |||
| Mouse | |||
| Mouse | |||
| Mouse | Y | ||
| Mouse | Y | ||
| Mouse | |||
| Mouse | Y | ||
| Mouse | Y | ||
| Mouse | |||
| Mouse | Y | ||
| Mouse | Y | ||
| Mouse | |||
| Mouse | |||
| Mouse | |||
| Mouse | Y | ||
| Mouse | Y | ||
| Mouse | Y | ||
| Mouse | Y | ||
| Mouse | Y | ||
| Fish | |||
| Fish | |||
| Fish | Y | ||
| Fish | |||
| Fish | |||
| Worm | |||
| Worm | |||
| Worm | |||
| Worm | Y | ||
| Worm | Y | ||
| Fly | |||
| Fly | |||
| Fly | |||
| Fly | Y | ||
| Fly | |||
| Fly | Y |
Y represents the presence of these genes or their homologues in C. hongkongensis while the fourth column represents the homologues of these genes in T. granosa. M indicates significant expression in males while F indicates significant expression in females.
* indicates a significant difference (P<0.05) while
** indicates a highly significant difference (P<0.01).
Types of simple sequence repeats (SSR) identified in the gonadal transcriptome of the blood clam.
| Repeat motif | Number | Percentage (%) |
|---|---|---|
| Di-nucleotide | ||
| AC/CA/GT/TG | 347/348/424/378 | |
| AG/GA/CT/TC | 200/270/119/162 | |
| AT/TA/GC/CG | 872/800/0/1 | |
| Total | 3921 | 62.41% |
| Tri-nucleotide | ||
| AAC/AAG/AAT(N≥5) | 43/16/139 | |
| ACA/ACC/ACG/ACT | 40/22/3/2 | |
| AGA/AGC/AGG/AGT | 13/1/3/5 | |
| ATA/ATC/ATG/ATT | 95/34/48/88 | |
| CAA/CAC/CAG/CAT | 46/10/8/33 | |
| CCA/CCT/CGC | 16/4/1 | |
| CTA/CTC/CTG/CTT | 4/6/6/6 | |
| GAA/GAC/GAG/GAT | 20/3/6/48 | |
| GCA/GCT/GGA/GGT | 10/6/3/11 | |
| GTA/GTC/GTG/GTT | 3/2/12/18 | |
| TAA/TAC/TAG/TAT | 63/9/3/81 | |
| TCA/TCC/TCG/TCT | 52/4/3/14 | |
| TGA/TGC/TGG/TGT | 66/7/11/36 | |
| TTA/TTC/TTG | 108/15/36 | |
| Total | 1342 | 21.36% |
| Tetra-nucleotide | ||
| AAAC/AAAT | 3/9 | |
| AACA/AATA/AATC/AATT | 3/10/5/1 | |
| ACAG/ACAT/ACGC/ACTG | 1/2/1/1 | |
| AGAA/AGAT/AGTG | 1/1/1 | |
| ATAA/ATAC/ATAG/ATGT | 5/3/2/3 | |
| ATTA/ATTG/ATTT | 1/2/8 | |
| CAAA/CAAC/CTAT/CTGT | 1/2/1/1 | |
| GAAA/GAAT/GACA/GATA | 1/2/1/1 | |
| GTAT/GTCA/GTCC/GTCT | 1/1/1/2 | |
| GTGC/GTTG/GTTT | 1/1/1 | |
| TAAA/TAAT/TACA/TACT | 4/1/1/1 | |
| TATC/TATG/TATT | 3/2/6 | |
| TCAA/TCAT/TCTT | 1/2/1 | |
| TGAC/TGTA/TGTC | 1/1/2 | |
| TTAA/TTAT/TTGA/TTGT | 1/7/1/1 | |
| TTTA/TTTC/TTTG | 7/2/3 | |
| Total | 129 | 2.05% |
| Penta-nucleotide | ||
| AAAAT/AATCC/TGAGT | 1/1/1 | |
| CAAAG/CAGGC/CCAGC | 1/1/1 | |
| Total | 6 | 0.10% |
| Hexa-nucleotide | ||
| TTTTTC/TTATAA | 1/1 | |
| Total | 2 | 0.03% |
| others | 883 | 14.05% |
‘Number’ indicates the number of different types of SSR detected in unigenes while‘Percentage’ indicates the relative proportion of SSRs with different repeat motifs among the total number of SSRs.
Fig 8Singlenucleotidepolymorphism (SNP) types showing polymorphism of the sexual transcriptome sequence of the blood clam.