| Literature DB >> 30128207 |
Longfei Shu1,2, Jie Qiu3, Katja Räsänen1,2.
Abstract
Maternal effects can substantially affect ecological and evolutionary processes in natural populations. However, as they often are environmentally induced, establishing their genetic basis is challenging. One important, but largely neglected, source of maternal effects are egg coats (i.e., the maternally derived extracellular matrix that surrounds the embryo). In the moor frog, the gelatinous egg coats (i.e., egg jelly) are produced in the mother's oviduct and consist primarily of highly glycosylated mucin type O-glycans. These O-glycans affect jelly water balance and, subsequently, contribute to adaptive divergence in embryonic acid tolerance. To identify candidate genes for maternal effects, we conducted RNAseq transcriptomics on oviduct samples from seven R. arvalis females, representing the full range of within and among population variation in embryonic acid stress tolerance across our study populations. De novo sequencing of these oviduct transcriptomes detected 124,071 unigenes and functional annotation analyses identified a total of 57,839 unigenes, of which several identified genes likely code for variation in egg jelly coats. These belonged to two main groups: mucin type core protein genes and five different types of glycosylation genes. We further predict 26,711 gene-linked microsatellite (simple sequence repeats) and 231,274 single nucleotide polymorphisms. Our study provides the first set of genomic resources for R. arvalis, an emerging model system for the study of ecology and evolution in natural populations, and gives insight into the genetic architecture of egg coat mediated maternal effects.Entities:
Keywords: Acidification; Amphibian; Egg coat; Glycosylation; Maternal effect genes; Moor frog; Oviduct; RNA seq; Rana arvalis; Transcriptome
Year: 2018 PMID: 30128207 PMCID: PMC6098945 DOI: 10.7717/peerj.5452
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Results of RNA sequencing of six R. arvalis oviducts.
| Sample ID | Total raw reads | Total clean reads | Total clean nucleotides | Q20 percentage (%) | GC percentage (%) |
|---|---|---|---|---|---|
| S2 | 99,884,602 | 87,485,924 | 7,873,733,160 | 98.04 | 45.11 |
| B2 | 96,108,078 | 85,373,136 | 7,683,582,240 | 98.05 | 44.83 |
| T2 | 108,026,198 | 84,548,126 | 7,609,331,340 | 98.27 | 45.68 |
Notes:
Total reads and total nucleotides are given after adaptor trimming and quality filtering. Q20 percentage is the proportion of nucleotides with a quality value larger than 20; GC percentage is the proportion of guanidine and cytosine nucleotides among total nucleotides. The sample ID indicates the six different females, originating from three populations (S, neutral origin; T, acid origin; and B, intermediate pH origin). The individuals were chosen so as to maximize variation in embryonic acid tolerance (which in turn is largely determined by the molecular composition of the egg jelly coats). In each population, individual 1 (italics) represents a female whose offspring was most acid sensitive in the embryonic stage, while individual 2 represents a female whose offspring was the most acid tolerant (based on screening of embryonic acid tolerance in a laboratory experiment, Shu et al., 2016).
Contigs and unigenes in the transcriptome assembly of six R. arvalis oviducts.
| Sample ID | Total number | Total length (nt) | Mean length (nt) | N50 | Total consensus sequences | Distinct clusters | Distinct singletons | |
|---|---|---|---|---|---|---|---|---|
| Contig | ||||||||
| S2 | 217,356 | 59,657,739 | 274 | 366 | – | – | – | |
| B2 | 183,977 | 51,106,261 | 278 | 381 | – | – | – | |
| T2 | 153,751 | 46,501,333 | 302 | 436 | – | – | – | |
| Unigene | ||||||||
| S2 | 112,136 | 55,372,134 | 494 | 729 | 112,136 | 16,008 | 96,128 | |
| B2 | 91,647 | 47,344,659 | 517 | 787 | 91,647 | 14,419 | 77,228 | |
| T2 | 87,401 | 45,945,512 | 526 | 839 | 87,401 | 12,775 | 74,626 | |
| All | 124,071 | 90,322,330 | 728 | 1,212 | 124,071 | 28,452 | 95,619 |
Notes:
The sample ID indicates the six different females, originating from three populations (S, neutral origin, T, acid origin, and B, intermediate pH origin) with the acid most sensitive within each population indicated in italics (See Table 1 for details). N50 is the shortest sequence length at 50% of the transcriptome. Total consensus sequences represents all the assembled unigenes. Distinct Clusters represents the cluster unigenes. The same cluster contains some highly similar (more than 70%) unigenes, and these unigenes may come from the same gene or a homologous gene. Distinct singletons represents that these unigenes come from a single gene.
Figure 1The length distribution of the unigenes identified based on seven R. arvalis oviduct transcriptomes.
The X-axis shows the length distribution (nt) of sequenced unigenes and Y-axis indicates number of unigenes for a given length.
Figure 2Annotation of R. arvalis unigenes against the Nr database.
(A) E-value distribution of the top BLAST hits for each unique sequence. (B) Similarity distribution of the top BLAST hits for each unique sequence. (C) Species distribution of the top BLAST hits for all homologous sequences.
Figure 3COG functional classification of unigenes identified from the R. arvalis oviduct unigenes.
The X-axis shows the different functional classes, and Y-axis the number of genes annotated into a given class. Most genes are in the classes of “General function,” followed by “Translation, ribosomal structure, and biogenesis,” “Transcription,” and “Replication, recombination, repair.”
Figure 4Number of unigenes annotated based on different public databases (see ‘Methods’ for details on databases).
Figure 5GO categories of unigenes identified from the transcriptome of seven R. arvalis oviduct samples.
The unigenes were annotated in three categories as represented on the X-axis: biological processes (23), cellular components (18), and molecular functions (19). The X-axis indicates the GO term, while the Y-axis (log scale) indicates the number and percentage of unigenes for each GO term.
Genes coding for core proteins as identified from the oviduct of six R. arvalis females. Genes highlighted in bold are the most highly expressed core glycoprotein genes. See text for detailed discussion.
| Component | Gene | Function |
|---|---|---|
| Mucin | Mucin-1 | ECM protein |
| ECM protein | ||
| Mucin-4 | ECM protein | |
| ECM protein | ||
| ECM protein | ||
| Mucin-6 | ECM protein | |
| Mucin-7 | ECM protein | |
| Mucin-15 | ECM protein | |
| Collagen | Collagen alpha-1(I) | ECM protein |
| Collagen alpha-1(III) | ECM protein | |
| Collagen alpha-1(V) | ECM protein | |
| Collagen alpha-1(XI) | ECM protein | |
| Collagen alpha-1(XII) | ECM protein | |
| Collagen alpha-1(XVIII) | ECM protein | |
| Collagen alpha-1(XXVII) | ECM protein | |
| Collagen alpha-2(I) chain | ECM protein | |
| Collagen alpha-2(IV) | ECM protein | |
| Collagen alpha-2(V) | ECM protein | |
| Collagen alpha-2(VI) | ECM protein | |
| Collagen alpha-5(IV) | ECM protein | |
| Collagen alpha-6(IV) chain | ECM protein | |
| Others | Decorin | ECM protein |
| Dermatopontin | ECM protein | |
| EMILIN-1 | ECM protein | |
| EMILIN-2 | ECM protein | |
| Fibrillin-1 | ECM protein | |
| Fibrinogen-like protein 1 | ECM protein | |
| Fibronectin | ECM protein | |
| Fibulin | ECM protein | |
| Laminin | ECM protein |
Protein glycosylation genes as identified from the oviduct of R. arvalis females.
| Glycan pathway | Glycan type | Gene |
|---|---|---|
| Mucin type O-glycan | beta-1,3- | |
| alpha- | ||
| beta-1,6- | ||
| beta-1,4-galactosyltransferase 5 | ||
| glycoprotein- | ||
| polypeptide | ||
| sialyltransferase 4A | ||
| sialyltransferase 7A | ||
| Other type of O-glycan | O-linked GlcNAc type | Protein O-GlcNAc transferase |
| O-linked Man type | beta-1,2- | |
| beta-1,4-galactosyltransferase 1 | ||
| carbohydrate 3-sulfotransferase 10 | ||
| dolichyl-phosphate-mannose-protein mannosyltransferase | ||
| glucuronosyltransferase | ||
| sialyltransferase 6 | ||
| 4-galactosyl- | ||
| O-linked Fuc type | beta-1,4-galactosyltransferase 1 | |
| peptide- | ||
| sialyltransferase 6 | ||
| O-linked Glc type | protein glucosyltransferase | |
| UDP-xylose:glucoside alpha-1,3-xylosyltransferase | ||
| O-linked Gal type | collagen beta-1, | |
| lysyl hydroxylase/galactosyltransferase/glucosyltransferase | ||
| Heparan sulfate | alpha-1,4- | |
| alpha-1,4- | ||
| glucuronyl/ | ||
| Heparan sulfate glucosamine 3- | ||
| Chondroitin sulfate | chondroitin sulfate | |
| chondroitin sulfate synthase | ||
| galactosylxylosylprotein 3-beta-galactosyltransferase | ||
| galactosylgalactosylxylosylprotein 3-beta-glucuronosyltransferase 1 | ||
| protein xylosyltransferase | ||
| xylosylprotein 4-beta-galactosyltransferase | ||
| Keratan sulfate | beta-1,4-galactosyltransferase 1 | |
| beta-1,4-galactosyltransferase 4 | ||
| beta-1,3- | ||
| carbohydrate 6-sulfotransferase 2 | ||
| sialyltransferase 4A |
Figure 6The biosynthesis pathway (KEGG) of the Mucin type O-glycans.
Red squares indicate the genes expressed in the oviduct of R. arvalis. Study sites: GALNT, polypeptide N-acetylgalactosaminyltransferase; SIAT4, sialyltransferase 4A; SIAT7A, sialyltransferase 7A; C1GALT1, glycoprotein-N-acetylgalactosamine 3-beta-galactosyltransferase; GCNT1, beta-1,3-galactosyl-O-glycosyl-glycoprotein beta-1,6-N-acetylglucosaminyltransferase; B3GNT6, acetylgalactosaminyl-O-glycosyl-glycoprotein beta-1,3-N-acetylglucosaminyltransferase; GCNT3, N-acetylglucosaminyltransferase 3, mucin type; B4GALT5, beta-1,4-galactosyltransferase 5.
Figure 7Heat map of gene expression in oviducts of six R. arvalis females.
The genes presented are selected from those coding for core proteins (in gray) and protein glycosylation (in black). The colors represent high (red), low (green), or average (black) gene expression based on Z-score normalized FPKM values for each gene. The individual female’s identity from the three study populations (T, S, and B) is indicated below. Within each population, the number indicates the individual female, whereby the females with the most sensitive embryos is indicated by 1 and female with the most acid tolerant embryos is indicated by 2 (acid tolerance was estimated in Shu, Suter & Räsänen, 2015b). B3 female was left out from this analysis because it had not fully ovulated at the time of sampling and hence was not directly comparable in gene expression patterns.