Literature DB >> 25690965

A conserved set of maternal genes? Insights from a molluscan transcriptome.

M Maureen Liu1, John W Davey, Daniel J Jackson, Mark L Blaxter, Angus Davison.   

Abstract

The early animal embryo is entirely reliant on maternal gene products for a 'jump-start' that transforms a transcriptionally inactive embryo into a fully functioning zygote. Despite extensive work on model species, it has not been possible to perform a comprehensive comparison of maternally-provisioned transcripts across the Bilateria because of the absence of a suitable dataset from the Lophotrochozoa. As part of an ongoing effort to identify the maternal gene that determines left-right asymmetry in snails, we have generated transcriptome data from 1 to 2-cell and ~32-cell pond snail (Lymnaea stagnalis) embryos. Here, we compare these data to maternal transcript datasets from other bilaterian metazoan groups, including representatives of the Ecydysozoa and Deuterostomia. We found that between 5 and 10% of all L. stagnalis maternal transcripts (~300-400 genes) are also present in the equivalent arthropod (Drosophila melanogaster), nematode (Caenorhabditis elegans), urochordate (Ciona intestinalis) and chordate (Homo sapiens, Mus musculus, Danio rerio) datasets. While the majority of these conserved maternal transcripts ("COMATs") have housekeeping gene functions, they are a non-random subset of all housekeeping genes, with an overrepresentation of functions associated with nucleotide binding, protein degradation and activities associated with the cell cycle. We conclude that a conserved set of maternal transcripts and their associated functions may be a necessary starting point of early development in the Bilateria. For the wider community interested in discovering conservation of gene expression in early bilaterian development, the list of putative COMATs may be useful resource.

Entities:  

Mesh:

Year:  2014        PMID: 25690965      PMCID: PMC4594767          DOI: 10.1387/ijdb.140121ad

Source DB:  PubMed          Journal:  Int J Dev Biol        ISSN: 0214-6282            Impact factor:   2.203


Introduction

Cell division requires that genome replication and assortment are achieved while cellular function is maintained. In somatic cells, there is continuity of cytoplasm from mother to daughter, so that new nuclei take up the reins of cellular control as transcription of their genomes is resumed after division. In contrast, in the formation of a new organism the early zygote has to perform a similar feat of taking control of a new cell, but the task is made more complex because the gametic pronuclei must be reprogrammed and coordinated before transcription initiation. In animal embryos the zygotic cytoplasm, provisioned by the mother, has been found to contain all the machinery necessary to drive the first stages of embryonic development. This maternal provisioning has been demonstrated through the blocking of transcription from the zygotic genome (Baroux ). In transcriptionally-blocked embryos, maternal products are often sufficient to drive the first rounds of cell division, and even the first phases of differentiation (Baroux ). The switch between maternal and zygotic control is called the maternal-zygotic transition (MZT), or the midblastula transition (MBT), and spans the period from fertilisation to the point where maternally provisioned factors are no longer sufficient to deliver normal development (Baroux , Stitzel and Seydoux, 2007, Tadros and Lipshitz, 2009). The MZT is associated with the activation of the zygotic genome. In animal species where fine-scale analyses have been performed, zygotic gene activation has been modelled as two phases (Baroux , Tadros and Lipshitz, 2009). An early phase, involving a few loci, is associated with degradation of maternal proteins and mRNAs, while the second phase is much more extensive and includes genes involved in a wide range of biological processes (Schier, 2007, Tadros and Lipshitz, 2009). Initial, albeit limited, zygotic genome activation has been identified as early as the fertilised zygote (in the paternal pronuclei of mouse, sea urchin and the nematode Ascaris suum), and as late as the 256-cell embryo stage (in Xenopus) (Baroux , Tadros and Lipshitz, 2009, Wang ). Experimental evidence indicates that the MZT is tightly regulated, and includes the birth of zygotic RNAs and the death of maternal RNAs (Schier, 2007, Stitzel and Seydoux, 2007, Tadros and Lipshitz, 2009), taking place at multiple levels and in a controlled and managed manner. Thus, while many embryos are able to transcribe experimentally introduced DNA, the early embryonic genome is maintained in a state that is incompatible with transcription. Changes in chromatin structure, combined with a dilution of factors such as transcriptional repressors by cell division, allow for the initiation of zygotic transcription. Nonetheless, despite the complexity, it has been suggested that the MZT can be simplified into two interrelated processes: the first whereby a subset of maternal mRNAs and proteins is eliminated, and the second whereby zygotic transcription is initiated (Schier, 2007, Tadros and Lipshitz, 2009). In zebrafish, maternally-provisioned products from just three genes, Nanog, Pou5f1 and SoxB1 (known for their roles in embryonic stem cell fate regulation), are sufficient to initiate the zygotic developmental program and to induce clearance of the maternal program by activating the expression of a microRNA (Lee , Leichsenring ). In Xenopus, increasing nuclear to cytoplasmic ratio is believed to be the controlling element in the switch, with just four factors regulating multiple events during the transition (Collart ). However, the generality of these findings remains unknown. Furthermore, while the regulation of RNA transcription (gene expression) has received considerable attention (primarily due to the advances in nucleic acid sequencing technologies), protein expression and turnover rates remain relatively under-studied (Stitzel and Seydoux, 2007). Our knowledge of maternal-to-zygotic transcription phenomena is also largely restricted to the dominant model animal species, with relatively few experimental studies existing for other metazoans. Although there has been a recent upsurge in interest in the maternal control of embryonic development, especially the MZT (Benoit , De Renzis , Lee , Leichsenring , Tadros and Lipshitz, 2009), the study of maternal factors has played an important part in the history of embryology and development, particularly in the model animal taxa Drosophila melanogaster (phylum Arthropoda from superphylum Ecdysozoa), Caenorhabditis elegans (Nematoda, Ecdysozoa), Strongylocentrotus purpuratus (Echinodermata, Deuterostomia), Mus musculus, Homo sapiens and Danio rerio (Chordata, Deuterostomia) (Gilbert, 2006). Missing from this roster of models are representatives of “the” superphylum Lophotrochozoa, a morphologically diverse group that includes the Mollusca and Annelida. Two annelid models, Platynereis dumerilii and Capitella telata, are becoming well established (Dill and Seaver, 2008, Giani , Hui ), but model molluscs have been developed for their potential to answer particular questions (e.g. asymmetric distribution of patterning molecules during development; Lambert and Nagy, 2002), or their association with a particular disease (e.g. schistosome transmitting Biomphalaria; Knight ). As part of an ongoing effort to identify the maternal gene that determines left-right asymmetry in molluscs (Harada , Kuroda , Liu ), we are developing Lymnaea stagnalis pond snails as a model because they are one of the few groups that exhibit genetically-tractable, natural variation in their left-right asymmetry, or chirality, and so are ideal systems in which to understand why chirality is normally invariant, yet also pathological when it does vary (Schilthuizen and Davison, 2005). In generating a maternal transcriptomic resource for this species (the chirality-determining gene is maternally expressed; Boycott and Diver, 1923, Sturtevant, 1923), we were surprised to discover that while there are general studies on the composition and regulation of maternal expression (Shen-Orr ), there has been no comprehensive description of shared bilaterian maternal genes. One reason may be that no maternal gene resource exists for the Lophotrochozoa, Spiralia or Mollusca. Instead, previous work has described early developmental transcription in the molluscs Ilyanassa sp. (Lambert ) and Crepidula fornicata (Henry ), but using combined developmental stage libraries. Here we compare a new 1 to 2-cell L. stagnalis transcriptome (presumed maternal) to maternal transcriptomes from selected ecdysozoan and deuterostome species to identify conserved maternally provisioned genes across the Bilateria.

Results

L. stagnalis embryonic transcriptome sequencing and assembly

Roche 454 sequencing of the two L. stagnalis libraries (1 to 2-cell and ~32-cell) generated 192,758 and 218,893 reads respectively, of which 163,004 and 192,552 were 150 bases or longer. The 1 to 2-cell assembly generated more contigs than the ~32-cell assembly, despite having fewer sequences (Table 2). A GC content of 36% for both libraries was approximately the same as previously reported for L. stagnalis (Adema , Liu ). Merging the two assemblies produced by Newbler and MIRA resulted in fewer, longer contigs. The 1 to 2-cell library generated 11,212 contigs, and the ~32 cell library 9,497 contigs.
TABLE 1

PRIMER SEQUENCES USED TO ISOLATE GENE FRAGMENTS FOR RIBOPROBE SYNTHESES

GeneForward primer (5′ to 3′)Reverse primer (5′ to 3′)
beta-tubulinTGTGGAATGGATCCCCAACAATGTCATCACTCAGGAGCTTTGATACGGCTTG
c2724 ATP-dependent RNA helicaseGCAGCGGTTTCTTCCGCAATGTTTTTCTCTCCTCTTTACTGCTG
c453 heat shock 70 kda proteinCCACTGCTGCAGCCATTGCCTACTGAATGAGCACACCGGGCTGA
c7974 ADP-ribosylation factor 4CAAGGTGCAACTGCCACGCAAGAAATCCCACCACCACCCCCAAC
c9053 proteasome alpha 6 subunitCGCGCTCGCTATGAGGCAGCTATCATGGTATCAGCAACACCCACA
c579 ergic and golgi 2CGTCTGCTACAGGTGGCGGTTTGTCCGTGGTTGATTGGCCGGTTA
c9016 eukaryotic translation initiation factor 3 subunit iTGGTGCTGTTTGGTGCATTGATTGAGCGGGCATCAAATTTGCCAAC
c8075 eukaryotic translation elongation factorTACTGCGCCAAGCCATTGGTGACTGAAGCAGGGCATCACCAGCA
c8318 78 kda glucose-regulated proteinCGCAAAACCAGCGACATATAAGCATGGCTGCAGCAGTTGGCTCATT
TABLE 2

ASSEMBLY OF THE LYMNAEA STAGNALIS EMBRYO TRANSCRIPTOMES

1 cell transcriptome
32 cell transcriptome
Newbler 2.6MIRAMergedMerged + CD-HitNewbler 2.6MIRAMergedMerged + CD-Hit
Number of contigs13,20115,41911,22211,21211,05614,4229,5129,497
Max contig length4,2582,9376,0516,0514,2143,5644,2124,212
Number contigs >100bp12,90815,18411,14611,13610,92114,3259,4909,475
>100bp N50700630782781847689940938
>100bp GC content36.335.836.336.336.235.336.236.2
Number contigs >1000bp1,6851,3751,8691,8612,0811,8432,2452,234
>1000bp N501,3901,3171,4071,4061,5201,4241,5331,533
>1000bp GC content36.436.836.436.436.336.536.336.3
Contigs versus SwissProt hits27.60%25.80%30.90%30.90%33.20%29.20%36.20%36.20%

Comparison between maternal transcriptomes

We compared the two developmental transcriptomes of L. stagnalis to each other and to six published maternal transcriptomes of roughly comparable depth derived from four deuterostomes and two ecdysozoans (Table 3; Aanes , Azumi , Baugh , De Renzis , Evsikov , Grondahl ). For M. musculus and C. elegans, maternal-only transcripts (present in the oocyte or egg but not in developing embryos) and maternal-zygotic transcripts (found in both oocyte or egg, and after zygotic transcription has started) have been defined. For the mouse, 2,834 genes were maternal-only and 1,796 maternal-zygotic, while for C. elegans 2,794 were maternal-only and 2,285 maternal-zygotic (Baugh , Evsikov ).
TABLE 3

MATERNAL TRANSCRIPTOME DATASETS USED IN THIS STUDY

Taxonomic group / SpeciesCommon nameNumber of maternal genesMethodSource
Deuterostomia
Homo sapiens human7,470Array analysis of metaphase II oocytes Grøndahl et al. 2010
Mus musculus mouse4,643*Sanger sequencing of oocyte cDNA library Evsikov et al. 2006
Danio rerio zebrafish4,375*ABI Solid cDNA sequences of oocyte and early embryo Aanes et al. 2011
Ciona intestinalis Ciona / sea squirt4,041Array analysis of early embryo Azumi et al. 2007
Ecdysozoa
Drosophila melanogaster Drosophila / fly6,582#Array analysis of early embryo De Renzis et al. 2007
Caenorhabditis elegans C. elegans / worm5,081*Array analysis of early embryo Baugh et al. 2003
Lopphotrochozoa
Lymnaea stagnalis snail11,212454 sequencing of cDNA library from 1 cell embryoThis study

more sequences listed in paper, but not all retrievable or present in database (mouse ~5,400; worm 6,042; zebratish 4,465)

fewer sequences listed in paper compared with database (6,485)

By reciprocal tBLASTx analyses, we identified putatively orthologous genes present in each of the seven species. About one quarter of each of the other maternal transcriptomes, between 900 and 1,900 genes, overlapped with the maternal transcriptome of the pond snail, L. stagnalis (Table 4). Surprisingly, 481 of the L. stagnalis genes had putative orthologues in all seven taxa (Supplementary Table 1). These 481 orthologues in fact probably represent 439 or fewer distinct genes, as BLASTx analyses revealed that some matched the same sequence in the NCBI nr protein database. This result implies that 5-10% of the maternal transcriptome is conserved and shared across all of the representative taxa (H. sapiens 6.1%, M. musculus 9.9%, D. rerio 10.6%, C. intestinalis 11.4%, D. melanogaster 7.0%, C. elegans 9.0%). We refer to this conserved set as the “conserved maternal transcriptome” (COMAT).
TABLE 4

COMPARISON BETWEEN MATERNAL TRANSCRIPTOMES

SpeciesMaternaltranscriptomeNumber with orthologues inLymnaea stagnalis transcriptome%Uniquehits%Reciprocalhits%Uniquereciprocal hits%
Homo sapiens 7,4702,39432%1,85225%2,69836%1,76824%
Mus musculus 4,6431,95442%1,44231%2,01343%1,36129%
Danio rerio 4,3751,91344%1,45233%1,98545%1,32830%
Ciona intestinalis 4,0411,36034%95424%1,11027%93623%
Drosophila melanogaster 6,5822,50138%1,98030%2,90344%1,90029%
Caenorhabditis elegans 5,0811,66233%1,22024%1,62832%1,18123%
We compared the L. stagnalis 1 to 2-cell transcriptome to maternal-only transcripts and maternal-zygotic transcripts from M. musculus and C. elegans (Baugh , Evsikov ) using tBLASTx. The M. musculus maternal-only data set matched 1069 L. stagnalis transcripts, whereas the M. musculus maternal-zygotic data set matched 884 L. stagnalis transcripts. Of the 481 COMATs from L. stagnalis, 219 were found in the M. musculus maternal-only data set and 261 in the M. musculus maternal-zygotic data set, indicating a relative over-representation of maternal-zygotic transcripts that are conserved between chordate and mollusc, compared with maternal-only (Fisher’s exact test, 2,834:1,796 maternal-only:maternal-zygotic M. musculus versus 1,069:884 maternal-only:maternal-zygotic L. stagnalis, P < 0.0001), especially when considering COMATs (Fisher’s exact test, 2,834:1,796 versus 219:261, P < 0.0001). A similar result was found in comparisons between L. stagnalis and C. elegans (Fisher’s exact test, 2794:2285 versus 733:929 or 222:259, P < 0.0001, P < 0.0002). Similar comparisons were also made for maternal transcripts identified as being actively degraded or not degraded in the early embryo (Baugh , Evsikov ), but no differences were found.

Gene ontology analyses

About one-third (31% of the 1 to 2-cell and and 36% of the ~32-cell) L. stagnalis transcripts (~3,400 genes) had significant BLASTx matches in the SwissProt database (Table 2). Blast2GO was used to functionally annotate both L. stagnalis transcriptomes. Of the 11,212 1 to 2-cell contigs, 4,311 (38%) had a significant BLASTx match, and 3,481 (31%) were assigned GO identifiers. Similarly, of 9,497 ~32-cell contigs, 4,255 (45%) had a significant BLASTx match, and 3,425 (36%) were assigned GO identifiers. For the COMAT subset, all but one of the 481 sequences had a significant BLASTx match, and 435 (90%) were assigned GO identifiers (Supplementary Table 1). The distribution of GO annotations into functional categories revealed no obvious qualitative differences between the 1 to 2-cell and ~32 cell L. stagnalis transcriptomes (Supplementary Figure 1). A Fisher’s exact test, with multiple correction for false discovery rate, confirmed that no functional categories were significantly under or overrepresented between the two libraries. In comparison, the COMAT subset was enriched for many functional categories compared with the complete L. stagnalis 1 to 2-cell transcriptome (Fig. 1; Table 5; Supplementary Table 2). In particular, GO terms associated with nucleotide metabolism and binding in general were overrepresented in the COMAT subset (Figure 1; Table 5; Supplementary Table 2). The maternal expression of a selected set of the COMAT genes was validated in one-cell zygotes using in situ methods (Fig. 2).
Fig. 1

Enrichment of Gene Ontology terms in the conserved maternal transcript (COMAT) subset

Highest level GO terms that show the greatest enrichment in COMAT compared with the L. stagnalis 1 to 2-cell transcriptome. Only those comparisons with P < 1E-5 are shown. Black shading: percentage of each type in COMAT. Grey shading: percentage of each type in the 1 to 2-cell transcriptome.

TABLE 5

HIGHEST LEVEL GENE ONTOLOGY TERMS ENRICHED IN THE CONSERVED MATERNAL DATASET

GO-IDTerm*CategoryFDRP-Valueafter FDRNumber intest groupNumber in1 cell referenceNumber inreference totalNumber notannotated in testNumber notannotatedreference
GO:0005524ATP bindingF2.83E-335.84E-361191362552711953
GO:0005525GTP bindingF2.62E-151.08E-174228703482061
GO:0051082unfolded protein bindingF5.10E-112.75E-13249333662080
GO:0008026ATP-dependent helicase activityF6.39E-097.10E-112415393662074
GO:0003924GTPase activityF1.41E-081.61E-102518433652071
GO:0004674protein serine/threonine kinase activityF4.92E-086.17E-102520453652069
GO:0003755peptidyl-prolyl cis-trans isomerase activityF5.29E-077.72E-09144183762085
GO:0004767sphingomyelin phosphodiesterase activityF1.05E-042.28E-067073832089
GO:0004298threonine-type endopeptidase activityF1.21E-042.74E-068193822088
GO:0004842ubiquitin-protein ligase activityF1.09E-032.96E-051517323752072
GO:0005200structural constituent of cytoskeletonF2.95E-038.91E-056173842088
GO:0008568microtubule-severing ATPase activityF3.06E-039.43E-055053852089
GO:0042288MHC class I protein bindingF1.50E-026.05E-044043862089
GO:0005528FK506 bindingF1.50E-026.05E-044043862089
GO:0019899enzyme bindingF1.92E-028.08E-042456803662033
GO:0003676nucleic acid bindingF2.13E-029.21E-04802933733101796
GO:0007264small GTPase mediated signal transductionP1.78E-101.24E-122512373652077
GO:0051258protein polymerizationP2.72E-073.75E-091911303712078
GO:0006184GTP catabolic processP8.66E-071.32E-082423473662066
GO:0000413protein peptidyl-prolyl isomerizationP2.30E-063.94E-08134173772085
GO:0006468protein phosphorylationP2.76E-064.87E-083452863562037
GO:0006200ATP catabolic processP5.78E-041.50E-051618343742071
GO:0031145anaphase-promoting complex-dependentproteasomal ubiquitin-dependent protein catabolicprocessP1.83E-035.19E-0595143812084
GO:0000209protein polyubiquitinationP2.90E-038.70E-051212243782077
GO:0031110regulation of microtubule polymerization ordepolymerizationP3.06E-039.43E-055053852089
GO:0000165MAPK cascadeP3.12E-039.69E-0584123822085
GO:0030174regulation of DNA-dependent DNA replicationinitiationP3.12E-039.69E-0584123822085
GO:0045087innate immune responseP3.49E-031.11E-04108183802081
GO:0051437positive regulation of ubiquitin-protein ligase activityinvolved in mitotic cell cycleP5.31E-031.77E-0473103832086
GO:0007018microtubule-based movementP6.73E-032.30E-041214263782075
GO:0031346positive regulation of cell projection organizationP8.65E-033.09E-046283842087
GO:0051495positive regulation of cytoskeleton organizationP8.65E-033.09E-046283842087
GO:0000216M/G1 transition of mitotic cell cycleP1.13E-024.21E-0474113832085
GO:0051084de novo’ post-translational protein foldingP1.29E-024.92E-045163852088
GO:0000084S phase of mitotic cell cycleP1.45E-025.71E-041011213802078
GO:0008356asymmetric cell divisionP1.50E-026.05E-044043862089
GO:0010458exit from mitosisP1.50E-026.05E-044043862089
GO:0071363cellular response to growth factor stimulusP1.69E-026.97E-0499183812080
GO:0051704multi-organism processP2.41E-021.05E-032561863652028
GO:0051225spindle assemblyP3.17E-021.50E-035273852087
GO:0050684regulation of mRNA processingP3.17E-021.50E-035273852087
GO:0006977DNA damage response, signal transduction by p53class mediator resulting in cell cycle arrestP3.17E-021.50E-035273852087
GO:0007167enzyme linked receptor protein signaling pathwayP3.31E-021.58E-031219313782070
GO:0051436negative regulation of ubiquitin-protein ligase activityinvolved in mitotic cell cycleP3.61E-021.75E-0364103842085
GO:0030522intracellular receptor signaling pathwayP4.64E-022.29E-0389173822080
GO:0045664regulation of neuron differentiationP4.64E-022.29E-0389173822080
GO:0005874microtubuleC3.31E-065.93E-082119403692070
GO:0019773proteasome core complex, alpha-subunit complexC1.05E-042.28E-067073832089
GO:0045298tubulin complexC3.06E-039.43E-055053852089
GO:0005681spliceosomal complexC5.33E-031.78E-041830483722059
GO:0048471perinuclear region of cytoplasmC1.69E-027.00E-041114253792075
GO:0005829cytosolC2.00E-028.53E-04421261683481963
GO:0005663DNA replication factor C complexC3.17E-021.50E-035273852087

ordered by category and significance

Fig. 2

Visualisation of maternal gene product spatial distribution in uncleaved zygotes of Lymnaea stagnalis by whole mount in situ hybridisation

Eight maternal gene products were visualised in uncleaved zygotes relative to a negative control (β-tubulin). (A) β-tubulin is not detectable in uncleaved zygotes. A polar body is indicated by the horizontal arrow. (B) β-tubulin is clearly expressed in ciliated cells of older veliger larvae. (C) contig_2724: ATP-dependent RNA helicase dhx8. (D) contig_453: heat shock 70 kda protein cognate 4. (E) contig_7974: ADP-ribosylation factor 4. (F) contig_9053: proteasome alpha 6 subunit. (G) contig_579: ergic and golgi 2. (H) contig_9016: eukaryotic translation initiation factor 3 subunit i. (I) contig_8075: eukaryotic translation elongation factor. (J) contig_8318: 78 kda glucose-regulated protein.

Comparison with human housekeeping genes

The COMAT subset was compared to 3802 well-characterised human housekeeping genes (Eisenberg and Levanon, 2013). All but 38 of the 481 COMAT transcripts had a significant match to this set (92%), indicating that the majority are housekeeping in function, at least in humans. In comparison, of the 4,311 L. stagnalis 1 to 2-cell transcripts that had a significant BLASTx match in the NCBI nr protein database, only 2,165 (50%) also had matches to the human housekeeping gene dataset. The conserved maternal gene dataset is therefore highly enriched for putative housekeeping genes (Fisher’s exact test, 2156:4311 versus 443:481, P < 0.0001). We wished to understand if a particular subset of housekeeping genes are over-represented in the COMAT subset, or whether the genes are a random subset of all housekeeping genes. We therefore compared the GO annotations of the 3,802 human housekeeping genes against the subset of 300 human housekeeping genes (Table 6) that were found in the COMAT (a proportion of the COMATs hit the same human gene, hence fewer genes than expected). Similar GO annotations were enriched in this selected pairwise comparison compared with the COMAT as a whole (Supplementary Tables 3 and 4). At the highest level, the same first seven Molecular Functions were found in both H. sapiens housekeeping versus H. sapiens COMAT, and L. stagnalis 1 to 2-cell transcriptome versus L. stagnalis COMAT comparisons, with P < 5E−8 (Supplementary Table 4; ATP binding, GTPase activity, unfolded protein binding, protein serine/threonine kinase activity, GTP binding, threonine-type endopeptidase activity, and ATP-dependent RNA helicase activity). Similarly, the first seven terms relating to Biological Process were also found (P < 5E−8; anaphase-promoting complex-dependent proteasomal ubiquitin-dependent protein catabolic process, protein polyubiquitination, negative regulation of ubiquitin-protein ligase activity involved in mitotic cell cycle, DNA damage response, signal transduction by p53 class mediator resulting in cell cycle arrest, positive regulation of ubiquitin-protein ligase activity involved in mitotic cell cycle, antigen processing and presentation of exogenous peptide antigen via MHC class I, and TAP-dependent, GTP catabolic process). Thus, the overall conclusion is that the COMAT generally consists of housekeeping genes, but is particularly enriched for a particular subset, including those involved in nucleotide binding functions, protein degradation and activities associated with the cell cycle.
TABLE 6

THE 300 HUMAN GENES IN THE CONSERVED MATERNAL DATASET

GeneAccessionDescriptionGeneAccessionDescription
MTRRNM_0024545-methyltetrahydrofolate-homocysteine methyltransferasereductaseNOP5/NOP58NM_015934Nucleolar protein NOP5/NOP58
ACAD9NM_014049Acyl-Coenzyme A dehydrogenase family, member 9NAP1L4NM_005969Nucleosome assembly protein 1-like 4
ACADVLNM_000018Acyl-Coenzyme A dehydrogenase, very long chainOTUB1NM_017670OTU domain, ubiquitin aldehyde binding 1
ARF1NM_001658ADP-ribosylation factor 1OSBPL2NM_014835Oxysterol binding protein-like 2
ARF5NM_001662ADP-ribosylation factor 5PAK2NM_002577P21 (CDKN1A)-activated kinase 2
ARF6NM_001663ADP-ribosylation factor 6PCAFNM_003884P300/CBP-associated factor
ARFGAP3NM_014570ADP-ribosylation factor GTPase activating protein 3PCTK1NM_006201PCTAIRE protein kinase 1
ARL1NM_001177ADP-ribosylation factor-like 1PPWD1NM_015342Peptidylprolyl isomerase domain and WD repeat containing 1
AHSA1NM_012111AHA1, activator of heat shock 90kDa protein ATPase homolog 1(yeast)PPIENM_006112Peptidylprolyl isomerase E (cyclophilin E)
ALDH9A1NM_000696Aldehyde dehydrogenase 9 family, member A1PPIFNM_005729Peptidylprolyl isomerase F (cyclophilin F)
AAMPNM_001087Angio-associated, migratory cell proteinPPIHNM_006347Peptidylprolyl isomerase H (cyclophilin H)
ANKRD17NM_032217Ankyrin repeat domain 17PRDX1NM_002574Peroxiredoxin 1
ANKRD28NM_001195098Ankyrin repeat domain 28PRDX2NM_005809Peroxiredoxin 2
ARD1ANM_003491ARD1 homolog A, N-acetyltransferase (S. cerevisiae)PECINM_006117Peroxisomal D3,D2-enoyl-CoA isomerase
ACTR1ANM_005736ARP1 actin-related protein 1 homolog A, centractin alpha (yeast)PI4KBNM_002651Phosphatidylinositol 4-kinase, catalytic, beta
ACTR1BNM_005735ARP1 actin-related protein 1 homolog B, centractin beta (yeast)PLAANM_001031689Phospholipase A2-activating protein
ARNTNM_001668Aryl hydrocarbon receptor nuclear translocatorPRPSAP1NM_002766Phosphoribosyl pyrophosphate synthetase-associated protein 1
ATP5A1NM_004046ATP synthase, H+ transporting, mitochondrial F1 complex, alphasubunit 1PAFAH1B1NM_000430Platelet-activating factor acetylhydrolase, isoform Ib, alphasubunit 45kDa
ATP5BNM_001686ATP synthase, H+ transporting, mitochondrial F1 complex, betapolypeptidePLRG1NM_002669Pleiotropic regulator 1 (PRL1 homolog, Arabidopsis)
ATAD1NM_032810ATPase family, AAA domain containing 1PHBNM_002634Prohibitin
ABCB10NM_012089ATP-binding cassette, sub-family B (MDR/TAP), member 10PHB2NM_001144831Prohibitin 2
ABCB7NM_004299ATP-binding cassette, sub-family B (MDR/TAP), member 7PSMC2NM_002803Proteasome (prosome, macropain) 26S subunit, ATPase, 2
BXDC5NM_025065Brix domain containing 5PSMC3NM_002804Proteasome (prosome, macropain) 26S subunit, ATPase, 3
BRD7NM_013263Bromodomain containing 7PSMC4NM_006503Proteasome (prosome, macropain) 26S subunit, ATPase, 4
BPTFNM_004459Bromodomain PHD finger transcription factorPSMC5NM_002805Proteasome (prosome, macropain) 26S subunit, ATPase, 5
BUB3NM_004725BUB3 budding uninhibited by benzimidazoles 3 homolog (yeast)PSMC6NM_002806Proteasome (prosome, macropain) 26S subunit, ATPase, 6
CAB39NM_016289Calcium binding protein 39PSMD10NM_002814Proteasome (prosome, macropain) 26S subunit, non-ATPase, 10
CALUNM_001219CalumeninPSMD11NM_002815Proteasome (prosome, macropain) 26S subunit, non-ATPase, 11
CBR4NM_032783Carbonyl reductase 4PSMA1NM_002786Proteasome (prosome, macropain) subunit, alpha type, 1
CSNK1A1NM_001892Casein kinase 1, alpha 1PSMA2NM_002787Proteasome (prosome, macropain) subunit, alpha type, 2
CSNK1DNM_001893Casein kinase 1, deltaPSMA3NM_002788Proteasome (prosome, macropain) subunit, alpha type, 3
CSNK2A3NM_001256686casein kinase 2, alpha 3 polypeptidePSMA4NM_002789Proteasome (prosome, macropain) subunit, alpha type, 4
CTCFNM_006565CCCTC-binding factor (zinc finger protein)PSMA5NM_002790Proteasome (prosome, macropain) subunit, alpha type, 5
CNBPNM_003418CCHC-type zinc finger, nucleic acid binding proteinPSMA6NM_002791Proteasome (prosome, macropain) subunit, alpha type, 6
CD63NM_001780CD63 moleculePSMA7NM_002792Proteasome (prosome, macropain) subunit, alpha type, 7
CRKRSNM_015083CDC2-related kinase, arginine/serine-richPSMB2NM_002794Proteasome (prosome, macropain) subunit, beta type, 2
CDC37NM_007065CDC37 homolog (S. cerevisiae)PSMB6NM_002798Proteasome (prosome, macropain) subunit, beta type, 6
CDC42NM_001791CDC42 (GTP binding protein, 25kDa)PSMB7NM_002799Proteasome (prosome, macropain) subunit, beta type, 7
CDC5LNM_001253CDC5 CDC5-like (S. pombe)PIAS1NM_016166Protein inhibitor of activated STAT, 1
CLK3NM_003992CDC-like kinase 3PRKAA1NM_006251Protein kinase, AMP-activated, alpha 1 catalytic subunit
CCT3NM_005998Chaperonin containing TCP1, subunit 3 (gamma)PPP1CCNM_002710Protein phosphatase 1, catalytic subunit, gamma isoform
CCT4NM_006430Chaperonin containing TCP1, subunit 4 (delta)PPP2CBNM_001009552Protein phosphatase 2 (formerly 2A), catalytic subunit, betaisoform
CCT5NM_012073Chaperonin containing TCP1, subunit 5 (epsilon)PPP2R5DNM_006245Protein phosphatase 2, regulatory subunit B’, delta isoform
CCT6ANM_001762Chaperonin containing TCP1, subunit 6A (zeta 1)PPP4CNM_002720Protein phosphatase 4 (formerly X), catalytic subunit
CCT7NM_006429Chaperonin containing TCP1, subunit 7 (eta)PPP6CNM_002721Protein phosphatase 6, catalytic subunit
CCT8NM_006585Chaperonin containing TCP1, subunit 8 (theta)PSKH1NM_006742Protein serine kinase H1
CHD4NM_001273Chromodomain helicase DNA binding protein 4PTPN1NM_002827Protein tyrosine phosphatase, non-receptor type 1
C14orf130NM_175748Chromosome 14 open reading frame 130PRPF31NM_015629PRP31 pre-mRNA processing factor 31 homolog (S. cerevisiae)
CSTF1NM_001324Cleavage stimulation factor, 3′ pre-RNA, subunit 1, 50kDaPRPF4NM_004697PRP4 pre-mRNA processing factor 4 homolog (yeast)
CSTF2TNM_015235Cleavage stimulation factor, 3′ pre-RNA, subunit 2, 64kDa, tauvariantPWP2NM_005049PWP2 periodic tryptophan protein homolog (yeast)
COPANM_004371Coatomer protein complex, subunit alphaRAB10NM_016131RAB10, member RAS oncogene family
COPS2NM_004236COP9 constitutive photomorphogenic homolog subunit 2(Arabidopsis)RAB11BNM_004218RAB11B, member RAS oncogene family
CTDSP2NM_005730CTD (carboxy-terminal domain, RNA polymerase II, polypeptideA) small phosphatase 2RAB14NM_016322RAB14, member RAS oncogene family
CLEC3BNM_015004C-type lectin domain family 3, member BRAB18NM_021252RAB18, member RAS oncogene family
CUL1NM_003592Cullin 1RAB1ANM_004161RAB1A, member RAS oncogene family
CUL4BNM_003588Cullin 4BRAB2ANM_002865RAB2A, member RAS oncogene family
CDK9NM_001261Cyclin-dependent kinase 9RAB5CNM_004583RAB5C, member RAS oncogene family
CYB5BNM_030579Cytochrome b5 type B (outer mitochondrial membrane)RAB7ANM_004637RAB7A, member RAS oncogene family
CYP2U1NM_183075Cytochrome P450, family 2, subfamily U, polypeptide 1RDXNM_002906Radixin
DAZAP1NM_018959DAZ associated protein 1RANBP1NM_002882RAN binding protein 1
DDX19BNM_007242DEAD (Asp-Glu-Ala-As) box polypeptide 19BRANNM_006325RAN, member RAS oncogene family
DDX1NM_004939DEAD (Asp-Glu-Ala-Asp) box polypeptide 1RAP1ANM_002884RAP1A, member of RAS oncogene family
DDX17NM_006386DEAD (Asp-Glu-Ala-Asp) box polypeptide 17RHOANM_001664Ras homolog gene family, member A
DDX18NM_006773DEAD (Asp-Glu-Ala-Asp) box polypeptide 18RESTNM_005612RE1-silencing transcription factor
DDX21NM_004728DEAD (Asp-Glu-Ala-Asp) box polypeptide 21RFC2NM_002914Replication factor C (activator 1) 2, 40kDa
DDX23NM_004818DEAD (Asp-Glu-Ala-Asp) box polypeptide 23RFC5NM_007370Replication factor C (activator 1) 5, 36.5kDa
DDX24NM_020414DEAD (Asp-Glu-Ala-Asp) box polypeptide 24RBBP4NM_005610Retinoblastoma binding protein 4
DDX27NM_017895DEAD (Asp-Glu-Ala-Asp) box polypeptide 27RXRANM_002957Retinoid X receptor, alpha
DDX3XNM_001356DEAD (Asp-Glu-Ala-Asp) box polypeptide 3, X-linkedRDH14NM_020905Retinol dehydrogenase 14 (all-trans/9-cis/11-cis)
DDX41NM_016222DEAD (Asp-Glu-Ala-Asp) box polypeptide 41REXO1NM_020695REX1, RNA exonuclease 1 homolog (S. cerevisiae)
DDX47NM_016355DEAD (Asp-Glu-Ala-Asp) box polypeptide 47RPL14NM_003973Ribosomal protein L14
DDX54NM_024072DEAD (Asp-Glu-Ala-Asp) box polypeptide 54RPL35NM_007209Ribosomal protein L35
DDX56NM_019082DEAD (Asp-Glu-Ala-Asp) box polypeptide 56RPS6KB1NM_003161Ribosomal protein S6 kinase, 70kDa, polypeptide 1
DHX15NM_001358DEAH (Asp-Glu-Ala-His) box polypeptide 15RPS6KB2NM_003952Ribosomal protein S6 kinase, 70kDa, polypeptide 2
DHX38NM_014003DEAH (Asp-Glu-Ala-His) box polypeptide 38RPS6KA3NM_004586Ribosomal protein S6 kinase, 90kDa, polypeptide 3
DHX8NM_004941DEAH (Asp-Glu-Ala-His) box polypeptide 8RRP1NM_003683Ribosomal RNA processing 1 homolog (S. cerevisiae)
DHRS7BNM_015510Dehydrogenase/reductase (SDR family) member 7BAHCYNM_000687S-adenosylhomocysteine hydrolase
DLG1NM_004087Discs, large homolog 1 (Drosophila)SCRIBNM_015356Scribbled homolog (Drosophila)
DNAJA2NM_005880DNAJ (Hsp40) homolog, subfamily A, member 2STRAPNM_007178Serine/threonine kinase receptor associated protein
DNAJA3NM_005147DNAJ (Hsp40) homolog, subfamily A, member 3SETD8NM_020382SET domain containing (lysine methyltransferase) 8
DNAJB12NM_017626DNAJ (Hsp40) homolog, subfamily B, member 12SMAD5NM_005903SMAD family member 5
DNAJC10NM_018981DNAJ (Hsp40) homolog, subfamily C, member 10SMU1NM_018225Smu-1 suppressor of mec-8 and unc-52 homolog (C. elegans)
DNAJC17NM_018163DNAJ (Hsp40) homolog, subfamily C, member 17SHOC2NM_007373Soc-2 suppressor of clear homolog (C. elegans)
DNAJC5NM_025219DNAJ (Hsp40) homolog, subfamily C, member 5SLC25A11NM_003562Solute carrier family 25 (mitochondrial carrier; oxoglutaratecarrier), member 11
DUSP16NM_030640Dual specificity phosphatase 16SLC25A39NM_016016Solute carrier family 25, member 39
ELAVL1NM_001419ELAV (embryonic lethal, abnormal vision, Drosophila)-like 1(Hu antigen R)SLC39A7NM_006979Solute carrier family 39 (zinc transporter), member 7
ETFANM_000126Electron-transfer-flavoprotein, alpha polypeptide (glutaricaciduria II)SPG7NM_003119Spastic paraplegia 7 (pure and complicated autosomal recessive)
ECHS1NM_004092Enoyl Coenzyme A hydratase, short chain, 1, mitochondrialSPATA5L1NM_024063Spermatogenesis associated 5-like 1
ERGIC2NM_016570ERGIC and golgi 2SFRS2NM_003016Splicing factor, arginine/serine-rich 2
EEF2NM_001961Eukaryotic translation elongation factor 2SAE1NM_005500SUMO1 activating enzyme subunit 1
EIF2AK3NM_004836Eukaryotic translation initiation factor 2-alpha kinase 3UBA2NM_005499SUMO1 activating enzyme subunit 2
EIF3DNM_003753Eukaryotic translation initiation factor 3, subunit DTAF5LNM_014409TAF5-like RNA polymerase II, p300/CBP-associated factor(PCAF)-associated factor, 65kDa
EIF3INM_003757Eukaryotic translation initiation factor 3, subunit ITNKS2NM_025235Tankyrase, TRF1-interacting ankyrin-related ADP-ribosepolymerase 2
EIF4A1NM_001416Eukaryotic translation initiation factor 4A, isoform 1TCP1NM_030752T-complex 1
EIF4A3NM_014740Eukaryotic translation initiation factor 4A, isoform 3TXN2NM_012473Thioredoxin 2
EIF4E2NM_004846Eukaryotic translation initiation factor 4E family member 2TXNDC9NM_005783Thioredoxin domain containing 9
FBXW11NM_012300F-box and WD repeat domain containing 11TIAL1NM_003252TIA1 cytotoxic granule-associated RNA binding protein-like 1
FZR1NM_016263Fizzy/CDC20 related 1 (Drosophila)TRAP1NM_001272049TNF receptor-associated protein 1
FKBP3NM_002013FK506 binding protein 3, 25kDaTOMM70ANM_014820Translocase of outer mitochondrial membrane 70 homolog A(S. cerevisiae)
FTSJ1NM_012280FtsJ homolog 1 (E. coli)TPI1NM_000365Triosephosphate isomerase 1
FUSIP1NM_006625FUS interacting protein (serine/arginine-rich) 1TUFMNM_003321Tu translation elongation factor, mitochondrial
GTF2BNM_001514General transcription factor IIBTUBA1BNM_006082Tubulin, alpha 1b
GNPDA1NM_005471Glucosamine-6-phosphate deaminase 1TUBA1CNM_032704Tubulin, alpha 1c
GRWD1NM_031485Glutamate-rich WD repeat containing 1TUBBNM_178014Tubulin, beta
GRPEL1NM_025196GrpE-like 1, mitochondrial (E. coli)YWHABNM_003404Tyrosine 3-monooxygenase/tryptophan 5-monooxygenaseactivation protein, beta polypeptide
GTPBP4NM_012341GTP binding protein 4YWHAENM_006761Tyrosine 3-monooxygenase/tryptophan 5-monooxygenaseactivation protein, epsilon polypeptide
GTPBP10NM_033107GTP-binding protein 10 (putative)UBA52NM_003333Ubiquitin A-52 residue ribosomal protein fusion product 1
GNL2NM_013285Guanine nucleotide binding protein-like 2 (nucleolar)UBBNM_018955Ubiquitin B
GNL3NM_014366Guanine nucleotide binding protein-like 3 (nucleolar)UBCNM_021009Ubiquitin C
H2AFVNM_012412H2A histone family, member VUBE3CNM_014671Ubiquitin protein ligase E3C
HBS1LNM_006620HBS1-like (S. cerevisiae)UBA3NM_003968Ubiquitin-activating enzyme E1C (UBA3 homolog, yeast)
HSPE1NM_001202485Heat shock 10kDa protein 1 (chaperonin 10)UBE2V1NM_021988Ubiquitin-conjugating enzyme E2 variant 1
HSPA5NM_005347Heat shock 70kDa protein 5 (glucose-regulated protein, 78kDa)UBE2ANM_003336Ubiquitin-conjugating enzyme E2A (RAD6 homolog)
HSPA8NM_006597Heat shock 70kDa protein 8UBE2BNM_003337Ubiquitin-conjugating enzyme E2B (RAD6 homolog)
HSPA9NM_004134Heat shock 70kDa protein 9 (mortalin)UBE2D2NM_003339Ubiquitin-conjugating enzyme E2D 2 (UBC4/5 homolog, yeast)
HGSNM_004712Hepatocyte growth factor-regulated tyrosine kinase substrateUBE2D3NM_003340Ubiquitin-conjugating enzyme E2D 3 (UBC4/5 homolog, yeast)
HNRPDNM_002138Heterogeneous nuclear ribonucleoprotein D (AU-rich elementRNA binding protein 1)UBE2G2NM_003343Ubiquitin-conjugating enzyme E2G 2 (UBC7 homolog, yeast)
HAT1NM_003642Histone acetyltransferase 1UBE2INM_003345Ubiquitin-conjugating enzyme E2I (UBC9 homolog, yeast)
BAT1NM_004640HLA-B associated transcript 1UBE2NNM_003348Ubiquitin-conjugating enzyme E2N (UBC13 homolog, yeast)
IMP4NM_033416IMP4, U3 small nucleolar ribonucleoprotein, homolog (yeast)UBE2Q1NM_017582Ubiquitin-conjugating enzyme E2Q (putative) 1
JAK1NM_002227Janus kinase 1 (a protein tyrosine kinase)UBE2R2NM_017811Ubiquitin-conjugating enzyme E2R 2
KPNA1NM_002264Karyopherin alpha 1 (importin alpha 5)VRK2NM_006296Vaccinia related kinase 2
KLHL8NM_020803Kelch-like 8 (Drosophila)VPS4ANM_013245Vacuolar protein sorting 4 homolog A (S. cerevisiae)
L3MBTL2NM_031488L(3)mbt-like 2 (Drosophila)AKT1NM_005163V-akt murine thymoma viral oncogene homolog 1
LRRC47NM_020710Leucine rich repeat containing 47VCPNM_007126Valosin-containing protein
MAPRE2NM_014268Microtubule-associated protein, RP/EB family, member 2VBP1NM_003372Von Hippel-Lindau binding protein 1
MCM7NM_005916Minichromosome maintenance complex component 7RALANM_005402V-ral simian leukemia viral oncogene homolog A (ras related)
MRPL4NM_015956Mitochondrial ribosomal protein L4WDR12NM_018256WD repeat domain 12
MAPK1NM_002745Mitogen-activated protein kinase 1WDR3NM_006784WD repeat domain 3
MAPK9NM_002752Mitogen-activated protein kinase 9WDR57NM_004814WD repeat domain 57 (U5 snRNP specific)
MAP2K1NM_002755Mitogen-activated protein kinase kinase 1WDR5BNM_019069WD repeat domain 5B
MAP2K2NM_030662Mitogen-activated protein kinase kinase 2WDR61NM_025234WD repeat domain 61
MAP2K5NM_002757Mitogen-activated protein kinase kinase 5YPEL2NM_001005404Yippee-like 2 (Drosophila)
MAP4K4NM_004834Mitogen-activated protein kinase kinase kinase kinase 4YME1L1NM_014263YME1-like 1 (S. cerevisiae)
MAPKAPK2NM_004759Mitogen-activated protein kinase-activated protein kinase 2YY1NM_003403YY1 transcription factor
MLH1NM_000249MutL homolog 1, colon cancer, nonpolyposis type 2 (E. coli)ZBTB6NM_006626Zinc finger and BTB domain containing 6
MLLT1NM_005934Myeloid/lymphoid or mixed-lineage leukemia (trithorax homolog,Drosophila); translocated to, 1ZNF138NM_001271649zinc finger protein 138
MYNNNM_018657MyoneurinZNF195NM_007152Zinc finger protein 195
MYO1ENM_004998Myosin IEZNF197NM_006991Zinc finger protein 197
MTMR1NM_003828Myotubularin related protein 1ZNF289NM_032389Zinc finger protein 289, ID1 regulated
NDUFS8NM_002496NADH dehydrogenase (ubiquinone) Fe-S protein 8, 23kDa(NADH-coenzyme Q reductase)ZNF347NM_032584Zinc finger protein 347
NEDD8NM_006156Neural precursor cell expressed, developmentally down-regulated 8ZNF37ANM_003421Zinc finger protein 37A
NF2NM_000268Neurofibromin 2 (bilateral acoustic neuroma)ZNF397NM_001135178Zinc finger protein 397
NHP2L1NM_005008NHP2 non-histone chromosome protein 2-like 1 (S. cerevisiae)ZNF41NM_007130Zinc finger protein 41
NEK4NM_003157NIMA (never in mitosis gene a)-related kinase 4ZNF506NM_001099269Zinc finger protein 506
NSUN2NM_017755NOL1/NOP2/Sun domain family, member 2ZNF91NM_003430Zinc finger protein 91
NOL1NM_006170Nucleolar protein 1, 120kDaZFAND1NM_024699Zinc finger, AN1-type domain 1
NOL5ANM_006392Nucleolar protein 5A (56kDa with KKE/D repeat)ZFAND5NM_006007Zinc finger, AN1-type domain 5
NOLA2NM_017838Nucleolar protein family A, member 2 (H/ACA small nucleolarRNPs)ZDHHC5NM_015457Zinc finger, DHHC-type containing 5
NOLA3NM_018648Nucleolar protein family A, member 3 (H/ACA small nucleolarRNPs)ZRF1NM_014377Zuotin related factor 1
A final concern was that the COMATs are simply conserved genes that tend to be highly expressed, and so are more likely to be detected in non-exhaustive sequencing experiments. We therefore used the expression data of Eisenberg & Levanon (2013) to compare the read depth of these two types of gene (COMATS and non-COMATS) in human tissues. Overall, COMATs tend to be more highly expressed, but they represent a set of genes that have a large range in their quantitative gene expression (Figure 3). Thus, while the mean gene expression in the conserved data set is higher (COMAT mean log geometric gene expression = 1.08, S.E. 0.03; non-COMAT mean = 0.90, S.E. 0.008; P < 0.001), the individual variation is considerable in both datasets (S.D. 0.51 and 0.47 respectively). Thus, a lack of depth in sequencing experiments cannot wholly explain the existence of COMATs.
Fig. 3

Frequency histogram of relative gene expression for human housekeeping genes

Conserved maternal transcripts (COMATs, red line) tend to have a higher gene expression (measured reads per kb per million mapped reads, RPKM) than non-COMATs (blue). However, COMATs still represent several orders of magnitude of gene expression. Gene expression data from Eisenberg & Levanon (2013).

Discussion

Much excitement has been caused by the discovery that the evolution of gene expression patterns seems to underpin the morphological hourglass pattern of both plants and animals (Kalinka , Meyerowitz, 2002, Quint ). Thus, the long-standing observation that vertebrate morphology is at its most conserved during the embryonic pharyngula or phylotypic period is generally mirrored by conserved expression patterns of conserved genes at these stages (Kalinka and Tomancak, 2012, Kalinka ). In contrast, active transcription in the early zygote is much more limited. Early animal embryos instead largely rely upon RNAs and proteins provided by the maternal gonad during oocyte maturation. This transcriptionally-quiescent period might, a priori, be considered evolutionarily constrained, as the maternally provided transcriptome is widely considered to fulfill one major role, the initiation and management of several rounds of rapid cell division. Every one of these early cell divisions is a critical event that must be faithfully completed to ensure the development of a healthy embryo (Evsikov ). Few studies have investigated the level of conservation of maternally provided genes (Shen-Orr ), despite their well-recognised importance in early development (Wieschaus, 1996). Indeed there are few comprehensive datasets of maternally provisioned transcripts even in well-characterised taxa, and none in the Lophotrochozoa. Improvements in sequencing technologies mean that quantitative transcriptome studies are now possible on organisms that lack genomic resources. Our work therefore provides a list of conserved maternal transcripts, or COMATs (Table 6; Supplementary Table 1), that may be useful to the wider community interested in the study of early bilaterian development. We identified a core set of COMATs from seven representatives of the three bilaterian superphyla, spanning >600 million years of evolution (Peterson ). These species display highly divergent modes of development (from direct to indirect, and mosaic to regulative). Since the L. stagnalis maternal transcriptome we report here is unlikely to be complete, one possibility is that our estimate of 5-10% of all maternally provisioned transcripts being conserved across the Bilateria may rise upon deeper sampling of the snail transcriptome. Conversely, the number may reduce as maternal transcriptomes from more taxa are included in the analysis. Unsurprisingly, we found that many of these genes had nucleotide (especially ATP and GTP) binding functions, were associated with protein degradation or had activities associated with the cell cycle (Table 6). The majority of functions ascribed are probably accurately defined as housekeeping (Eisenberg and Levanon, 2013). One possibility is that some of the most conserved maternal RNAs are those that cannot be provided (solely) as proteins. Cell cycle genes may be illustrative, because some cell cycle proteins are degraded every cycle and so maternal protein alone cannot be sufficient. Finally, the fact that the ~32-cell transcriptome was neither enriched nor underrepresented for any gene ontology relative to the 1 to 2-cell transcriptome, along with a relative over-representation of maternal-zygotic transcripts that are conserved between M. musculus / C. elegans and L. stagnalis suggests that the same transcripts are at least still present during early zygotic transcription (Supplementary Figure 1). Given the wide variety of developmental modes and rates displayed by metazoan embryos, as well as the hourglass theory of evolution (Kalinka and Tomancak, 2012), one view is that we might expect to find relatively few deeply conserved maternal transcripts. Alternatively, as it has been documented that a relatively large fraction (between 45% and 75%) of all genes within a species’ genome can be found as maternal transcripts (see references within Tadros and Lipshitz, 2009), another view is that maternal transcripts that are conserved between different organisms may be a stochastic subset of a large maternal transcriptome. Instead, our analyses suggest that there is a core and specific set of maternal transcripts that may be essential for early cell divisions, irrespective of the precise mode of development. While both our data and the others utilised in this study have obvious limitations, primarily the limited sequencing coverage, it is thus uncertain whether further investigation will reveal a greater or lesser proportion of conserved maternal transcripts. However, a simultaneous consideration is that we have detected those genes that are conserved and transcribed at a relatively high level across all taxa, since the study is at best partially quantitative. Further studies are warranted to reveal the true nature of this conservation. Nonetheless, as we found that the conserved maternal part of a well annotated group of H. sapiens housekeeping genes is enriched for precisely the same functions (Table 6, Supplementary Table 3), we can robustly conclude that there is undoubtedly highly conserved gene expression in the early development of bilaterian embryos. There may also be a distinct set of genes, with mostly housekeeping and nucleotide metabolic functions, that is a necessary starting point of the maternal-to-zygotic transition. Our analyses thus suggest that the ancestral function of maternal provisioning in animal eggs is to supply the zygote with the materials with which to perform the basic cellular functions of rapid cell division in the early stages of development. The extent of the provisioning is evolutionarily labile, with species that have evolved rapid development relying more on maternal products. Addition of patterning molecules is phylogenetically contingent: as different groups and species have evolved different mechanisms of patterning the embryo and been under selection for fast patterning (as in lineage-driven, or mosaic development) or delayed patterning (as in species with regulative development), so the role of maternal factors in driving patterning has changed.

Materials and Methods

cDNA library construction

Early development in the pond snail L. stagnalis has been described in exquisite morphological and cytological detail (Raven, 1966). However, the L. stagnalis MZT has not been mapped in the same detail as in model species, but transcription from zygotic nuclei was first detected in 8-cell embryos, and major transcriptional activity detected at the 24-cell stage (Morrill, 1982). While division cycles are not as rapid as development in C. elegans or D. melanogaster, the L. stagnalis embryo does not divide for ~3 hour at the 24-cell stage, suggesting this may represent a shift from maternal to zygotic control. We thus separately sampled 1 to 2-cell and ~32-cell stage L. stagnalis embryos from a laboratory stock maintained in Nottingham, representing the maternal component and the early stages of zygotic transcription. Zygotes were manually dissected out of their egg capsules and stored in RNAlater (Ambion). As one embryo was expected to yield ~ 0.5 ng RNA, more than one thousand individual embryos of each type were pooled. Total RNA was then extracted using the Qiagen RNeasy Plus Micro Kit. cDNA was then synthesised and two non-normalised cDNA libraries were constructed using the MINT system (Evrogen). The libraries were then processed for sequencing on the Roche 454 FLX platform in the Edinburgh Genomics facility, University of Edinburgh. The raw data have been submitted to the European Nucleotide Archive under bioproject PRJEB7773.

Transcriptome assembly

The raw Roche 454 data were screened for MINT and sequencing adapters and trimmed of low quality base calls. The reads from each library were assembled using gsAssembler (version 2.6; also known as Newbler; 454 Life Sciences) and MIRA (Chevreux ) separately, and then the two assemblies were assembled together using CAP3 (Huang and Madan, 1999), following the proposed best practice for transcriptome assembly from 454 data (Kumar and Blaxter, 2010). gsAssembler assemblies were run with the −cdna and −urt options. MIRA assemblies used job options ‘denovo, est, accurate, 454’ and with clipping by quality off (−CL:qc=no). CD-HIT was then used to remove redundant sequences from the merged CAP3 assemblies (Li and Godzik, 2006), running cd-hit-est with sequence identity threshold 0.98 (−c 0.98) and clustering to most similar cluster (−g 1). The assembly has been made available on afterParty (http://afterparty.bio.ed.ac.uk).

Maternal transcriptomes from other species

We identified a number of published, high-throughput, maternal transcriptome studies from Ciona intestinalis (Urochordata, Deutrostomia), Danio rerio, Mus musculus, Homo sapiens (Chordata, Deuterostomia), C. elegans (Nematoda, Ecdysozoa) and D. melanogaster (Arthropoda, Ecdysozoa). A “maternal transcript” is an mRNA that is present in the embryo before the initiation of major zygotic transcription. This does not mean that these mRNAs are not also later also transcribed from the zygotic genome in the developing embryo. We carried out a reciprocal tBLASTx comparison of the L. stagnalis 1 to 2-cell transcriptome against each of the other datasets, using a threshold expect value of 1e−10. By identifying L. stagnalis transcripts that had homologues in all of the species we identified a putative set of conserved bilaterian maternal transcripts.

Functional annotation of transcriptome

The 1 to 2-cell and 32-cell transcriptome assemblies were annotated with gene ontology (GO) terms using Blast2GO v 2.7.0 against the NCBI non-redundant (nr) protein database, with an E-value cutoff of 1e-05. GO term distribution was quantified using the Combined Graph function of Blast2GO, with enrichment assessed using the Fisher’s Exact Test function (Conesa ).

In situ validation of representative transcripts

We validated the maternal expression of a selection of sequences in L. stagnalis 1-cell embryos by using whole mount in situ hybridisation (WMISH). Primers were designed to amplify fragments of selected genes, which were then cloned into pGEM-T and verified by standard Sanger sequencing. Complementary riboprobes were prepared from these templates as described in Jackson . The WMISH protocol we employed here for L. stagnalis is similar to previously described protocols for molluscan embryos and larvae (Jackson , Jackson ) with some important modifications (described elsewhere; in review). The colour reactions for all hybridisations (including the negative β-tubulin control) were allowed to proceed for the same length of time, and all samples cleared in 60% glycerol and imaged under a Zeiss Axio Imager Z1 microscope. The primers used are shown in Table 1.
  44 in total

1.  The evolution of early animal embryos: conservation or divergence?

Authors:  Alex T Kalinka; Pavel Tomancak
Journal:  Trends Ecol Evol       Date:  2012-04-18       Impact factor: 17.712

2.  Gene expression divergence recapitulates the developmental hourglass model.

Authors:  Alex T Kalinka; Karolina M Varga; Dave T Gerrard; Stephan Preibisch; David L Corcoran; Julia Jarrells; Uwe Ohler; Casey M Bergman; Pavel Tomancak
Journal:  Nature       Date:  2010-12-09       Impact factor: 49.962

Review 3.  The convoluted evolution of snail chirality.

Authors:  M Schilthuizen; A Davison
Journal:  Naturwissenschaften       Date:  2005-10-11

4.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences.

Authors:  Weizhong Li; Adam Godzik
Journal:  Bioinformatics       Date:  2006-05-26       Impact factor: 6.937

Review 5.  Embryonic transcription and the control of developmental pathways.

Authors:  E Wieschaus
Journal:  Genetics       Date:  1996-01       Impact factor: 4.562

6.  A transcriptomic hourglass in plant embryogenesis.

Authors:  Marcel Quint; Hajk-Georg Drost; Alexander Gabel; Kristian Karsten Ullrich; Markus Bönn; Ivo Grosse
Journal:  Nature       Date:  2012-09-05       Impact factor: 49.962

7.  A bacterial artificial chromosome library for Biomphalaria glabrata, intermediate snail host of Schistosoma mansoni.

Authors:  Coen M Adema; Mei-Zhong Luo; Ben Hanelt; Lynn A Hertel; Jennifer J Marshall; Si-Ming Zhang; Randall J DeJong; Hye-Ran Kim; David Kudrna; Rod A Wing; Cari Soderlund; Matty Knight; Fred A Lewis; Roberta Lima Caldeira; Liana K Jannotti-Passos; Omar dos Santos Carvalho; Eric S Loker
Journal:  Mem Inst Oswaldo Cruz       Date:  2006-09       Impact factor: 2.743

8.  Composition and regulation of maternal and zygotic transcriptomes reflects species-specific reproductive mode.

Authors:  Shai S Shen-Orr; Yitzhak Pilpel; Craig P Hunter
Journal:  Genome Biol       Date:  2010-06-01       Impact factor: 13.583

9.  Human housekeeping genes, revisited.

Authors:  Eli Eisenberg; Erez Y Levanon
Journal:  Trends Genet       Date:  2013-06-27       Impact factor: 11.639

10.  Nanog, Pou5f1 and SoxB1 activate zygotic gene expression during the maternal-to-zygotic transition.

Authors:  Miler T Lee; Ashley R Bonneau; Carter M Takacs; Ariel A Bazzini; Kate R DiVito; Elizabeth S Fleming; Antonio J Giraldez
Journal:  Nature       Date:  2013-09-22       Impact factor: 49.962

View more
  6 in total

1.  A Whole Mount In Situ Hybridization Method for the Gastropod Mollusc Lymnaea stagnalis.

Authors:  Daniel J Jackson; Ines Herlitze; Jennifer Hohagen
Journal:  J Vis Exp       Date:  2016-03-15       Impact factor: 1.355

2.  Maternal effect genes as risk factors for congenital heart defects.

Authors:  Fadi I Musfee; Omobola O Oluwafemi; A J Agopian; Hakon Hakonarson; Elizabeth Goldmuntz; Laura E Mitchell
Journal:  HGG Adv       Date:  2022-03-09

3.  A transcriptional blueprint for a spiral-cleaving embryo.

Authors:  Hsien-Chao Chou; Margaret M Pruitt; Benjamin R Bastin; Stephan Q Schneider
Journal:  BMC Genomics       Date:  2016-08-05       Impact factor: 3.969

4.  The conserved regulatory basis of mRNA contributions to the early Drosophila embryo differs between the maternal and zygotic genomes.

Authors:  Charles S Omura; Susan E Lott
Journal:  PLoS Genet       Date:  2020-03-30       Impact factor: 5.917

5.  Evolved Differences in cis and trans Regulation Between the Maternal and Zygotic mRNA Complements in the Drosophila Embryo.

Authors:  Emily L Cartwright; Susan E Lott
Journal:  Genetics       Date:  2020-09-14       Impact factor: 4.562

Review 6.  Maternal effect genes: Update and review of evidence for a link with birth defects.

Authors:  Laura E Mitchell
Journal:  HGG Adv       Date:  2021-10-16
  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.