The chorioallantoic placenta connects mother and fetus in eutherian pregnancies. In order to understand the evolution of the placenta and provide further understanding of placenta biology, we sequenced the transcriptome of a term placenta of an African elephant (Loxodonta africana) and compared these data with RNA sequence and microarray data from other eutherian placentas including human, mouse, and cow. We characterized the composition of 55,910 expressed sequence tag (i.e., cDNA) contigs using our custom annotation pipeline. A Markov algorithm was used to cluster orthologs of human, mouse, cow, and elephant placenta transcripts. We found 2,963 genes are commonly expressed in the placentas of these eutherian mammals. Gene ontology categories previously suggested to be important for placenta function (e.g., estrogen receptor signaling pathway, cell motion and migration, and adherens junctions) were significantly enriched in these eutherian placenta-expressed genes. Genes duplicated in different lineages and also specifically expressed in the placenta contribute to the great diversity observed in mammalian placenta anatomy. We identified 1,365 human lineage-specific, 1,235 mouse lineage-specific, 436 cow lineage-specific, and 904 elephant-specific placenta-expressed (PE) genes. The most enriched clusters of human-specific PE genes are signal/glycoprotein and immunoglobulin, and humans possess a deeply invasive human hemochorial placenta that comes into direct contact with maternal immune cells. Inference of phylogenetically conserved and derived transcripts demonstrates the power of comparative transcriptomics to trace placenta evolution and variation across mammals and identified candidate genes that may be important in the normal function of the human placenta, and their dysfunction may be related to human pregnancy complications.
The chorioallantoic placenta connects mother and fetus in eutherian pregnancies. In order to understand the evolution of the placenta and provide further understanding of placenta biology, we sequenced the transcriptome of a term placenta of an African elephant (Loxodonta africana) and compared these data with RNA sequence and microarray data from other eutherian placentas including human, mouse, and cow. We characterized the composition of 55,910 expressed sequence tag (i.e., cDNA) contigs using our custom annotation pipeline. A Markov algorithm was used to cluster orthologs of human, mouse, cow, and elephant placenta transcripts. We found 2,963 genes are commonly expressed in the placentas of these eutherian mammals. Gene ontology categories previously suggested to be important for placenta function (e.g., estrogen receptor signaling pathway, cell motion and migration, and adherens junctions) were significantly enriched in these eutherian placenta-expressed genes. Genes duplicated in different lineages and also specifically expressed in the placenta contribute to the great diversity observed in mammalian placenta anatomy. We identified 1,365 human lineage-specific, 1,235 mouse lineage-specific, 436 cow lineage-specific, and 904 elephant-specific placenta-expressed (PE) genes. The most enriched clusters of human-specific PE genes are signal/glycoprotein and immunoglobulin, and humans possess a deeply invasive humanhemochorial placenta that comes into direct contact with maternal immune cells. Inference of phylogenetically conserved and derived transcripts demonstrates the power of comparative transcriptomics to trace placenta evolution and variation across mammals and identified candidate genes that may be important in the normal function of the human placenta, and their dysfunction may be related to human pregnancy complications.
The mammalian placenta is the functional connection between mother and fetus, acting as a conduit for physiological exchange during pregnancy (Mossman 1987). The presence of a chorioallantoic placenta distinguishes eutherian mammals from other organisms (Archibald and Rose 2005). Within eutherian mammals (i.e., extant placental mammals and closely related extinct forms), placenta morphology and physiology varies among species. There are observable differences in the type of placental interface (e.g., epitheliochorial, endotheliochorial, and hemochorial), placental shape (e.g., diffuse, cotylendonary, zonary, discoidal, and bidiscoidal), and degree of maternofetal interdigitation (e.g., folded, lamellar, villous, trabecular, and labyrinthine) (Benirschke and Kaufmann 2000). Mammalian species with the same placenta characteristics do not form monophyletic groups (Wildman et al. 2006). Phylogenetic analyses suggest that the last common ancestor of eutherian mammals had a deeply invasive discoid hemochorial placenta with a labyrinthine interdigitation (Wildman et al. 2006; Elliot and Crespi 2009) instead of a less invasive placenta type as originally postulated (Haeckel 1883). Despite these differences in placenta morphology and physiology, all eutherian placentas share conserved characteristics (Mossman 1987; Cross et al. 2003). Comparative studies of placenta transcriptomes have the potential to identify those genes, processes and pathways that underlie basic placenta morphology and function and are conserved across species and placenta types as well as those specific to individual taxa or placenta types.Humans and mice have a hemochorial discoidal placenta with villous maternofetal interdigitation (Benirschke and Kaufmann 2000). The hemochorial interface and discoidal shape found in humans is not a derived condition but rather represents the ancestral primate and ancestral eutherian state (Wildman et al. 2006). Villous maternal–fetal interdigitation evolved independently at least three times during the descent of eutherian mammals, including in the primates (Wildman et al. 2006). Lineage-specific changes observed across placenta transcriptomes likely underlie lineage-specific differences in placenta anatomy, physiology, and gestation length (Rawn and Cross 2008). Importantly, considering transcripts found only in the human placenta may improve our understanding of human-specific pregnancy-related complications responsible for high perinatal morbidity and mortality (Rawn and Cross 2008). Disruption and dysregulation of placenta biology have been shown to result in obstetrical syndromes (Romero 2009; Brosens et al. 2010) (e.g., preeclampsia, Madazli et al. 2000; intrauterine growth restriction, Olofsson et al. 1993; and preterm birth, Kim et al. 2003) that pose health risks to both mothers and fetuses (Moffett and Loke 2006).In the present study, we sequenced the transcriptome of the placenta of an African elephant (Loxodonta africana) and compared these data with expressed sequence and microarray data from other eutherian species including human, mouse, and cow. The elephant placenta was sequenced to broaden the currently narrow phylogenetic sampling of available transcriptome data for this tissue. Moreover, the inclusion of these taxa allows us to sample three major superordinal eutherianmammalian clades (Euarchontoglires [human and mouse], Laurasiatheria [cow], and Atlantogenata [elephant]) as well as different placenta types. The rationale for this taxon sampling approach was 2-fold. First, in order to unravel the series of evolutionary events that resulted in the emergence of the human placenta, it is necessary to know which genes were expressed in the placenta of the last common ancestor of placental mammals. Elephants belong to the eutherian superordinal clade, Atlantogenata, and this clade is the most distantly related eutherian clade to the one in which humans, mice, and cow are classified. The anatomy of mammalian placentas has been previously well characterized (Mossman 1987; Benirschke and Kaufmann 2000; Mess and Carter 2007). Humans and mice have a hemochorial discoid placenta with villous and labyrinthine interdigitation, respectively. Cows have an epitheliochorial cotyledonary placenta with villous interdigitation. Elephants have an endotheliochorial zonary placenta with labyrinthine interdigitation. Comparison of transcriptome data in these species allowed us to identify a core set of genes that are expressed in all eutherian species examined. These genes are overrepresented by gene ontology categories related to estrogen receptor signaling, cell motion and migration, and adherens junctions. Expression of these genes is conserved across Eutheria and likely represents a core set of transcripts important for placenta function, especially at term. We also identified lineage-specific placenta transcripts for the species included in our study. We identified genes with human lineage–specific expression in the placenta. These transcripts are overrepresented by genes related to signal/glycoproteins and immunoglobulins, and the function of these genes may relate to the deeply invasive humanhemochorial placenta. Taken together, we suggest that the findings presented point a way toward understanding the emergence of the eutherian placenta; moreover, we discuss how the present findings have applications that may improve our understanding of human-specific pregnancy-related complications responsible for high perinatal morbidity and mortality.
Materials and Methods
RNA Isolation, cDNA Library Preparation, and Pyrosequencing
Loxodonta africana placenta tissue was preserved in RNAlater (Ambion, Foster City, CA). Placenta villous trees were separated from the extraembryonic membranes, and total RNA was isolated from the two placenta compartments using TRIzol Reagent (Invitrogen, Carlsbad, CA). The Qiagen RNeasy Kit was used in conjunction with the Qiagen RNase-Free DNase Set for clean up (Qiagen, Valencia, CA) according to the manufacturer's recommendations. The concentration and purity of the total RNA isolated from the placenta villous trees were tested using the NanoDrop ND-1000 Spectrophotometer (Thermo Fisher Scientific, Wilmington, DE). Total RNA (15 μg) was submitted to Agencourt Bioscience Corporation (Beverly, MA) per the company's instructions.A duplex-specific cDNA nondirectional normalization procedure (Evrogen Lab, Ltd. Part # CS010) was used in order to equalize transcript abundance. A cDNA library was then constructed using the normalized cDNA following the steps outlined in 454's GS FLX Titanium General Library Preparation Method Manual. Library construction included DNA fragmentation, size selection, quality assessment, fragment end polishing, adapter ligation, small fragment removal, library immobilization, fill in reaction, single-stranded DNA isolation, quality assessment, and quantification. Emulsion polymerase chain reaction (PCR) was performed following the procedures outlined in 454's GS FLX Titanium emPCR Method Manual to clonally amplify DNA fragments. Sequences were generated by pyrosequencing. Pyrosequencing was completed on a Roche/454 FLX sequencer, and short reads file had been deposited in Short Reads Archive (SRR027944.3). In this manner, we generated data via sequencing by synthesis. Four hundred and fifty-four reads were assembled using Newbler Assembler Software (454 Life Sciences).
Sequencing, Assembly, and Annotation of Elephant Placenta Transcripts
We sequenced a normalized cDNA library derived from the elephant villous placenta tissue, using the Roche 454 GS FLX System. We obtained 524,115 placenta cDNA sequence reads (SRA accession number SRR027944.3) with an average length of 205 bp (107,392,105 bp in total). Repeats were filtered from the data prior to assembly (Smit et al. 1996–2010). As a result, 72% of the reads (377,362) were included in the assembly. High-quality reads were assembled into 55,409 contigs derived from the assembled reads with a mean length of 284 bp (fig. 1). The average fold sequence coverage per contig was 7× (range = 2–869). In addition, 501 high-quality singleton sequences with an average length of 132 bp were also used in subsequent analyses.
F
Contig length distribution and composition. Summary of cDNA contig length distribution and composition in this study. (A) Distribution of assembled contig length; (B) pie chart depicts the results of BLAT searches using the 55,910 transcribed elephant contigs as the subject for queries to mammalian genomic, cDNA, ncRNA, and microbial databases. Known transcripts (red) are those elephant contigs that have sequence similarity to sequences in human and/or mouse and/or cow Ensembl cDNA gene databases. Purple indicates elephant transcripts that share a similarity to noncoding RNA sequences (see Materials and Methods for database details). Blue indicates those elephant transcripts that have significant similarity to only the Ensembl elephant genomic database. Green indicates those transcripts that did not return hits for any of our BLAT queries. A small percentage of transcripts shared sequence similarity only with bacterial sequence (light blue).
Contig length distribution and composition. Summary of cDNA contig length distribution and composition in this study. (A) Distribution of assembled contig length; (B) pie chart depicts the results of BLAT searches using the 55,910 transcribed elephant contigs as the subject for queries to mammalian genomic, cDNA, ncRNA, and microbial databases. Known transcripts (red) are those elephant contigs that have sequence similarity to sequences in human and/or mouse and/or cow Ensembl cDNA gene databases. Purple indicates elephant transcripts that share a similarity to noncoding RNA sequences (see Materials and Methods for database details). Blue indicates those elephant transcripts that have significant similarity to only the Ensembl elephant genomic database. Green indicates those transcripts that did not return hits for any of our BLAT queries. A small percentage of transcripts shared sequence similarity only with bacterial sequence (light blue).The unique elephant contigs and singleton reads (altogether 55,910 entries, 15,877,533 bases) were submitted to homology searches (Kent 2002) of publicly available databases of DNA, messenger RNA (mRNA), and noncoding RNA (ncRNA) in human, mouse, cow, elephant, opossum, platypus, and chicken. We also conducted BLAT searches on microbial databases as a filter to remove possible contamination as well as endogenous microbial sequences found in the elephant placenta.We first characterized the composition of these 55,910 contigs (fig. 1) using our custom annotation pipeline (supplementary fig. S1, Supplementary Material online). Contigs were classified as mammalian (95.03%), unidentifiable (4.96%), or microbial (0.01%). Of the identifiable mammalian contigs, 15,017 mapped to known protein-coding genes and 7,219 mapped to ncRNA transcripts. Eight hundred of these ncRNA contigs were orthologous to sequences in the ncRNA and genomic databases but not to the predicted transcript databases. Strikingly, we found 30,894 contigs that showed significant similarity to mammalian genomic sequences but not to any previously described mammalian protein-coding or ncRNA transcripts. Of these novel transcripts, 19,347 could be considered unique to the afrotherian placenta because they represent sequences present only in the elephant draft genome assembly but not in other genomes searched. We ranked these novel elephant transcripts by predicted open reading frame (ORF) length and transcript abundance (i.e., read number) and chose transcripts that had a long ORF region and at least 50 sequence reads for validation.To validate a subset of novel transcripts in the elephant placenta, we sequenced real-time (RT) PCR products from ten elephant transcripts based on these criteria. Total RNA from the same isolation that was submitted to Agencourt was reverse transcribed using the SMART RACE cDNA Amplification Kit (Clontech Laboratories, Inc., Mountain View, CA) per manufacturers' recommendations. Amplification and validation of the sequencing fragment assembly produced by The Genome Sequencer FLX System was performed using the above kit with primers designed using Primer3 software (Koressaar and Remm 2007) and synthesized by Integrated DNA Technologies (San Diego, CA). The criteria used to select sequence fragments for validation were as follows: sequence fragments were more than 500 bp in length, included a detectable ORF (Olson 2002) and were composed of at least 50 reads. Thermal cycling conditions for PCR were based on the manufacturer's recommendation and were performed on an Eppendorf Mastercycler ep gradient S (Eppendorf, Westbury, NY). PCR fragments were eluted from a 1% agarose gel using the QIAquick Gel Extraction Kit (Qiagen) and then cloned into pGEM-T Easy vector (Promega, San Luis Obispo, CA). The plasmid DNA was isolated from Escherichia coli DH5α competent cells (Invitrogen) using the QIAprep Spin Miniprep Kit (Qiagen) without modification. The DNA was then sent to the Research Technology Support Facility at Michigan State University (East Lansing, MI) for sequencing using T7 and M13F primers provided by the facility. Sequences of the validated contigs have been deposited in GenBank (accession numbers: GU166281–GU166288). Alignment and primer information for these sequences are listed in supplementary data set S1 (Supplementary Material online).We verified seven of these novel transcripts (GenBank accession numbers: GU166282–GU166288) but failed to obtain sequence for three of the contigs. Each of the ten transcripts could be mapped to the elephant genome assembly (supplementary data set S1, Supplementary Material online). Eight of these ten transcribed contigs fall in genomic coordinates outside of known protein-coding genes, whereas the remaining two contigs were transcribed in introns of known protein-coding genes (fig. 2). It is important to note, however, that given the comparative data currently available, it is not possible to determine at which point during atlantogenatan (i.e., the clade containing afrotherian elephants as well as xenarthrans) evolution, the transcription of these sequences emerged. Finally, 3,571 elephant contigs had a significant similarity to at least one of the following genomic/transcript databases: human, mouse, cow, opossum, platypus, and chicken but did not map to the low coverage (2×) African elephant draft genome assembly (http://www.broadinstitute.org). These sequences may presumably be located in gaps in the elephant assembly.
F
Elephant novel transcripts and new exons. Novel elephant transcripts were discovered using our analyses pipeline (supplementary fig. S1, Supplementary Material online). The University of California–Santa Cruz (UCSC) BLAT was used to align contigs to elephant genome data. Direction of transcription was determined by UCSC BLAT results. Novel elephant transcripts were classified into two groups: (A) Novel elephant transcripts that are not adjacent to any known transcribed regions. (B) Novel elephant transcripts that are in Ensembl transcript intron regions.
Elephant novel transcripts and new exons. Novel elephant transcripts were discovered using our analyses pipeline (supplementary fig. S1, Supplementary Material online). The University of California–Santa Cruz (UCSC) BLAT was used to align contigs to elephant genome data. Direction of transcription was determined by UCSC BLAT results. Novel elephant transcripts were classified into two groups: (A) Novel elephant transcripts that are not adjacent to any known transcribed regions. (B) Novel elephant transcripts that are in Ensembl transcript intron regions.
Database Sources
We compared our newly sequenced elephant transcript data with publically available genome and transcriptome data. Human, mouse, cow, opossum, platypus, and chicken genomic and transcript data were downloaded from Ensembl (V.52) (Hubbard et al. 2009). Figure 3 shows the evolutionary relationships among these species. These genomes are considered high quality (coverage ≥6×). In addition, in order to maximize the use of all currently available elephant genomic data, elephant transcripts were downloaded from Ensembl, and elephant genomic data were fetched from http://www.broadinstitute.org/science/projects/mammals-models/elephant/elephant, February 2009.
F
Phylogeny. Tree topology and divergence dates used for this study were derived from Murphy et al. (2007) and Wildman et al. (2007). The human, mouse, cow, opossum, platypus, and chicken sequence data are considered high-quality genomic and/or transcriptome data. Atla, Atlantogenata; Euarcho, Euarchontoglires; Laur, Laurasiatheria; Afro, Afrotheria.
Phylogeny. Tree topology and divergence dates used for this study were derived from Murphy et al. (2007) and Wildman et al. (2007). The human, mouse, cow, opossum, platypus, and chicken sequence data are considered high-quality genomic and/or transcriptome data. Atla, Atlantogenata; Euarcho, Euarchontoglires; Laur, Laurasiatheria; Afro, Afrotheria.To supplement the transcript data provided by Ensembl, we also included the human and mouse contig data from GenBank and full mRNA data sets from the mammalian gene collection (MGC, http://mgc.nci.nih.gov/; Temple et al. 2009). The MGC comprised 43,797 full-length non-redundant human, mouse, and cow cDNA. Human and mouse placenta transcriptome data were obtained from Su et al. (2004). We downloaded the probe sequence for human and mouse microarray chips and reannotated them using BlastN against Ensembl human and mouse transcripts. Genes with signal intensity greater than 100 in the placenta were considered expressed in the placenta. One-to-one ortholog relationships between human and mouse transcripts were annotated using Ensembl BioMart (Hubbard et al. 2009). Cow placenta–expressed sequence tag (EST) data (165,262) were downloaded from GenBank, using “cow” and “placenta” as query terms. We used these cow EST sequences to Blast Ensembl–predicted cow transcripts. These cow EST sequences mapped to 8,936 cow Ensembl genes.ncRNA sequences were downloaded from http://biobases.ibch.poznan.pl/ncRNA/. This ncRNA database includes ncRNA deposited in GenBank, FANTOM3 mouse and human ncRNAs, and H-invitational Integrated Database of Annotated Human Genes version 3.4 (Szymanski et al. 2007). In order to supplement these ncRNA databases, we also used the Ensembl ncRNA (V.52) data sets for human, mouse, cow, and elephant. All microbial genome data (841 different bacterial genomes) deposited in GenBank (until February 2009) were downloaded onto our local server for similarity searching.
Comparative Genomics Analyses
Assembled 454 contigs and singleton reads ≥100 bp were aligned to the human, mouse, cow, and elephant genomic and transcript data using BLAT (Kent 2002) (parameters: −q = dnax −t = dnax −tilesize = 11). In our analyses, we classified the 454 contigs into one of four categories based on the BLAT results: known transcripts, novel transcripts, ncRNA, and unidentified transcripts (supplementary fig. S1, Supplementary Material online). Known transcripts were defined as contigs that result in a match to currently available mammal transcripts data (see Database Sources). The ncRNA transcripts were defined as contigs that had significant hits to ncRNA sequences using the BLAT tool. Groups of orthologous transcripts were clustered using the Markov Cluster algorithm implemented by OrthoMCL (Li et al. 2003).All sequences, BLAT results, and annotation tables were stored in a MySQL database (http://www.mysql.com). Custom Perl scripts were used to conduct genome-wide sequence searches and to retrieve BLAT and OrthoMCL results. Statistical analyses of the transcript annotation were conducted using R (www.r-project.org/) and BioConductor (Gentleman et al. 2004).Using CAFÉ v2.2 (De Bie et al. 2006), we analyzed the elephant, cow, human, and mouse placenta–expressed transcripts for evidence of significant gene family expansion or contraction in the elephant lineage in particular. A total of 18,394 gene family ID's that overlap with the placenta-expressed (PE) transcripts were extracted from Ensembl. We then used a custom script to generate a data file with the number of genes present in each lineage for each of the gene families. Gene families with <3 genes (13,076 families) or >100 genes (Zinc finger: ENSFM00250000000002) were removed from this data set in order to provide more accurate estimations of birth and death rates. Rates of gene family evolution were estimated across the following phylogeny: (Elephant:100,[{Human:91,Mouse:91}:1,Cow:92]:8) where branch lengths represent species divergence date estimates in millions of years (Goodman et al. 2009).Conserved and lineage-specific PE gene lists were inferred using the orthologous group clustering data, supplementary data set S2 (Supplementary Material online). In order to obtain more accurate gene annotations, we used the following criteria: 1) removed putative housekeeping genes (Eisenberg and Levanon 2003) from all the gene lists. We removed the homologous genes from each of the four species examined; 2) removed those predicted genes which do not have known gene symbols; 3) removed those predicted genes with different Ensembl gene names and HGNC/MGI gene symbols. Gene annotations were extracted from the Ensembl BioMart database. A functional annotation clustering tool (Dennis et al. 2003; Huang da et al. 2009) was used to determine overrepresented functional categories for each gene list of interest. Default settings and parameters were used. These data are available in supplementary data set S3 (Supplementary Material online).
Intersection of Gene Lists with KOMP Data
Gene lists were intersected with Knockout Mouse Project (KOMP, Austin et al. 2004) data, specifically those KOMP genes present on the mouse microarray (Su et al. 2004) and important for placenta phenotypes (Mammalian Genome Informatics; Mammalian Phenotype ID: MP:0001711 Abnormal Placenta Morphology, Blake et al. 2010). Ninety KOMP genes were expressed in all three eutherian species (human, mouse, and elephant), and 62 KOMP genes were expressed in Euarchontoglires (human and mouse). In order to determine which functional categories are overrepresented in each of these two lists (Eutheria KOMP placenta genes and Euarchontoglires KOMP placenta genes), we used functional annotation clustering as described above (see Comparative Genomic Analyses). These data are available in supplementary data set S4 (Supplementary Material online).
Intersection of Gene Lists with Placenta-Predominant Expression Data
We used the tissue specificity index τ (Yanai et al. 2005) to measure the tissue-specific expression patterns for each gene. The following formula was used for τ value calculation:In this formula, i is the number of tissues studied and max gene expression is the highest expression value for a gene across all studied tissues and cell lines. We used 79 human tissues/cell lines and 50 mouse tissues/cell as previously published (Su et al. 2004). If a gene had greatest expression in placenta and a τ value >0.8, we considered that gene to be predominately expressed in placenta. Placenta-predominant gene data are available in supplementary data sets S5 and S6 (Supplementary Material online).
Results
Phylogenetically Conserved PE Genes
In order to determine phylogenetically conserved placenta transcripts, we constructed human–mouse–elephant placenta Ensembl protein-coding transcript ortholog groups using a Markov-clustering algorithm (Li et al. 2003) that is based on the best reciprocal Blast results (see Materials and Methods). We found 2,963 genes to be putatively expressed in the placenta of these eutherian mammals (fig. 4; supplementary data set S2, Supplementary Material online). Of these genes, 1,073 were also present in the more limited currently available cow placenta data. In order to determine which functional categories are overrepresented by genes expressed in the placenta of eutherian mammals, we conducted gene ontology enrichment analyses (Dennis et al. 2003; Huang da et al. 2009) using our data set of 2,963 conserved PE genes. There were 157 functional annotation clusters of genes coexpressed in the eutherian placenta with enrichment scores (ES) greater than 1. A complete list of enrichment categories can be found in supplementary data set S3 (Supplementary Material online); the top 5 functional annotation clusters are summarized in table 1. The most enriched annotation cluster (ES = 26.49) includes genes related to membrane-enclosed lumen (GO:0031974), specifically the nucleus. Functional annotation clusters also enriched with conserved PE genes include components of the cytoskeleton and basic biological processes and functions including RNA processing (e.g., splicing), nucleotide binding, catabolic processes (e.g., proteolysis), and transport and localization. Additional annotation clusters have been shown to play a role in placenta biology (Antonson et al. 2003; Gasperowicz and Otto 2008; Zhao et al. 2010) and are present in our data set of conserved PE genes include regulation of cell migration (ES = 3.54; GO:0030334; 51 genes), estrogen receptor signaling (ES = 1.54; GO:0030520; NCOA6, ARID1A, RBM9, and RBM14), Notch signaling (ES = 1.18; GO:0007219; MIB1, NOTCH2, APP, ADAM10, PSEN1, APH1A, NOTCH2NL, ADAM17, WDR12, PSENEN, SPEN, and SEL1L), and JAK-STAT cascade involved in growth hormone (GH) receptor signaling (ES = 1.27; GO:0060397; STAT5B, JAK2, and STAT3). Analyses repeated using the 1,073 genes expressed in the human, mouse, cow, and elephant placentas yielded similar results (supplementary data sets S2 and S3, Supplementary Material online), suggesting the most enriched functional categories discussed above represent the set of genes commonly expressed in eutherian placentas.
F
Eutheria placenta shared and lineage-specific expressed genes. Genomic data from seven species (human, mouse, cow, elephant, opossum, platypus, and chicken) were used in our comparative genomics study. Human, mouse, and cow PE genes were obtained from publically available data (see Materials and Methods). Elephant PE genes were generated in this study. Numbers on each branch represent the branch (i.e., lineage specific) PE genes as determined by the Markov-clustering algorithms and our custom quality control procedure (see Materials and Methods).
Table 1
Enrichment Analyses of Eutherian Placenta–Expressed Transcripts and Human-Specific PE Transcripts
Lineage
Functional Annotation Clustera Description
Number of Genes
ES
Eutherian
Membrane-enclosed lumen
479
26.49
Nonmembrane-bound organelle
588
20.79
RNA processing (e.g., splicing)
187
12.8
Nucleotide binding
526
11.91
Catabolic processes (e.g., proteolysis)
264
10.36
Human
Glycoprotein
432
6.77
Immunoglobulin (V-set)
38
5.86
Immunoglobulin-like fold
85
3.36
Killer cell immunoglobulin-like receptor
8
3.32
Adhesion
67
2.69
Note.—Complete data sets can be found in supplementary data set S3 (Supplementary Material online).
Top 5 clusters are presented in the table.
Enrichment Analyses of Eutherian Placenta–Expressed Transcripts and Human-Specific PE TranscriptsNote.—Complete data sets can be found in supplementary data set S3 (Supplementary Material online).Top 5 clusters are presented in the table.Eutheria placenta shared and lineage-specific expressed genes. Genomic data from seven species (human, mouse, cow, elephant, opossum, platypus, and chicken) were used in our comparative genomics study. Human, mouse, and cow PE genes were obtained from publically available data (see Materials and Methods). Elephant PE genes were generated in this study. Numbers on each branch represent the branch (i.e., lineage specific) PE genes as determined by the Markov-clustering algorithms and our custom quality control procedure (see Materials and Methods).
Lineage-Specific PE Genes
By comparing those transcripts expressed in the placenta of a phylogenetically diverse range of species, we were not only able to determine which genes expressed in the placenta are conserved (shared across eutherian species investigated) but also those genes that are derived and therefore specific to independent lineages. We defined those PE transcripts present in only one of the species we investigated as “lineage-specific,” regardless of whether the genes encoding these transcripts were specific to that species' genome. The newly sequenced elephant transcripts and currently available human, mouse, and cow PE transcript data were compared (see Materials and Methods) in order to estimate the number of lineage-specific PE genes. We found 1,365 human-specific PE genes, 1,235 mouse-specific PE genes, 436 cow-specific PE genes, and 904 elephant-specific PE genes (supplementary data set S2, Supplementary Material online). These findings reveal a great deal of lineage-specific heterogeneity in expression of orthologous protein-coding genes. Inclusion of more closely related species will help refine these lists further, enabling us to determine, for example, whether transcripts considered here as “human lineage–specific” are also expressed in the placentas of other Euarchonta (e.g., primate) species.In order to determine which functional categories are overrepresented by genes differentially expressed by lineage, we conducted gene ontology enrichment analyses (Dennis et al. 2003; Huang da et al. 2009) using each of our data sets of lineage-specific PE genes (supplementary data set S3, Supplementary Material online). There are 38 annotation clusters with ES greater than one expressed in the human placenta; the top 5 functional annotation clusters are summarized in table 1. Interestingly, the majority of human-specific PE genes are immunity related. The most enriched (ES = 6.77) annotation cluster is predominantly composed of genes encoding glycoprotein and/or signal-related proteins and includes chorionic gonadotropin beta (CG-β) polypeptides, growth hormone 2 (GH2), and the interleukin-6 receptor (IL6R).We repeated the functional annotation analysis on the mouse-specific, elephant-specific, and cow-specific PE genes (supplementary data set S3, Supplementary Material online). The cow placenta is enriched with genes involved in hydrolysis (ES = 3.51) and other metabolic processes. Mouse-specific PE genes are most enriched (ES = 23.05) with membrane-bound or -associated receptors including olfactory receptors and G-protein–coupled signaling. Interestingly, elephant-specific PE genes are also overrepresented (ES = 82.2) by the same functional annotation cluster as the mouse. Both mouse and elephant placenta express a number of species-specific olfactory receptors. Olfactory receptor gene families have experienced extensive lineage-specific expansion and contraction throughout mammalian evolution (Niimura and Nei 2007). That olfactory receptors are expressed in elephant placenta is not entirely surprising because it is known that receptors are also expressed in rodent placenta (Itakura et al. 2006; Mao et al. 2010). However, olfactory receptors are not significantly overrepresented among genes expressed in human placenta. Further study and increased taxon sampling are required to clarify whether the last common ancestor of placental mammals expressed these receptors, and expression was lost in humans or whether placenta expression of these receptors emerged independently in lineages leading to rodents and elephants.
Gene Family Evolution on the Elephant Lineage
Of the 5,317 gene families examined, 34 showed significant expansion or contraction (P ≤ 0.05) change along the elephant lineage (table 2). Thirteen of these gene families correspond to different subfamilies of human olfactory receptors and support expansion and contraction of olfactory receptor gene families across mammals and idiopathic expression in nonolfactory tissue (De la Cruz et al. 2009). These receptors are among the most dynamic families of genes, gene gain and loss in this family is rapid (Niimura and Nei 2007) and characterized by adaptive evolution in protein-coding sequences (Goodman et al. 2009). Although these analyses provide preliminary data about gene family evolution in PE elephant transcripts, more extensive sampling (e.g., the inclusion of a more complete data set of cow PE transcripts; the addition of more primates) will greatly increase the sensitivity and specificity of these analyses.
Table 2
Gene Family Evolution in the Elephant Placenta Transcriptome
Glutathione S-transferase EC_2.5.1.18 GST class alpha
ENSFM00270000056439
3
0
0.04
Note.—MRCA, inferred most recent common ancestor of elephant, cow, human, and mouse.
Gene Family Evolution in the Elephant Placenta TranscriptomeNote.—MRCA, inferred most recent common ancestor of elephant, cow, human, and mouse.
Intersection of PE Genes with KOMP Data
The chorioallantoic placenta is a derived feature shared by all eutherian mammals, and as such, we hypothesized that genes that are essential to basic placenta morphology should be widely expressed in all eutherian placentas. In order to test this, we examined whether genes that when knocked out in the mouse cause abnormal mouse placenta morphology and/or physiology are expressed in the placentas of other eutherian mammals. Genes from the KOMP (Austin et al. 2004) were examined in the MGI database (Bult et al. 2008), and we focused this analysis on a specific mammalian phenotype ID: MP:0001711 (i.e., abnormal placenta morphology). We then intersected our conserved and lineage-specific expressed gene lists with the subset of the KOMP genes that were represented by probe sets on the mouse microarray (n = 245). Despite being initially characterized in mice, we found that these KOMP genes were more widely expressed in the Eutheria (P < 2.2 × 10−16; Fisher's exact test) and Euarchontoglires (P = 0.00825) placentas included in this study. Of the 245 genes that cause abnormal placental morphology when knocked out in mouse, 90 are expressed in human, mouse, and elephant placentas and 62 genes are expressed only in human and mouse placentas.We conducted enrichment analyses for the 90 KOMP genes expressed in eutherian placentas and the 62 genes expressed only in the Euarchontoglires placentas using the eutherian PE genes and Euarchontoglires PE genes as the background, respectively (supplementary data set S4, Supplementary Material online). Eutherian placentas were significantly enriched in genes involved in vasculature development (ES = 3.05), regulation of cell development (ES = 2.67), and embryonic morphogenesis (ES = 2.66). The most enriched categories of the Euarchontoglires placenta genes were placenta development (ES = 6.05), vasculature development (ES = 3.52), and identical protein binding (ES = 3.37). Using the limited cow placenta EST data, we found that 42 of the 245 genes were expressed in the human, mouse, cow, and elephant placentas and 26 genes were expressed in the human, mouse, and cow placentas. Taken together, these findings suggest that KOMP genes that are expressed in the placentas of sampled eutherian species may play important roles in proper placenta development and function in eutherian species besides mouse.
Tissue-specific gene expression has been considered a fundamental aspect of multicellular biology (Chikina et al. 2009). Thus, we also sought to identify those genes predominantly (or only) expressed in either the human or mouse placentas (in relation to other tissues) and identified many more mouse-specific and human-specific placenta-predominant genes than have previously been shown (Rawn and Cross 2008). Genes were considered placenta predominant if their gene tissue–specific index was higher than 0.8, and the placenta had the highest expression value (Materials and Methods). We identified 40 human-specific placenta-predominant genes expressed in the human placenta and 57 mouse-specific placenta-predominant genes expressed in the mouse placenta (supplementary data set S5, Supplementary Material online). Our placenta-predominant lineage-specific data found similar enrichment of genes related to placental hormones. Mouse placenta–predominant genes were enriched in prolactin/hormone (ES = 9.7), placentally expressed cathepsin (ES = 7.17), and pregnancy (ES = 6.54) (supplementary data set S6, Supplementary Material online). Human placenta–predominant genes were enriched in placenta/prolactin (ES = 6.65), glycoprotein (ES = 4.91), and gonadotropin (ES = 3.83) (supplementary data set S6, Supplementary Material online).
Discussion
The chorioallantoic placenta is an important reproductive trait uniting all placental mammals. It has been proposed that the emergence of the placenta allowed for longer gestation periods resulting in the live birth of larger better adapted offspring (Romer 1967). Comparative studies of placenta biology have shown that eutherian placentas share a set of conserved characteristics (e.g., development of a blastocyst, trophoblastic cells, branching morphogenesis of villi necessary for nutrient transfer, promotion of uterine angiogenesis) needed to maintain pregnancy and fetal development as well as derived lineage- or placenta-specific characteristics (Hoffman and Wooding 1993; Cross et al. 2003; Wildman et al. 2006; Enders 2009). Deciphering the molecular underpinnings of such conserved and derived traits not only provides a more complete understanding of placenta biology, function, and evolution but also of critical importance to studies of pregnancy-related complications and diseases in humans and other mammals (Rawn and Cross 2008). In this study, we used comparative transcriptomics to systematically identify conserved and lineage-specific PE transcripts. Using this approach, we were able to identify and describe 1) a core set of 2,963 genes commonly expressed in human, mouse, and elephant placentas; 2) 1,365 potentially human-specific PE transcripts; and 3) 40 human-specific transcripts that are predominantly (or only) expressed in placenta tissue. In doing so, we also identified a set of mouse-specific, elephant-specific, and cow-specific PE genes. The core set of conserved eutherian PE genes represents those genes (and their related processes and pathways) that can be considered fundamental to the success of pregnancy, regardless of species or placenta type. Genes expressed specifically in the human placenta (including both those also expressed in other tissues and those predominantly expressed in the placenta) are appropriate and important candidates for the study of human-specific pregnancy complications that contribute to perinatal morbidity and mortality (see below).Although this study presents the first description and analysis of an afrotherian placenta transcriptome, the addition of more mammalian placenta transcriptomes would greatly increase the power of these analyses and help fine-tune our conserved and derived gene lists. Transcripts found to be lineage-specific in this study may be expressed in the placenta of other eutherian mammals. More extensive sampling is needed to test if, for example, human lineage–specific transcripts are unique to the human placenta or found in the placenta of other primate species. Furthermore, our data set incorporates several sources of gene expression data including contigs, microarray, and ESTs. RNA-seq can give more accurate results for identifying transcribed elements than microarray. As a result, more transcriptomes derived from direct sequencing of RNA are needed.
Conserved Expression of Genes across Eutherian Placentas
The 2,963 genes found to be commonly expressed in the placentas of human, mouse, and elephant should be considered important for the structure and function of eutherian, and potentially mammalian, placenta. One of the significant differences between the choriovitelline marsupial placenta and the chorioallantoic eutherian placenta is blastocyst formation and extravillous trophoblast invasion of the placenta into the eutherian uterus (Lillegraven 1975; Mess and Carter 2007). In addition, phylogenetic reconstructions suggest that the placenta of the last common ancestor of extant eutherians was invasive (Vogel 2005; Wildman et al. 2006; Elliot and Crespi 2009) with a hemochorial interface with a discoid shape and a labyrinthine interdigitation (Wildman et al. 2006). Eutherian PE genes are most overrepresented by genes related to member-enclosed lumen (table 1). Some genes (e.g., NCoA6 and Pou2F1) in this category have been shown to be necessary for placenta development. Null mice of nuclear receptor coactivator/coregulator NCoA6 exhibit a dramatically reduced spongiotrophoblast layer, as well as collapsed blood vessels in the region bordering the spongiotrophoblast, and labyrinthine layers (Antonson et al. 2003). Pou2F1 is a member of the POU protein family. Experimental evidence in mice found that Pou2F1 is required for trophoblast stem cell derivation (Sebastiano et al. 2010). Additionally, cell adherens are an overrepresented annotation among the 2,963 genes with conserved eutherian expression. The role played by genes involved in cell–cell interactions (e.g., cell adherens) at the site where the developing placenta contacts the uterus has long been appreciated (Schlafke and Enders 1975; Leach 2002). Moreover, aberrant adhesion of the placenta and uterus is a major cause of maternal mortality and morbidity in human pregnancies. For example, the potentially life-threatening placenta accreta occurs when a portion or all placental trophoblast invades into uterine myometrium and subsequently fails to separate from the uterus after delivery (Manyonda and Varma 1991; Benirschke and Kaufmann 2000). Although the finding of conserved patterns of gene expression across Eutheria is important, we note that orthologs of many of the transcripts expressed among the three eutherianmammal placentas may also be expressed in marsupial placentas; therefore, future studies should expand the current work to include additional outgroups and developmental time points.In addition, 90 of the 2,963 genes expressed in placentas of human, mouse, and elephant also cause abnormal mouse placenta morphology and/or physiology when knocked out in the mouse. These genes may be essential for proper placenta function in both endotheliochorial and hemochorial placenta types. Those 93 genes not found in our data set of 2,963 conserved genes may represent 1) transcripts expressed specifically in the mouse placenta, 2) transcripts expressed in both mouse and elephant placentas, 3) transcripts not expressed in the term placenta, or 4) transcripts important for placenta function but not expressed in placenta tissue.Although the current work provides an overview of placenta gene expression across eutherian mammals, much work remains to be done. The current study is novel in that it provides data from the elephant, a representative of placentalmammal lineage, which has not been subject to as much biomedical research as mice or humans. The elephant has an endotheliochorial placenta, and future works on the transcriptomes of other species are likely to refine and elucidate our understanding of placenta evolution. For example, sequences from other species that possess endotheliochorial placentas (e.g., carnivores) will indicate whether convergent patterns of gene expression have accompanied the convergent evolution of endotheliochorial placentas.
Human Lineage–Specific PE Genes
The diversity of placenta anatomy is great within eutherian mammals, with variations in shape, interdigitation, and degree of invasiveness all showing extensive evolutionary heterogeneity (Mossman 1987; Wildman et al. 2007; Elliot and Crespi 2009). Identifying which genes are coexpressed across all placental mammals and which show more lineage-specific or placenta type–specific expression will help to elucidate the molecular underpinnings of this variation. Two mechanisms partially responsible for explaining this variation may be gene duplication and/or lineage-specific gene expression (Soares et al. 2007; Rawn and Cross 2008).We found evidence that 34 gene families that are expressed in the elephant placenta have undergone significant expansion or contraction during the evolution of this lineage. Further research is required to determine whether these gene families are expressed in elephant tissues other than the placenta and whether their expression pattern at term is typical of their expression at other gestational ages or whether they represent a dynamic pattern of gestational expression as has been seen with many genes expressed in mouse placenta (Knox and Baker 2008). Of 14,790 genes expressed in the elephant placenta, 36 of these (ATP5F1, ATP5H, C12orf5, C15orf21, C1orf178, C1orf96, C3orf14, C7orf53, CCT2, CD3G, CHIC1, CTLA4, CYB5A, FAM105A, GNL3, GRSF1, HLA-A, IL2RG, LDHC, MRPL35, MRPL50, NDUFA12, NPC2, PIAS2, PPT1, PTMA, RNF34, RTN4IP1, SCP2, TAF9B, TARS, TSPAN13, TSPAN6, TTC1, UQCRH, and ZPBP2) show evidence of positive selection on the elephant lineage since the common ancestor of the elephant and tenrec (Echinops telfairi) (Goodman et al. 2009). Three of these genes (C1orf178, CHIC1, and ZPBP2) are solely expressed in the elephant placenta when compared with the other species.Trophoblast cells of the hemochorial placenta (human and mouse) penetrate deeply into uterine tissue compared with trophoblasts of the epitheliochorial (cow) and endotheliochorial (elephant) placenta. Although hemochorial placentation is a feature typical of both human and mouse, the human placenta exhibits a deeper invasion of intravascular trophoblast than does the mouse placenta (Moffett and Loke 2006). Because the human trophoblast comes into direct contact with genetically distinct maternal immune cells, it is possible that mammals with hemochorial placentas (e.g., humans) have undergone adaptations of the immune system in pregnancy not seen in mammals with epitheliochorial or endotheliochorial placentas. Perhaps reflecting deep trophoblast invasion into the uterus, our list of human lineage–specific PE genes is enriched with immunity-related genes (table 1; supplementary data sets S2 and S3, Supplementary Material online). For example, anthropoid primates (e.g., humans) have multiple duplications of the galectin gene family, and it has been proposed that these galectins (e.g., glaectin-13) reduce the danger of maternal immune attacks on the fetal semi-allograft, presumably conferring additional immune tolerance mechanisms, and sustain hemochorial placentation during a relatively long gestation of anthropoid primates (Than et al. 2009). Human-specific PE genes were also overrepresented by signal/glycoproteins (e.g., CG-β) (supplementary data sets S2 and S3, Supplementary Material online). The CG gene family has become especially expanded in anthropoid primates with many family members expressed in the placenta (Rawn and Cross 2008). CG-β is only expressed in the placenta, and one of its primary roles is to maintain the progesterone production to sustain pregnancy (Cameo et al. 2004).
Evolution of Placental Hormones
Placental hormones are produced by fetally derived tissues and bind maternal receptors (Gootwine 2004). Placental hormones have been implicated in maternal–fetal conflict as a way by which the fetus extracts nutrients from the mother (Haig 2008). GH and prolactin (PRL) originate from the same ancestral gene (Niall et al. 1971) and are thought to be expressed predominantly in pituitary tissue based on studies of human and rodent tissue (Su et al. 2004). In some mammals (e.g., primates, murine rodents, artiodactyls), however, these genes have undergone accelerated change via gene duplication events resulting in clusters of related GH-like and PRL-like genes. Our analyses support previous findings that there are four “GH-like” human lineage–specific placenta predominantly expressed genes: chorionic somatomammotropin hormone-like 1 (CSHL1), chorionic somatomammotropin hormone 1 (placental lactogen, CSH1), growth hormone 2 (GH2), and chorionic somatomammotropin hormome 2 (CSH2). GH genes have undergone at least two independent duplications in anthropoid primates (Papper et al. 2009). Placenta-specific forms of GH promote blood flow to the placenta (Schiessl et al. 2007), are involved in placenta developmental (Alsat et al. 1998), and mediate trophoblast invasion (Lacroix et al. 2005). In addition, our study supports previous findings that there are murine rodent and ruminant “PRL-like” genes with placenta-predominant expression in both mouse and cow placenta, respectively. Nine placental lactogenes were found to be mouse-specific placenta-predominant expressed genes. Cow and other ruminant species have also undergone expansion of the PRL locus (e.g., PRP1, PRP3, PRP6, PRP10), resulting in the emergence of placental lactogens (e.g., CSH2) (Alvarez-Oxiley et al. 2008).Interestingly, we found PRL itself is expressed in the elephant placenta. In all other mammal species examined, PRL is specifically expressed in the pituitary but not expressed in the placenta. As expression of GH and prolactin is specific to the pituitary in the boreoeutherian species examined, this observation may reflect an independent elephant-specific trait, or alternatively, that the ancestor of eutherian mammals may have had only one PLR gene that was expressed in both the pituitary and placenta. The African elephant reproduces at a slower rate than any other extant mammal. The gestation of these animals lasts approximately 22 months, the interbirth interval is approximately 8 years (Allen 2006), and the estrous cycle lasts between 13 and 14 weeks (Kapustin et al. 1996). The invasive elephant placenta is endotheliochorial, and trophoblast cells are mononuclear (i.e., no syncytium is formed) (Wooding et al. 2005). Although endotheliochorial placentas also occur in other orders of mammals (e.g., Carnivora, Scandentia, Chiroptera), it has been shown that this is the result of convergent evolution (Wildman et al. 2006; Elliot and Crespi 2009). More extensive sampling of eutherian and marsupial mammals is needed to test these hypotheses.
Conclusion
Comparative genomics has been advanced as a method that can comprehensively characterize the functional elements in the genome (Birney et al. 2007; Margulies et al. 2007), and it has been argued that an evolutionary framework can be used to organize medical knowledge (Nesse et al. 2010). We have sequenced the transcriptome of the term elephant placenta, and this has enabled us to distinguish 1) a set of genes expressed in the placenta of the last common ancestor of extant eutherian mammals and from 2) genes that are expressed in a lineage-specific manner. Further work relying on increased taxonomic and developmental time points will be required to determine when these two categories of genes evolved their present pattern of expression, but we are encouraged that comparative transcriptomics can provide a roadmap for describing the evolution of the placenta.
Supplementary Material
Supplementary figure S1 and data sets S1–S6 are available at Genome Biology and Evolution online (http://www.gbe.oxfordjournals.org/).
Authors: Glynn Dennis; Brad T Sherman; Douglas A Hosack; Jun Yang; Wei Gao; H Clifford Lane; Richard A Lempicki Journal: Genome Biol Date: 2003-04-03 Impact factor: 13.583
Authors: Don L Armstrong; Michael R McGowen; Amy Weckle; Priyadarshini Pantham; Jason Caravas; Dalen Agnew; Kurt Benirschke; Sue Savage-Rumbaugh; Eviatar Nevo; Chong J Kim; Günter P Wagner; Roberto Romero; Derek E Wildman Journal: Placenta Date: 2017-05-12 Impact factor: 3.481
Authors: Jimi L Rosenkrantz; Jessica E Gaffney; Victoria H J Roberts; Lucia Carbone; Shawn L Chavez Journal: BMC Biol Date: 2021-06-21 Impact factor: 7.431