Literature DB >> 29867970

Meta-Analysis of Maternal and Fetal Transcriptomic Data Elucidates the Role of Adaptive and Innate Immunity in Preterm Birth.

Bianca Vora1,2, Aolin Wang1,3, Idit Kosti1,4, Hongtai Huang1,3, Ishan Paranjpe1, Tracey J Woodruff3, Tippi MacKenzie4,5,6, Marina Sirota1,4.   

Abstract

Preterm birth (PTB) is the leading cause of newborn deaths around the world. Spontaneous preterm birth (sPTB) accounts for two-thirds of all PTBs; however, there remains an unmet need of detecting and preventing sPTB. Although the dysregulation of the immune system has been implicated in various studies, small sizes and irreproducibility of results have limited identification of its role. Here, we present a cross-study meta-analysis to evaluate genome-wide differential gene expression signals in sPTB. A comprehensive search of the NIH genomic database for studies related to sPTB with maternal whole blood samples resulted in data from three separate studies consisting of 339 samples. After aggregating and normalizing these transcriptomic datasets and performing a meta-analysis, we identified 210 genes that were differentially expressed in sPTB relative to term birth. These genes were enriched in immune-related pathways, showing upregulation of innate immunity and downregulation of adaptive immunity in women who delivered preterm. An additional analysis found several of these differentially expressed at mid-gestation, suggesting their potential to be clinically relevant biomarkers. Furthermore, a complementary analysis identified 473 genes differentially expressed in preterm cord blood samples. However, these genes demonstrated downregulation of the innate immune system, a stark contrast to findings using maternal blood samples. These immune-related findings were further confirmed by cell deconvolution as well as upstream transcription and cytokine regulation analyses. Overall, this study identified a strong immune signature related to sPTB as well as several potential biomarkers that could be translated to clinical use.

Entities:  

Keywords:  immunology; meta-analysis; pregnancy; preterm birth; transcriptomics

Mesh:

Substances:

Year:  2018        PMID: 29867970      PMCID: PMC5954243          DOI: 10.3389/fimmu.2018.00993

Source DB:  PubMed          Journal:  Front Immunol        ISSN: 1664-3224            Impact factor:   7.561


Introduction

Preterm birth (PTB), which is defined as giving birth before completion of 37 weeks of gestation, is the leading cause of newborn deaths worldwide. In 2010, 14.9 million babies were born preterm, accounting for 11.1% of all births across 184 countries, with the highest PTB rates occurring in Africa and North America (1). This high incidence of PTB is concerning since 29% of all neonatal deaths worldwide, approximately 1 million deaths total, are accounted to complications in PTB (2). Furthermore, children born prematurely are at increased risk for a milieu of short- and long-term complications including motor, cognitive, and behavioral impairments (3, 4). Approximately 30% of PTBs are medically indicated due to maternal or fetal conditions; the other two-thirds are categorized as spontaneous preterm births (sPTB) that include spontaneous preterm labor and preterm premature rupture of the membranes (5). PTB is a syndrome with multiple etiologies. Numerous signs point to genetic factors as playing a role in birth timing including the observations that PTBs are likely to recur in mothers, women who are born preterm are more likely to deliver prematurely, and sisters of women who have delivered prematurely are at an increased risk of delivering preterm. Furthermore, twin studies suggest that genetics account for approximately one-third of the variation in PTB (6, 7). Other factors shown to influence risk for PTB include those associated with adverse lifestyle and behavior, such as stress, smoking, drug use, and nutrition (8). Although a variety of social (9, 10), environmental, and maternal factors have been implicated in PTB, causes of sPTB have remained largely mysterious and therefore, in most instances, not amenable to effective interventions. Thus far, there exists no universal detection method to predict sPTB or intervention approach to prolong labor and extend the pregnancy to term. The complexity and multiple etiologies of sPTB, along with the inconsistency in clinical phenotyping and non-uniform classification system, have limited the identification of genetic factors and clinically relevant biomarkers (11). Over the years, many different mechanisms have been identified to be associated with sPTB, including breakdown of maternal–fetal tolerance, decidual senescence, uterine overdistension, and procoagulant activity (12, 13). One particularly interesting mechanism that has been implicated is the dysregulation of the interplay between the maternal innate and adaptive immune systems. The innate immune system, also known as the non-specific immune system, comprises cells and mechanisms including but not limited to macrophages, toll-like receptors, neutrophils, and cytokines which aid in host defense from infection (14, 15). This sub-system is responsible for the generalized, non-specific immune response, inflammation, and activation of the adaptive immune system through antigen presentation (14, 15). Contrastingly, the adaptive immune system comprises lymphocytes, specifically T cells and B cells, which are specialized white blood cells that provide long-term immunity (14, 15). In pregnancy, regulatory T-cells proliferate after implantation and function to prevent rejection of the fetus by creating an anti-inflammatory environment (16, 17). However, for labor to initiate and progress, the maternal immune system switches to a pro-inflammatory state by activating the pro-inflammatory nuclear factor-kB signaling pathway, which leads to an increase in the production of cytokines, chemokines, and interleukins and allows for infiltration of the fetal/maternal interface by activating leukocytes (16–19). The location and function of each immune cell is critical to sustain pregnancy to term; it has been proposed that a premature shift from the anti-inflammatory to the pro-inflammatory state, and therefore a disruption in the balance of innate and adaptive immunity, could result in preterm labor and delivery (19). There is a need to understand the mechanisms by which preterm labor is affected which could then lead to identification, intervention, and prevention. Identifying immune-related genetic signatures as well as clinically relevant diagnostic biomarkers specific to sPTB would enhance our ability to discern women who are at an elevated risk for delivering prematurely. However, findings have been limited due to small sample size and issues with irreproducibility (20). Meta-analysis, which combines information from multiple existing studies, is a powerful tool that improves reliability, generalizability, and ability to detect differential gene expression by larger statistical power (20). With the development of databases such as the National Institute of Health Gene Expression Omnibus (NIH GEO) and Array Express, gene expression meta-analysis has been applied to investigate different disease subtypes and discover novel biomarkers (21–24). In the area of obstetrics, a recent study performed a meta-analysis which integrated diverse types of genomic data, overlaying evolutionary data, and placental expression data in an effort to elucidate genes that may be involved in parturition and disrupt pregnancy (25). As discussed in a recent systematic review (26), although there have been 134 genome-wide transcriptomic studies related to pregnancy and PTB, most of these studies have focused on PTB related to preeclampsia (one of the medical indications of PTB). sPTB was investigated in only 7% of all studies and 18% of preterm studies, even though sPTB is responsible for over two-thirds of all PTBs. Furthermore, 61% of the studies focused on placental tissue, which has limited utility in the diagnostic setting and upon comparison of results from the different studies, there was very limited overlap among differentially expressed genes; only 2 genes of 6,444 differentially expressed genes identified were present in 10 or more gene expression studies (26). Therefore, there exists a need to aggregate data and perform meta-analyses to elucidate gene signatures that are robust and can be reproduced in studies of maternal blood, which allows for discovery of biomarkers that can be implemented as part of the standard prenatal care. The NIH GEO database has three sPTB related, publicly available datasets which have all been analyzed separately before. The first study, which included women who were diagnosed with threatened preterm labor (median gestational age: 32 weeks), found 469 differentially expressed genes and significantly increased leukocyte and neutrophil counts in women who had sPTB within 48 h after initiation of labor (27). The second study, also by Heng et al., collected samples at two different time points and found no differentially expressed genes in the second trimester and 26 differentially expressed genes in the third trimester when comparing sPTB and term birth (28). The last study analyzed eight tissue types, comparing women who delivered preterm and term with or without labor; they found that pregnancy was maintained by downregulation of chemokines at the maternal–fetal interface but the work has not been published. Using these three datasets, we performed a cross-study meta-analysis which identified a set of significant differentially expressed genes in maternal blood, many of which were immune related and a few of which could translate to clinically relevant biomarkers. An additional analysis of measurements collected during mid-gestation in one study revealed a smaller set of significant genes that were differentially expressed over time. Finally, a complementary analysis of fetal cord blood (CB) revealed that there were a number of differentially expressed genes on the fetal side, many of which overlapped with the significant genes in maternal blood and showed opposing changes in regulation.

Results

Datasets

We identified three datasets, from the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database (23, 24), which were comprised of whole blood gene expression profiles from women who delivered preterm and term, respectively. These three studies (GSE46510, GSE59491, and GSE73685) included 339 maternal whole blood samples, 134 from women who delivered preterm, and 205 from those delivered at term. The gestational age of the preterm deliveries ranged from 24.4 to 36.9 weeks with a median of 34 weeks. One study (GSE59491) collected blood samples at two different time points, second trimester (17–23 weeks) and third trimester (27–33 weeks), respectively. In addition to whole blood samples, another study used in the meta-analysis (GSE73685) collected RNA samples from seven other different types of tissues including amnion, CB, chorion, decidua, fundus, lower segment, and placenta (Table 1).
Table 1

Datasets used in discovery analyses.

DatasetYearAuthorPlatformSample typesPreterm birthsTerm birthsGestational age at sampling*
GSE465102014HengGPL16311Maternal whole blood757932 (24–36)
GSE594912016HengGPL18964Maternal whole blood51 (T2)47 (T3)114 (T2)114 (T3)19 (17–20)29 (27–33)
GSE736852016BaldwinGPL6244Amnion (A)1212NR
Cord blood1112
Chorion (C)1212
Decidua (D)1112
Fundus (F)1010
Lower segment1212
Placenta (P)129
Maternal whole blood (WB)1212

Median (range) reported.

NR, not reported; T2, second trimester; T3, third trimester.

Datasets used in discovery analyses. Median (range) reported. NR, not reported; T2, second trimester; T3, third trimester.

Overview

Our primary goal was to perform a meta-analysis to identify potential maternal plasma biomarkers by evaluating differentially expressed genes associated with sPTB and investigating whether certain cell types are enriched in sPTB, using time-matched maternal data from the three independent studies. Taking advantage of the repeated samples collected in mid-gestation from study GSE59491 and samples collected from seven additional tissues in study GSE73685, we performed secondary analyses to identify potential common gene expression signatures across different gestational stages and different tissues and to investigate the potential maternal–fetal interplay at the transcriptomic level. We investigated and compared the transcriptomic signature that was identified as part of the maternal meta-analysis to what was observed earlier in the pregnancy. The second additional analysis investigated differential gene expression in various tissue types to identify tissue specific transcriptomic signatures (Figure 1). Each of the signatures was further interrogated through pathway and transcriptional regulation analysis.
Figure 1

Analysis of relationship of gene expression differences in term vs. preterm birth. We identified three independent studies from the Gene Expression Omnibus database (in yellow) to perform a meta-analysis using third trimester maternal blood samples (in green), an additional differential expression analysis with second trimester samples from GSE59491 (in orange), and a tissue-specific analysis with samples from GSE73685 (in blue).

Analysis of relationship of gene expression differences in term vs. preterm birth. We identified three independent studies from the Gene Expression Omnibus database (in yellow) to perform a meta-analysis using third trimester maternal blood samples (in green), an additional differential expression analysis with second trimester samples from GSE59491 (in orange), and a tissue-specific analysis with samples from GSE73685 (in blue).

Cross-Study Gene Expression Meta-Analysis in Maternal Blood

Samples from the three studies were pooled together based on gestational age at time of sample collection. The women were split into two groups based only on whether they delivered before or after 37 weeks of gestation, with no regard to time of delivery relative to the initiation of labor. When pooling samples from the different studies, study-specific differences in gene expression were seen (Figure 2A) and corrected for using ComBat (29) to eliminate such biases (Figure 2B). When we imposed a false discovery rate (FDR) of 0.1, the normalized, merged gene dataset of 17,337 was reduced to 4,648 significant genes. Setting a significance threshold at a fold change (FC) of 1.3-fold increase or decrease in gene expression (30) for PTB samples, relative to term birth, condensed our gene list from 4,648 genes to 210 differentially expressed genes (FC range: 0.46–1.94) (Figure 2C), with 65 genes upregulated and 145 genes downregulated (Table S1 in Supplementary Material). We saw clustering of preterm samples and term samples based on the 210 significant genes; however, we did not see any clustering by study (Figure 2D). Only third trimester samples from GSE59491 were used in the meta-analysis to better time-match all samples (Figure 2E).
Figure 2

Results from the cross-study meta-analysis and distribution of gestational age at sampling. (A,B) Principal component analysis plots with all genes before (A) and after (B) ComBat. (C,D) Principal component analysis plot (C) and heatmap (D) of all samples based on 210 significant differentially expressed genes. (E) Gestational age at sampling was not significantly different between preterm and term maternal whole blood samples (n = 315, p-value = 0.125).

Results from the cross-study meta-analysis and distribution of gestational age at sampling. (A,B) Principal component analysis plots with all genes before (A) and after (B) ComBat. (C,D) Principal component analysis plot (C) and heatmap (D) of all samples based on 210 significant differentially expressed genes. (E) Gestational age at sampling was not significantly different between preterm and term maternal whole blood samples (n = 315, p-value = 0.125). Splitting our list of 210 genes into two sub-groups based on whether they were upregulated or downregulated in PTB, we found that the downregulated genes demonstrated strong network connectivity using the STRING database (31–33) (Figure 3A) and were functionally enriched in 36 different pathways using the ToppFun database (34) (Table 2), more than half of which were immune related. Specifically, the downregulated genes were highly involved in the adaptive immune response, showing significant clustering and connectivity in Gene Ontology Consortium (GO) biological processes (Table S2 in Supplementary Material) including antigen receptor-mediated signaling pathway, leukocyte activation, lymphocyte activation, and T-cell activation (Figures 4A–D). Furthermore, there were six genes (CD8B, CLC, DPP4, NELL2, SERPINI1, and NUCB2) of 145 downregulated genes that were found to be secreted as proteins in humans from the UnitProt database (35) (Table 3). The 65 upregulated genes showed less network connectivity relative to the downregulated genes (Figure 3B). Although the majority (5 of 6) of the functionally enriched pathways were immune related (Table 2), the upregulated genes were specifically involved in the innate immune response, a stark contrast to the downregulated genes. In addition, there were 9 genes from the 65 upregulated genes that were found be secreted as proteins in humans [IL-1 receptor type I (IL-1R1), IL-1R2, IL-1RAP, HPSE, NLRP3, tissue factor pathway inhibitor (TFPI), LRG1, CST7, LAMB2] in the UniProt database (35) (Table 3).
Figure 3

STRING connectivity networks based on 210 differentially expressed genes. (A,B) Connectivity networks for significantly downregulated (A) and upregulated (B) genes from meta-analysis.

Table 2

Functionally enriched pathways from cross-study meta-analysis.

IDNameSourcep-ValueFDR B&HGenes from inputGenes in annotation
Upregulated
M12095Signal transduction through IL1R*MSigDB C2 BIOCARTA (v5.1)6.17E−063.04E−03433
1269320Interleukin-1 signaling*BioSystems: REACTOME2.38E−055.86E−03446
1457780Neutrophil degranulation*BioSystems: REACTOME6.52E−051.07E−029492
1269203Innate Immune System*BioSystems: REACTOME1.81E−042.23E−02141312
137944IL1-mediated signaling events*BioSystems: Pathway Interaction Database2.84E−042.54E−02335
82974Starch and sucrose metabolismBioSystems: KEGG3.10E−042.54E−02336
Downregulated
M1467The Co-Stimulatory Signal During T-cell Activation*MSigDB C2 BIOCARTA (v5.1)3.46E−073.21E−04521
83080T cell receptor signaling pathway*BioSystems: KEGG8.04E−073.66E−048103
138055TCR signaling in naive CD8+ T cells*BioSystems: Pathway Interaction Database1.24E−063.66E−04648
1269171Adaptive Immune System*BioSystems: REACTOME1.58E−063.66E−0420826
137998TCR signaling in naive CD4+ T cells*BioSystems: Pathway Interaction Database4.71E−068.73E−04660
1269175Generation of second messenger moleculesBioSystems: REACTOME5.89E−069.10E−04536
1269174Translocation of ZAP-70 to Immunological synapse*BioSystems: REACTOME1.77E−052.10E−03422
M9526T Cell Signal Transduction*MSigDB C2 BIOCARTA (v5.1)1.81E−052.10E−03545
1269173Phosphorylation of CD3 and TCR zeta chains*BioSystems: REACTOME3.00E−052.98E−03425
1269172TCR signaling*BioSystems: REACTOME3.32E−052.98E−037124
1269182PD-1 signaling*BioSystems: REACTOME3.53E−052.98E−03426
M16519HIV Induced T Cell Apoptosis*MSigDB C2 BIOCARTA (v5.1)5.98E−054.62E−03311
83078Hematopoietic cell lineage*BioSystems: KEGG7.48E−055.34E−03697
M10765Lck and Fyn tyrosine kinases in initiation of TCR Activation*MSigDB C2 BIOCARTA (v5.1)1.03E−046.46E−03313
1269176Downstream TCR signaling*BioSystems: REACTOME1.05E−046.46E−036103
M13247T Cytotoxic Cell Surface Molecules*MSigDB C2 BIOCARTA (v5.1)1.30E−047.07E−03314
M6427T Helper Cell Surface Molecules*MSigDB C2 BIOCARTA (v5.1)1.30E−047.07E−03314
83125Primary immunodeficiency*BioSystems: KEGG1.47E−047.55E−03437
1269177Costimulation by the CD28 family*BioSystems: REACTOME1.90E−049.25E−03573
169352Regulation of Wnt-mediated beta catenin signaling and target gene transcriptionBioSystems: Pathway Interaction Database2.75E−041.27E−02579
1269183Signaling by the B Cell Receptor (BCR)*BioSystems: REACTOME3.25E−041.42E−028236
M16966Stathmin and breast cancer resistance to antimicrotubule agentsMSigDB C2 BIOCARTA (v5.1)3.36E−041.42E−02319
M18215Role of Tob in T-cell activation*MSigDB C2 BIOCARTA (v5.1)4.57E−041.83E−02321
1269201Immunoregulatory interactions between a Lymphoid and a non-Lymphoid cell*BioSystems: REACTOME4.74E−041.83E−026136
1270272Activation of NOXA and translocation to mitochondriaBioSystems: REACTOME5.21E−041.88E−0225
1269102Nef-mediates down modulation of cell surface receptors by recruiting them to clathrin adaptersBioSystems: REACTOME5.26E−041.88E−02322
M6327Activation of Csk by cAMP-dependent Protein Kinase Inhibits Signaling through the T Cell Receptor*MSigDB C2 BIOCARTA (v5.1)6.84E−042.35E−02324
1427859Cargo recognition for clathrin-mediated endocytosisBioSystems: REACTOME7.78E−042.58E−02599
137922IL12-mediated signaling events*BioSystems: Pathway Interaction Database1.01E−033.24E−02461
1269603Binding of TCF/LEF:CTNNB1 to target gene promotersBioSystems: REACTOME1.08E−033.24E−0227
137936IL12 signaling mediated by STAT4*BioSystems: Pathway Interaction Database1.08E−033.24E−02328
1269100The role of Nef in HIV-1 replication and disease pathogenesis*BioSystems: REACTOME1.20E−033.49E−02329
83004Propanoate metabolismBioSystems: KEGG1.61E−034.52E−02332
1269298Fc epsilon receptor (FCERI) signaling*BioSystems: REACTOME1.84E−034.94E−029381
117293Arrhythmogenic right ventricular cardiomyopathy (ARVC)BioSystems: KEGG1.88E−034.94E−02472
1269528SMAD2/SMAD3:SMAD4 heterotrimer regulates transcriptionBioSystems: REACTOME1.92E−034.94E−02334

Pathways annotated with a * are immune related.

FDR B&H, false discovery rate using Benjamini–Hochberg method; Genes from input, number of significant genes included in given pathways; Genes in annotation, number of genes involved in functional pathway; MSigDB C2 BIOCARTA, Molecular Signatures Database curated gene set derived from BIOCARTA database; KEGG, Kyoto Encyclopedia of Genes and Genomes.

Figure 4

(A–D) Network visualization of functionally enriched GO biological processes in significantly downregulated genes from the meta-analysis.

Table 3

Secreted proteins from meta-analysis and T2 ad hoc analysis.

GenesFC_GSE46510FC_GSE59491FC_GSE73685Directionalityp-ValueAdj p value
HPSE1.125025661.0297551321.302481986Upregulated0.014727670.072407187
NLRP31.1203881121.0459367471.463420748Upregulated0.0086375670.055169509
LRG11.3046008441.0127829051.266657092Upregulated0.0016943970.02445227
CLC0.778166040.9256577680.460056969Downregulated0.0136087090.06964973
DPP40.9065030940.9395788050.7254311Downregulated0.0075987620.051200829
IL1R1*1.0882376671.0853244561.324171306Upregulated0.0066162520.047794152
IL1RAP1.1637410331.0162187631.310389926Upregulated0.0170008770.07821447
LAMB21.051784631.0228487681.305601216Upregulated0.0040966240.037683905
NELL20.7853240820.913732260.642718858Downregulated0.0001983620.010174565
NUCB20.7496139120.9655737380.800992598Downregulated0.0006571350.01646351
SERPINI10.9202033360.8724208730.755891687Downregulated0.0001808170.009596761
TFPI*1.0440972121.1171680950.750437685Upregulated0.0093257120.057394342
IL1R21.2319214761.0514527091.625438109Upregulated0.0047725890.040619724
CST71.1889698281.0334584641.334064851Upregulated0.0035003530.034868265
CD8B0.8567613690.9098157670.742642806Downregulated0.0060327790.045913205

Genes with * annotation are also found to be significant in the T2 analysis.

FC_GSE46510, fold-change calculated using GSE46510 samples; FC_GSE59491, fold-change calculated using GSE59491 samples; FC_GSE73685, fold-change calculated using GSE73685 samples; Adj p val, adjusted p-value.

STRING connectivity networks based on 210 differentially expressed genes. (A,B) Connectivity networks for significantly downregulated (A) and upregulated (B) genes from meta-analysis. Functionally enriched pathways from cross-study meta-analysis. Pathways annotated with a * are immune related. FDR B&H, false discovery rate using Benjamini–Hochberg method; Genes from input, number of significant genes included in given pathways; Genes in annotation, number of genes involved in functional pathway; MSigDB C2 BIOCARTA, Molecular Signatures Database curated gene set derived from BIOCARTA database; KEGG, Kyoto Encyclopedia of Genes and Genomes. (A–D) Network visualization of functionally enriched GO biological processes in significantly downregulated genes from the meta-analysis. Secreted proteins from meta-analysis and T2 ad hoc analysis. Genes with * annotation are also found to be significant in the T2 analysis. FC_GSE46510, fold-change calculated using GSE46510 samples; FC_GSE59491, fold-change calculated using GSE59491 samples; FC_GSE73685, fold-change calculated using GSE73685 samples; Adj p val, adjusted p-value. These immune pathways and secreted proteins associated with sPTB could have been missed in the single-study analysis due to limited sample size and thus not reaching statistical significance: only 26 significant genes were identified (FDR < 0.10) in the GSE59491 study and no significant genes were identified (FDR < 0.10) in the GSE73685 study (analysis of maternal blood sample only). This highlights the importance and power of aggregating the data and performing a meta-analysis.

Cell-Type Deconvolution Analysis in Maternal Blood

Due to the heterogeneity of plasma samples, it is important to identify and quantify the various cell types that comprise peripheral blood. If not taken into account, the variability in cell composition in each sample can confound the results and limit interpretability (36). To examine the reproducibility of our experiments and test the hypothesis of aligning our pathways with cell type abundance when comparing preterm and term birth, we performed a cell-type deconvolution analysis. Specifically, since immune cells constitute a large portion of cell types in plasma, and a large part of differentially expressed pathways were immune related, we utilized xCell, a computational method that is able to infer 64 various immune and stroma cell types using gene signatures (37). Our xCell analysis of all 339 samples revealed that there were 27 cell types that were enriched and significantly differentially expressed between preterm and term birth. Macrophages (M2 type) and microvascular endothelial cells demonstrated the largest and most significant (FDR = 0.003) difference in enrichment between preterm and term maternal blood samples (Figure 5A) when comparing average xCell scores. We also saw some clustering of preterm and term birth samples by immune cell type, with term birth samples showing some clustering and upregulation of adaptive immune cells, such as Th2 cells, CD8+ T-cells, CD4+ T-cells, and B-cells, and PTB samples showing some clustering and upregulation of innate immune cells, such as NKT, macrophages M2, basophils, and neutrophils (Figure 5B). Adjusting for significant cell types as a covariate in our differential expression analysis for T3 samples resulted in 334 genes that were differentially expressed in PTB compared with term birth. Upon pathway analysis, we found that the innate immune pathway was upregulated in the preterm samples, which is consistent with our initial results.
Figure 5

Cell deconvolution of 339 meta-analysis samples. Boxplot (A) and heatmap (B) of average xCell scores for enriched cell types.

Cell deconvolution of 339 meta-analysis samples. Boxplot (A) and heatmap (B) of average xCell scores for enriched cell types.

Additional Analysis of Maternal Signatures in the Second Trimester

To investigate and compare expression profiles at two different time points in pregnancy, we utilized the samples collected at second trimester in the GSE59491 study and performed an additional analysis investigating whether any of our significant genes from the third trimester analysis were differentially expressed at an earlier time point to facilitate potential biomarker identification. Implementing an FDR < 0.1 on the filtered list of 210 genes, there were 18 genes (8 upregulated and 10 downregulated) that were significantly differentially expressed (Figure 6A; Table S3 in Supplementary Material).
Figure 6

Results from additional second trimester analysis. (A) Heatmap of significant genes from second trimester analysis; genes which are secreted as proteins are boxed. (B,C) Boxplots of genes that encode secreted proteins at second (T2) and third (T3) trimester; raw gene expression values from GSE59491 are plotted.

Results from additional second trimester analysis. (A) Heatmap of significant genes from second trimester analysis; genes which are secreted as proteins are boxed. (B,C) Boxplots of genes that encode secreted proteins at second (T2) and third (T3) trimester; raw gene expression values from GSE59491 are plotted. These 18 genes, which were differentially expressed in PTB relative to term birth at the second trimester (17–20 weeks) and the third trimester (24–36 weeks), showed similar FC direction and values when comparing second trimester samples from GSE59491 and the samples from the cross-study meta-analysis (Table S4 in Supplementary Material). Furthermore, when plotting the raw expression data for the second and third trimester samples from GSE59491 for these 18 genes, the same trends were upheld, demonstrating similar FC direction and values between the two groups (Figure S1 in Supplementary Material). 2 of these 18 genes (IL-1R1 and TFPI) showed potential as diagnostic biomarkers; they were found to be secreted and detectable in human plasma in the UniProt database (35) (Table 3) and upheld the same fold-change directionality in both second and third trimester samples (Figures 6B,C).

Upstream Transcription and Cytokine Regulation Analysis in Maternal Signatures

To better understand the differential expression patterns, we explored the upstream regulation of differentially expressed upregulated genes for the second and third trimester separately. We first created a transcription factor regulation network for the second and third trimester (Figures 7A,B). In Figure 7A, we found four regulators for only two of the second trimester differently expressed genes. Out of the four regulators, one transcription factor, BCL6 (38), has been shown before to regulate components of the immune system and another, MXD1, is involved in cell proliferation (39). In Figure 7B, we found nine regulators for 46 of the third trimester differently expressed genes. Out of those nine, a few are known to be involved in development of the immune system, such as SPI1 (40), BCL6, and UXT (41) while others are involved in embryonic cell development such as CBX5 (42), RUNX2 (43), and TCF3 (44). The overlap between the groups is two transcription factors, BCL6 and MXD1.
Figure 7

Regulatory networks for second and third trimester differentially expressed genes. Transcription regulation networks for differentially expressed genes in the second trimester (A) and third trimester (B), where the transcription factors are represented with a purple round node and the differentially expressed targets are represented with a gray square node. Cytokine networks for second trimester (C) and third trimester (D), where the transcription factors are represented with an orange hexagon node and the differentially expressed targets are represented with a gray square node.

Regulatory networks for second and third trimester differentially expressed genes. Transcription regulation networks for differentially expressed genes in the second trimester (A) and third trimester (B), where the transcription factors are represented with a purple round node and the differentially expressed targets are represented with a gray square node. Cytokine networks for second trimester (C) and third trimester (D), where the transcription factors are represented with an orange hexagon node and the differentially expressed targets are represented with a gray square node. We then explored cytokine regulation in differentially expressed upregulated genes in second and third trimesters. In both trimesters, as shown clearly in Figures 7C,D, IL-7 is the only cytokine we found to be involved in the regulation. Given the known role for IL-7 signaling in lymphocyte differentiation, this finding is also consistent with the immune signature we observed. Based on those four regulatory networks and two modes of regulation, we see enrichment of transcription factors involved with the immune system and with cell proliferation.

Differential Gene Expression Analysis in Samples From Other Tissues

Since GSE73685 contained a set of diverse tissues, we also evaluated transcriptional signal in various maternal and fetal tissues separately. With an FDR < 0.05, only one of the tissue types, CB, showed significant differentially expressed genes. Imposing a fold-change cutoff of 1.3 on the 507 genes that were identified from the differential expression analysis resulted in 473 significant genes (Table S5 in Supplementary Material), 165 upregulated and 308 downregulated genes in PTB relative to term birth, which clustered to create a distinct separation between PTB and term birth (Figure 8A). Based on the ToppFun database, 308 downregulated genes were highly enriched in multiple functional pathways, many of which were immune related (Table 4) (34). Specifically, PTB samples showed downregulation of many innate immune-related pathways relative to term birth samples. Conversely, the 165 upregulated genes showed low-functional pathway enrichment (Table 4) (34).
Figure 8

Significant genes from cord blood (CB) tissue analysis and maternal–cord gene signature comparison. (A) Heatmap of significant differentially expressed genes from CB analysis. (B) Boxplot of overlapping significant genes from meta-analysis and CB analysis; raw gene expression values from GSE73685 plotted.

Table 4

Functionally enriched pathways from cord blood tissue analysis.

IDNameSourcep-ValueFDR B&HGenes from inputGenes in annotation
Upregulated
169351Validated targets of C-MYC transcriptional activationBioSystems: Pathway Interaction Database7.733E−080.00008437981
Downregulated
1269203Innate Immune System*BioSystems: REACTOME7.768E−269.578E−23811312
1457780Neutrophil degranulation*BioSystems: REACTOME8.786E−255.417E−2250492
213780Tuberculosis*BioSystems: KEGG5.678E−070.000233415179
83051Cytokine-cytokine receptor interaction*BioSystems: KEGG0.0000012340.000380418270
469200Legionellosis*BioSystems: KEGG0.0000046630.0009906855
144181Leishmaniasis*BioSystems: KEGG0.000004820.0009906973
1427857Regulation of TLR by endogenous ligand*BioSystems: REACTOME0.0000058680.001034516
M9546Chaperones modulate interferon Signaling Pathway*MSigDB C2 BIOCARTA (v5.1)0.000014970.002122519
1269204Toll-Like Receptors Cascades*BioSystems: REACTOME0.000015490.00212212153
1269310Cytokine Signaling in Immune system*BioSystems: REACTOME0.000025540.00259930763
1269158IRAK4 deficiency (TLR2/4)*BioSystems: REACTOME0.00002740.002599411
PW:0000234Innate immune response*Pathway Ontology0.00002740.002599411
1269160MyD88 deficiency (TLR2/4)*BioSystems: REACTOME0.00002740.002599411
634527NF-kappa B signaling pathway*BioSystems: KEGG0.000041830.003636995
122191NOD-like receptor signaling pathway*BioSystems: KEGG0.000044230.00363612170
1269156Diseases of Immune System*BioSystems: REACTOME0.000050930.003694524
1269157Diseases associated with the TLR signaling cascade*BioSystems: REACTOME0.000050930.003694524
1269318Signaling by Interleukins*BioSystems: REACTOME0.000057120.00391323531
M13968HIV-I Nef: negative effector of Fas and TNF*MSigDB C2 BIOCARTA (v5.1)0.000064560.00419758
138052Ephrin B reverse signalingBioSystems: Pathway Interaction Database0.000092690.005531527
193147Osteoclast differentiationBioSystems: KEGG0.00009420.00553110130
1383066TP53 Regulates Transcription of Cell Death GenesBioSystems: REACTOME0.00012390.006944645
P00031Inflammation mediated by chemokine and cytokine signaling pathway*PantherDB0.00013570.00727312191
1269545Class A/1 (Rhodopsin-like receptors)BioSystems: REACTOME0.00017440.0089616322
1269236Activated TLR4 signaling*BioSystems: REACTOME0.00018520.0090019115
114228Fc gamma R-mediated phagocytosis*BioSystems: KEGG0.00018980.009001891
217173Influenza A*BioSystems: KEGG0.00023220.0101711173
1269239Toll-Like Receptor TLR1:TLR2 Cascade*BioSystems: REACTOME0.00025570.01017895
1269238Toll-Like Receptor 2 (TLR2) Cascade*BioSystems: REACTOME0.00025570.01017895
1269237MyD88:Mal cascade initiated on plasma membrane*BioSystems: REACTOME0.00025570.01017895
1269240Toll-Like Receptor TLR6:TLR2 Cascade*BioSystems: REACTOME0.00025570.01017895
137995HIV-1 Nef: Negative effector of Fas and TNF-alpha*BioSystems: Pathway Interaction Database0.00033260.01282535
99051Chemokine signaling pathway*BioSystems: KEGG0.00035960.0129411182
137910CXCR4-mediated signaling events*BioSystems: Pathway Interaction Database0.00035980.01294776
1269234Toll-Like Receptor 4 (TLR4) Cascade*BioSystems: REACTOME0.00036740.012949126
147809Chagas disease (American trypanosomiasis)*BioSystems: KEGG0.00041560.014038102
172846Staphylococcus aureus infection*BioSystems: KEGG0.00042090.01403656
1457777Antimicrobial peptides*BioSystems: REACTOME0.00050530.01648105
1269280FCGR activation*BioSystems: REACTOME0.00052210.01651422
213306Measles*BioSystems: KEGG0.00057710.017799134
M15285NF-kB Signaling Pathway*MSigDB C2 BIOCARTA (v5.1)0.00062340.01875423
138022Class I PI3K signaling eventsBioSystems: Pathway Interaction Database0.00070510.02047541
83060ApoptosisBioSystems: KEGG0.00071390.020479138
375172Salmonella infection*BioSystems: KEGG0.00076310.02138786
169642Toxoplasmosis*BioSystems: KEGG0.00082320.022568113
1269161MyD88 deficiency (TLR5)*BioSystems: REACTOME0.00090520.0237523
1269566Hydroxycarboxylic acid-binding receptorsBioSystems: REACTOME0.00090520.0237523
1269303C-type lectin receptors (CLRs)*BioSystems: REACTOME0.001120.028779147
137964Regulation of p38-alpha and p38-beta*BioSystems: Pathway Interaction Database0.001170.02937427
1269576G alpha (i) signaling eventsBioSystems: REACTOME0.0011910.0293712243
1270241Signal regulatory protein (SIRP) family interactions*BioSystems: REACTOME0.001330.03216313
1269308Dectin-2 family*BioSystems: REACTOME0.001540.03607429
153910Phagosome*BioSystems: KEGG0.0015510.036079154
1470924Interleukin-10 signaling*BioSystems: REACTOME0.0016020.03617549
1269546Peptide ligand-binding receptorsBioSystems: REACTOME0.0017520.0361710188
1269332TNFs bind their physiological receptors*BioSystems: REACTOME0.0017530.03617430
137974Caspase cascade in apoptosis*BioSystems: Pathway Interaction Database0.0017550.03617550
138017Signaling events mediated by PTP1BBioSystems: Pathway Interaction Database0.0017550.03617550
PW:0000681FasL mediated signaling pathway*Pathway Ontology0.0017890.0361724
1269159IRAK4 deficiency (TLR5)*BioSystems: REACTOME0.0017890.0361724
PW:0000464leukotriene metabolic*Pathway Ontology0.0017890.0361724
83099Amyotrophic lateral sclerosis (ALS)BioSystems: KEGG0.0019190.03816551
P00020FAS signaling pathwayPantherDB0.0019860.03874431
M17681IL3 signaling pathway*MSigDB C2 BIOCARTA (v5.1)0.0020620.03874315
M11736Cytokines can induce activation of matrix metalloproteinases, which degrade extracellular matrix*MSigDB C2 BIOCARTA (v5.1)0.0020620.03874315
P00006Apoptosis signaling pathwayPantherDB0.0020730.038747102
1269195Antigen processing-Cross presentation*BioSystems: REACTOME0.0021920.040347103
137939Direct P53 effectorsBioSystems: Pathway Interaction Database0.0022340.040518132
1269357GPVI-mediated activation cascade*BioSystems: REACTOME0.0024760.04291554
M14775G alpha s PathwayMSigDB C2 BIOCARTA (v5.1)0.0025060.04291316
1270299RIPK1-mediated regulated necrosisBioSystems: REACTOME0.0025060.04291316
1270298Regulated NecrosisBioSystems: REACTOME0.0025060.04291316
1269544GPCR ligand bindingBioSystems: REACTOME0.00270.0456117455
812256TNF signaling pathway*BioSystems: KEGG0.0028670.047787108
M4891Regulation of transcriptional activity by PML*MSigDB C2 BIOCARTA (v5.1)0.0030030.04873317
1270264Ligand-dependent caspase activation*BioSystems: REACTOME0.0030030.04873317
194384African trypanosomiasis*BioSystems: KEGG0.0031290.04946435
137944IL1-mediated signaling events*BioSystems: Pathway Interaction Database0.0031290.04946435

Pathways annotated with a * are immune related.

FDR B&H, FDR using Benjamini–Hochberg method; Genes from input, number of significant genes included in given pathways; Genes in annotation, number of genes involved in functional pathway, MSigDB C2 BIOCARTA, Molecular Signatures Database curated gene set derived from BIOCARTA database; KEGG, Kyoto Encyclopedia of Genes and Genomes; PANTHER, Protein Analysis Through Evolutionary Relationships Classification system.

Significant genes from cord blood (CB) tissue analysis and maternal–cord gene signature comparison. (A) Heatmap of significant differentially expressed genes from CB analysis. (B) Boxplot of overlapping significant genes from meta-analysis and CB analysis; raw gene expression values from GSE73685 plotted. Functionally enriched pathways from cord blood tissue analysis. Pathways annotated with a * are immune related. FDR B&H, FDR using Benjamini–Hochberg method; Genes from input, number of significant genes included in given pathways; Genes in annotation, number of genes involved in functional pathway, MSigDB C2 BIOCARTA, Molecular Signatures Database curated gene set derived from BIOCARTA database; KEGG, Kyoto Encyclopedia of Genes and Genomes; PANTHER, Protein Analysis Through Evolutionary Relationships Classification system. Comparing these 473 significant genes from the CB analysis to the 210 significant genes output from the maternal blood meta-analysis, we found that there were 13 genes including toll-like receptor 5 (TLR5) and other immune transcripts which overlapped and were significant in both analyses. Plotting the raw data for these 13 genes from GSE73685 revealed opposite directionality comparing preterm and term birth for CB and maternal blood, respectively (Figure 8B). While some genes were upregulated in preterm maternal whole blood samples (in both the meta-analysis and GSE73685 only samples), those same genes were downregulated in preterm CB samples; the same was true for many genes which were downregulated in preterm maternal whole blood samples but upregulated in preterm CB samples. All the results and the data are available as an RShiny Application for the benefit of the research community: http://comphealth.ucsf.edu/preterm_transcriptomics/.

Discussion

Given the role of the immune system in pregnancy, there exists a need to elucidate immune signatures specific to PTB at both the maternal and fetal level. This study was thus designed to answer these questions by aggregating data from multiple independent experiments in an attempt to discover significant, differential genetic signatures in women who deliver preterm. Our cross-study meta-analyses revealed 210 differentially expressed genes, 15 of which were found to be secreted in the plasma. Interestingly, 18 of these 210 genes also demonstrated differential expression in the second trimester, suggesting a possibility for early identification of patients who might deliver preterm. IL-1R1 and TFPI, both of which encode immune-related proteins, were found to be differentially expressed and secreted longitudinally. CB analysis also revealed significant differential gene expression and had clustering in immune related pathways. In contrast to preterm maternal whole blood, which showed upregulation of innate immunity and downregulation of adaptive immunity, CB showed downregulation in innate immunity. This juxtaposition, as well as the heavy involvement of immune-related pathways and biomarkers, bring to light novel findings which coincide with previous literature.

Leveraging Transcriptomics to Identify New Biomarkers for sPTB

There is a crucial need to find biomarkers for PTB. There are classic negative predictors such as the absence of fetal fibronectin in the cervicovaginal fluid, but they are less useful as a routine screening tool to identify women with high risk of PTB (45–47). Identifying biomarkers predictive of PTB in maternal blood seems like an easier target as blood is easily accessible and can be collected in most women as part of the standard prenatal care (27). In our study, we found nine upregulated genes that encode secreted proteins in human (48). These markers may be further investigated regarding their values as biomarkers for identifying high-risk women for PTB, especially IL-1R1 and TFPI that were significantly over expressed among PTB cases as early as during second trimester. IL-1 receptor type I belongs to the IL-1 family of receptors which contains 10 distinct but related gene products all of which are heavily involved in the innate immune response. This receptor has a variety of ligands which are involved in the initiation (IL-1α and IL-1β) and inhibition (IL-1Ra) of the immune and inflammatory responses (49). IL-1α belongs to a group of dual-function cytokines, constitutively present inside cells under normal homeostatic condition and playing a role as a transcription regulator to trigger inflammation and immunity extracellularly (50). This ligand has been shown to induce an inflammatory response in absence of infection as well as is responsible for the stimulation and release of IL-1β from monocytes (51). Conversely, IL-1β is not expressed in homeostatic conditions and is active only upon cleavage of its precursor caspase-1 (50). Although IL-1Iα is the initiator of sterile inflammation IL-1β has been shown to play a role as an amplifier of inflammation (50, 51). The binding of either of these molecules to IL-1R1 leads to the activation of many transcription factors including nuclear factor-kappa B (NF-kB) and ultimately leads to an inflammatory response (49). IL-1 receptor type I has been studied as one of the potential biomarkers to predict heart failure in hypertensive patients (52) and was proposed as a candidate molecular target for rheumatoid arthritis treatment (53). In the pregnancy space, IL-1R1 has been investigated in endometrial tissues and chorioamnionitis (54, 55) and has been found to be increased in PTBs stimulated by RU486 in rats (56). One study found an aberrant placental expression of interleukin 1 receptor-like 1 (IL-1RL1) in PTB cases (compared with spontaneous term births) whose mRNA transcript were of higher detection in maternal plasma samples than their gestational age-matched controls that had term birth, suggesting IL-1RL1 to be a candidate PTB-associated marker (57). Other cytokines have also been identified as PTB biomarkers, including IL6, IL-1β, and IL2 (26). In case of infection, blocking a single factor on the pathway may not be sufficient to prevent preterm delivery (58). Our finding suggests that IL-1R1 could be one of the detectable markers of the dysregulated inflammatory network associated with PTB that bears further investigation. The overall signature we observed is consistent with previously published literature supporting a role for the inflammasome and activation of the innate immune system in the onset of spontaneous preterm labor. For example, activation of the NLRP3 inflammasome, which ultimately results in increased levels of mature IL-1β, has also been implicated in patients (59). There are increased levels of IL-1β in the amniotic fluid of patients with preterm labor (60) as well as in the chorioamniotic membranes (59). A GWAS study also reported that polymorphisms in the IL1R antagonist locus were associated with PTB (61, 62). In mouse models, introduction of IL1 can induce PTB by activating the innate immune system, and blockade of IL1R can abrogate this phenotype (63). Given the extensive downstream effects of this signaling pathway in influencing neonatal morbidity in preterm infants (64), our findings have clinical relevance for discovering targetable molecular pathways. Tissue factor is a key element for normal gestation (65). Maternal plasma concentrations of total TFPI, the main physiological inhibitor of the tissue factor-dependent pathway of blood coagulation, is shown to increase during the first half of pregnancy, remain relatively constant in the remaining half, and decrease during labor (66–68). Different profiles of maternal plasma tissue factor and TFPI concentrations have been observed among several obstetrical syndromes including preeclampsia (69), preterm prelabor rupture of membranes (70), and small for gestational age (69).

Maternal and Fetal Signals Elucidate the Role of Adaptive and Innate Immunity in PTB

Immunity and inflammation have been shown to play an important role in parturition timing (71–75). Specifically, infection and breakdown of maternal–fetal tolerance (rejection) are the two most important in this respect. These have different association with gestational age, with infection (76) affecting mainly early PTB while rejection (77) affecting mainly late PTB cases. Healthy pregnancy involves multiple tolerance mechanisms that prevent the maternal and fetal immune systems from recognizing and rejecting each other (78, 79), whereas preterm labor may result from a breakdown in maternal–fetal tolerance (12). Kourtis et al. conclude that aspects of innate immunity are maintained or enhanced during pregnancy, particularly during the second and third trimesters and there are decreases in adaptive immunity seen in later stages of pregnancy (80). Before labor, the maternal immune system modulates inflammatory signaling pathways to avoid rejection of the fetus. Conversely, in pregnancies with PTB, the fetal immune system might undergo activation, resulting in recognition and rejection of maternal antigens. Implications of pregnancy as a modulated immunological condition are vast including prevention of fetal rejection, susceptibility to some infections and maybe even PTB (80). The upregulated and downregulated gene signatures identified in the maternal meta-analysis demonstrate a clear enrichment in immune-related pathways. When looking at the regulation of the differentially expressed genes, we found that transcription factors regulating the differentially expressed genes were also immune-related transcription factors. Looking at cytokine data, we found IL-1-related pathways are upregulated during the third trimester for women who deliver preterm. This supports the upregulation of inflammatory pathways involving cytokines and their receptors among PTB cases reported by Heng et al.’s (28) study, whose data were included in the current meta-analysis. Genes encoding IL-1α and IL-1β, two founding members of the IL-1 family that have played a central role in several autoinflammatory diseases (81–83), and other cytokines such as IL-6 are also upregulated in our study, despite not being statistically significant after multiple testing correction. Past research suggests that pro-inflammatory cytokines IL-1β and TNF-α play a primary role in inducing infection-associated PTB (58, 84). These findings are consistent with the literature and are more reflective of early rather than late PTB. The upregulated inflammatory-related pathways in this study may be in part attributed to clinical or sub-clinical infection. However, diagnosis of infection is often not available in population studies, which precludes further exploration of the contribution of infection in the observed signal. Based on the maternal data, we found that genes and cell types associated with innate immunity were upregulated in PTB while those relevant to adaptive immunity were downregulated in PTB. Genes identified in the fetal CB analysis showed enrichment in pathways that were immune related but the signature was flipped; innate immunity was downregulated in babies born preterm. One hypothesis is that the immune systems of women who deliver preterm are less responsive to specific foreign antigens such as infections which themselves could lead to PTB, while mothers whose adaptive immunity was stronger were able to maintain the pregnancy due to better immune coping mechanisms. Previously, polymorphisms of genes pertaining to the innate immune system were found to have only moderate effects on subsequent PTB, although they played a functionally relevant role in host immune response (85). On the other hand, babies that were born preterm showed a downregulation of innate immunity, which suggests opposing signals in the maternal and fetal immune tolerance but also could be a result of the incomplete development of immune defense. Since innate immunity serves as the first defense of the human immune systems, weaker innate immunity signals could be indicative of vulnerability and susceptibility to life-threatening infections (86). There is some evidence that this homeostasis of the fetal–maternal immune tolerance can be perturbed during infection, resulting in immune activation and the observed opposing signals can be indicative of the breakdown of the tolerance mechanism leading up to PTB. Specifically, there are several genes that are reversed in the maternal and fetal signatures. TLR5 was one of the genes we found to have opposing differential gene expression when comparing mother and fetus. While TLR5 showed lower expression in PTB CB samples, TLR5 was upregulated in PTB maternal whole blood samples. TLR5, as well as other toll-like receptors, play an important role in pathogen recognition and subsequent activation of the inflammatory innate immune response. TLR5 (along with TLR2 and TLR3) has previously been implicated in regulation of pro-inflammatory and pro-labor responses in primary human myometrium cells (87). One of the downstream targets of this gene in the MyD88-dependent pathway is NF-kB, a critical transcription factor in the activation of genes related to immune and inflammatory responses (88–90). TLR5 has also been shown to increase production of various pro-inflammatory interleukins including IL-6 and IL-8 (87, 91). Importance of TLR5 in pregnancy and its association with PTB has been shown repeatedly. The TLR5 (g.1174C>T) variant, which encodes a non-functional protein, is significantly associated with development of severe bronchopulmonary dysplasia in very low-birth weight infants born prematurely. This evidence shows that the non-functionality of TLR5 in preterm infants results in an insufficient immune response to flagellated bacteria (92). Furthermore, TLR5 mRNA expression has repeatedly been found to be increased in the placenta following spontaneous term labor (91, 93). This study has several limitations that may be encountered in other similar studies. First, we were limited to the number of studies with publicly available data that could be aggregated together for our meta-analysis. In addition, a common shortcoming of using publicly available data is that samples lacked demographic information as well as detailed clinical annotations. Furthermore, samples included in our study are heterogeneous as they came from studies with different design (cohort or case–control), phenotype—late and early PTB, and different populations (dataset GSE46510 consisting samples from women with threatened preterm labor). Yet, the current comparison between sPTB cases and term birth controls were likely to be an underestimation of the underlying different gene expression profile between the two groups due to the inclusion of symptomatic women non-differentially as both cases and controls (increasing the baseline risk of sPTB among the controls) as well as confounding factors such as infection and other obstetric complications. In addition, although we propose several potential novel biomarkers, our data are limited in discerning whether the differential expression signatures observed reflect the membrane bound proteins or their secreted isoforms. However, despite these drawbacks, this paper presents novelty in being the largest published meta-analysis of PTB transcriptomics using publicly available data to date. Since PTB samples are difficult to obtain, the ability to aggregate data by using standardized methods to correct for heterogeneity is exciting since it increases our statistical power and, as a result, allows for the discovery of novel pathways and biomarkers. For example, the pathways associated with sPTB and potential biomarkers for indication of early switch to a pro-inflammatory state of the maternal immune system could have been missed in the single-study analysis due to not reaching statistical significance: only 26 significant genes were identified (FDR < 0.10) in the GSE59491 study and no significant genes were identified (FDR < 0.10) in the GSE73685 study (analysis of maternal blood sample only). Although additional validation is needed, we hope that this paper informs the design and interpretation of clinical biomarker studies. Furthermore, we hope that this meta-analysis incentivizes others to add their data to public repositories with the goal of creating a more comprehensive database for PTB. This paper presents several future directions including validation of the observed cell type signals through methods such as flow cytometry and CyTOF (94) as well as further exploring the presented transcriptomic signatures for diagnostics and therapeutics. We may be able to validate TFPI and IL-1R1 in additional datasets collected prospectively in combination with clinical data and direct analysis of cell types to correlate with findings in plasma. In addition, staining and imagining of these two proteins in preterm and term whole blood samples can elucidate their sub-cellular location and potential as a clinical biomarker. This could lead to additional large animal studies to identify pathways whose inhibition could be beneficial and efficacious, similar to IL1 signaling blocked by Anakinra in rheumatoid arthritis. Furthermore, although we evaluated the effect of cell type proportions as a covariate for our differential expression analysis, future studies involving single cell or sorted cell analysis will be much more informative.

Conclusion

Overall, our comprehensive analysis using publicly available data was able to elucidate genetic signatures associated with sPTB as well as identify potential biomarkers that could be translated to clinical practice. The novel finding of the reversal of regulation in innate immunity in maternal blood samples relative to fetal blood samples in PTB brings to light potential mechanisms that may be at play, which may allow for the prediction of sPTB as well as the development of therapeutics to extend pregnancy to term. In addition, the identification of two potential biomarkers, such as TFPI and IL-1R1, which are differentially expressed starting at mid-gestation, allows the possibility for clinically diagnostic biomarkers which may identify women at risk for PTB.

Materials and Methods

Study Design

The purpose of this study was to perform a cross-study meta-analysis using multiple independent datasets to identify differential expressed genes comparing mothers who deliver preterm to mothers who deliver at term using maternal whole blood samples. Additional analyses across different time points and various tissue types were also performed to investigate differential expression between these two groups (Figure 1). We searched the NCBI GEO database for public human microarray genome-wide expression studies using search terms including PTB and premature (23, 24). Abstracts were screened and only studies that met the following criteria were included: (i) had both spontaneous preterm cases and term delivery controls in the same study, (ii) included samples collected before or at delivery, and (iii) had a sample size of 20 or more. We used samples from maternal blood for our main analysis as they have the most samples. Classification of samples as PTB or term birth was extracted from sample matrices downloaded from the GEO database.

Cross-Study Meta-Analysis

We implemented the meta-analysis pipeline by Hughey and Butte (21) for data processing and normalization. All microarray data were renormalized from raw data and merged based on genes meeting two criteria: those with non-missing values and those which were mutually inclusive across all three studies. The merged dataset was subsequently corrected for study-specific effects using ComBat, which implements an empirical Bayes method to correct for study-specific biases and batch effects by performing cross-study normalization (29). An F-test was performed to test the equality of variance across the three studies. Differential gene expression analysis was performed on this normalized merged dataset of genes to obtain significance level (p-values) of each gene using the R package limma which fits linear models to expression data for each gene (95). We corrected for multiple hypothesis testing using the Benjamini and Hochberg’s (i.e., FDR) method (96) with a pre-specified cutoff of 0.1 to identify more significant genes. Effect size of each gene is expressed in FC, which was calculated for each study separately using the raw expression data before ComBat. Samples were divided with respect to preterm or term delivery and mean gene expression was calculated for each gene meeting the FDR < 0.1 cutoff. The logged (base 2) average expression values were used to calculate fold-change [FC = 2(average expression for preterm samples − average expression for term samples)]. We further filtered the significant genes that met the FDR < 0.1 criteria using a significance threshold at a FC > 1.3 for upregulated genes or <1/1.3 for downregulated genes (22). We obtained a list of genes that showed the largest fold-change and were most differentially expressed, when comparing preterm and term births in the respective study. A final gene set was compiled by combining the significant genes from all three studies; if a gene met the FC cutoff of 1.3 in at least one of the studies, it was included in the final gene data set. To investigate the relevance of these results, we performed a gene list functional enrichment analysis using ToppFun (34) to identify the pathways our genes were involved in and met a cutoff of FDR < 0.05, evaluated connectivity using the STRING database (31–33), explored biomarker identification using the UniProtKB database (35), and executed a cell type enrichment analysis using xCell (37). We extracted the significant cell types from the xCell output by performing a Student’s two-sided t-test and subsequent Benjamini and Hochberg’s (96) multiple testing correction (FDR < 0.05). A more stringent cutoff was used for cell deconvolution to extract the cell types that were robustly, significantly different between the two groups. We utilized samples collected at the second trimester from GSE59491 (28) and samples from multiple tissues from GSE73685 and performed (1) single-study analysis and (2) tissue-level analyses to further investigate the common differentially expressed gene signatures across time points and different tissue types.

Single-Study Second Trimester Analysis

To perform a single-study gene expression analysis on the second trimester samples from GSE59491, we merged the gene expression values across all studies, extracted the second trimester samples from GSE59491, and implemented a linear model on that subset of samples. After calculating the p-value for each gene, we filtered our list of genes using the output from the cross-study meta-analysis done prior; this resulted in a list of overlapping set of genes which were previously found to be significant in the third trimester. To determine whether these genes were significantly differentially expressed in the second trimester as well, we corrected the raw p-values for multiple hypothesis testing for this subset of genes using the Benjamini and Hochberg’s method and imposed an FDR < 0.1 (96). As an exploratory analysis, we input the resulting genes from the FDR < 0.1 cutoff into the UniProtKB database to determine which genes are secreted as proteins in humans (35).

Regulatory Network Analysis

Transcription factor regulation networks and cytokine networks were analyzed through the use of Upstream Regulator analytic in IPA (QIAGEN Inc., https://www.qiagenbioinformatics.com/products/ingenuitypathway-analysis). Only significant connections are included in our networks and all connections are based on prior knowledge in IPA Knowledge Base. The transcription factors were filtered and only those that were expressed in a specific trimester were kept, to allow as much accuracy in results as possible.

Tissue-Level Analyses

To evaluate differential gene expression at a higher resolution, we performed gene expression analyses at an individual tissue level. Combining all tissue data into a merged dataset with all eight tissue types based on mutual genes, we extracted each tissue and created eight tissue-specific datasets for linear model fitting using limma (95). After correcting the raw p-values from limma using the Benjamini–Hochberg method for multiple hypotheses testing, seven of the eight tissues did not show significantly differentially expressed genes after implementing an FDR < 0.05; however, one tissue type, CB, had an output of genes which met the FDR criteria (96). A more stringent FDR cutoff was used to delineate the genes with the strongest differential expression between the two groups since an FDR < 0.1 resulted in 1,035 differentially expressed genes. These genes were further filtered by imposing a fold-change cutoff of 1.3 (22) which resulted in a list of significantly differentially expressed genes; the relevance of these genes was explored by performing pathway analysis using ToppFun in which a FDR < 0.05 was used as a cutoff (34). All the results and the data are available as an RShiny Application for the benefit of the research community: http://comphealth.ucsf.edu/preterm_transcriptomics/.

Data Availability Statement

The datasets analyzed for this study can be found in the National Institute of Health Gene Expression Omnibus (https://www.ncbi.nlm.nih.gov/geo/). All datasets as well as the results generated in this study are available on the RShiny application (http://comphealth.ucsf.edu/preterm_transcriptomics/) and the ImmPort database (accession: SDY1327).

Author Contributions

BV, MS, and AW conceived of the study. BV, AW, and IK carried out data analysis. IP carried out data visualization and app development. All authors contributed to writing and editing the manuscript.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
  93 in total

Review 1.  Epidemiology and environmental factors in preterm labour.

Authors:  Deirdre J Murphy
Journal:  Best Pract Res Clin Obstet Gynaecol       Date:  2007-05-01       Impact factor: 5.237

2.  Genome-wide expression for diagnosis of pulmonary tuberculosis: a multicohort analysis.

Authors:  Timothy E Sweeney; Lindsay Braviak; Cristina M Tato; Purvesh Khatri
Journal:  Lancet Respir Med       Date:  2016-02-20       Impact factor: 30.700

Review 3.  Intrauterine infection and preterm labor.

Authors:  Varkha Agrawal; Emmet Hirsch
Journal:  Semin Fetal Neonatal Med       Date:  2011-09-25       Impact factor: 3.926

Review 4.  Preterm labor: one syndrome, many causes.

Authors:  Roberto Romero; Sudhansu K Dey; Susan J Fisher
Journal:  Science       Date:  2014-08-14       Impact factor: 47.728

5.  Tissue factor (TF) and tissue factor pathway inhibitor (TFPI) in amniotic fluid and blood plasma: implications for the mechanism of amniotic fluid embolism.

Authors:  M Uszyński; E Zekanowska; W Uszyński; J Kuczyński
Journal:  Eur J Obstet Gynecol Reprod Biol       Date:  2001-04       Impact factor: 2.435

6.  The preterm prediction study: quantitative fetal fibronectin values and the prediction of spontaneous preterm birth. The National Institute of Child Health and Human Development Maternal-Fetal Medicine Units Network.

Authors:  A R Goepfert; R L Goldenberg; B Mercer; J Iams; P Meis; A Moawad; E Thom; J P VanDorsten; S N Caritis; G Thurnau; M Miodovnik; M Dombrowski; J M Roberts; D McNellis
Journal:  Am J Obstet Gynecol       Date:  2000-12       Impact factor: 8.661

Review 7.  Autoinflammatory diseases: clinical and genetic advances.

Authors:  Sharifeh Farasat; Ivona Aksentijevich; Jorge R Toro
Journal:  Arch Dermatol       Date:  2008-03

8.  Interleukin-4 and -10 gene polymorphisms and spontaneous preterm birth in multifetal gestations.

Authors:  Robin B Kalish; Santosh Vardhana; Meruka Gupta; Sriram C Perni; Steven S Witkin
Journal:  Am J Obstet Gynecol       Date:  2004-03       Impact factor: 8.661

Review 9.  Long-term neurodevelopmental outcomes after intrauterine and neonatal insults: a systematic review.

Authors:  Michael K Mwaniki; Maurine Atieno; Joy E Lawn; Charles R J C Newton
Journal:  Lancet       Date:  2012-01-13       Impact factor: 79.321

10.  Whole blood gene expression profile associated with spontaneous preterm birth in women with threatened preterm labor.

Authors:  Yujing Jan Heng; Craig Edward Pennell; Hon Nian Chua; Jonathan Edward Perkins; Stephen James Lye
Journal:  PLoS One       Date:  2014-05-14       Impact factor: 3.240

View more
  13 in total

1.  A pilot study showing a stronger H1N1 influenza vaccination response during pregnancy in women who subsequently deliver preterm.

Authors:  Sandra Andorf; Sanchita Bhattacharya; Brice Gaudilliere; Gary M Shaw; David K Stevenson; Atul J Butte; Marina Sirota
Journal:  J Reprod Immunol       Date:  2019-02-27       Impact factor: 4.054

2.  Computational discovery of therapeutic candidates for preventing preterm birth.

Authors:  Brian L Le; Sota Iwatani; Ronald J Wong; David K Stevenson; Marina Sirota
Journal:  JCI Insight       Date:  2020-02-13

Review 3.  Matters of size: Roles of hyaluronan in CNS aging and disease.

Authors:  Frances Tolibzoda Zakusilo; M Kerry O'Banion; Harris A Gelbard; Andrei Seluanov; Vera Gorbunova
Journal:  Ageing Res Rev       Date:  2021-10-09       Impact factor: 10.895

4.  Comparative Analysis of Global Gene Expression and Complement Components Levels in Umbilical Cord Blood from Preterm and Term Neonates: Implications for Significant Downregulation of Immune Response Pathways related to Prematurity.

Authors:  Dorota Gródecka-Szwajkiewicz; Zofia Ulańczyk; Edyta Zagrodnik; Karolina Łuczkowska; Dorota Rogińska; Miłosz P Kawa; Iwona Stecewicz; Krzysztof Safranow; Przemysław Ustianowski; Sławomir Szymański; Bogusław Machaliński
Journal:  Int J Med Sci       Date:  2020-07-11       Impact factor: 3.738

5.  Transcriptomic analysis of fetal membranes reveals pathways involved in preterm birth.

Authors:  Silvana Pereyra; Claudio Sosa; Bernardo Bertoni; Rossana Sapiro
Journal:  BMC Med Genomics       Date:  2019-04-01       Impact factor: 3.063

6.  Placental Galectins Are Key Players in Regulating the Maternal Adaptive Immune Response.

Authors:  Andrea Balogh; Eszter Toth; Roberto Romero; Katalin Parej; Diana Csala; Nikolett L Szenasi; Istvan Hajdu; Kata Juhasz; Arpad F Kovacs; Hamutal Meiri; Petronella Hupuczi; Adi L Tarca; Sonia S Hassan; Offer Erez; Peter Zavodszky; Janos Matko; Zoltan Papp; Simona W Rossi; Sinuhe Hahn; Eva Pallinger; Nandor Gabor Than
Journal:  Front Immunol       Date:  2019-06-19       Impact factor: 7.561

7.  Enabling precision medicine in neonatology, an integrated repository for preterm birth research.

Authors:  Marina Sirota; Cristel G Thomas; Rebecca Liu; Maya Zuhl; Payal Banerjee; Ronald J Wong; Cecele C Quaintance; Rita Leite; Jessica Chubiz; Rebecca Anderson; Joanne Chappell; Mara Kim; William Grobman; Ge Zhang; Antonis Rokas; Sarah K England; Samuel Parry; Gary M Shaw; Joe Leigh Simpson; Elizabeth Thomson; Atul J Butte
Journal:  Sci Data       Date:  2018-11-06       Impact factor: 6.444

8.  Multiomics Characterization of Preterm Birth in Low- and Middle-Income Countries.

Authors:  Fyezah Jehan; Sunil Sazawal; Abdullah H Baqui; Muhammad Imran Nisar; Usha Dhingra; Rasheda Khanam; Muhammad Ilyas; Arup Dutta; Dipak K Mitra; Usma Mehmood; Saikat Deb; Arif Mahmud; Aneeta Hotwani; Said Mohammed Ali; Sayedur Rahman; Ambreen Nizar; Shaali Makame Ame; Mamun Ibne Moin; Sajid Muhammad; Aishwarya Chauhan; Nazma Begum; Waqasuddin Khan; Sayan Das; Salahuddin Ahmed; Tarik Hasan; Javairia Khalid; Syed Jafar Raza Rizvi; Mohammed Hamad Juma; Nabidul Haque Chowdhury; Furqan Kabir; Fahad Aftab; Abdul Quaiyum; Alexander Manu; Sachiyo Yoshida; Rajiv Bahl; Anisur Rahman; Jesmin Pervin; Jennifer Winston; Patrick Musonda; Jeffrey S A Stringer; James A Litch; Mohammad Sajjad Ghaemi; Mira N Moufarrej; Kévin Contrepois; Songjie Chen; Ina A Stelzer; Natalie Stanley; Alan L Chang; Ghaith Bany Hammad; Ronald J Wong; Candace Liu; Cecele C Quaintance; Anthony Culos; Camilo Espinosa; Maria Xenochristou; Martin Becker; Ramin Fallahzadeh; Edward Ganio; Amy S Tsai; Dyani Gaudilliere; Eileen S Tsai; Xiaoyuan Han; Kazuo Ando; Martha Tingle; Ivana Maric; Paul H Wise; Virginia D Winn; Maurice L Druzin; Ronald S Gibbs; Gary L Darmstadt; Jeffrey C Murray; Gary M Shaw; David K Stevenson; Michael P Snyder; Stephen R Quake; Martin S Angst; Brice Gaudilliere; Nima Aghaeepour
Journal:  JAMA Netw Open       Date:  2020-12-01

9.  Protein interaction networks define the genetic architecture of preterm birth.

Authors:  Alper Uzun; Jessica S Schuster; Joan Stabila; Valeria Zarate; George A Tollefson; Anthony Agudelo; Prachi Kothiyal; Wendy S W Wong; James Padbury
Journal:  Sci Rep       Date:  2022-01-10       Impact factor: 4.379

10.  Crowdsourcing assessment of maternal blood multi-omics for predicting gestational age and preterm birth.

Authors:  Adi L Tarca; Bálint Ármin Pataki; Roberto Romero; Marina Sirota; Yuanfang Guan; Rintu Kutum; Nardhy Gomez-Lopez; Bogdan Done; Gaurav Bhatti; Thomas Yu; Gaia Andreoletti; Tinnakorn Chaiworapongsa; Sonia S Hassan; Chaur-Dong Hsu; Nima Aghaeepour; Gustavo Stolovitzky; Istvan Csabai; James C Costello
Journal:  Cell Rep Med       Date:  2021-06-15
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.