Francesca Gaccioli1,2, Susanne Lager1,2,3, Marcus C de Goffau4, Ulla Sovio1,2, Justyna Dopierala1,2,5, Sungsam Gong1, Emma Cook1, Andrew Sharkey2,6, Ashley Moffett2,6, Wai Kwong Lee7, Christian Delles7, Cristina Venturini8, Judith Breuer8,9, Julian Parkhill4, Sharon J Peacock10,11, D Stephen Charnock-Jones1,2, Gordon C S Smith12,13. 1. Department of Obstetrics and Gynaecology, University of Cambridge, Cambridge, UK. 2. Centre for Trophoblast Research, University of Cambridge, Cambridge, UK. 3. Department of Women's and Children's Health, Uppsala University, Uppsala, Sweden. 4. Department of Veterinary Medicine, University of Cambridge, Cambridge, UK. 5. Functional Genomics, GlaxoSmithKline Limited, Stevenage, UK. 6. Department of Pathology, University of Cambridge, Cambridge, UK. 7. Institute of Cardiovascular and Medical Sciences, University of Glasgow, Glasgow, UK. 8. Division of Infection and Immunity, University College London, London, UK. 9. Great Ormond Street Hospital for Children NHS Foundation Trust, London, UK. 10. Department of Medicine, University of Cambridge, Cambridge, UK. 11. London School of Hygiene and Tropical Medicine, London, UK. 12. Department of Obstetrics and Gynaecology, University of Cambridge, Cambridge, UK. gcss2@cam.ac.uk. 13. Centre for Trophoblast Research, University of Cambridge, Cambridge, UK. gcss2@cam.ac.uk.
Abstract
Pre-eclampsia (typically characterized by new-onset hypertension and proteinuria in the second half of pregnancy) represents a major determinant of the global burden of disease1,2. Its pathophysiology involves placental dysfunction, but the mechanism is unclear. Viral infection can cause organ dysfunction, but its role in placentally related disorders of human pregnancy is unknown3. We addressed this using RNA sequencing metagenomics4-6 of placental samples from normal and complicated pregnancies. Here, we show that human herpesvirus 6 (HHV-6, A or B) RNA was detected in 6.1% of cases of pre-eclampsia and 2.2% of other pregnancies. Fetal genotyping demonstrated that 70% of samples with HHV-6 RNA in the placenta exhibited inherited, chromosomally integrated HHV-6 (iciHHV-6). We genotyped 467 pre-eclampsia cases and 3,854 controls and found an excess of iciHHV-6 in the cases (odds ratio of 2.8, 95% confidence intervals of 1.4-5.6, P = 0.008). We validated this finding by comparing iciHHV-6 in a further 740 cases with controls from large-scale population studies (odds ratio of 2.5, 95% confidence intervals of 1.4-4.4, P = 0.0013). We conclude that iciHHV-6 results in the transcription of viral RNA in the human placenta and predisposes the mother to pre-eclampsia.
Pre-eclampsia (typically characterized by new-onset hypertension and proteinuria in the second half of pregnancy) represents a major determinant of the global burden of disease1,2. Its pathophysiology involves placental dysfunction, but the mechanism is unclear. Viral infection can cause organ dysfunction, but its role in placentally related disorders of human pregnancy is unknown3. We addressed this using RNA sequencing metagenomics4-6 of placental samples from normal and complicated pregnancies. Here, we show that human herpesvirus 6 (HHV-6, A or B) RNA was detected in 6.1% of cases of pre-eclampsia and 2.2% of other pregnancies. Fetal genotyping demonstrated that 70% of samples with HHV-6 RNA in the placenta exhibited inherited, chromosomally integrated HHV-6 (iciHHV-6). We genotyped 467 pre-eclampsia cases and 3,854 controls and found an excess of iciHHV-6 in the cases (odds ratio of 2.8, 95% confidence intervals of 1.4-5.6, P = 0.008). We validated this finding by comparing iciHHV-6 in a further 740 cases with controls from large-scale population studies (odds ratio of 2.5, 95% confidence intervals of 1.4-4.4, P = 0.0013). We conclude that iciHHV-6 results in the transcription of viral RNA in the human placenta and predisposes the mother to pre-eclampsia.
Preeclampsia and fetal growth restriction (FGR) are major causes of maternal and perinatal morbidity and mortality. We hypothesized that viral infection could be an underlying cause of the placental dysfunction which characterises these pregnancy complications. In order to identify viral sequences from both RNA viruses and actively replicating DNA viruses, we analyzed non-human reads in 279 RNA-seq datasets of placental samples from 99 cases of preeclampsia, 48 cases of FGR and 132 controls (Extended Data Figure 1). Using this discovery-based approach, the only clear viral signal was HHV-6, which was detected in 10 samples (Table 1 and Extended Data Figure 2). Targeted qPCR-based analysis of 12 viruses including HHV-6, confirmed these findings (Supplementary Information). 6.1% (6/99) of the preeclampsia cases and 2.2% (4/180) of the patients without preeclampsia were HHV-6 positive (Table 2).
Extended Data Fig. 1
Characteristics of the study groups in the Pregnancy Outcome Prediction (POP) study.
Data are expressed as median (IQR) or n (%) as appropriate. The overall rate of preeclampsia in these participants was 6.4%. For fields where there is no category labelled “missing”, data were 100% complete. Maternal age was defined as age at recruitment. All other maternal characteristics were defined by self-report at the 20 weeks questionnaire, from examination of the clinical case record, or linkage to the hospital’s electronic databases. Socio-economic status was quantified using the Index of Multiple Deprivation (IMD) 2007, which is based on census data from the area of the mother’s postcode. Stillbirths (n=8) and spontaneous preterm deliveries (n=100) were included in the analysis, while miscarriages (n=7) and terminations of pregnancy (n=11) were excluded. Abbreviations: non-cases denote patients without preeclampsia; FTE denotes full time education; BMI denotes body mass index; DM denotes diabetes mellitus.
Table 1
Detection of HHV-6 RNA and/or DNA in infant and parental samples.
Patient
Status
Placental HHV-6 RNA(RNA-seq)
Placental HHV-6 DNA(qPCR)
Parental HHV-6 DNA(qPCR)
Virus
Reads*
Virus
Maternal
Paternal
Infant_1
CON
HHV-6B
331
iciHHV-6B
iciHHV-6B
negative
Infant_2
CON
HHV-6B
259
iciHHV-6B
negative
iciHHV-6B
Infant_3
FGR
HHV-6A
77
iciHHV-6A
iciHHV-6A
negative
Infant_4
CON
HHV-6B
43
iciHHV-6B
negative
iciHHV-6B
Infant_5
PE
HHV-6A
22
negative
negative
negative
Infant_6
PE
HHV-6B
9
iciHHV-6B
negative
iciHHV-6B
Infant_7
PE
HHV-6B
7
iciHHV-6B
iciHHV-6B
negative
Infant_8
PE
HHV-6A
3
iciHHV-6A
negative
iciHHV-6A
Infant_9
PE
HHV-6A
1
negative
negative
negative
Infant_10
PE
HHV-6B
1
negative
negative
negative
iciHHV-6 corresponds to a high HHV-6 DNA signal in the sample measured by qPCR, i.e. within 4 cycles of the RNase P signal. Negative indicates viral DNA not detected. Placental HHV-6 denotes HHV-6 studied in placental samples; parental HHV-6 denotes HHV-6 studied in parental samples; CON denotes a healthy pregnancy without FGR or preeclampsia (see Methods); FGR denotes fetal growth restriction; PE denotes a patient with preeclampsia; iciHHV-6 denotes inherited chromosomally integrated human herpesvirus 6; HHV-6A denotes human herpesvirus 6, variant A; HHV-6B denotes human herpesvirus 6, variant B. *Reads indicates the number of sequencing reads identified by the Kraken software as aligning to the HHV-6 genome.
Extended Data Fig. 2
Placental RNA-seq reads mapped to HHV-6A and HHV-6B genomes.
Placental HHV-6 positive samples identified by RNA-seq. Reads were identified by Kraken as aligning to the HHV-6 genomes and mapped with BWA to the HHV-6A or HHV-6B reference genomes (Supplementary Methods); note that the total number of reads/sample recognized by the two software is not always identical. CON denotes a healthy pregnancy without FGR or preeclampsia (see Methods); FGR denotes fetal growth restriction; PE denotes a patient with preeclampsia; HHV-6A denotes human herpesvirus 6, variant A; HHV-6B denotes human herpesvirus 6, variant B; DRL: direct repeat left; DRR: direct repeat right. Repetitive regions are in Italic. HHV-6A and HHV-6B genomes have been described by Dominguez G et al[53].
Table 2
Datasets described in the current work.
Dataset description
Dataset name
Study
Method
Total
PE status
HHV-6 positive (%)
RNA-seq
RNA-seq
POP study
Placental RNA-seq
279
Case
99
6
(6.1)
Non-case
180
4
(2.2)
POP study minus RNA-seq
POP study
Cord DNA qPCR
3,568
Case
150
2
(1.3)
Non-case
3,418
25
(0.7)
POP study all
POP study
Cord DNA qPCR
3,847
Case
249
5*
(2.0)
Non-case
3,598
29
(0.8)
Case control
Case control
Cord DNA qPCR
474
Case
218
5
(2.3)
Non-case
256
1
(0.4)
POP study minus RNA-seq + Case control
Other
POP study + Case control
Cord DNA qPCR
4,042
Case
368
7
(1.9)
Non-case
3,674
26
(0.7)
POP study+ Case control
All
POP study + Case control
Cord DNA qPCR
4,321
Case
467
10
(2.1)
Non-case
3,854
30
(0.8)
GOPEC
GOPEC
GOPEC
Cord DNA qPCR
740
Case
740
12
(1.6)
Non-case
n/a
Healthy/control patients from large population studies
Total
Various**
Various**
Case
n/a
Non-case
61,549
403
(0.7)
Datasets “POP study minus RNA-seq”, “POP study all” and “Case control” were not analyzed as separate groups. The column “Dataset name” refers to the datasets described in Figure 2. POP study denotes the Pregnancy Outcome Prediction study; GOPEC denotes the Genetics of Preeclampsia Consortium; PE denotes preeclampsia; Case denotes a pregnancy affected by preeclampsia; Non-case denotes a pregnancy without preeclampsia; n/a denotes not applicable. * The sum of the two rows above for cases is 6+2=8 whereas this cell states 5. The difference between 8 and 5 is explained by the 3 cases of preeclampsia where the placenta was positive for HHV-6 by RNA-seq but there was no ciHHV-6. ** The studies are listed in the Methods.
Between 0.2% and 1% of humans carry a copy of HHV-6 integrated into the telomeric region of a chromosome in every cell of their body, including the germ cells. Inherited, chromosomally integrated HHV-6A and HHV-6B (iciHHV-6A and iciHHV-6B, respectively) can be transmitted from either the mother or father to the fetus in Mendelian fashion[7,8]. Hence, we investigated whether HHV-6 DNA was detected in the placental and parental DNA samples of the 10 HHV-6 positive patients identified by RNA-seq. Of these 10 placental samples, 7 were strongly positive for HHV-6 DNA, consistent with viral chromosomal integration (Table 1 and Extended Data Figure 3). In all 7 cases with a high placental DNA signal, one of the parental DNA samples was also strongly positive for the virus, consistent with inherited chromosomal integration.
Extended Data Fig. 3
HHV-6 detection in fetal and parental samples.
HHV-6A and HHV-6B representative signals in cord (A) and parental (B) DNA samples detected using a multiplex qPCR approach. These analyses were performed in 5,061 and 86 samples, respectively, and each sample was analyzed in triplicate. qPCR amplification curves for the HHV-6A and HHV-6B 9 U67/68 genes are represented in green and red, respectively; RNase P curves are in blue and confirmed presence of DNA in the wells. ciHHV-6 corresponds to a high HHV-6 DNA signal in the sample measured by qPCR, i.e. within 4 cycles of the RNase P signal. HHV-6 non-integrated corresponds to a HHV-6 DNA signal in the sample detected at more than 4 Ct higher compared to the RNase P signal. Negative samples lack HHV-6A or HHV-6B DNA signal. C) RT-qPCR amplification plot of placental RNA samples showing detection of the HHV-6 U100 gene. Eight representative samples are shown, two with viral transcript amplification (total n=48 samples, each analyzed in triplicate). Five negative controls (samples without reverse transcriptase enzyme in the RT reaction) lacked U100 amplification (not shown). U100 curves are in black and RNase P curves are in blue. Rn (normalized reporter value) represents the fluorescence of the reporter dye normalized to the signal of the passive reference dye for a given reaction. The ΔRn is the Rn value of an experimental reaction minus the Rn value of the baseline signal generated by the instrument. This parameter indicates the magnitude of the fluorescent signal generated in the qPCR assay. ciHHV-6 denotes chromosomally integrated human herpesvirus 6; HHV-6A denotes human herpesvirus 6, variant A; HHV-6B denotes human herpesvirus 6, variant B; RNase P denotes the human positive control gene RPPH1.
We next performed deep sequencing of the HHV-6B genome in 9 parent and offspring pairs with high levels of HHV-6B DNA (these analyses included some additional HHV-6 positive samples identified by genotyping the entire POP study cohort – see below). We created “barcode” graphics of the viral genome by indicating the presence or absence of 187 informative SNPs as a black or white vertical line, respectively (Figure 1A and Extended Data Figure 4). These analyses demonstrated 100% concordance between the offspring and parent HHV-6B genomes. There were 999 informative SNPs in the HHV-6A genome. These SNPs were 100% concordant between 2 parent and offspring pairs with ciHHV-6A (not shown). As a proof of principle of chromosomal integration, we analyzed one parent and offspring pair in detail and identified the ciHHV-6B integration site at the telomeric side of 4p16.3 in both the parental and offspring genomes (Supplementary Information). This integration site has an identical nucleotide sequence to a previously described insertion site for ciHHV-6A[9]. Relative qPCR quantification and sequencing reads spanning the HHV-6 genome and the flanking human telomere demonstrated viral chromosomal integration in the placental samples analyzed.
Figure 1
SNP analysis of HHV-6B genome sequenced in fetal and parental samples.
A) Comparison of HHV-6B genomes sequenced in different samples revealed 187 informative SNPs, represented as black or white lines in the “barcode” graph if concordant or discordant to the HHV-6B reference genome (GCF_000846365.1), respectively. B) Comparison of placental and parental SNPs based on RNA-seq and DNA-seq reads aligning to the HHV-6B gene U100, which codes for glycoprotein Q (gQ).[44] The genes of the HHV-6B reference genome are in light green in the upper part of the panel, and the U100 gene is indicated in dark green; repeat regions, including the two large direct repeat regions on both termini, are indicated in amber. The RNA-seq coverage of the U100 gene (highlighted in light-red and enlarged) is shown for one placental sample (Infant 2 in Table 1 and Extended Data Figure 2) and represented by the surface area of the black peaks. The DNA-seq reads obtained from the corresponding paternal sample are shown in the bottom part of the panel. Red vertical lines indicate positions where both the infant RNA and the paternal DNA concordantly differ from the HHV-6B reference genome.
Extended Data Fig. 4
Identification of informative HHV-6B SNPs.
DNA-seq reads of 2 randomly selected samples were compared to the HHV-6B reference genome. *Informative SNP sites, i.e. SNPs present in just one of the two analyzed samples (gapped vertical red lines). **SNPs present in both analyzed samples, i.e. sites concordantly different from the reference genome, were considered not informative (continuous vertical red lines). Throughout the 162kb HHV-6B genome 187 SNPs were classified as informative.
We also compared the SNP concordance between the chromosomally integrated DNA in the parent and the RNA detected in the placenta, using a case where there was high RNA-seq coverage of the U100 viral gene (Figure 1B). There was concordance between the SNPs present in the placental RNA and the parent’s DNA, demonstrating that the placental viral RNA was encoded by the inherited viral genome. This concordance could not be explained by maternal contamination as the case in question involved fetal inheritance of ciHHV-6B from the father. There were 16 offspring with iciHHV-6 on the basis of analysis of umbilical cord DNA (described below) and an associated RNA sample, and viral RNA was detected in the placenta of 11 (69%) of these (Extended Data Figure 3). Hence, the RNA-seq and qPCR analyses led us to conclude that HHV-6 was the only clear viral signal observed in a large number of case and control placentas and that there was no evidence suggesting an association between preeclampsia or fetal growth restriction and any virus other than HHV-6. Furthermore, SNP analysis indicated that the viral genomes in the parent and the child were identical and that the viral RNA expressed in the placenta also contained the same SNPs observed in the parent. This suggests that, when present, RNA from the virus was usually, but not invariably, associated with iciHHV-6.The initial analysis of cases and controls using RNA-seq suggested an association between HHV-6 RNA positivity in the placenta and preeclampsia, but replication was required to strengthen the evidence. As we had offspring DNA samples available from 92% of the POP study cohort and from an external case control study, we then sought to determine the association between fetal inheritance of ciHHV-6 (which also refers to placental inheritance of the virus as the placental genome comes from the zygote) and the risk of preeclampsia in a separate study group, which excluded the patients analyzed by RNA-seq. The two sources included 368 cases of preeclampsia and 3,674 pregnancies without preeclampsia. The proportions where the umbilical cord DNA was positive for iciHHV-6 were 1.9% (7/368) and 0.7% (26/3,674), respectively (P=0.022) (Table 2). When we analyzed the whole of the POP study cohort and the external case control study combined, iciHHV-6 positive samples were 2.1% (10/467) and 0.8% (30/3,854) in cases and non-cases, respectively (P=0.008). Among the women from the POP and the case control studies (n=4,321), we had a 74% power to detect a difference of 1.36% in the proportions of iciHHV-6 positive samples between cases and controls at alpha=0.05. Fetal inheritance of ciHHV-6 was associated with a three-fold increased risk of preeclampsia (odds ratio (OR) 2.8, 95% CI: 1.4 to 5.6, P=0.008, Figure 2A-B).
Figure 2
Fetal inheritance of ciHHV-6 and the risk of preeclampsia.
A) Proportion of iciHHV-6 positive samples in preeclampsia cases (black bars) and non-cases (white bars) in the following datasets: placental RNA samples within the POP study analyzed by RNA-seq (“RNA-seq”, n=279); all samples analyzed by cord DNA genotyping, excluding those in the first group, i.e. with RNA-seq data available (“Other”, n=4,042); the combined study population, i.e. all the genotyped cord DNA samples from both studies (“All”, n=4,321). n represents the number of patients analyzed in each group. For the 3 datasets “RNA-seq”, “Other” and “All” the P values for the association between fetal iciHHV-6 and the risk of preeclampsia were 0.14, 0.022 and 0.008, respectively, using the recommended Fisher-Boschloo unconditional exact test[51] (2-sided); and 0.17, 0.026 and 0.008, respectively, using the more widely used Fisher’s exact test (2-sided). B) Odds ratios and 95% confidence intervals (CI) for preeclampsia by the presence of iciHHV-6 calculated in each dataset and in the pooled study population. C) Proportion of iciHHV-6 positive samples in the following datasets: large-scale population studies (n>1,000) of healthy/control patients currently available in the literature (white symbols; see Methods); pooled values of the healthy/control populations (dataset “Total”, grey symbol); GOPEC samples (dataset “GOPEC”, black symbol). Symbols represent the proportion of iciHHV-6 positive samples with 95% CI; n represents the number of patients involved in each study; the location where the studies were conducted is in parenthesis.
We performed a further replication of the analysis by genotyping cord DNA samples from 740 cases of preeclampsia recruited by the Genetics of Preeclampsia Consortium (GOPEC)[10]. As the GOPEC cohort did not include preeclamspia non-cases, the proportion with iciHHV-6 in this cohort was compared with a meta-analysis of other large-scale, population-based studies of ciHHV-6 in control patients. There was no overlap between the cases or controls used in the prior experiments and the GOPEC cases or the meta-analysis. The proportion of iciHHV-6 positive samples in the GOPEC study was 1.6% (12/740) and the summary proportion of ciHHV-6 derived from the meta-analysis of large-scale population studies was 0.7% (403/61,549) (Table 2 and Figure 2C). This analysis again demonstrated a two to three fold risk of preeclampsia associated with iciHHV-6 (OR=2.5, 95% CI: 1.4 to 4.4, P=0.0013). Although the meta-analysis indicated heterogeneity in the background rate of ciHHV-6 between the different studies, the proportion of iciHHV-6 in the GOPEC cohort fell outside the 95% CI of all the individual studies employed. Comparable results were obtained after exclusion of the two Japanese populations (Tanaka-Taya 2014 and Miura 2018; see Supplementary Information). Moreover, the iciHHV-6 positive cases within the GOPEC study were geographically dispersed, i.e. from 5 of the 9 recruitment centres. Collectively, the analysis of the GOPEC study and the meta-analysis of population studies provided strong confirmatory evidence. Further replication studies will require large sample sizes. We estimate that a case control study drawn from a single population would require approximately 1,900 cases and 1,900 controls to identify an OR=2.5 (80% power and alpha=0.05).Given the association between iciHHV-6 and the risk of preeclampsia in three groups, the POP study, the Cambridge case control study and the GOPEC study, we believe that the current analysis provides strong evidence for an association between placental HHV-6 and the risk of preeclampsia. Equally importantly, our study did not identify any other viral associations. Hence we conclude that a small proportion of cases of preeclampsia is likely to be due to inherited or de novo HHV-6, however, viral infection of the placenta is not a major determinant of the pathophysiology of the condition. Interestingly, the proportional increase in the risk of preeclampsia associated with iciHHV-6 was similar to the association with angina pectoris reported by a study of almost 20,000 Canadian adults[11].We have identified a number of areas for further study. The majority of cases of preeclampsia are associated with delivery at term and only a small proportion of pregnancies with preeclampsia result in preterm delivery[12]. There is some evidence that the pathophysiology of early onset and late onset disease may differ, hence future studies could address whether the association between preterm preeclampsia and placental HHV-6 may be stronger. However, this would require very large sample sizes to ensure adequate statistical power. Identification of placental HHV-6 RNA by RNA-seq relied on some samples expressing the viral RNA at very low levels (e.g. 3 samples had fewer than 5 reads). Importantly, RNA expression at term might not reflect levels earlier in gestation, which is relevant as a large body of evidence places the onset of the placental dysfunction associated with preeclampsia and FGR early in pregnancy. Moreover, expression levels in a particular area of the placenta, which reflects placental sampling at the time of collection, might not be representative of the whole organ. Therefore, we have viewed the presence of HHV-6 RNA in a binary fashion. The RNA-seq analysis also identified three placentas containing HHV-6 RNA in the absence of chromosomally integrated virus, and this may represent direct de-novo infection of the placenta by the virus, which could be the result of reactivation of HHV-6 in the mother. It was notable that all three were from cases of preeclampsia. However, the numbers were too small to draw reliable conclusions and two out of the three cases had only a single RNA read. Hence further studies are warranted to address the association between preeclampsia and HHV-6 infection of the placenta in the absence of chromosomally inherited virus. Analysis of cord blood IgM against HHV-6 might be a useful way to address this. Further research will also be required to determine the mechanism of association with iciHHV-6. The analysis of angina pectoris patients demonstrated that subjects with ciHHV-6 had shorter telomeres[11]. Although placental senescence might have a role in the pathophysiology of some pregnancy complications[13], there is currently no direct evidence of an association[14]. Viral infection can lead to a number of derangements of cellular function associated with preeclampsia, such as endoplasmic reticulum stress[15,16], and the potential for viral infection to adversely affect placental function has previously been reviewed[17]. Further studies may also address the timing of expression of the virus in the placenta and the risk of disease, for example by studying viral nucleic acids in the maternal circulation in cases where iciHHV-6 was inherited from the father. We only studied placental expression of the virus following birth, but the clinical consequences of iciHHV-6 may depend on viral replication earlier in pregnancy. Chromosomally integrated HHV-6 appears to be capable of viral reactivation[18,19] and we also detected multiple viral transcripts in the placentas with HHV-6 (Extended Data Figure 2). The presence of HHV-6 during extravillous trophoblast invasion of the maternal vessels could derange the cross talk between trophoblast and maternal immune cells, leading to abnormal placentation. A recent study has demonstrated a humoral response to the proteins encoded by the genes transcribed by iciHHV-6[19].A direct effect of viral integration on the expression of adjacent genes involved in the pathogenesis of preeclampsia is an unlikely explanation for the association as the virus inserts in the telomeres, which lack genes, and the exact site of integration is variable. It is possible there is some perturbation of telomere function and gene repression by heterochromatin, and again this warrants further study. Previous studies have demonstrated an association between HHV-6A in the endometrium in women with primary infertility[20]. Moreover, infection of endometrial cells by HHV-6A was associated with increased expression of cytokines by uterine natural killer cells[21]. However, analysis of the parental samples from the POP study and GOPEC cohorts demonstrated that the virus was inherited from the mother in 41% of cases with iciHHV-6 and from the father in 47%, hence, there was no relationship between the parent of origin of iciHHV-6 and the risk of preeclampsia. This observation makes it unlikely that the association between fetal inheritance of ciHHV-6 and preeclampsia is mediated by the presence of ciHHV-6 in the mother.The present study may help explain, at least in part, some epidemiological characteristics of preeclampsia. Previous studies have indicated the possible existence of the “dangerous father”[22], i.e. certain groups of men are more likely to father a pregnancy complicated with preeclampsia. As ciHHV-6 is inherited in a Mendelian pattern, 50% of the offspring of men with ciHHV-6 will inherit ciHHV-6. Given the three-fold risk of preeclampsia with inherited ciHHV-6, it would be predicted that there would be an overall excess of preeclampsia in pregnancies fathered by men who have ciHHV-6. We and others have also previously shown that mothers experiencing preeclampsia have an increased risk of ischemic heart disease in later life[23,24]. Women with ciHHV-6 would be expected to be at increased risk of preeclampsia based on the present analysis and of angina pectoris based on the Canadian study, hence, common associations with ciHHV-6 could explain some of the increased risk of later heart disease.In conclusion, applying placental RNA-seq and metagenomics to cases of placentally-related complications of pregnancy, we identified a single clear viral signal, HHV-6 (A or B). Viral RNA was usually related to fetal inheritance of ciHHV-6 and this was associated with a two to three-fold risk of preeclampsia.
Methods
Study design
We employed samples from three studies. The first was the Pregnancy Outcome Prediction (POP) study, a prospective cohort study of unselected nulliparous women with a singleton pregnancy attending the Rosie Hospital (Cambridge, UK) between January 2008 and July 2012, as previously described[25-27]. Briefly, participants had phlebotomy and fetal biometry at 12, 20, 28 and 36 weeks of gestational age (wkGA). At the 20wkGA visit, maternal blood and paternal saliva were obtained for genotyping the parents. At the time of delivery, the placenta was systematically biopsied and a sample of umbilical cord was obtained for genotyping the offspring. Pregnancy and birth outcome data were ascertained by review of each woman’s paper case record by research midwives and by record linkage to clinical electronic databases of ultrasonography (Astraia), delivery (Protos), biochemical tests (Meditech) and neonatal intensive care (Badgernet). Preeclampsia was diagnosed and classified based on the objective criteria of the 2013 American College of Obstetricians and Gynecologists guideline, as previously described[12,28].The second study was a case control study which was also previously described[29]. In brief, women were recruited from three UK maternity hospitals: St. James University Hospital, Leeds; John Radcliffe Hospital, Oxford; and Rosie Hospital, Cambridge. Preeclampsia was defined as hypertension (>140/90mmHg) and proteinuria (>300mg/24h) presenting for the first time in the second half of pregnancy. There was no overlap in the participants recruited to the two studies and both had ethical approval from the Local Research Ethics Committees and written informed consent was obtained from all participants. Genotyping of the infant was performed using analysis of umbilical cord DNA.The Genetics of Preeclampsia Consortium (GOPEC) recruited women affected by preeclampsia and their families from 10 UK recruitment centers in the UK (Birmingham, Cambridge, Glasgow, Leeds, Leicester, London, Newcastle, Nottingham, Oxford, Stoke) between 2000 and 2003[10]. Samples from Cambridge (n=98) were excluded from the current analysis to avoid possible overlap with the POP and case control cohorts. Participants gave informed consent for the study, which was approved by the Trent Multicentre Research Ethics Committee. Women were eligible for the study if aged 18 or over, of white Western European ancestry (by grandparental ethnicity), with a singleton pregnancy and new onset hypertension and proteinuria in or after 20th week of pregnancy. Hypertension was defined as systolic blood pressure ≥140mm Hg and diastolic blood pressure ≥90mm Hg measured on two occasions within a 24 hour period. Proteinuria was defined as >500mg/24 hours or 2+ (1 g/liter) on dipstick testing of urine. Women with hypertension or proteinuria before 20wkGA, with essential hypertension, renal or cardiac disease, or diabetes were excluded. Fetal DNA was isolated from umbilical cord tissue and parental DNA from venous blood. As the GOPEC cohort did not include preeclamspia non-cases, large-scale population studies (n>1,000) of healthy/control patients were used as a comparison group for this cohort. The references for the studies analyzed are the following: Tanaka-Taya 2004[30]; Pellet 2012[31]; Gravel 2015[11]; Hill 2017[32]; Moustafa 2017[33]; Miura 2018[34]; Peddu & Mouammine 2019[19,35]; heterogeneity test, I2=87%, P=5x10-08 (2-sided). The samples used in the current work are listed in Table 2.
RNA-sequencing (RNA-seq) analysis: POP study
Of the 4,212 women followed from their dating ultrasound scan through to delivery in the POP study, the placenta was obtained from 3,870 (92%) and 1,480 (35%) of these were collected within 30 minutes of birth and yielded samples suitable for analysis of RNA. We performed RNA-seq on 279 placentas, including 99 preeclampsia cases (8 with a fetal growth restricted [FGR] infant), 48 FGR without preeclampsia, and 132 pregnancies without either condition. FGR was defined as a customized birth weight <5th percentile[36]. Healthy controls were defined as pregnancies resulting in a live-born infant with a birth weight in the normal range (20-80th percentile) with no evidence of slowing in fetal growth on prenatal scans, and no evidence of pre-existing or acquired hypertensive disorders, gestational diabetes or diabetes mellitus type I or type II or other obstetric complications. Total placental RNA was isolated from approximately 5mg of tissue stored in RNAlater. After lysis of the tissues in Lysing Matrix D tubes, extraction was performed using the mirVana miRNA Isolation Kit (Ambion) followed by DNase treatment (DNA-free DNA Removal Kit, Ambion). Libraries were prepared from 300-500ng of total placental RNA (RIN values ≥ 7.0) with the TruSeq Stranded Total RNA Library Prep Kit with Ribo-Zero Human/Mouse/Rat (Illumina), a method that did not employ selection of RNAs with a polyadenylated tail. Briefly, the protocol includes ribosomal RNA (rRNA) depletion, cDNA synthesis, adapters ligation to discriminate samples and libraries amplification by PCR. Libraries were then pooled and sequenced (single-end, 125bp) using a Single End V4 cluster kit and Illumina HiSeq2500 and HiSeq4000 instruments. At the end of the sequencing process reads were obtained, which are inferred nucleotide sequences corresponding to all or part of a single RNA transcript. If an RNA is highly expressed there will be more correspondent reads and vice versa. The obtained reads were trimmed (using cutadapt and Trim Galore!) and mapped to the primary chromosomal assemblies of the GRCh38.p3 version of the human reference genome using TopHat2, a splice-aware mapper built on top of Bowtie2 short-read aligner[37-39]. The initial ‘unmapped’ reads were filtered out to remove poor quality reads, based on the following conditions: 1) base-quality score (i.e. Phred score) < 30, 2) read-length < 50bp, 3) undetermined base (i.e. Ns) > 5bp, and 4) poly A/T > 5bp, and 5) low-complexity reads defined by the dust score > 7. In order to remove as many reads of human origin as possible, additional human reads were subtracted if they aligned to sequences present in the following databases: 1) GRCh38.p5, 2) human RefSeq, and 3) all human contigs and clone sequences from NCBI NT. The remaining reads of each sample were mapped to a custom Kraken reference database, including the default bacterial and viral genomes and few additional eukaryotic genomes to remove residual unmapped human reads. Kraken (v0.10.6)[40] was run using the metagm_run_kraken option and identified 10 placental RNA samples containing HHV-6A or HHV-6B reads (Table 1). Mapping of the viral reads against the GCF_000845685.1 (HHV-6A) or the GCF_000846365.1 (HHV-6B) reference genomes was performed using BWA (v0.7.17-r1188)[41] and visualized using Artemis (v.16.0.0)[42].
DNA isolation and qPCR analyses
Of the 4,212 women followed from their dating ultrasound scan through to delivery in the POP study, samples were collected for DNA isolation from 4,060 (96%) mothers, 3,965 (94%) fathers and 3,869 (92%) offspring. Of the offspring samples, 3,847 were available for the current analysis, after exclusion of the following: 4 with unknown preeclampsia status, 7 miscarriages, and 11 terminations of pregnancy. Of the parental samples, 64 were analyzed in the current study. In the case control study we analyzed only umbilical cord DNA samples: 218 cases of preeclampsia and 256 non-cases. The analysis of the samples from the GOPEC cohort included 740 umbilical cord DNA samples after exclusion of 98 samples from Cambridge and 8 samples with no/low DNA. We also analyzed 22 parental DNA samples corresponding to the 12 ici-HHV6 positive cord DNAs (2 paternal samples were not available).
Placental DNA
Placental DNA was isolated with the Fast DNA Spin kit (MP Biomedical) from 25mg of villous tissue. The DNA isolation procedure has previously been described in detail[43]. Briefly, the tissue was homogenized by bead-beating twice for 40 seconds at a speed of 6.5m/sec on a FastPrep-24 (MP Biomedical). DNA concentrations were determined by Nanodrop Lite (Thermo Fisher Scientific). qPCR was performed on a QuantStudio 6 Flex system (ThermoFisher Scientific). The qPCR reactions were prepared using the TaqMan Multiplex Master Mix (ThermoFisher Scientific). The presence of the following viruses was investigated in placental samples using the primers and probes listed in Supplementary Table 1: Adenovirus, Cytomegalovirus (CMV), Epstein-Barr virus (EBV), Human Papillomavirus types 6, 11, 16 and 18 (HPV-6, HPV-11, HPV-16, HPV-18), Herpes Simplex Viruses types 1 and 2 (HSV-1 and HSV-2), Parvovirus, and Varicella Zoster Virus (VZV); detection of HHV-6 was based on a custom-made TaqMan assay targeting the U67/68 gene and designed to discriminate HHV-6A and HHV-6B (HHV-6 U67/68 described in Supplementary Table 1). The pre-designed TaqMan RNase P Assay, ABY dye/QSY probe (ThermoFisher Scientific) was used to detect the human positive control gene RNase P (RPPH1).
Cord and parental DNA isolation
Umbilical cord tissues (approximately 100mg) were incubated with Cell Lysis Solution (Qiagen) at 65°C for 1 hour, followed by proteinase K digestion at 55°C overnight using Puregene Proteinase K (Qiagen). After incubation with RNase A Solution (Qiagen) at 37°C for 30 minutes, samples were precipitated using the Protein Precipitation Solution (Qiagen). In each sample, genomic DNA present in the supernatant was then precipitated with 100% isopropanol, washed with 70% ethanol and dissolved in IDTE pH 8.0 (10mM Tris, 0.1mM EDTA; Integrated DNA Technologies). Cord DNA samples from the case control and the GOPEC studies were prepared as previously described[10,29]. Maternal DNA was isolated from whole blood samples. Blood cells were pelleted and lysed with SE buffer (75mM NaCl, 25mM EDTA), Pronase (0.2mg/ml), and 0.9% SDS. Paternal DNA was isolated from saliva with the Oragene DNA collection kit (DNA Genotek Inc.). Genomic maternal and paternal DNA samples were purified by ethanol precipitation and re-suspended in TE buffer (10mM Tris, 1mM EDTA, pH 8.0). Parental DNAs from the GOPEC study were prepared as previously described[10]. Cord and parental DNA genotyping was performed using a multiplex qPCR approach including the custom-made TaqMan assay targeting the U67/68 gene described above. Relative quantitation of HHV-6 signals was achieved by comparison with the human RNase P gene (RPPH1), measured with the TaqMan RNase P Assay (ThermoFisher Scientific). The relative abundance of the target is expressed by the Ct (cycle threshold) value, which is inversely associated with the signal. High HHV-6 signals (i.e. low Ct, close to the signal for RNase P) suggested viral chromosomal integration (Extended Figure 1).
Placental RNA isolation and RT-qPCR analysis
Total RNA was isolated from approximately 10mg of placental villous tissues stored in RNAlater (Ambion), including 16 ciHHV-6 positive and 32 ciHHV-6 negative samples on the basis of the cord gDNA analysis. After lysis in Lysing Matrix D tubes (MP Biomedicals), RNA extraction was performed using the Rneasy Plus Mini Kit with genomic DNA removal (Qiagen) according to the manufacturer’s instructions. RNA quantity and quality were assessed with the Agilent RNA 6000 Nano Kit (Agilent Technologies) on an Agilent 2100 Bioanalyzer System. Reverse transcriptase qPCR (RT-qPCR) was employed to assess the presence of viral RNA in placental samples. The RT reaction was performed using 500ng of total placental RNA from each sample and the SuperScript IV VILO Master Mix (ThermoFisher Scientific). Six reactions lacking the reverse transcriptase enzyme were included to rule out genomic DNA amplification. The multiplex qPCR reactions included a custom-designed TaqMan assay detecting both HHV-6A and HHV-6B transcripts encoded by the U100 gene[44] (HHV-6 U100, Supplementary Table 1), and the TaqMan RNase P Assay (both by ThermoFisher Scientific).
DNA-sequencing (DNA-seq) analysis
Deep sequencing of the HHV-6 genome was performed using a target enrichment method. Specifically, overlapping 120-mer RNA baits spanning the length of 7 complete HHV-6A and HHV-6B reference genomes (GenBank accession numbers: AB021506, AF157706, KC465951, KJ123690, KP257584, KT355575, NC001664) were prepared as previously described[45]. Libraries (n=36) were prepared according to the SureSelectXT Illumina paired-end sequencing library protocol, pooled and run on an Illumina MiSeq sequencing platform. Paired-end sequencing results were analyzed as follow.
Analysis of viral genome sequencing by SureSelect target enrichment
Forward and reverse reads from each sample were filtered with KneadData (v0.6.1) (http://huttenhower.sph.harvard.edu/kneaddata), a tool which performs quality control on metagenomic sequencing data and removes human and low quality reads. The following trimmomatic options were used: SLIDINGWINDOW:4:20, LEADING:30, TRAILING 30, MINLEN:75. Filtered Fastq files were analyzed using Kraken to identify HHV-6A and HHV-6B reads, which were then mapped with BWA (v0.7.17-r1188)[41] to their respective reference genomes GCF_000845685.1 (HHV-6A) or GCF_000846365.1 (HHV-6B). These reads were assembled using Spades (v3.11.0)[46] and visualized using Artemis (v.16.0.0)[42] and samples were considered to be successfully sequenced if >99% coverage of either the HHV-6A or HHV-6B genome (~160kb) was achieved.
Metagenomic sequencing by Illumina HiSeq X Ten platform
Whole genome sequencing data obtained with the Illumina HiSeq X Ten platform from 2 HHV-6B positive placental DNA samples allowed for the comparison of the results from the two DNA-seq workflows. Libraries were sequenced (150bp, paired end) on a HiSeq X Ten (Illumina). Sample processing and library preparation for the metagenomics analysis was performed as previously described[43]. The sequencing coverage was designed to generate >30-fold coverage of the human genomic DNA in each sample. Reads generated in this experiment were analyzed with KneadData to remove most of the human reads, and forward and reverse reads from each sample were filtered with the following trimmomatic options: HEADCROP9, SLIDINGWINDOW:4:20, MINLEN: 100. Approximately 10,000 HHV-6 read pairs were identified using Kraken and assembled using Spades. Coverage of the HHV-6B genome was checked using BWA and Artemis which showed almost complete genome coverage (>99%), similar to the samples successfully sequenced by full viral genome sequencing using SureSelect target enrichment. Furthermore, complete agreement was observed between the Illumina HiSeq X Ten data and the SureSelect data obtained from the two samples analyzed with both methods.
Analysis of single nucleotide polymorphisms (SNPs)
Analysis of SNPs was performed using the HHV-6 reads obtained with the DNA-seq and RNA-seq workflows to investigate direct parental transmission. Nine parent and child pairs with HHV-6B in the placenta and in one of the parents were studied. In order to facilitate both processing and mapping visualization, 50,000 forward and reverse paired reads were subsampled from each sample successfully sequenced by SureSelect target enrichment. All reads were used in the case of one sample with just 37,597 read pairs. All HHV-6B paired reads generated by HiSeq X Ten sequencing or by RNA-seq were used. Nucleotides that differ among the reconstructed HHV-6 genomes and in comparison to the reference genome were identified by comparing pairs of samples to the reference sequence. Many of the SNPs were uninformative, i.e. all 18 samples shared the identical polymorphism when compared to the reference genome. However, 187 SNPs, varying among the 18 samples analyzed, were identified. This iterative process was performed in a blinded fashion in respect to parent and child pair information. Repeat regions R0, R1, R2 and R3 were excluded from this analysis as mapping and/or sequencing errors occur at high frequency at these locations. A similar analysis was performed for the 2 HHV-6A positive parent and child pairs, identifying 999 informative SNPs. Therefore, single nucleotide polymorphism (SNP) concordance between parental and offspring sequences was assessed on the basis of 999 and 187 informative SNPs in the HHV-6A and HHV-6B genomes, respectively.
Analysis of viral integration sites
The HHV-6 genome is bounded by two identical long repeat regions which are in turn flanked by variable length telomere-like repeat regions (T1 and T2). The T2 region, which consists of a perfect array of telomere-like repeats (TTAGGG)n[47], is essential for viral integration[48]. Due to the relatively short length of the reads generated in this study and the presence these repeats in the regions of interest, it is not possible to use an assembly-based approach, i.e. assembling a contig from the viral sequence and extending into the flanking human sequence. Therefore, an approach was developed to target human reads close to the site of viral integration and possibly a small part of the flanking repetitive region of the HHV-6B genome. The raw reads were mapped to the viral reference genome and read pairs were analyzed when only one read mapped to the viral genome. The non-viral reads of these discordant read pairs were subsequently searched using the NCBI online BLAST interface (blastn, database: Others, http://blast.ncbi.nlm.nih.gov)[49]. Non-viral reads that mapped to the same region within the human genome were grouped, together with their corresponding viral-mapped reads and contigs assembled.
Statistical analysis
Associations were quantified using odds ratios and 95% confidence intervals (CI). The latter were calculated using the recommended quasi-exact Baptista-Pike mid-p method, which performs well across a wide range of sample sizes and proportions[50]. P values (2-sided) from 2x2 tables were calculated using the Fisher-Boschloo unconditional exact test[51], although all analyses were repeated using the Fisher’s exact test which is more widely recognized. Metaprop command[52] in Stata was used to perform random effects meta-analysis of proportions and to estimate heterogeneity. Analyses were performed using Stata 15.1 and R 3.2.5.
Characteristics of the study groups in the Pregnancy Outcome Prediction (POP) study.
Data are expressed as median (IQR) or n (%) as appropriate. The overall rate of preeclampsia in these participants was 6.4%. For fields where there is no category labelled “missing”, data were 100% complete. Maternal age was defined as age at recruitment. All other maternal characteristics were defined by self-report at the 20 weeks questionnaire, from examination of the clinical case record, or linkage to the hospital’s electronic databases. Socio-economic status was quantified using the Index of Multiple Deprivation (IMD) 2007, which is based on census data from the area of the mother’s postcode. Stillbirths (n=8) and spontaneous preterm deliveries (n=100) were included in the analysis, while miscarriages (n=7) and terminations of pregnancy (n=11) were excluded. Abbreviations: non-cases denote patients without preeclampsia; FTE denotes full time education; BMI denotes body mass index; DM denotes diabetes mellitus.
Placental RNA-seq reads mapped to HHV-6A and HHV-6B genomes.
Placental HHV-6 positive samples identified by RNA-seq. Reads were identified by Kraken as aligning to the HHV-6 genomes and mapped with BWA to the HHV-6A or HHV-6B reference genomes (Supplementary Methods); note that the total number of reads/sample recognized by the two software is not always identical. CON denotes a healthy pregnancy without FGR or preeclampsia (see Methods); FGR denotes fetal growth restriction; PE denotes a patient with preeclampsia; HHV-6A denotes human herpesvirus 6, variant A; HHV-6B denotes human herpesvirus 6, variant B; DRL: direct repeat left; DRR: direct repeat right. Repetitive regions are in Italic. HHV-6A and HHV-6B genomes have been described by Dominguez G et al[53].
HHV-6 detection in fetal and parental samples.
HHV-6A and HHV-6B representative signals in cord (A) and parental (B) DNA samples detected using a multiplex qPCR approach. These analyses were performed in 5,061 and 86 samples, respectively, and each sample was analyzed in triplicate. qPCR amplification curves for the HHV-6A and HHV-6B 9 U67/68 genes are represented in green and red, respectively; RNase P curves are in blue and confirmed presence of DNA in the wells. ciHHV-6 corresponds to a high HHV-6 DNA signal in the sample measured by qPCR, i.e. within 4 cycles of the RNase P signal. HHV-6 non-integrated corresponds to a HHV-6 DNA signal in the sample detected at more than 4 Ct higher compared to the RNase P signal. Negative samples lack HHV-6A or HHV-6B DNA signal. C) RT-qPCR amplification plot of placental RNA samples showing detection of the HHV-6 U100 gene. Eight representative samples are shown, two with viral transcript amplification (total n=48 samples, each analyzed in triplicate). Five negative controls (samples without reverse transcriptase enzyme in the RT reaction) lacked U100 amplification (not shown). U100 curves are in black and RNase P curves are in blue. Rn (normalized reporter value) represents the fluorescence of the reporter dye normalized to the signal of the passive reference dye for a given reaction. The ΔRn is the Rn value of an experimental reaction minus the Rn value of the baseline signal generated by the instrument. This parameter indicates the magnitude of the fluorescent signal generated in the qPCR assay. ciHHV-6 denotes chromosomally integrated human herpesvirus 6; HHV-6A denotes human herpesvirus 6, variant A; HHV-6B denotes human herpesvirus 6, variant B; RNase P denotes the human positive control gene RPPH1.
Identification of informative HHV-6B SNPs.
DNA-seq reads of 2 randomly selected samples were compared to the HHV-6B reference genome. *Informative SNP sites, i.e. SNPs present in just one of the two analyzed samples (gapped vertical red lines). **SNPs present in both analyzed samples, i.e. sites concordantly different from the reference genome, were considered not informative (continuous vertical red lines). Throughout the 162kb HHV-6B genome 187 SNPs were classified as informative.
Authors: Annie Gravel; Isabelle Dubuc; Guillaume Morissette; Ruth H Sedlak; Keith R Jerome; Louis Flamand Journal: Proc Natl Acad Sci U S A Date: 2015-06-15 Impact factor: 11.205
Authors: E M McClure; A Garces; S Saleem; J L Moore; C L Bose; F Esamai; S S Goudar; E Chomba; M Mwenechanya; O Pasha; A Tshefu; A Patel; S M Dhaded; C Tenge; I Marete; M Bauserman; S Sunder; B S Kodkany; W A Carlo; R J Derman; P L Hibberd; E A Liechty; K M Hambidge; N F Krebs; M Koso-Thomas; M Miodovnik; D D Wallace; R L Goldenberg Journal: BJOG Date: 2017-01-31 Impact factor: 6.531
Authors: Daniel P Depledge; Anne L Palser; Simon J Watson; Imogen Yi-Chun Lai; Eleanor R Gray; Paul Grant; Ravinder K Kanda; Emily Leproust; Paul Kellam; Judith Breuer Journal: PLoS One Date: 2011-11-18 Impact factor: 3.240
Authors: Yan Huang; Alberto Hidalgo-Bravo; Enjie Zhang; Victoria E Cotton; Aaron Mendez-Bermudez; Gunjan Wig; Zahara Medina-Calzada; Rita Neumann; Alec J Jeffreys; Bruce Winney; James F Wilson; Duncan A Clark; Martin J Dyer; Nicola J Royle Journal: Nucleic Acids Res Date: 2013-09-19 Impact factor: 16.971
Authors: Michael L Wood; Colin D Veal; Rita Neumann; Nicolás M Suárez; Jenna Nichols; Andrei J Parker; Diana Martin; Simon Pr Romaine; Veryan Codd; Nilesh J Samani; Adriaan A Voors; Maciej Tomaszewski; Louis Flamand; Andrew J Davison; Nicola J Royle Journal: Elife Date: 2021-09-21 Impact factor: 8.140
Authors: Satu Wedenoja; Masahito Yoshihara; Hindrek Teder; Hannu Sariola; Mika Gissler; Shintaro Katayama; Juho Wedenoja; Inka M Häkkinen; Sini Ezer; Nina Linder; Johan Lundin; Tiina Skoog; Ellika Sahlin; Erik Iwarsson; Karin Pettersson; Eero Kajantie; Mikael Mokkonen; Seppo Heinonen; Hannele Laivuori; Kaarel Krjutškov; Juha Kere Journal: EBioMedicine Date: 2020-07-14 Impact factor: 8.143