Literature DB >> 33060642

Repeated sampling facilitates within- and between-subject modeling of the human sperm transcriptome to identify dynamic and stress-responsive sncRNAs.

Christopher P Morgan1, Amol C Shetty2, Jennifer C Chan3, Dara S Berger4, Seth A Ament2, C Neill Epperson5, Tracy L Bale6,7.   

Abstract

Epidemiological studies from the last century have drawn strong associations between paternal life experiences and offspring health and disease outcomes. Recent studies have demonstrated sperm small non-coding RNA (sncRNA) populations vary in response to diverse paternal insults. However, for studies in retrospective or prospective human cohorts to identify changes in paternal germ cell epigenetics in association with offspring disease risk, a framework must first be built with insight into the expected biological variation inherent in human populations. In other words, how will we know what to look for if we don't first know what is stable and what is dynamic, and what is consistent within and between men over time? From sperm samples from a 'normative' cohort of healthy human subjects collected repeatedly from each subject over 6 months, 17 healthy male participants met inclusion criteria and completed donations and psychological evaluations of perceived stress monthly. sncRNAs (including miRNA, piRNA, and tRNA) isolated from mature sperm from these samples were subjected to Illumina small RNA sequencing, aligned to subtype-specific reference transcriptomes, and quantified. The repeated measures design allowed us to define both within- and between-subject variation in the expression of 254 miRNA, 194 tRNA, and 937 piRNA in sperm over time. We developed screening criteria to identify a subset of potential environmentally responsive 'dynamic' sperm sncRNA. Implementing complex modeling of the relationships between individual dynamic sncRNA and perceived stress states in these data, we identified 5 miRNA (including let-7f-5p and miR-181a-5p) and 4 tRNA that are responsive to the dynamics of prior stress experience and fit our established mouse model. In the current study, we aligned repeated sampling of human sperm sncRNA expression data with concurrent measures of perceived stress as a novel framework that can now be applied across a range of studies focused on diverse environmental factors able to influence germ cell programming and potentially impact offspring development.

Entities:  

Mesh:

Substances:

Year:  2020        PMID: 33060642      PMCID: PMC7562703          DOI: 10.1038/s41598-020-73867-7

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.379


Introduction

Epidemiological studies over the last several decades have provided strong evidence that parental life experiences shape offspring development, stimulating new consideration of the factors that underlie intergenerational programming of disease risk and resilience[1-16]. While these associations are well established, the molecular mechanisms involved remain elusive, especially for paternal transmission. Recent studies focusing on the influence of preconception paternal insults implicate epigenetic processes important for germ cell development and maturation[17-46]. Advances in high-throughput next-generation sequencing have revealed highly complex populations of RNA in sperm, and in animal studies, small noncoding RNA (sncRNA) populations have emerged as causal agents in the germline transmission of paternal experience[44,47-51]. Animal studies have provided the strongest evidence that paternal experiences, including various types of stress, drugs, and dietary manipulations, are associated with lasting changes in sperm RNA expression, where changes in specific miRNA and tRNA fragments have often been reported[17-25,27,28,31,32,36-38,40,41,52,53]. Further, injection of these RNAs into fertilized zygotes or incubation of sperm with RNA-containing extracellular vesicles, replicated the offspring phenotype, supporting a causal importance of sperm RNA changes[17,19-25]. A growing number of human subject studies also demonstrated sperm sncRNA populations vary in association with diverse environmental exposures or experiences (e.g., smoking, diet, obesity, and stress/trauma)[28,39, 42,53–58]. However, the majority of these studies were constrained to a single timepoint or to a single within-subject comparison. Without time as a factor, it is challenging to determine environmentally-driven impact on sperm RNAs vs. population variance. Before we can begin searching for candidate sperm RNAs associated with or predictive of offspring disease risk (e.g., paternal stress or trauma and child autism or schizophrenia risk), a critical first step must be to develop a framework from a ‘normative’ dataset that includes within- and between-subject comparisons. In the current studies, our goal was to assess the normative composition and dynamic changes in sperm sncRNA (including miRNA, piRNA, and tRNA) from a cohort of healthy human subjects from repeated monthly collections over 6 months. This repeated measures design allowed us to define both between-subject and within-subject variation in sperm sncRNA content with time as a factor. In addition, as our lab and others have previously demonstrated in animal models that sperm sncRNA are responsive to prior chronic stress experience, we modeled monthly transcriptomic data aligned with prior subject perceived stress state to identify specific sperm RNAs that fit strict criteria for consistent detection within- and between-subjects[17,18,21,28,31]. Unlike sperm DNA methylation, which is relatively stable over time within and between subjects, specific populations of sperm sncRNA appear to be more dynamic[59,60]. Our goal is that by providing the field with this comprehensive ‘normative’ dataset modeled with perceived stress, we can begin building a powerful framework to be utilized across cohorts and areas of study, and therefore novel and disease-predictive sncRNAs in human sperm will eventually be identified.

Results

Demographic and clinical characteristics of a normative cohort of males

To establish the baseline characteristics of the sperm sncRNA complement, we recruited men from a relatively homogenous and ‘normative’ population of University of Pennsylvania students. Subjects between the ages of 18 and 25 were screened and excluded for major medical illness, mental health diagnoses, psychotropic medication use, and substance abuse. Following screening and baseline assessments, enrolled subjects returned monthly for 6 visits to donate semen samples for sperm sncRNA analysis. In addition, with each sample donation subjects completed psychological inventories, including the Perceived Stress Scale (PSS) (Fig. 1A,B). The PSS is commonly used to assess perception of stress over the previous month. It is the most commonly used psychological instrument to measure the degree to which situations in a subject’s life are appraised as stressful and taps into how unpredictable, uncontrollable and overloaded respondents experience their lives[61]. Baseline demographics and results from an Adverse Childhood Experiences (ACE) questionnaire and State-Trait Anxiety Inventory (STAI) demonstrate the final study cohort (N = 17) was relatively homogenous (Table 1). Subjects were between 19–25 years old (mean = 22.8, SD = 1.8), single, and without children. In addition, most subjects had ACE scores of 0, while only one subject had an ACE = 1 and two had an ACE = 2. Subjects scored between 22–39 on the STAI-Trait questionnaire (mean = 30.7, SD = 6.0). Mature sperm was enriched from cryopreserved samples collected from these individuals, then sncRNA was isolated and subject to small RNA sequencing. Sample characteristics are listed in Supplemental Table 1.
Figure 1

A normative cohort for within- and between-subject expression characteristics of sperm sncRNA populations associated with variable perceived stress experience across the 6-month study period. (A) To assess the normative composition and dynamic changes in human sperm small non-coding RNA (sncRNA) populations (including miRNA, tRNA, and piRNA), potential subjects were screened (indicated by brown shading of the first visit) to enroll healthy young men (ages 18–25) who completed psychological inventories, including the Perceived Stress Scale (PSS), and donated sperm monthly over 6 months (N = 17 subjects). Mature sperm was enriched from cryopreserved samples collected from these individuals, then sncRNA was isolated and subject to small RNA sequencing, followed by analysis and statistical modeling of sncRNA expression data and PSS scores. (B) On the same day of each monthly sperm collection, subjects also completed the Perceived Stress Scale (PSS) as part of their psychological inventory. The total score of this clinically validated measure of perceived stress score for each month (y-axis) is plotted over time (x-axis) for each subject (indicated by plot numbering). (C) Hypothetical expression plot of a sncRNA for 5 subjects over 6 months to illustrate the within- and between-subject structure and analysis of the data collected in our study. This structure was exploited to develop screening criteria for the identification of potential environmentally responsive ‘dynamic’ sperm sncRNA. These criteria were based on three a priori assumptions. First, we focused on sncRNA with a high degree of within-subject variation across time, assuming variation reflects responses to external stimuli. Second, we assumed that for a ‘dynamic’ sncRNA to be functionally relevant, it should also be expressed at relatively high levels (within the top quartile of expression) during at least one time point over the 6 months for a given subject. Third, we assumed sncRNA meeting these first two criteria (variable and relevant expression level) in multiple subjects within the final cohort were more likely to reflect a conserved response to variation in extrinsic factors in the environment, and therefore meet our final criteria as ‘dynamic’ to be tested in our statistical modeling. In the hypothetical plot provided, potential sncRNA expression patterns shown include: low expression and low variation in Subject 1, relevant high expression but low variation in Subject 2, low expression and high variation in Subject 3, relevant high expression and high variation in Subjects 4 and 5, exhibiting between-subject overlap. (A,C) by Tim Phelps © 2020 JHU AAM. (B) was generated using R (version 3.5.2) and the package ‘ggplot2’ (version 3.2.0) (https://cran.r-project.org/).

Table 1

Baseline demographics and assessments including ACE questionnaire and STAI inventory for study subjects.

SubjectAgeMarital statusChildrenRace/ethnicityEducationAdverse childhood events (ACE) (total)State/Trait Anxiety Inventory (STAI) trait score
119SingleNo childrenAsianSome college022
219SingleNo childrenAsianSome college028
422SingleNo ChildrenCaucasian (non-Hispanic)College degree123
522SingleNo childrenCaucasian (non-Hispanic)Some college023
623SingleNo childrenCaucasian (non-Hispanic)College degree026
725SingleNo childrenHispanicCollege degree035
925SingleNo childrenCaucasian (non-Hispanic)Some graduate/professional027
1024SingleNo childrenCaucasian (non-Hispanic)Some graduate/professional033
1122SingleNo childrenCaucasian (non-Hispanic)College degree038
1224SingleNo childrenCaucasian (non-Hispanic)Some graduate/professional031
1325SingleNo childrenCaucasian (non-Hispanic)Some graduate/professional027
1422SingleNo childrenCaucasian (non-Hispanic)Some graduate/professional024
1524SingleNo childrenCaucasian (non-Hispanic)Some graduate/professional037
1723SingleNo childrenCaucasian (non-Hispanic)Some graduate/professional234
1823SingleNo childrenCaucasian (non-Hispanic)Some graduate/professional238
2023SingleNo childrenCaucasian (non-Hispanic)Some graduate/professional039
2222SingleNo childrenAsianSome graduate/professional037
A normative cohort for within- and between-subject expression characteristics of sperm sncRNA populations associated with variable perceived stress experience across the 6-month study period. (A) To assess the normative composition and dynamic changes in human sperm small non-coding RNA (sncRNA) populations (including miRNA, tRNA, and piRNA), potential subjects were screened (indicated by brown shading of the first visit) to enroll healthy young men (ages 18–25) who completed psychological inventories, including the Perceived Stress Scale (PSS), and donated sperm monthly over 6 months (N = 17 subjects). Mature sperm was enriched from cryopreserved samples collected from these individuals, then sncRNA was isolated and subject to small RNA sequencing, followed by analysis and statistical modeling of sncRNA expression data and PSS scores. (B) On the same day of each monthly sperm collection, subjects also completed the Perceived Stress Scale (PSS) as part of their psychological inventory. The total score of this clinically validated measure of perceived stress score for each month (y-axis) is plotted over time (x-axis) for each subject (indicated by plot numbering). (C) Hypothetical expression plot of a sncRNA for 5 subjects over 6 months to illustrate the within- and between-subject structure and analysis of the data collected in our study. This structure was exploited to develop screening criteria for the identification of potential environmentally responsive ‘dynamic’ sperm sncRNA. These criteria were based on three a priori assumptions. First, we focused on sncRNA with a high degree of within-subject variation across time, assuming variation reflects responses to external stimuli. Second, we assumed that for a ‘dynamic’ sncRNA to be functionally relevant, it should also be expressed at relatively high levels (within the top quartile of expression) during at least one time point over the 6 months for a given subject. Third, we assumed sncRNA meeting these first two criteria (variable and relevant expression level) in multiple subjects within the final cohort were more likely to reflect a conserved response to variation in extrinsic factors in the environment, and therefore meet our final criteria as ‘dynamic’ to be tested in our statistical modeling. In the hypothetical plot provided, potential sncRNA expression patterns shown include: low expression and low variation in Subject 1, relevant high expression but low variation in Subject 2, low expression and high variation in Subject 3, relevant high expression and high variation in Subjects 4 and 5, exhibiting between-subject overlap. (A,C) by Tim Phelps © 2020 JHU AAM. (B) was generated using R (version 3.5.2) and the package ‘ggplot2’ (version 3.2.0) (https://cran.r-project.org/). Baseline demographics and assessments including ACE questionnaire and STAI inventory for study subjects.

Quality control and alignment to the Ensembl ncRNA transcriptome

Mature sperm cells are generally thought to be transcriptionally inert. As a result, the RNA content of mature sperm differs from most cells in several ways. The mature sperm RNA complement is dominated by sncRNA and lacks intact ribosomal RNA. This is likely due to the active degradation of specific pools of RNA or the protection of specific species from a broader degradation. These characteristics present a challenge for efforts to characterize sperm RNA populations, largely driven by a poor understanding of the functional role of sncRNA and a parallel dearth of annotations for sncRNA in standard reference genomes. In an effort to circumvent this lack in annotation, we have instead aligned our sequencing data to specific sncRNA transcriptomes curated to contain known or predicted sncRNA species. Initially, a dataset consisting of reads aligned to the Ensembl ncRNA reference transcriptome was used to establish various quality control criteria to apply across sperm sncRNA transcriptomes of interest (miRNA, piRNA, and tRNA). These included filtering criteria to exclude sncRNA features not consistently present across samples and multivariate analyses to identify samples that would be excluded as outliers for technical reasons. Sequencing libraries were generated from 100 samples collected from 17 subjects. In addition, libraries were generated from a common pool of sperm RNA for each round of library preparation and sequencing to serve as technical replicates to assess potential batch effects. These libraries were sequenced to an average depth of 8.2 × 106 reads. Across samples, 29% of these reads aligned to 37,110 features in the Ensembl ncRNA reference transcriptome (Supplemental Table 1). To balance the detection of low-abundance transcripts against the characterization of transcripts consistently present across sperm samples, we retained transcripts with abundances ≥ 1 CPM in at least 75% of samples. This filtering criterion resulted in a near normal distribution of the log2 expression (log2 CPM) of 11,074 retained features (Supplemental Fig 1). Hierarchical clustering of samples based on ncRNA transcript expression, post-filtering and TMM normalization, identified samples 4-01 and 6-01 as outliers (Supplemental Fig 2A). This was likely driven by the large number of ncRNA features absent from these sample; in sample 4-01 and 6-01, 3695 and 969 features, respectively, had CPM = 0. The sample with the next largest number of absent features was 17-01 with 150; therefore 4-01 and 6-01 were excluded from further analyses. Though subjects were assigned to library prep/sequencing runs in a non-systematic manner, samples from the same subject were generally prepared and sequenced in the same run; therefore, it was important to test for batch effects. The clustering analysis grouped the 3 technical replicate samples into a single exclusive cluster, demonstrating any batch effects were minimal. The two outlier samples and the overlapping technical replicates were also identifiable in a Principal Component Analysis (PCA) when samples are plotted along the two components (PC1 and PC2) that account for the largest proportions of total variance in the dataset (37.3% and 12.2% respectively) (Supplemental Fig 2B). ‘Dynamic’ sperm sncRNA populations display characteristic within-subject patterns of expression across time. Graphs represent the expression patterns of sperm sncRNA, the ranked expression (highest to lowest) of each (A) miRNA, (B) tRNA, and (C) piRNA is plotted by time within each subject. The final pool of ‘dynamic’ sncRNA (N = 33 miRNA, N = 17 tRNA, and N = 97 piRNA) are plotted in dark blue, while all other consistently expressed sncRNA are plotted in light blue. This figure generated using R (version 3.5.2) and the package ‘ggplot2’ (version 3.2.0) (https://cran.r-project.org/).

Characterizing the class-specific sncRNA transcriptomes of sperm

There are 3 classes of sncRNA in sperm that are of particular relevance to the goals of the current study, miRNA, tRNA, and piRNA. None of these classes are present in their entirety within the Ensembl ncRNA transcriptome, therefore we aligned our sequencing data separately to more comprehensive class-specific reference transcriptomes. For miRNA, reads were initially aligned to 2657 features in the reference transcriptome obtained from miRbase. After filtering for features consistently present across samples (CPM ≥ 1 in 75% of samples), we identified 254 total miRNA in mature sperm. For tRNA, reads initially aligned to 425 features in the genomic tRNA database reference transcriptome from GtRNAdb. After filtering, we identified 194 total tRNA consistently present in mature sperm. For piRNA, reads initially aligned to 32,827 features in the reference transcriptome obtained from piRbase. After filtering, 837 total piRNA were consistently present in mature sperm. Hierarchical clustering and PCA demonstrate broad clustering of samples from the same subject, suggesting between-subject variation in sperm sncRNA expression is greater than within-subject variation (Supplemental Figs 3, 4, and 5). For miRNA, 70% of samples (N = 69) were grouped in clusters exclusively with samples from the same subject (Supplemental Fig 3A). These clusters ranged in size from 2 to 6 samples. The same was true for 78% of samples (N = 76) based on tRNA expression (Supplemental Fig 4A) and 66% of samples (N = 65) based on piRNA expression (Supplemental Fig 5A). This makes sense biologically, as between-subject variation is likely driven by the combination of differences in genetic background and past life experience, which should be greater than any changes in experience over the course of the 6 monthly collections. Individual ‘dynamic’ sperm miRNA and tRNA, but not piRNA, are significantly correlated with perceived stress experience, potentially acting to influence the earliest stages of development to shape long-term health outcomes. (A) Summary of analysis results, including the class-specific numbers of consistently expressed (total) and ‘dynamic’ sperm sncRNA, and the number of ‘dynamic’ sncRNA with statistically significant associations with perceived stress experience. (B) Theoretical model linking a chain of events beginning with an environmental experience or exposure, including stress, impacting sperm sncRNA levels. Acting individually or interacting together as part of a broader sncRNA code, these ‘dynamic’ sncRNA are able to transmit the encoded information regarding the paternal environment to offspring, potentially impacting developmental processes (e.g., rates of fertility, embryo division and implantation). Even if the initial impact of these ‘dynamic’ sperm sncRNA was small (indicated by ⍺1 or ⍺2), consequent shifts in the timing of developmental windows of susceptibility to additional events could produce significant differences over time, either in positive (resilience, indicated in blue) or negative (risk, indicated in red) directions when compared to a typical developmental trajectory (as indicated in black). Illustrations by Tim Phelps © 2020 JHU AAM. To identify the top expressed features, each class of sperm sncRNA was ranked by expression from highest to lowest. The top 50 of each class are displayed in Tables 2, 3, and 4. To examine the stability of these rankings, we looked at how many months individual sncRNA were ranked in the top quartile of expression for each subject. Plotting a histogram of the number of months in which a given sncRNA was expressed in the top quartile of its class demonstrates that sncRNA expressed at a high level in at least 1 month were most likely to be highly expressed in all 6 collections from a subject (Supplemental Figs 6A, 6C, and 6E). The only subjects for which this was not the case were subject 11 for miRNA and piRNA and subject 17 for tRNA. The same is not true for sncRNA expressed at levels in the bottom quartile (Supplemental Figs 6B, 6D, and 6F).
Table 2

Top expressed miRNA.

OrdermiRNAAvg rank expressionAvg expression (CPM)Avg rank CV expression (within-subject)Avg CV expression (within-subject)
1hsa-miR-10b-5p26161320.45
2hsa-miR-10a-5p44561590.41
3hsa-miR-22-3p53441630.39
4hsa-miR-1213663801020.52
5hsa-miR-30a-5p131441600.41
6hsa-miR-375-3p13193990.52
7hsa-miR-182-5p141331590.41
8hsa-miR-26a-5p151281440.43
9hsa-miR-10400-5p151202050.3
10hsa-miR-16-5p161221670.39
11hsa-miR-7977172591200.51
12hsa-miR-396019931950.31
13hsa-miR-34c-5p201251300.47
14hsa-miR-148a-3p211281020.51
15hsa-miR-4800-3p22921930.33
16hsa-miR-1260a231731210.51
17hsa-miR-92a-3p23921160.47
18hsa-miR-542-3p24871270.45
19hsa-miR-25-3p25711760.38
20hsa-miR-27b-3p2678980.58
21hsa-miR-191-5p27671520.42
22hsa-miR-463427582200.24
23hsa-miR-21-5p2783910.55
24hsa-miR-449a31751000.54
25hsa-miR-770432542000.32
26hsa-miR-6765-3p3388970.58
27hsa-let-7a-5p36451340.45
28hsa-miR-141-3p36102450.69
29hsa-miR-1260b38681210.49
30hsa-miR-619-5p38361780.34
31hsa-miR-3614-3p39371390.42
32hsa-miR-30d-5p43351640.4
33hsa-miR-4787-5p43311900.32
34hsa-let-7f-5p45431010.51
35hsa-miR-28-3p48291560.41
36hsa-miR-239249591140.59
37hsa-miR-3184-3p4939840.53
38hsa-miR-423-5p4939840.53
39hsa-miR-4763-5p49261510.4
40hsa-miR-192-5p52291570.42
41hsa-miR-151a-5p54241640.39
42hsa-miR-477156221730.36
43hsa-miR-6511a-5p56241380.43
44hsa-miR-451656211960.26
45hsa-miR-6836-3p58192140.27
46hsa-miR-6511b-5p58231360.44
47hsa-miR-149-3p59191820.3
48hsa-miR-186-5p61231340.45
49hsa-miR-10394-3p62182060.29
50hsa-miR-6790-3p62211380.44
Table 3

Top expressed tRNA.

OrdertRNAAvg rank expressionAvg expression (CPM)Avg rank CV expression (within-subject)Avg CV expression (within-subject)
1hsa-tRNA-Glu-CTC-1-114994920.43
2hsa-tRNA-Glu-CTC-2-124936900.43
3hsa-tRNA-Glu-TTC-2-1519401240.34
4hsa-tRNA-Lys-CTT-1-173231490.64
5hsa-tRNA-iMet-CAT-1-1712761130.38
6hsa-tRNA-Gly-CCC-1-181082640.51
7hsa-tRNA-Lys-CTT-4-183148470.65
8hsa-tRNA-iMet-CAT-2-1911691050.40
9hsa-tRNA-Gly-GCC-1-110923690.52
10hsa-nmt-tRNA-Leu-TAA-4-111720670.47
11hsa-nmt-tRNA-Leu-TAA-1-1126161120.36
12hsa-tRNA-SeC-TCA-1-1135011550.22
13hsa-tRNA-Gly-GCC-2-115659590.57
14hsa-tRNA-Glu-TTC-1-1173031500.25
15hsa-tRNA-Gly-CCC-2-1173041530.23
16hsa-tRNA-Gly-GCC-3-119567480.63
17hsa-tRNA-Gly-CCC-6-119330810.42
18hsa-tRNA-Leu-CAG-2-1202201350.29
19hsa-tRNA-Leu-CAG-1-1212131330.30
20hsa-nmt-tRNA-Gln-TTG-6-121401790.47
21hsa-tRNA-Asp-GTC-2-1212081310.30
22hsa-tRNA-His-GTG-1-1231771060.39
23hsa-tRNA-Thr-CGT-chr1-11825197890.44
24hsa-tRNA-Lys-CTT-2-125322600.54
25hsa-tRNA-Arg-ACG-1-1301081370.25
26hsa-tRNA-Arg-ACG-2-1301101500.22
27hsa-tRNA-SeC-TCA-2-1311321300.29
28hsa-tRNA-Lys-TTT-3-134951310.30
29hsa-tRNA-Arg-TCG-1-135991600.22
30hsa-tRNA-Val-AAC-2-135861310.29
31hsa-tRNA-Pro-TGG-3-137120950.45
32hsa-tRNA-Leu-TAG-2-137831170.33
33hsa-tRNA-Glu-TTC-4-138761570.22
34hsa-tRNA-Lys-TTT-5-138751250.31
35hsa-tRNA-Gln-TTG-1-138651590.22
36hsa-tRNA-Lys-CTT-6-140841190.33
37hsa-tRNA-Gln-CTG-1-141701100.35
38hsa-tRNA-Gln-TTG-3-142571630.21
39hsa-tRNA-Asp-GTC-3-143541250.33
40hsa-tRNA-Arg-CCG-2-144661270.34
41hsa-tRNA-Leu-TAG-3-146511400.28
42hsa-nmt-tRNA-Gln-TTG-9-147571120.33
43hsa-tRNA-Gln-CTG-2-14763990.39
44hsa-tRNA-Leu-TAA-1-147461590.23
45hsa-tRNA-Arg-CCT-4-149471460.27
46hsa-tRNA-Pro-AGG-1-15095820.50
47hsa-tRNA-Ser-GCT-3-151541470.25
48hsa-tRNA-Val-TAC-1-151401510.25
49hsa-tRNA-Gly-TCC-1-15839960.39
50hsa-tRNA-Arg-CCG-1-160381120.36
Table 4

Top expressed piRNA.

OrderpiRNAAvg rank expressionAvg expression (CPM)Avg rank CV expression (within-subject)Avg CV expression (within-subject)
1hsa-piR-24925124,9455460.31
2hsa-piR-317339668060.12
3hsa-piR-2710575497500.19
4hsa-piR-2386397615490.35
5hsa-piR-2235594887410.20
6hsa-piR-165133836960.23
7hsa-piR-19619133946270.28
8hsa-piR-24713153267340.19
9hsa-piR-20684153087280.19
10hsa-piR-7034155073850.43
11hsa-piR-9359165063840.43
12hsa-piR-6266182876450.25
13hsa-piR-12106202586990.22
14hsa-piR-10365202277410.17
15hsa-piR-26211207672950.51
16hsa-piR-2465222216530.26
17hsa-piR-9471252335550.33
18hsa-piR-28905262055460.36
19hsa-piR-3351271886700.25
20hsa-piR-12078272253570.46
21hsa-piR-31148283691400.76
22hsa-piR-25836291536560.25
23hsa-piR-14727292245810.30
24hsa-piR-3953301963990.42
25hsa-piR-28241341506150.28
26hsa-piR-25596351734140.43
27hsa-piR-8336381493930.45
28hsa-piR-22784401751850.62
29hsa-piR-23494403613910.45
30hsa-piR-2251421007340.20
31hsa-piR-15477422863750.49
32hsa-piR-22404431046470.26
33hsa-piR-6428432363660.64
34hsa-piR-22183441076330.27
35hsa-piR-2042245926940.22
36hsa-piR-10341491464510.42
37hsa-piR-1099650954840.37
38hsa-piR-805250767190.21
39hsa-piR-26351876130.29
40hsa-piR-1520754727730.17
41hsa-piR-22921571114720.38
42hsa-piR-873658736900.24
43hsa-piR-1163763845230.39
44hsa-piR-3099970882670.54
45hsa-piR-295576546240.27
46hsa-piR-198877765070.41
47hsa-piR-557077852420.56
48hsa-piR-2609077874000.46
49hsa-piR-932778507270.19
50hsa-piR-207379645550.33
Top expressed miRNA. Top expressed tRNA. Top expressed piRNA.

‘Dynamic’ sperm sncRNA

To identify potential environmentally responsive ‘dynamic’ sncRNA, we developed screening criteria based on three a priori assumptions: (1) we assumed ‘dynamic’ sncRNA were likely to exhibit a higher degree of variation in expression over time in response to changes in the environment; (2) given the disparity in the amount of RNA present in sperm relative to ovum, we assumed sncRNA with the potential to impact offspring development would need to be highly expressed; and (3) we assumed sncRNA meeting criteria based on these first two assumptions in multiple subjects were more likely to reflect a conserved functional response to extrinsic factors[62-64]. The analysis we performed to characterize these properties and identify ‘dynamic’ sncRNA is illustrated in Fig. 1C and described in greater detail in the “Methods”. An initial pool of candidates consisted of sncRNA that exhibited within-subject variation (CV expression) ranked in the top quartile of each class of features, while also being expressed in the top quartile in at least one timepoint over the 6 months for a given subject. One hundred seventeen miRNA, 75 tRNA, and 369 piRNA met these criteria in at least one subject. The expression of these initial candidates is plotted for each subject in Supplemental Fig 7. We then asked how many of these initial candidates overlapped between subjects (Supplemental Fig 8). Thirty-three miRNA, 17 tRNA, and 97 piRNA met the within-subject criteria in at least 25% (N = 5) of subjects (count candidate ≥ 5), constituting our final pool of ‘dynamic’ sperm sncRNA. These sncRNA are displayed in Tables 5, 6, and 7 and their expression over time is plotted for each subject in Fig. 2A–C, respectively.
Table 5

'Dynamic' sperm miRNA.

OrdermiRNAAvg rank expressionAvg expression (CPM)Avg rank CV expression (within-subject)Avg CV expression (within-subject)Count candidate
1hsa-miR-1213663801020.525
2hsa-miR-375-3p13193990.525
3hsa-miR-148a-3p211281020.517
4hsa-miR-27b-3p2678980.587
5hsa-miR-21-5p2783910.557
6hsa-miR-449a31751000.546
7hsa-miR-6765-3p3388970.586
8hsa-miR-141-3p36102450.6912
9hsa-let-7f-5p45431010.515
10hsa-miR-3184-3p4939840.537
11hsa-miR-423-5p4939840.537
12hsa-miR-39076532820.578
13hsa-miR-120865241100.545
14hsa-miR-44546927910.545
15hsa-miR-891a-5p7631730.708
16hsa-miR-363-3p8029570.638
17hsa-miR-31828117940.548
18hsa-miR-60908422640.729
19hsa-miR-6750-3p8819540.677
20hsa-miR-203a-3p10330310.9811
21hsa-miR-181a-5p10633750.637
22hsa-miR-203b-5p10730290.9910
23hsa-miR-130a-3p107111190.525
24hsa-let-7g-5p108111020.525
25hsa-miR-320a-3p11117570.758
26hsa-miR-1247-3p11611600.668
27hsa-miR-320b12914430.809
28hsa-miR-6880-5p1338730.617
29hsa-miR-23a-3p14211570.816
30hsa-miR-23b-3p14411580.796
31hsa-miR-29a-3p1518650.625
32hsa-miR-4757-5p1519420.846
33hsa-miR-6731-5p1607530.877
Table 6

'Dynamic' sperm tRNA.

OrdertRNAAvg rank expressionAvg expression (CPM)Avg rank CV expression (within-subject)Avg CV expression (within-subject)Count candidate
1hsa-tRNA-Lys-CTT-1-173231490.6410
2hsa-tRNA-Gly-CCC-1-181082640.516
3hsa-tRNA-Lys-CTT-4-183148470.6510
4hsa-tRNA-Gly-GCC-1-110923690.526
5hsa-nmt-tRNA-Leu-TAA-4-111720670.4710
6hsa-tRNA-Gly-GCC-2-115659590.579
7hsa-tRNA-Gly-GCC-3-119567480.6310
8hsa-tRNA-Gly-CCC-6-119330810.427
9hsa-nmt-tRNA-Gln-TTG-6-121401790.476
10hsa-tRNA-Lys-CTT-2-125322600.546
11hsa-tRNA-Pro-TGG-3-137120950.455
12hsa-tRNA-Pro-AGG-1-15095820.506
13hsa-tRNA-Asn-GTT-15-16928950.415
14hsa-tRNA-Asn-GTT-19-17028940.425
15hsa-nmt-tRNA-Ser-TGA-3-17131480.616
16hsa-tRNA-Und-NNN-4-18020510.615
17hsa-tRNA-Glu-TTC-10-18917370.685
Table 7

'Dynamic' sperm piRNA.

OrderpiRNAAvg rank expressionAvg expression (CPM)Avg rank CV expression (within-subject)Avg CV expression (within-subject)Count candidate
1hsa-piR-26211207672950.516
2hsa-piR-31148283691400.7614
3hsa-piR-22784401751850.6210
4hsa-piR-15477422863750.495
5hsa-piR-6428432363660.647
6hsa-piR-3099970882670.548
7hsa-piR-557077852420.569
8hsa-piR-5304791213180.657
9hsa-piR-4648101693380.525
10hsa-piR-170251011493130.708
11hsa-piR-88911141173720.677
12hsa-piR-18089114832440.699
13hsa-piR-772119603840.566
14hsa-piR-28626120611690.6914
15hsa-piR-14072126741410.7713
16hsa-piR-26349146353700.496
17hsa-piR-30538154362540.548
18hsa-piR-875175303290.536
19hsa-piR-19697187253020.506
20hsa-piR-23716189371250.7916
21hsa-piR-6191189372430.5911
22hsa-piR-2804419957710.9314
23hsa-piR-19009203371270.7814
24hsa-piR-23367210214310.475
25hsa-piR-10037213271670.7214
26hsa-piR-4898217254570.485
27hsa-piR-31194218254550.495
28hsa-piR-18612220194380.425
29hsa-piR-27695227224040.445
30hsa-piR-1129235452810.638
31hsa-piR-20280252212670.556
32hsa-piR-1386260172680.576
33hsa-piR-445268162780.566
34hsa-piR-20373269143500.486
35hsa-piR-1611927059900.8212
36hsa-piR-1899827059900.8212
37hsa-piR-13390271143640.497
38hsa-piR-608272221000.8214
39hsa-piR-27282274251120.8211
40hsa-piR-17850278241160.8213
41hsa-piR-6527279201420.7512
42hsa-piR-1823288272980.658
43hsa-piR-1529025930.8613
44hsa-piR-4673292173670.545
45hsa-piR-13152306162180.616
46hsa-piR-19503308163530.555
47hsa-piR-14706312113780.466
48hsa-piR-13606325133400.495
49hsa-piR-6999326181100.8711
50hsa-piR-21337328362460.718
51hsa-piR-20083336161030.8411
52hsa-piR-3188339143910.575
53hsa-piR-5647342231280.838
54hsa-piR-24896343131890.6711
55hsa-piR-22687345103110.577
56hsa-piR-891335017840.8811
57hsa-piR-15758351141400.779
58hsa-piR-10155354133290.536
59hsa-piR-29889359131410.7610
60hsa-piR-132536215930.8412
61hsa-piR-20495370131110.8312
62hsa-piR-2938037393690.505
63hsa-piR-19468374111810.697
64hsa-piR-13532374131490.748
65hsa-piR-30832378172350.658
66hsa-piR-1711379111160.7910
67hsa-piR-667438092870.597
68hsa-piR-15190381111140.7910
69hsa-piR-30472387131250.787
70hsa-piR-7841394141100.846
71hsa-piR-4093396121320.818
72hsa-piR-21538400101880.676
73hsa-piR-458040493520.535
74hsa-piR-2371441192030.675
75hsa-piR-21903417101090.819
76hsa-piR-28592418152630.678
77hsa-piR-1219542113790.8910
78hsa-piR-2044942511990.848
79hsa-piR-12469431101310.827
80hsa-piR-2835043591120.808
81hsa-piR-7054437101450.776
82hsa-piR-912543913690.9610
83hsa-piR-1844344291000.8610
84hsa-piR-1672545691180.817
85hsa-piR-339845810910.878
86hsa-piR-2804746192070.656
87hsa-piR-1277146481090.827
88hsa-piR-2671446891460.767
89hsa-piR-2408446891000.836
90hsa-piR-251946881350.817
91hsa-piR-1077947372250.625
92hsa-piR-1975547572310.615
93hsa-piR-2829749671430.795
94hsa-piR-2070949671590.725
95hsa-piR-2711351281370.776
96hsa-piR-190355198760.946
97hsa-piR-849055072420.696
Figure 2

‘Dynamic’ sperm sncRNA populations display characteristic within-subject patterns of expression across time. Graphs represent the expression patterns of sperm sncRNA, the ranked expression (highest to lowest) of each (A) miRNA, (B) tRNA, and (C) piRNA is plotted by time within each subject. The final pool of ‘dynamic’ sncRNA (N = 33 miRNA, N = 17 tRNA, and N = 97 piRNA) are plotted in dark blue, while all other consistently expressed sncRNA are plotted in light blue. This figure generated using R (version 3.5.2) and the package ‘ggplot2’ (version 3.2.0) (https://cran.r-project.org/).

'Dynamic' sperm miRNA. 'Dynamic' sperm tRNA. 'Dynamic' sperm piRNA.

Perceived stress and ‘dynamic’ sperm sncRNA

To assess the potential relationship between perceived stress and individual ‘dynamic’ sncRNA, we conducted a series of linear models to test for relationships between ‘dynamic’ sncRNA expression and PSS scores. Based on data from our lab in a mouse model of paternal stress, we hypothesized that there would be a delay in the impact of stress experience on the expression of ‘dynamic’ sncRNA[17]. Therefore, we evaluated the following seven relationships for an individual sncRNA’s expression level in a sperm sample and: (1) PSS score at the time it was collected (t), (2) PSS score at the time of the previous collection (t − 1), (3) PSS score at t − 2, (4) PSS score at t − 3, (5) change in PSS score between t and t − 1, (6) change in PSS score between t and t − 2, and (7) change in PSS score between t and t − 3. The interpretation of the first four models was that the level of a sncRNA changes in a direct relationship with PSS score, possibly with a delay (i.e., a rheostat model), whereas the last three model an increase or decrease in a sncRNA relative to the magnitude of a change in PSS score over the specified period (i.e., a change detector). Ultimately, we identified five ‘dynamic’ miRNA with associations to PSS scores that passed our significance cutoffs (p < 0.01 and FDR < 0.2): let-7f-5p, miR-181a-5p, miR-4454, miR-6765-3p, and miR-12136 (Table 8). We identified four ‘dynamic’ tRNA with significant associations to PSS scores: tRNA-Gly-GCC-3-1, tRNA-Lys-CTT-1-1, tRNA-Lys-CTT-2-1, and tRNA-Lys-CTT-4-1 (Table 8). There were no significant associations between ‘dynamic’ piRNA and PSS scores (Fig. 3A).
Table 8

Characteristics of linear fixed effects models assessing the relationship between perceived stress experience and the expression of ‘dynamic’ sncRNA: Table includes model regression coefficients (log2FC), adjusted p-values, and FDR. Significant relationships are highlighted in bold.

Dynamic sncRNAsncRNA ∼ PSS time of collection (t)sncRNA ∼ PSS at t-1sncRNA ∼ PSS at t-2sncRNA ∼ PSS at t-3sncRNA ∼ delta PSS t-1sncRNA ∼ delta PSS t-2sncRNA ∼ delta PSS t-3
log2FCp (adjusted)FDRlog2FCp (adjusted)FDRlog2FCp (adjusted)FDRlog2FCp (adjusted)FDRlog2FCp (adjusted)FDRlog2FCp (adjusted)FDRlog2FCp (adjusted)FDR
hsa-let-7f-5p− 0.0600.0030.045− 0.0530.0130.220− 0.0410.1040.343− 0.0380.2110.700− 0.0090.7540.998− 0.0080.1040.839− 0.0340.3940.744
hsa-miR-181a-5p− 0.1160.0000.001− 0.1030.0010.019− 0.0900.0080.128− 0.1130.0050.078− 0.0080.8520.998− 0.0110.0080.839− 0.0100.8520.937
hsa-miR-4454− 0.0410.1240.511− 0.0540.0600.392− 0.0860.0080.128− 0.0910.0170.115− 0.0020.9580.9980.0600.0080.6830.0710.1570.633
hsa-miR-6765-3p− 0.0210.4450.698− 0.0290.3220.665− 0.0730.0230.194− 0.0980.0070.078− 0.0150.7000.9980.0370.0230.6830.0850.0790.633
hsa-miR-12136− 0.0330.0720.338− 0.0400.0450.392− 0.0520.0280.194− 0.0800.0040.0780.0090.7270.9980.0140.0280.7730.0530.1450.633
hsa-tRNA-Gly-GCC-3-10.0220.3700.8970.0440.0560.3420.0670.0110.0750.0770.0120.087− 0.0240.4210.854− 0.0830.0130.054− 0.1040.0050.023
hsa-tRNA-Lys-CTT-1-10.0070.8380.9500.0340.2020.3820.0640.0310.0870.0770.0310.087− 0.0360.3290.854− 0.1040.0060.037− 0.1370.0020.009
hsa-tRNA-Lys-CTT-2-10.0270.3090.8970.0410.0770.3420.0590.0220.0750.0830.0070.087− 0.0390.1930.854− 0.0900.0050.037− 0.1300.0010.009
hsa-tRNA-Lys-CTT-4-10.0110.7460.9060.0390.1620.3820.0710.0190.0750.0830.0190.087− 0.0340.3650.854− 0.1070.0060.037− 0.1390.0010.009

Table includes model regression coefficients (log2FC), adjusted p-values, and FDR. Significant relationships (p < 0.01 and FDR < 0.2) are highlighted in bold.

Figure 3

Individual ‘dynamic’ sperm miRNA and tRNA, but not piRNA, are significantly correlated with perceived stress experience, potentially acting to influence the earliest stages of development to shape long-term health outcomes. (A) Summary of analysis results, including the class-specific numbers of consistently expressed (total) and ‘dynamic’ sperm sncRNA, and the number of ‘dynamic’ sncRNA with statistically significant associations with perceived stress experience. (B) Theoretical model linking a chain of events beginning with an environmental experience or exposure, including stress, impacting sperm sncRNA levels. Acting individually or interacting together as part of a broader sncRNA code, these ‘dynamic’ sncRNA are able to transmit the encoded information regarding the paternal environment to offspring, potentially impacting developmental processes (e.g., rates of fertility, embryo division and implantation). Even if the initial impact of these ‘dynamic’ sperm sncRNA was small (indicated by ⍺1 or ⍺2), consequent shifts in the timing of developmental windows of susceptibility to additional events could produce significant differences over time, either in positive (resilience, indicated in blue) or negative (risk, indicated in red) directions when compared to a typical developmental trajectory (as indicated in black).

Illustrations by Tim Phelps © 2020 JHU AAM.

Characteristics of linear fixed effects models assessing the relationship between perceived stress experience and the expression of ‘dynamic’ sncRNA: Table includes model regression coefficients (log2FC), adjusted p-values, and FDR. Significant relationships are highlighted in bold. Table includes model regression coefficients (log2FC), adjusted p-values, and FDR. Significant relationships (p < 0.01 and FDR < 0.2) are highlighted in bold.

Discussion

There is a growing appreciation for the importance of the paternal preconception environment in the developmental programming of offspring[1,15,44,48-51]. Though the mechanisms underlying this developmental plasticity are likely adaptive, in the context of human health, these effects may be expressed as changes in disease risk or resilience. Recent work from our lab and others demonstrating direct causal associations between sperm RNA and complex offspring phenotypes have shifted the focus of the investigations into the mechanisms underlying intergenerational transmissions to sperm RNA[17,19-25]. In the early 2000s, Stephen Krawetz and colleagues demonstrated that human sperm contained specific populations of RNA, that the RNA species present were conserved across healthy subjects, and that these RNA were delivered to the ovum, where they played a functional role in early zygote development[65,66]. Though the initial focus of this work was on protein-coding RNA, these studies also identified sncRNA in human sperm, including miRNA, tRNA, and piRNA, and as our understanding of the functional relevance of sncRNA in cell physiology has advanced, so too has our appreciation for the importance of sncRNA in the function of the germ cell and in regulating the earliest processes of newly fertilized zygotes[26,47,62,66-70]. A portion of the sncRNA present in sperm may only be remnants of spermatogenic processes, such as the RNA fragmentation products of ribosomal RNA subunits extensively degraded to suppress spurious protein translation in these transcriptionally quiescent cells[71]. However, it is clear that much of the sncRNA content of sperm is not the product of stochastic processes, but is actively shaped, in part, through interactions with extracellular vesicles (EVs) secreted by somatic cells along the reproductive tract, including the epididymis[63,72-74]. Mechanistic studies in animal models show an association of changes in germ cell sncRNA with intergenerational transmission[17-25,27,28,31,32,36-38,40,41,52,53]. Several labs independently demonstrated that injecting total RNA isolated from sperm exposed to environmental manipulations, specific classes of sncRNA (often differentiated by size), or even specific environmentally-responsive sncRNA into newly fertilized zygotes was sufficient to transmit/phenocopy complex phenotypes in affected offspring[17,19-25]. Epidemiological studies suggest that similar processes may link paternal adverse experiences and offspring disease risk, but causal or prospective data are lacking[28,39,42,53-56]. Progress in the field has been held back by a lack of critical details regarding many of the necessary factors to design prospective clinical studies and test such hypotheses. There is a primary need to first understand fundamental dynamics of sperm transcriptomics. For example, to differentiate between variation driven by genetics vs. environment (intrinsic vs extrinsic factors), it is necessary to examine sperm content over multiple time points (within- vs between-subject comparisons). Therefore, we have established an extensive dataset describing the dynamics of sperm sncRNA expression over time and across a normative cohort of human subjects. Of course, defining any cohort as ‘normative’ can be problematic. Our selection criteria were developed to recruit a healthy group of males with minimal heterogenous and confounding characteristics. However, age can influence sperm epigenetics, including sncRNA content, and our cohort may not comprise the typical age distribution of males with reproductive intent[17,75-78]. In addition, the lived experience of people from different racial or ethnic groups can vary dramatically and are likely to influence the dynamics of specific sperm sncRNA expression. Future studies will need to determine how generalizable our current model is from this initial cohort across the diversity present in the broader population. As has been previously reported, miRNA comprised a smaller fraction of the total sperm sncRNA pool in our subjects[54,56]. However, each miRNA can regulate the expression of hundreds of genes, and multiple miRNAs can collaborate in targeting extensive cellular processes and molecular pathways[79]. For example, we previously demonstrated that injecting a specific pool of 9 stress-sensitive miRNA into newly fertilized mouse zygotes extensively altered the expression of specific target stored maternal mRNA transcripts within 24 h[19]. In our current study, an average of 0.73% of total reads aligned to mature miRNA (compared to 29% aligning to the Ensembl ncRNA reference transcriptome), and after filtering, we identified a pool of 254 consistently expressed miRNA. We noted that several of the top-expressed miRNA in this pool have known functions in spermatogenesis or the epididymal maturation of sperm, including miR-10a-5p (consistently one of the highest expressed sperm miRNA), miR-30a-5p, and miR-26a-5p (84–86)[80-82]. Others may play important roles in the earliest stages of zygotic development[81]. For example, miR34c-5p is among the highest expressing sperm miRNA in humans and mice, and is required for the first cleavage division in mouse zygotes[68,83,84]. These studies suggest functional roles for sperm miRNA in biological processes important to reproduction and that may impact post-fertilization embryo development. Our experimental design did not include manipulations of human subjects or interventions; instead, it was intended to build an initial framework from a ‘normative’ human subject cohort, including examination within subjects over time and comparisons between subjects. By collecting samples across an extended period, we anticipated exploiting variation in the experiences of participants to screen for environmentally responsive ‘dynamic’ sperm sncRNA. Using screening criteria based on three a priori assumptions, as detailed in the “Methods”, we identified 33 final ‘dynamic’ miRNA—highly expressed and with a sufficient dynamic pattern over time, both within- and between-subjects. In validation of our assumptions and selection criteria, we found that several of the ‘dynamic’ miRNA identified were also previously reported in sperm from rodents and humans, and associated with male environmental perturbations, including chronic stress[18,21,28]. For example, miR-449a levels were reduced in sperm from adult men who had experienced a high number of adverse childhood events and in the sperm of male mice following chronic stress[28]. Further, in two independent models, miR-375-3p was elevated in the sperm from adult male mice following prior chronic stress experience that occurred in the postnatal or pubertal/adult period[18,21]. tRNA are an abundant class of sncRNA also found in mature sperm[73,84,85]. In our study, just over 2% of total reads aligned to tRNA, and we identified 194 consistently expressed tRNA. In addition to the role intact tRNA play in protein translation, tRNA-derived RNA fragments (tRFs) regulate gene expression, and in some instances act in concert with cellular machinery already in place for miRNA and piRNA actions[86-88]. In 2016, two studies implicated sperm tRFs in the epigenetic germline inheritance of complex metabolic phenotypes following paternal high fat or low protein dietary exposures in mice[23,25]. At ~ 75 bp, intact tRNA are significantly longer than either miRNA or piRNA, and also longer than the 36 bp read lengths of our sequencing dataset, and therefore, we were limited in our ability to differentiate between sequence reads derived from intact tRNA and those from tRFs[88]. In examining the expression pattern of sperm tRNAs across subjects, it was clear that this class of sperm sncRNA is far less variable overall than either miRNA or piRNA. However, we did identify a subset of 17 sperm tRNA that met our criteria for ‘dynamic’ sncRNA. Focusing on the expression of these ‘dynamic’ tRNA over time, there were clearly subjects who had more variable expression than others, which might correlate with between-subject variation in environmental factors that were not examined in our study. It was also clear that the most ‘dynamic’ tRNA for a given subject tended to be present at lower overall expression levels. Three of the final ‘dynamic’ tRNA we identified, tRNA-Gly-GCC-1-1, tRNA-Gly-GCC-2-1, and tRNA-Gly-GCC-3-1, transport the same amino acid (glycine) and share the same anticodon sequence (i.e., isodecoders). Reduced sperm expression of tRFs derived from two other tRNA-Gly-GCC isodecoders were previously associated with poor embryo quality when those human sperm samples were used for in vitro fertilization, supporting an important biological function for these fragments in sperm[85]. Consistent with previous reports, piRNA were the most prevalent class of sperm sncRNA in our dataset in terms of expression abundance[54,80]. An average of 4.9% of total sequence reads aligned to piRNA across subjects. In addition to comprising the largest proportion of the total sperm sncRNA pool, far more piRNA (837) were consistently expressed across sperm samples. This was not unexpected, as tens of thousands of human piRNA have been annotated[89]. piRNA are predominately expressed in the germline where they fulfill their canonical role in maintaining genomic stability by repressing repetitive elements; though there is a growing appreciation for the role piRNA play in the post-transcriptional regulation of protein-coding transcripts[80,90,91]. Interestingly, in D. melanogaster and C. elegans, piRNA play a key role in transgenerational epigenetic inheritance of specific traits[92,93]. In rodents, changes in sperm piRNA expression were reported in association with male dietary manipulations and early life chronic stress experiences[20,21,41]. Of consistently expressed sperm piRNA, 97 met selection criteria and were categorized as ‘dynamic’. Compared to miRNA, little is known about the functional role of individual piRNA, especially in mature sperm. Interestingly, in our study, we noted that when viewed across time, piRNA expression patterns in many subjects appear to display a bi-monthly cycle. Similar but less apparent patterns were also noted in the miRNA plots. Supporting a potential biological relevance, 15 of the 97 ‘dynamic’ piRNA had a cyclical expression pattern in at least 25% of our subjects. Such an expression pattern may reflect cyclical changes in extrinsic environmental factors or an intrinsic rhythm in male fertility not previously described. Recent studies from our lab and others have demonstrated that the sperm sncRNA content is responsive to prior stress experience[17,18,21,28,31]. Further, our recent molecular studies in mice identified the specific timing at which previously stressed males were able to transmit a specific phenotype to their offspring[17]. These studies demonstrated a key finding in the field, that intergenerational transmission may occur, and in some contexts may only occur after an extended period following the cessation of the insult. This was a critical piece of the puzzle for formulating hypotheses in modeling human subject data—when to expect a detectable change to show up in the sperm after a given experience? Therefore, we evaluated a series of statistical models to align our sncRNA expression data with that of a clinical measure of perceived stress. These analyses identified significant associations between the expression of several ‘dynamic’ sncRNA and either the subject’s absolute PSS score, concurrently or at previous timepoints (rheostatic model), or the change in PSS score between timepoints (delta-detector). Five ‘dynamic’ miRNA and four tRNA were determined to be significantly associated with perceived stress experience. Of these, we noted that several had been identified in previous studies as important, including the miRNA, let-7f-5p and miR-181a-5p, and the tRNA, tRNA-Gly-GCC-3-1, tRNA-Lys-CTT-1-1, tRNA-Lys-CTT-2-1, and tRNA-Lys-CTT-4-1[25,56,94]. The miRNA let-7f-5p and tRFs derived from tRNA-Gly-GCC and tRNA-Lys-CTT isodecoders were previously identified as differentially expressed in the sperm of male mice following exposure to a low-protein diet[25]. In that study, injection of a tRF derived from the 5′ end of tRNA-Gly-GCC depressed the levels of genes highly expressed in preimplantation embryos targeted by the endogenous retroelement MERVL. tRNA-Lys-CTT levels were increased in human sperm following a high sugar diet exposure[56]. Of great relevance to our results, a recent study identified significant correlations between levels of prior childhood trauma and adult plasma miR-181a-5p levels and several members of the let-7 family, similar to our results in sperm[94]. This suggests that these miRNA may be part of an evolutionarily conserved stress-responsive mechanism, conserved across tissues and species. How this would impact post-conception reproduction or embryonic development is unknown, but one of the top predicted gene targets of these miRNA, protogenin (PRTG), could play a role[95-97]. PRTG is involved in regulating early embryonic developmental transitions and trophoblast differentiation, and has been associated with ADHD and measures of cognitive development in multiple studies[95,96,98,99]. Therefore, changes in levels of sperm let-7f-5p and miR-181a-5p delivered at fertilization could regulate PRTG expression, among other genes, impacting rates of embryo division and implantation. Even if these effects were small, consequent shifts in the timing of developmental windows of susceptibility to environmental signals could produce significant differences in development and health outcomes mapped onto genetic risk (as shown in the schematic in Fig. 3B).

Conclusion

In these studies, we have utilized between- and within-subject sperm sncRNA comparisons and provided an initial framework onto which additional human subject data can be built. These data confirm high expressing common miRNA, tRNA, and piRNA that were dynamic in their pattern of expression over time, likely responsive to a factor in the internal or external environment. Further, using our perceived stress state analyses, we were able to identify miRNA and tRNA that fit strict modeling criteria for changing their expression levels following a previous perceived high stress state. Much work remains to be done in this field, but these data provide a powerful starting point. From here, it is conceivable that one day we may be able to map onto such a normative data set the sncRNA expression of fathers of children with neurodevelopmental disorders or men with traumatic experiences, such as returning from military service, and be able to identify biomarkers predictive of offspring developmental risk and resilience factors.

Methods and materials

Subject recruitment

A cohort of 18 healthy males was recruited from the University of Pennsylvania student body to establish normative sperm molecular signatures as a benchmark for comparison to later clinical populations. The study was approved by the Perelman School of Medicine at the University of Pennsylvania Institutional Review Board, all participants provided written informed consent, and all research was performed in accordance with relevant guidelines. Subjects between the ages of 18 and 25 were screened for history of major medical illnesses, mental health diagnoses, and substance abuse. Grounds for exclusion included: (1) history (participant self-report) of major medical illnesses or other current medical conditions that the physician investigator deemed as contraindicated for study participation; (2) regular or recreational use of any psychotropic medication (e.g., antidepressants, antipsychotics, psychostimulants or anxiolytics), as per self-report, (3) recent (within previous year) diagnosis (per MINI International Neuropsychiatric Interview) or treatment for any psychiatric disorder or substance use disorder (previous 2 years), (4) lifetime history of schizophrenia or other psychotic disorder, substance addiction disorder (excepting nicotine), (5) current use of any tobacco products, determined by urine cotinine level; and (6) positive drug screen for any substance, determined by urine drug screen at screening[100].

Study procedures

The study involved a total of 7 visits. The first visit was a screening visit to determine participant eligibility. The following 6 visits were sperm collection visits. During the screening visit, subjects underwent an in-office assessment including a urine toxicology screen, urine cotinine screen, and clinical assessments, including the Adverse Childhood Experiences (ACE) questionnaire, and the MINI International Neuropsychiatric Interview[100,101]. Subsequent visits (2–7) took place once a month for 6 months. At these visits, subjects submitted a semen sample, collected at home within the previous hour, to experienced andrologists at Penn Fertility Care clinic for processing and sample cryopreservation. Participants were asked to abstain from ejaculation for 48 h prior to semen collection. Within the same day, participants also completed a series of questionnaires to assess stress and anxiety experienced over the previous month, including the Perceived Stress Scale (PSS) and the Spielberger State Trait Anxiety Inventory (STAI)[61,102]. The PSS is the most commonly used instrument to assess perception of stress over the previous month[61]. It measures the degree to which situations in a subject’s life are appraised as stressful. The instrument taps into how unpredictable, uncontrollable and overloaded respondents experience their lives. The 10-item scale includes a number of direct queries about current levels of experienced stress during the last month. One participant did not return for their final donation, therefore only timepoints 1–5 were available for subject 11. In addition, one subject was excluded due to consistently low sperm quality across donated samples, leaving a final study cohort of 17 subjects for sperm sncRNA analysis.

Sperm sncRNA sequencing

Procedures for the isolation of small RNA from mature sperm were adapted from a published protocol[103]. Briefly, cryopreserved sperm samples were thawed, suspended in PureSperm Buffer (Nidacon), then mature sperm were enriched by centrifugation (300g, 15 min) through a 50% PureSperm density gradient (Nidacon). Sperm were then lysed in TRIzol-LS (Thermo Fisher) reagent, supplemented with 0.2 M β-mercaptoethanol and 100 mg of nuclease-free stainless-steel beads, by homogenization on a Disruptor Genie (Scientific Industries) at 3000 rpm for 5 min. RNA, enriched for small RNA, was isolated using Qiagen’s miRNeasy Mini kit according to manufacturer’s instructions. RNA concentration and quality were assessed using Agilent’s small RNA chips run on a Bioanalyzer 2100 (Agilent Technologies). The small RNA content of sperm samples was analyzed by small RNA sequencing. Libraries were constructed using the TruSeq small RNA Library Prep Kit (Illumina). RNA input for library preparation was standardized to 10 ng of small RNA, in accordance with the manufacturer’s protocol. Post-PCR cleanup and size selection for products > 100 bp was performed using AMPure XP bead purification. Library size distribution and quantification was performed on a TapeStation 4200 (Agilent) using their High Sensitivity D1000 screentape. Individually barcoded libraries were pooled to achieve ~ 10 million reads per sample and sequenced on an Illumina NextSeq 550 (36-bp single-end).

Bioinformatic analysis pipeline

The small non-coding RNA sequencing (sncRNASeq) analytical pipeline was designed using the snakemake framework and is available via GitHub at https://github.com/acshetty/sncRNA-seq-analysis[104]. Reference transcriptome sequences in FASTA format were downloaded from public repositories. These included the GRCh38 ncRNA reference from Ensembl, the miRNA reference from miRbase 21, the tRNA reference generated from GRCh37 by GtRNAdb 2.0, and the piRNA reference from piRBase v1.0[105-107]. Using ‘index_ref’ component, each reference FASTA file was indexed using the ‘bowtie-build’ from the Bowtie short read aligner software[108]. The sequencing reads have lengths longer than the average size of most sncRNAs which may result in the inclusion of adapter sequence at the 3′-end of the read sequence. Hence, the ‘trim_fq’ component was invoked in order to remove trailing adapter sequence using the Trimmomatic tool[109]. After trimming, reads shorter than 15 nucleotides were discarded before downstream analyses. The trimmed reads for each sample were then aligned, using the ‘align_reads’ component, to each of the different sncRNA class-specific reference transcriptomes using the Bowtie short read aligner[108]. Reads were aligned allowing for 2 mismatches and a seed length of 15 nucleotides. The alignment statistics were summarized for each sample across each of the different reference sequences using the ‘merge_alignment_statistics’ component. The raw expression values were computed using the ‘compute_expr’ component based on the number of reads aligned to each of the sncRNAs specified in their respective reference files. For each type of sncRNA, the raw expression values were merged across samples using the ‘merge_expr’ component to generate a count matrix for downstream analysis.

Characterizing the dynamics of sperm sncRNA expression

Expression of sncRNA were adjusted for differences in library sequencing depth to generate counts per million reads (CPM) following TMM normalization using the Bioconductor package ‘edgeR’ (version 3.24.3)[110]. Features were retained for analysis if they were expressed at ≥ 1 CPM in at least 75% of samples. sncRNA aligning to each reference transcriptome were analyzed separately. Analyses were performed at the level of the individual sample, within each subject, and between subjects. At the sample level, the expression (CPM) of each feature (i.e. miRNA, tRNA, or piRNA) was ranked from highest to lowest and categorized if expressed in the top or bottom quartiles. These expression values were used to perform analyses of expression and variation at the within-subject level. Within-subject measures of expression include average expression, average ranked expression, and the number of collections a feature was expressed in the top or bottom quartile. Within-subject measures of variation included the coefficient of variation (CV) of a feature’s expression across collections, which was used to rank features from most to least variable, then categorize features if in the top or bottom quartiles of within-subject variation. Between-subject measures of expression (average expression and average rank expression) and variation [average CV expression (within-subject) and average rank CV expression (within-subject)] were calculated for each sncRNA from mean within-subject expression levels summarized across subjects. To identify potential environmentally responsive ‘dynamic’ sncRNA, we exploited the within- and between-subjects structure of the sperm sncRNA expression data to developed screening criteria based on three a priori assumptions: (1) we assumed ‘dynamic’ sncRNA were likely to exhibit a higher degree of variation in expression over time in response to changes in the environment; (2) given the disparity in the amount of RNA present in sperm relative to ovum, we assumed sncRNA with the potential to impact offspring development would need to be highly expressed; and (3) we assumed sncRNA meeting criteria based on these first two assumptions in multiple subjects were more likely to reflect a conserved functional response to extrinsic factors[62-64]. The analysis we performed to characterize these properties and identify ‘dynamic’ sncRNA is illustrated in Fig. 1C. The first two criteria were based on within-subject expression characteristics. Features (sncRNA) were categorized as candidates in a given subject if: (1) a feature exhibited within-subject variation (CV expression) ranked in the top quartile of each class of sncRNA, and (2) the feature’s expression was ranked in the top quartile in ≥ 1 collection over the 6 months. (3) The final criteria required that a ‘dynamic’ sncRNA met the first two within-subject criteria in at least 25% (N = 5) of subjects (count candidate ≥ 5).

Identifying ‘dynamic’ sperm sncRNA responsive to perceived stress experience

The relationship between normalized expression counts for the subsets of ‘dynamic’ sperm sncRNA and PSS scores were analyzed by implementing linear fixed effects models using the Bioconductor package ‘limma’ (version 3.38.3)[111]. We implemented 7 different models to evaluate the relationship between the expression of individual ‘dynamic’ sncRNA and PSS scores, testing the following associations: (1) miRNA ~ PSS score at the time of collection (t); (2) miRNA ~ PSS score at collection t − 1; (3) miRNA ~ PSS score at collection t − 2; (4) miRNA ~ PSS score at collection t − 3; (5) miRNA ~ (change in PSS score between t and t − 1); (6) miRNA ~ (change in PSS score between t and t − 2); (7) miRNA ~ (change in PSS score between t and t − 3). The first four of these models tested for the relationship between expression of a miRNA and PSS scores using a ‘rheostat’ approach allowing for a delayed response. The final three models tested for the relationship between miRNA expression and the change in PSS scores using a ‘delta detector’ approach. The final results were tabulated and filtered for a p-value < 0.01 and FDR < 0.2 to detect significant associations.

Software used for data analyses

Sperm sncRNA sequencing data was trimmed, aligned, and counted using the small non-coding RNA sequencing (sncRNASeq) analytical pipeline, which is available in its entirety via GitHub at https://github.com/acshetty/sncRNA-seq-analysis. All other data processing, visualization, and statistical modeling were performed in the R software environment (version 3.5.3)[112]. Expression normalization was performed using the Bioconductor package ‘edgeR’ (version 3.24.3)[110]. Multivariate analyses were performed with the base R package ‘stats’ (version 3.5.3)[112]. Most data visualizations were generated using ‘ggplot2’ (version 3.2.0)[113]. Linear modeling of the relationships between ‘dynamic’ sncRNA expression and PSS scores was performed using the Bioconductor package ‘limma’ (version 3.38.3)[111]. Supplementary Information.
  110 in total

1.  Smoking induces differential miRNA expression in human spermatozoa: a potential transgenerational epigenetic concern?

Authors:  Emma L Marczylo; Akwasi A Amoako; Justin C Konje; Timothy W Gant; Timothy H Marczylo
Journal:  Epigenetics       Date:  2012-05-01       Impact factor: 4.528

2.  The microRNA signature of mouse spermatozoa is substantially modified during epididymal maturation.

Authors:  Brett Nixon; Simone J Stanger; Bettina P Mihalas; Jackson N Reilly; Amanda L Anderson; Sonika Tyagi; Janet E Holt; Eileen A McLaughlin
Journal:  Biol Reprod       Date:  2015-09-02       Impact factor: 4.285

Review 3.  Epigenetic germline inheritance in mammals: looking to the past to understand the future.

Authors:  K Gapp; J Bohacek
Journal:  Genes Brain Behav       Date:  2017-09-06       Impact factor: 3.449

4.  Intergenerational impact of paternal lifetime exposures to both folic acid deficiency and supplementation on reproductive outcomes and imprinted gene methylation.

Authors:  Lundi Ly; Donovan Chan; Mahmoud Aarabi; Mylène Landry; Nathalie A Behan; Amanda J MacFarlane; Jacquetta Trasler
Journal:  Mol Hum Reprod       Date:  2017-07-01       Impact factor: 4.025

Review 5.  A history of why fathers' RNA matters.

Authors:  Marta Gòdia; Grace Swanson; Stephen A Krawetz
Journal:  Biol Reprod       Date:  2018-07-01       Impact factor: 4.285

6.  Small RNAs derived from the 5' end of tRNA can inhibit protein translation in human cells.

Authors:  Andrew Sobala; Gyorgy Hutvagner
Journal:  RNA Biol       Date:  2013-04-01       Impact factor: 4.652

7.  Male obesity effects on sperm and next-generation cord blood DNA methylation.

Authors:  Ramya Potabattula; Marcus Dittrich; Martin Schorsch; Thomas Hahn; Thomas Haaf; Nady El Hajj
Journal:  PLoS One       Date:  2019-06-27       Impact factor: 3.240

8.  A novel class of tRNA-derived small RNAs extremely enriched in mature mouse sperm.

Authors:  Hongying Peng; Junchao Shi; Ying Zhang; He Zhang; Shangying Liao; Wei Li; Li Lei; Chunsheng Han; Lina Ning; Yujing Cao; Qi Zhou; Qi Chen; Enkui Duan
Journal:  Cell Res       Date:  2012-10-09       Impact factor: 25.617

Review 9.  A Decade of Exploring the Mammalian Sperm Epigenome: Paternal Epigenetic and Transgenerational Inheritance.

Authors:  Alexandre Champroux; Julie Cocquet; Joëlle Henry-Berger; Joël R Drevet; Ayhan Kocer
Journal:  Front Cell Dev Biol       Date:  2018-05-15

10.  Reproductive tract extracellular vesicles are sufficient to transmit intergenerational stress and program neurodevelopment.

Authors:  Jennifer C Chan; Christopher P Morgan; N Adrian Leu; Amol Shetty; Yasmine M Cisse; Bridget M Nugent; Kathleen E Morrison; Eldin Jašarević; Weiliang Huang; Nickole Kanyuch; Ali B Rodgers; Natarajan V Bhanu; Dara S Berger; Benjamin A Garcia; Seth Ament; Maureen Kane; C Neill Epperson; Tracy L Bale
Journal:  Nat Commun       Date:  2020-03-20       Impact factor: 14.919

View more
  3 in total

1.  Sperm transcriptional state associated with paternal transmission of stress phenotypes.

Authors:  Ashley M Cunningham; Deena M Walker; Aarthi Ramakrishnan; Marie A Doyle; Rosemary C Bagot; Hannah M Cates; Catherine J Peña; Orna Issler; Casey Lardner; Caleb Browne; Scott J Russo; Li Shen; Eric J Nestler
Journal:  J Neurosci       Date:  2021-06-07       Impact factor: 6.167

Review 2.  The Transmission of Intergenerational Epigenetic Information by Sperm microRNAs.

Authors:  Grace S Lee; Colin C Conine
Journal:  Epigenomes       Date:  2022-04-07

Review 3.  Germ Cell Drivers: Transmission of Preconception Stress Across Generations.

Authors:  Korrina A Duffy; Tracy L Bale; C Neill Epperson
Journal:  Front Hum Neurosci       Date:  2021-07-12       Impact factor: 3.169

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.