Rik G H Lindeboom1, Michiel Vermeulen1, Ben Lehner2,3,4, Fran Supek5,6. 1. Department of Molecular Biology, Faculty of Science, Radboud Institute for Molecular Life Sciences, Oncode Institute, Radboud University Nijmegen, Nijmegen, the Netherlands. 2. Systems Biology Program, Centre for Genomic Regulation, The Barcelona Institute of Science and Technology, Barcelona, Spain. ben.lehner@crg.eu. 3. Universitat Pompeu Fabra, Barcelona, Spain. ben.lehner@crg.eu. 4. Institució Catalana de Recerca i Estudis Avançats, Barcelona, Spain. ben.lehner@crg.eu. 5. Institució Catalana de Recerca i Estudis Avançats, Barcelona, Spain. fran.supek@irbbarcelona.org. 6. Institut de Recerca Biomedica Barcelona, The Barcelona Institute of Science and Technology, Barcelona, Spain. fran.supek@irbbarcelona.org.
Abstract
Premature termination codons (PTCs) can result in the production of truncated proteins or the degradation of messenger RNAs by nonsense-mediated mRNA decay (NMD). Which of these outcomes occurs can alter the effect of a mutation, with the engagement of NMD being dependent on a series of rules. Here, by applying these rules genome-wide to obtain a resource called NMDetective, we explore the impact of NMD on genetic disease and approaches to therapy. First, human genetic diseases differ in whether NMD typically aggravates or alleviates the effects of PTCs. Second, failure to trigger NMD is a cause of ineffective gene inactivation by CRISPR-Cas9 gene editing. Finally, NMD is a determinant of the efficacy of cancer immunotherapy, with only frameshifted transcripts that escape NMD predicting a response. These results demonstrate the importance of incorporating the rules of NMD into clinical decision-making. Moreover, they suggest that inhibiting NMD may be effective in enhancing cancer immunotherapy.
Premature termination codons (PTCs) can result in the production of truncated proteins or the degradation of messenger RNAs by nonsense-mediated mRNA decay (NMD). Which of these outcomes occurs can alter the effect of a mutation, with the engagement of NMD being dependent on a series of rules. Here, by applying these rules genome-wide to obtain a resource called NMDetective, we explore the impact of NMD on genetic disease and approaches to therapy. First, human genetic diseases differ in whether NMD typically aggravates or alleviates the effects of PTCs. Second, failure to trigger NMD is a cause of ineffective gene inactivation by CRISPR-Cas9 gene editing. Finally, NMD is a determinant of the efficacy of cancer immunotherapy, with only frameshifted transcripts that escape NMD predicting a response. These results demonstrate the importance of incorporating the rules of NMD into clinical decision-making. Moreover, they suggest that inhibiting NMD may be effective in enhancing cancer immunotherapy.
Nonsense mediated decay (NMD) is a quality control pathway that degrades mRNAs
containing premature termination codons (PTCs) because of nonsense or frameshifting
mutations[1-3]. However, not all PTCs trigger NMD. For
example, PTCs less than 50 nucleotides (nt) upstream of the last exon-exon junction
typically do not trigger NMD (‘the 50nt rule’)[4]. PTCs in the last exon of a gene were also found not to
trigger NMD (‘last exon rule’)[5,6]. These two
‘canonical rules’ of NMD have been widely validated[7], including by analyzing the impact of
thousands of inherited human PTCs[8,9] and thousands of PTCs introduced by
somatic mutations in human tumors[9,10] on mRNA levels.Through a large-scale analysis of human cancer exomes and transcriptomes we
recently suggested additional ‘non-canonical rules’ of NMD, e.g. that very
long exons (>~400 nt) inhibit NMD (‘long exon rule’), a
finding supported in subsequent experiments[11]. Moreover, PTCs less than 150 nt from the start codon typically
fail to trigger NMD (‘start proximal rule’), likely because of translation
reinitiation[9]. These rules, and
the effect of the distance between a PTC and the wild-type stop codon, and the presence
of certain RNA-binding protein motifs, were validated in inherited PTCs and together
they explain much of the systematic variability in NMD across genes and mutations in
human[9].Studies of individual genes have highlighted that NMD can have a strong bearing
on human disease phenotypes. In beta-thalassemia, 5’ end PTCs in the beta-globin
gene, which are seen by NMD, result in a recessive form of the disease, however
3’ end PTCs, which escape NMD, result in the dominant form, implying that NMD has
protective effects by preventing production of a toxic truncated protein. Conversely, in
Duchenne muscular dystrophy, PTCs at the 3’ end of the dystrophin gene result in
mild phenotypes, while PTCs further upstream are seen by NMD and result in a severe
phenotype due to loss of expression of a truncated protein that retains partial
activity. NMD can therefore either ameliorate or aggravate diseases[12-14]. PTCs cause a large proportion of human genetic
diseases[15]. However, the
overall extent to which NMD suppresses or enhances the effects of human genetic disease
is unclear[12,16]. This is an important question because drugs that inhibit NMD
may represent a general strategy to treat diseases when the production of a truncated
protein is beneficial[17-20].To evaluate the impact of NMD across diseases we have developed a resource,
NMDetective, that predicts the effects of PTCs genome-wide in human
and mouse. Applying NMDetective to known disease mutations reveals that
human genetic diseases tend to be often aggravated by NMD. We then use
NMDetective to show that the efficacy of NMD is an overlooked
consideration when performing gene editing using CRISPR-Cas9 and a cause of ineffective
gene inactivation. Finally, we apply NMDetective to tumor data and show
that whether frameshifting mutations do or do not trigger NMD predicts of the efficacy
of cancer immunotherapy.
Results
A resource for genome-wide prediction of NMD efficacy
To evaluate the impact of introducing PTCs across the human genome, we
used rules of NMD learnt from a large-scale analysis of ~10,000 matched
tumor exomes and transcriptomes[9]. The NMD efficacy acting on a particular PTC was estimated
as the –log2 fold-difference in the mRNA level in a tumor
sample bearing the PTC, versus the median mRNA level in a set
of tumors matched by tissue and global gene expression patterns, but with no
detectable PTCs in the transcript, as in Lindeboom, et al. [9] (Fig. 1a). Full NMD efficacy would correspond to a score of 1 (50%
decrease in mRNA level due to a heterozygous PTC) and completely inefficient NMD
a score of 0 (no change in mRNA level). The genomic features associated with
altered NMD efficacy across 2,840 PTCs were incorporated into a predictive model
using Random Forest Regression that explained 71% of the systematic variance in
NMD efficacy, as estimated on an independent set of 3,151 PTC-inducing
frameshifting indel mutations (Fig. 1b;
Methods).
Figure 1
NMDetective catalogues the effects of all possible PTCs in
the human genome.
a, an overview of the data used to create the
NMDetective-A and -B resources.
b, accuracy of predictions by NMDetective
evaluated on an independent set of frameshifting indel mutations.
c, the NMDetective-B decision tree model. The
number of PTCs in the training set assigned to each group is shown as
n. d, coverage of the gene coding regions with
NMD rules.
We used this model to perform an in silico screen of the
NMD efficacy of all possible single-nucleotide variants and frameshifting
mutations resulting in PTCs in the human and mouse genomes. For the human
genome, this produced 1.2 × 108 predictions for 101,781
protein-coding transcripts (UCSC Genes, hg38 assembly). Predictions are
available as a resource named NMDetective-A (Lindeboom, R. G.
H.. NMDetective. (2019). doi:10.6084/m9.figshare.7803398).An analysis of the distribution of NMD efficacy scores (Extended Data Fig. 1a; Methods) suggests that 51% of all possible PTCs in the human
genome are predicted to efficiently trigger NMD (NMDetective-A
score >0.52) resulting in destruction of their mRNA transcripts. In
addition, 22% of PTCs are predicted not to trigger NMD
(NMDetective-A score <0.25) meaning that their
transcripts will not be degraded (the remaining 27% of PTCs have an intermediate
NMD efficacy). Thus, whether NMD is triggered or not varies very extensively
across possible PTCs in the human genome.
Extended Data Fig.1
The distribution of genome-wide NMD efficacy scores and of NMD rules in
all genes with more than 20 disease-associated PTC variants.
a, the distribution of NMDetective-A scores over all
genes in hg38 reveals three global clusters of inefficient,
intermediate-efficiency and efficient NMD. b, genes in which
there is an excess of PTCs in NMD-evading regions (left barplot) and genes
where there is a dearth of PTCs is NMD-evading regions (right barplot). The
proportion of PTCs in different NMD-evading regions is shown as colored
segments in the bar chart. The relative portion of the protein-coding mRNA
sequence that is covered by the NMD rules is shown as a black vertical
stripe. c, a schematic of a gene that illustrates how PTCs that
trigger or evade NMD can lead to different outcomes in protein
expression.
In addition to NMDetective-A, which uses Random Forest
Regression, we also implemented a simplified predictive model,
NMDetective-B, that is fully transparent about the series
of tests it performs in a decision tree to reach the final prediction about
whether a PTC triggers NMD or not (Fig.
1c). The predictive performance of NMDetective-B is only
slightly lower than that of the Random Forest model (68% versus 71% variance
explained on an independent set of PTCs introduced by frameshifting indels;
Fig. 1b).
NMDetective-B therefore provides a reasonable trade-off for
situations where interpretability is critical, such as in clinical
applications.NMDetective-B consists of four nested tests that
incorporate the two canonical rules of NMD as well as two additional rules
learnt by analyzing the impact of thousands of PTCs in cancer genomes[9]. The four nested tests in the
tree are: (i) if the PTC is in the last exon, (ii) if it is in the last 50 nt of
the penultimate exon; (iii) if it is less than 150 nt away from the start codon;
and (iv) if it is in a long exon (>407 nt). The four rules in the
decision tree were obtained via automated inference from data, without manual
adjustment, yet they nonetheless reflect the known mechanistic bases for NMD
activity.All four rules are important for evaluating the impact of PTCs in the
human genome. Considering all possible PTCs, 55% are seen by NMD, while others
evade NMD: the ‘last exon rule’ covers 18% of PTCs, the
‘50nt rule’ 3%, the ‘start-proximal rule’ 12%, and
the ‘long exon rule’ 12% of PTCs (Fig. 1d).
NMD efficacy impacts the selection on germline variants
Genetic variants that cause severe phenotypes tend to be rare in human
populations because they are removed by purifying selection[21,22]. We can therefore evaluate whether NMD contributes to
the deleterious effects of PTCs by comparing the allele frequencies of PTCs
predicted to trigger and evade NMD. Indeed, 52% of rare PTCs (minor allele
frequency, MAF = 10-6 – 10-4 in ExAC) are predicted
to trigger NMD compared to only 25% of common PTCs (MAF>1%) and 30% of
intermediate frequency PTCs (MAF between 10-4 and 10-2,
all differences significant at p<2.2e−12 by Fisher’s exact
test; Fig. 2a). Additional tests for
selection that account for trinucleotide sequence composition (Supplementary Note, Extended Data Fig. 2a-b) are consistent with
NMD frequently modulating deleterious phenotypes resulting from PTCs.
Figure 2
Disease phenotypes arising from germline PTCs are modulated by NMD.
a, signatures of negative selection on NMD-detected variants in
population genomic data. b, genes where NMD is predicted to
aggravate the phenotype, c. genes where NMD is predicted to
alleviate the phenotype. b-c. genes significant at an FDR<5% are shown
(see Extended Data Fig. 3c-d for a more
permissive list at FDR<25%). Log2 odds ratios are for ClinVar
frequencies of frameshifting indel and nonsense variants in NMD-evading versus
NMD-detected regions of a gene, normalized to the length of the regions. FDRs
are by a Fisher’s exact test, two-tailed, Benjamini-Hochberg adjusted.
d, NMD rules significantly improve predictions of PTC
pathogenicity, e. variable importance in the PTC pathogenicity
predictor, * significant at p<0.001 by Chi-Square-test.
Extended Data Fig.2
The sequence context of nonsense variants is not different between
different types of NMD regions.
a, the trinucleotide spectrum of nonsense variants in
ExAC is consistent across gene regions that trigger or evade NMD,
b. spectrum of variants shows high Pearson correlations
between the different types of NMD regions. c, the baseline
NMD-evasion rule coverage for population genomic data, obtained from
nonsense variants simulated from the trinucleotide context of whole-genome
population variants at different VAF ranges, exhibits a consistent
distribution at different VAF ranges. Observed nonsense variants are
increasingly enriched towards NMD-evading regions with an increasing VAF,
compared to the simulated baseline at same VAFs. Odds ratios significant at
P<0.01 (Fisher’s exact test) are shown, comparing the
distribution of simulated versus observed nonsense
mutations.
We examined this effect for each of the NMD-evasion rules separately,
finding consistent differences between common variants and rare ones for the
last-exon and start-proximal rules (Fig.
2a), which are predicted to induce the strongest NMD evasion (Fig. 1c). The effect size for the
start-proximal rule (ORs of 1.45 and 1.73 when comparing the relative amount of
very rare to intermediate/common PTC variants in different types of NMD regions,
respectively) is similar to the one for the well-established last-exon rule (ORs
of 1.5 and 2.37), further validating this rule using population genomic data. A
cautionary note concerning these analyses is that the observed population
variation patterns, which suggest selection on NMD activity, might in part also
reflect other types of selection, such as that against longer 3’ gene end
truncations.
NMD variably aggravates and alleviates genetic diseases
For germline variants, therefore, the overall impact of NMD is to
aggravate the fitness cost of PTCs. This global trend may not, however, apply
equally to all genes, as evident in the contrasting examples of the Duchenne
muscular dystrophy (DMD gene) and the beta-thalassemia
(HBB gene)[23,24]. To
investigate the impact of NMD across different diseases, we evaluated whether
clinically-reported PTCs in human disease genes are predicted to trigger NMD. We
considered 7,514 nonsense and 12,756 indel variants with clinical significance
(having ClinVar assertions) in 752 genes causing genetic disorders for which
more than five PTC variants were available.The direction of the effect of NMD is variable across genes and diseases
(Extended Data Fig. 1b): in total, 49
disease genes were more than 2-fold enriched for pathogenic PTCs predicted to
evade NMD. An excess of pathogenic PTCs in NMD-evading regions suggests that NMD
reduces the pathogenicity of PTCs that it can detect and thus that NMD
alleviates the phenotype (illustrated in Extended
Data Fig. 1c). However, when considering the converse case, 155
disease genes were more than 2-fold enriched for pathogenic PTCs that trigger
NMD. This predominance of genes where pathogenic PTCs tend to trigger NMD (155
versus 49, P=5.1e−14 by sign test) suggests that, globally, NMD tends to
more commonly aggravate rather than ameliorate the effects of genetic
diseases.When examining individual genes, we found 17 disease associated-genes
significantly enriched for PTCs that trigger NMD and 13 significantly enriched
for PTCs that do not trigger NMD (Fig 2b-c;
False Discovery Rate (FDR) <5%; this increases to 35 and 40 genes,
respectively, at FDR<25%, Extended Data
Fig. 3c-d). Similar results were observed when normalizing to local
density of missense variants (Extended Data Fig.
3a-b, Supplementary
Note). This indicates that NMD affects the severity of phenotypes in
these heritable diseases.
Extended Data Fig.3
Disease genes with a significant enrichment of PTC variants that do or do
not trigger NMD, with and without normalization to local density of missense
mutations.
a-b, significant enrichment of genes at FDR<0.05
after normalization to the number of ClinVar missense variants observed in
the same NMD regions. c-d, genes significant at an
FDR<25% are shown (see Fig. 2d-e
for a list at FDR<5%). Log2 odds ratios are for ClinVar
frequencies of NMD-evading frameshifting indel and nonsense variants versus
NMD-detected frameshifting indel and nonsense variants regions of a gene,
normalized to the length of the NMD-evading versus NMD-detected regions.
FDRs are by Fisher’s exact test, two-tailed, Benjamini-Hochberg
adjusted. a-d, log2 odds ratios are shown
separately for the four rules, for each rule which is significant in a
particular gene.
Examples of disease genes where the enrichments suggest that NMD
aggravates the disease (Extended Data Fig.
3c) include multiple tumor suppressor genes (76-fold enrichment,
FDR=2.6e−10), consistent with mutations in these genes acting via a
loss-of-function mechanism that would be intensified by NMD. Besides tumor
suppressors, other examples include the genes JAG1,
DYRK1A and ZIC2, which have all been
reported to cause disease though haploinsufficiency[25-27]. For these cases, drugs that inhibit NMD or that cause
stop-codon read-through may alleviate disease severity[28,29]. For
example, an inhibitor of NMD can restore levels of P53 protein and the
expression of its downstream targets, causing cell death in cells that bear
NMD-triggering PTCs in TP53[30].There are, however, multiple diseases where we predict that NMD
alleviates the phenotype (discussed in Supplementary Note; Extended
Data Fig. 3d). In these cases inhibiting NMD would not have a
favorable impact.These analyses show that NMD commonly modulates the severity of
phenotypes resulting from PTCs. We therefore hypothesized that considering NMD
rules can improve pathogenicity predictions of PTCs (Supplementary Note).
Indeed, including the NMD rules in a joint model with known predictors of
pathogenicity significantly improves the prediction after all other features are
controlled for (Fig. 2d-e). This suggests
that NMD rules should be included in statistical models that predict variant
pathogenicity.
NMD efficacy determines the outcome of gene editing
CRISPR-Cas9 and related gene editing technologies have become widely
adopted for gene inactivation and offer great promise for the treatment of
genetic diseases[31] and for
high-throughput genetic screens[32-36]. The
mutations introduced by Cas9 are most often frameshifting indels[37]. These frameshifts can result
in PTCs that will either result in mRNA degradation by NMD or the production of
truncated proteins, depending upon whether NMD is triggered or not (Extended Data Fig. 1c). Both scenarios can be
used as loss-of-function models[2], but degradation of mRNA is desirable because of partial
rescue and gain-of-function effects that may result from truncated proteins.To systematically evaluate the importance of NMD for gene inactivation
by gene editing, we made use of a dataset where tiled sgRNAs were used to target
nine genes encoding human and mouse cell surface markers[36]. In all eight multi-exon genes
we found higher protein levels, suggestive of reduced NMD efficiency, if the
sgRNA sites were located in the 3’ ends of the coding sequence in regions
that evade NMD by the last-exon or the 50nt rule (Fig. 3a; difference 2.4-fold and 2.5-fold, respectively, normalized
to NMD-detected regions of a gene). There is a certain diversity of effect sizes
across these eight genes (Extended Data Fig.
4c; interquartile range, IQR, 1.5-fold to 5.0-fold for the last-exon
rule. The start-proximal and the long-exon NMD rules are also associated with
higher protein levels in this dataset (Supplementary Note). Furthermore, analysis of saturation
editing of the BRCA1 gene[35]
validates that the first 150 nt of the coding region are not well suited for
gene inactivation by CRISPR-Cas9 (Fig. 3c;
Extended Data Fig. 5b-e, Supplementary Note).
Figure 3
NMD rules determine the outcome of CRISPR-Cas9 gene editing.
a, a decrease in protein expression due to tiling sgRNAs placed
along the length of human and mouse genes (y axis quantifies the sgRNA fold
difference between a low-expressing versus high-expressing set of cells)
reveals, overall, similar associations with the non-canonical start-proximal NMD
rule to the canonical NMD last-exon rule. The CD13 gene demonstrates the effect
of the non-canonical long-exon rule. Shaded regions are 95% confidence interval
of the loess fit to protein expression. Pearson correlation
coefficients and two-sided tests for association were computed by comparing the
loess fit to the NMDetective-A NMD efficacy scores.
b, evading NMD attenuates the loss of fitness (y axis) due to
knockout of essential genes. Data for non-essential genes are in Extended Data Fig. 5a. P values are by
Mann-Whitney U test. The knock-out efficiency compares the reduction of sgRNAs
in NMD evading regions to the reduction in regions that trigger NMD.
c, a ‘saturation genome editing’ CRISPR
experiment shows strongly reduced mRNA levels for nonsense mutations in BRCA1,
except for those introduced into regions covered by the start-proximal (top) and
last-exon NMD evasion rules (bottom panel).
Extended Data Fig.4
Effect of NMD rules observed in CRISPR assays.
a, sgRNAs targeted to gene regions that evade NMD show
a smaller enrichment when selecting for cells that do not express the
targeted protein. Fold differences in sgRNA abundance are pooled per rule
and shown for all proteins in a and additionally broken down by
protein in c. P values are by Mann-Whitney U test, two-sided.
b, Models that discriminate essential from non-essential
genes based on the fold-depletion of sgRNAs are more accurate for sgRNAs
that target gene regions that trigger NMD than for sgRNAs targeted to
different NMD-evading regions.
Extended Data Fig.5
Relevance of NMD rules for CRISPR sgRNA design.
a, fitness loss upon targeting a non-essential gene
(left) versus an essential gene (right) using a sgRNA directed at gene
sections which are covered by various NMD-evasion rules.
b-e, distribution of loci targeted by sgRNAs
that are NMD-detected or NMD-evading (according to the individual NMD rules)
for genome-wide CRISPR libraries (b, c) or by
sgRNA design tools (d, e).
To further substantiate these findings, we analyzed the data from a
genome-wide CRISPR screen for gene essentiality[32]. Here we contrasted the sgRNAs targeting
different regions of each gene for their ability to produce a fitness defect in
essential genes[32]. sgRNAs
targeting regions covered by each of the four NMD evasion rules were highly
significantly associated with increased cell fitness (Fig. 3b).For the last exon rule there was a 38% reduced sgRNA depletion in
essential genes, compared to NMD-trigger regions (p<2e−16 by
two-tailed Mann-Whitney U test). For the 50nt rule there was a 12% reduction
(p=8e−5) and for the long exon rule there was a 31% reduction
(p<2e−16). Finally, the same held true for the start-proximal
rule, which had a larger effect in the cases where there is an in-frame
downstream AUG (20% reduction, p=2e−12) than when the downstream AUG is
out-of-frame (9% reduction, p=2e−4), further supporting re-initiation of
translation as a common mechanism by which a 5’ PTCs evade NMD[9].Importantly, the effect size for the non-canonical long exon rule is
similar to the established last exon rule (31% vs 38% reduced sgRNA depletion),
and the effect size for the non-canonical start-proximal rule is larger than for
the canonical 50nt rule (20% vs 12%), underscoring the importance of the two
non-canonical rules for governing NMD efficacy. In a set of known non-essential
genes, there was no loss of fitness, irrespective of the NMD rules (Extended Data Fig. 5a). The above
observations validated in two additional genome-wide CRISPR screen studies for
gene essentiality[33,34] (Extended Data Fig. 6). Accounting for NMD rules improves
predictions of sgRNA efficacy: NMD-triggering sgRNAs can distinguish essential
from non-essential genes at an AUC=0.90, while this is reduced for NMD-evading
sgRNAs (AUC=0.74-0.88 for different NMD-evasion rules; Extended Data Fig. 4b).
Extended Data Fig.6
CRISPR screening data support canonical and non-canonical determinants of
NMD efficacy.
a, the non-canonical long-exon NMD evasion rule has
similar effects as the canonical last-exon NMD evasion rule in terms of
attenuated loss of fitness when targeting an essential gene (Methods). b-e,
minor non-canonical NMD determinants, which are not included in the
NMDetective-B model, but are included in the
comprehensive NMDetective-A model. This includes: distance
to downstream splice site in long exons (b), for the
start-proximal rule, existence of a downstream in-frame AUG codon,
presumably facilitating translation re-initiation (c), distance
to the wild-type stop codon (d), and the
effect of mRNA turnover on the observed NMD efficacy (e).
Lastly, these CRISPR screen data also support additional non-canonical
determinants of NMD efficacy [9].
In particular, we observed reduced NMD efficacy when the distance to the
downstream exon junction or to the end of the coding region is long, or when
mRNAs that have shorter half-lives are targeted (Extended Data Fig. 6), consistent with competition between NMD and
other mRNA turnover processes.
Only frameshifts not detected by NMD predict the response to
immunotherapy
In recent years, immune checkpoint inhibitors have demonstrated
remarkable efficacy in a subset of cancer patients[38]. A robust predictor of the response to
immunotherapy is the overall tumor mutation burden, presumably because it
reflects the propensity to generate neo-antigenic peptides that can be detected
by the immune system[39,40]. The mutations most strongly
associated with immunotherapy response are small indels[41], consistent with indels
resulting in the production of frameshifted peptides with aberrant amino acid
sequences. However, frameshifts also usually introduce PTCs and so can trigger
NMD. We hypothesized therefore that NMD may modulate the efficacy of cancer
immunotherapy and specifically that only the burden of frameshifts that do not
trigger NMD will predict clinical response.Indeed, in a pan-cancer cohort we find that the burden of frameshifts
that do not trigger NMD correlates with tumor immune reactivity (Extended Data Fig. 7-8; Supplementary Note). The enhanced immunoreactivity of tumors
carrying frameshifting mutations that escape NMD suggests that these tumors may
also respond better to immunotherapy. To test this, we collated five datasets of
tumor exomes paired with patient response data: melanoma (treated with PD-1 and
CTLA-4 inhibitors)[42,43], renal cancer
(anti-PD-1)[44], lung
cancer (anti-PD-1)[45], and an
additional set with diverse cancer types and treatments[46]. We stratified the patients
into responders and non-responders (Methods) and compared their frameshifting indel burden separately for
regions predicted by NMDetective to trigger or evade NMD.
Extended Data Fig.7
Tumor infiltration by immune cells is associated with a high burden of
NMD-evading frameshifting indels.
a-b, Individual immune markers for the TCGA samples
were estimated using gene expression data[50]. FDR is by two-sided Mann-Whitney U test,
Benjamini-Hochberg adjusted. In panel b, only tests significant
at FDR<25% are shown.
Extended Data Fig.8
Evidence that NMD activity is a determinant of immune reactivity of
tumors.
a, in the TCGA kidney cancer cohorts (KIRC, KICH and
KIRP), a cancer type where indel burden is known to be particularly strongly
associated with immunogenicity[41], higher relative burden of NMD-evading frameshifts was
associated with longer survival (p=0.011 for pooled data from both panels,
by log-rank test) without application of immunotherapy. Patients were
separated based on the number of frameshifting indels that do not trigger
NMD being higher than the number that trigger NMD (cyan) and those patients
where the converse is true (red). b, in the TCGA UCEC cohort of
uterine corpus endometrial carcinoma, where the key NMD gene UPF1 is
commonly mutated, the predicted higher impact of UPF1 mutations is
associated with multiple gene-expression based markers of lymphocyte, but
not macrophage, infiltration. Patients with more than one UPF1 mutation were
assigned to the group of the mutation with the higher VEP score. P values by
Mann-Whitney U test.
In four out of five studies, the responders had a significantly higher
number of frameshifts predicted not to trigger NMD than the non-responders (1.5
- 4.3 fold higher in responders compared to non-responders, p=0.007-0.017,
one-tailed Mann-Whitney U test; Fig 4a). In
the fifth study there was a trend in the same direction (p=0.075), giving a
pooled p-value of 1.5 x 10-5 in a meta-analysis across the studies
(Fisher’s method for combining p-values) and a mean 2.4-fold difference
in burden of NMD-evading frameshifts.
Figure 4
Efficacy of immunotherapy is predicted by the burden of NMD-evading
frameshifting indels but not other indels.
a, across five studies, responders to immune checkpoint blockade are
enriched for a high burden of NMD-evading (top panels) but not for NMD-detected
(bottom panels) frameshifting indels. P values are by Mann-Whitney U test
(one-tailed, testing positive association of responders with higher burden).
b, enrichment for NMD-evading frameshifting indels in
responders versus non-responders is observed for all four NMD
rules. Error bars are 95% confidence intervals, c, coverage of
NMD-evading frameshifting indels by the individual NMD rules, observed in exomes
of immunotherapy responders.
In contrast, when examining frameshifts predicted by
NMDetective to trigger NMD, there was no association with
the response to immunotherapy in any study (p=0.129-0.666) or overall
(meta-analysis p=0.641; Fig 4a). The
average number of somatic frameshifts per patient that were predicted to trigger
NMD or not was similar (Fig 4a). The above
results broadly hold regardless of whether the frameshifts are classified by the
location of the frameshifting indel (Fig.
4a) or by the location of the proximal downstream PTC caused by the
frameshift (Extended Data Fig. 9a).
Extended Data Fig.9
NMD rules improve predictions of response to immunotherapy across
multiple cancer types.
a, assigning NMD rules to frameshift mutations based on
the location of the first downstream PTC in the new reading frame also shows
that the burden of frameshifts that cannot trigger NMD is higher in patients
that respond to immunotherapy. P values are by a one-tailed Mann-Whitney U
test. b, standardized regression coefficients and the 95%
confidence interval in a logistic regression model that predicts responders
versus non-responders. c, pseudo-R2 for sequential
addition of features to a joint model. The null model includes only the
study (dataset) as a covariate. d, precision-recall curves for
logistic regression models with three different sets of features: a tumor
mutation burden (TMB) baseline, another baseline where TMB and all
frameshifting indels are considered together, and the full model that
considers TMB and NMD-evading and NMD-detected frameshifting indels
separately. P values are by Chi-squared test. AUPRC, area under the
precision-recall curve.
The NMD-evading frameshifts classified by all four rules were enriched
in responders (Fig. 4b) and were most
commonly covered by the last exon rule (49%), followed by the long exon rule
(33%), and the non-canonical start-proximal rule (12%; Fig. 4c), demonstrating the importance of the complete set
of NMD rules in NMDetective.In a joint predictive model that classifies responders
versus non-responders, the NMD features have a substantial
contribution to predictive ability (Supplementary Note; Extended
Data Fig. 9b,c). We register an increase by 5.86% of the fraction of
patients correctly classified in a model which considers tumor mutation burden
(TMB) and NMD-evading frameshifts, compared to a TMB-only baseline (p=0.003;
Extended Data Fig. 9d). The area under
the precision-recall-curve increases from 0.55 (TMB-only) to 0.63 (TMB +
NMD-evading frameshifts).Taken together, these analyses provide strong evidence that, by
preventing the expression of neoantigens, NMD reduces the efficacy of cancer
immunotherapy.
Discussion
We have shown here that the activity of the NMD pathway is of broad
importance for selection on germline variants, disease phenotypes, gene editing
efficacy, and the immune reactivity of tumors. NMDetective is a
comprehensive resource containing predictions for whether PTCs occurring at any
location in the human and mouse genome will trigger NMD.Overall, NMD tends to aggravate phenotypes for many PTCs, suggesting that
pharmacological NMD inhibition may be a broadly applicable strategy for curbing the
progression of many genetic diseases[17-20].
Inhibitors of NMD have been developed and are well tolerated[29,30,47,48]. However, knowing the causal mutation for each
patient and whether it triggers NMD will be crucial for targeting therapies to
patients who will benefit.We also further validate the non-canonical NMD rules previously discovered
in our large-scale analysis of cancer genomes[9]. Both population allele frequencies and gene editing by
CRISPR-Cas9 validate that start-proximal PTCs and PTCs in long exons do not
efficiently trigger NMD. These non-canonical NMD rules cover a substantial
proportion of the human coding genome and are similarly important for predicting the
efficacy of NMD as the canonical rules (Fig.
1). For gene editing in particular, the tendency of sgRNA design algorithms
to target the first 150 nts of coding sequence needs to be revised, as frameshifts
in these regions often fail to trigger NMD and so can result in incomplete gene
inactivation.Finally, we suggest that NMD suppresses both the immune reactivity of tumors
and their response to immunotherapy. Remarkably, only those frameshifting indels
that evade NMD were associated with a response to immune checkpoint inhibitors.
Together with the enhanced immune reactivity of tumors with mutations in the NMD
pathway, this suggests that inhibiting NMD may be an effective strategy to
potentiate the efficacy of checkpoint inhibitors[18,49]. By preventing
the destruction of PTC-containing transcripts, reduced NMD activity may enhance the
expression of neo-antigens and so the efficacy of immunotherapy.
Methods
Predicting NMD efficacy
The fold-change in mRNA abundance of PTC-bearing transcripts was used to
quantify NMD efficacies as described in Lindeboom, et al. [9]. In short, after stringent
filtering to exclude genome segments with copy-number alterations, we identified
2,840 high-confidence nonsense mutations in the dominant transcript[50] of 1,900 protein-coding genes
in 9,769 tumor samples with matched exome and transcriptome data in the TCGA
Data Portal (superseded by NCI Genomic Data Commons[51]). Next, we compared the mRNA abundance (as
TPM, after filtering by a principal components analysis as in Lindeboom, et al.
[9]) of each PTC-bearing
transcript to the median expression of the same transcript in similar tumor
samples (defined by tumor type and further subdivided by non-negative matrix
factorization clustering on global gene expression patterns), but where PTCs
were absent. Finally, the −log2 fold-difference in TPM was
then used to quantify NMD efficacy acting upon each PTC. We used these NMD
efficacy scores derived from cancer transcriptomes as a training set to derive
the NMDetective predictive models. Such models were then used
to make genome-wide predictions for all possible PTC variants, based on genomic
features we found to be associated with NMD efficacy acting on observed PTC
mutations, and additionally requiring that these features validated in an
independent set of somatic frameshifting indels and also in a set of germline
PTC variants[9].The NMDetective-A resource was generated using Random
Forest regression (randomForest package version 4.6-14 in R),
while the NMDetective-B resource was based on a decision tree
model using a conditional inference tree algorithm (party
package version 1.3-1 in R). Both models were trained on NMD efficacy scores,
while using the following predictive features: on_last_exon and
in_last_50nt_of_penultimate_exon as boolean features, with
rna_halflife quantified in minutes and with the following
features quantified in number of nucleotides:
distance_to_coding_start, exon_length,
distance_to_downstream_EJC and
distance_to_wildtype_stopcodon. The 3’UTR was not
included when calculating the exon length of the last exon, and the
distance_to_coding_start was capped was capped at 1000 nt.
RNA half-life values were taken from Friedel, et al. [52] and missing values were imputed with the
median RNA half-life (which was 300 minutes). The distance to downstream EJC was
calculated from the presumed 5’ border of the EJC, located 50 nt from the
exon junction. An additional feature that was included in the training data set
was the variant allele fraction (VAF) of each somatic PTC variant; in order to
obtain the predictions in the NMDetective-A/B resources, the median VAF values
in the training data were supplied to the models.The NMDetective-A efficacy scores were clustered by a univariate
gaussian mixture model on all hg38 scores. Expectation maximization by the
flexmix package version 2.3-15 in R was performed for
1,000,000 iterations at a tolerance of 1e-15, testing 1 to 15 clusters to
determine optimal amount of clusters (determined with the bayesian information
criterion). This resulted in 5 clusters and the clusters with the two highest,
the two intermediate, and the lowest mean NMD efficacy scores were classified as
efficient, intermediate and inefficient NMD efficacy clusters, respectively.Random Forest regression was performed with 100,000 trees and the amount
of variables sampled at each split set to one. Only splits significant at a
permutation-based P value < 0.05 were selected for the final decision
tree model. The performance of both models reported in Fig 1b was assessed using an independent test set with NMD
efficacies of 3,151 PTC-introducing frameshift mutations[9], thus obtaining the
R2 values which were adjusted to account the maximum attainable
R2 given the reproducibility of the measurements (correction for
attenuation; see Lindeboom, et al. [9]).Unless stated differently, downstream analyses were performed by using
the NMDetective-B NMD efficacy predictions on the UCSC
knownGene databases downloaded in January 2018[53], after excluding genomic regions with multiple
NMD efficacy scores because of isoforms or overlapping genes.The data deposited at Lindeboom, R. G. H.. NMDetective. (2019).
doi:10.6084/m9.figshare.7803398 contains
NMDetective-A and NMDetective-B efficacy
scores for all possible single-nucleotide variants and out-of-frame indel
mutations that introduce PTCs in (UCSC knownGene) protein coding transcripts of
recent versions of the human and mouse genome (hg19, hg38, mm9 and mm10).
Disease genetics
Release 1.0 of the Exome Aggregation Consortium[54] (ExAC) was used to analyze the
signatures of selection acting on nonsense mutations. Variants with clinical
significance were taken from the ClinVar database version 20170905[55] on genome build GRCh37. Common
nonsense mutations (allele frequency > 0.001) in ExAC were selected as
benign mutations and compared to known pathogenic nonsense mutations in ClinVar
to train and test a logistic model that separates benign and pathogenic variants
based on ALoFT[56] features and
on our NMD prediction. Only the selection of ALoFT features shown in Fig. 2e were used to predict pathogenic
mutations, removing features that were redundant or that could be confounded
with our NMD rules (for example distance to start- and stop-codon). Features
describing protein domains and post-transcriptional modifications were described
by a binary variable for every feature with the test if the variant intersected
with a domain or modification. The ALoFT sequence conservation feature was
quantified as dN/dS ratios compared to mouse.To test for significant differences in density of clinically relevant
PTCs in NMD-evading regions of individual genes, we used all nonsense and
frameshift mutations in ClinVar and focussed only on genes with at least five
PTC introducing variant. The largest UCSC knownGene transcript per gene was
selected for analysis. For every gene, we compared the PTC density in individual
NMD regions, and in all NMD evading regions combined, to the PTC density in the
NMD-detected part of the gene. To control for potential effects of mutation
hotspots and amount of ClinVar entries per gene, we also compared ClinVar PTCs
to rare missense mutation density in ExAC (allele frequency < 0.001) in
Extended Data Fig. 3a-b.To control for a possible bias in sequence properties of germline
variants in different types of NMD regions, we used the trinucleotide context of
all germline variants reported in the whole-genome gnomAD release
2.1.1[57] to determine
sequence and VAF dependent substitution probabilities. We used these
probabilities to simulate mutations at different VAF ranges and filtered for
PTC-introducing variants to test for differences in their distribution over
different types of NMD regions.
CRISPR-Cas9 gene inactivation efficiency
We used genome-wide CRISPR screen data from Wang, et al. [32], to test if NMD evasion
affects the knock-out efficiency of essential genes presented in Fig. 3b (18166 genes targeted, ~10
sgRNAs per gene). Similar data from Meyers, et al. [33] and Wang, et al. [34] and was used as validation datasets and
presented in Extended Data Fig. 6a-e. In
all three studies, cancer cell lines were grown for several doublings after
being transduced with a genome-wide sgRNA library, and the depletion of sgRNAs
in the population was used to identify essential genes. To calculate the fold
change in sgRNA abundances shown in Fig.
3b, we normalized the read counts by total sequencing depth and added
0.001 to every read count value. Essential gene definitions used in Fig 3b were taken from the corresponding
study, for every cell line separately at FDR<5%. Genes with a short
half-life (<5 hours) were not shown in Fig.
3b, but were included in Extended Data
Fig. 5a. In Extended Data Fig.
6, we averaged the fold change for each sgRNA within each study, and
show only sgRNAs that target a gene that was a significant essential gene in at
least one cell line in Wang, et al. [32] at FDR<5%.To quantify how well the depletion of sgRNAs in different types of NMD
regions can predict essential genes, we used the log transformed fold change of
individual sgRNAs from Wang, et al. [32] to predict curated essential and non-essential genes
from Hart and Moffat [58].
Saturation genome editing data by CRISPR of the BRCA1 gene was taken from
Findlay, et al. [35]. Here,
BRCA1 mRNA and genome sequencing in saturation genome-edited HAP1 cells was
performed to quantify the mRNA expression of PTC-introducing SNVs, normalized to
their abundance in the genomic DNA (yielding the RNA score). We used these RNA
scores to investigate if NMD evasion of affected the expression of PTCs
introduced by CRISPR.Data from a CRISPR screen tiling genes encoding three human and six
mouse cell surface proteins, in which protein knock-out efficiencies were
determined by flow cytometry was obtained from Doench, et al. [36]. We used blastn[59] version 2.2.28+ to align sgRNA
target sequences to the hg19 or mm9 genome to obtain the genomic target site,
and only selected sgRNAs with a perfect match to the targeted genes. Read counts
for sgRNAs were normalized for total sequencing depth, and for every cell
surface protein we compared the mean normalized read counts from the negative
population to the unsorted population. We only considered genes that were
targeted by more than 50 sgRNAs and where the mean fold change in sgRNA
abundance differed from the unsorted population by >25%. For every
targeted gene, we show the isoform that was targeted by most sgRNA matches.To identify appropriate CRISPR design tools to investigate the
distribution of affected NMD rules by sgRNAs designed for CRISPR knock out
experiments, we focused on highly used tools that allow for batch submission of
target genes. We therefore used the CRISPR design tools E-CRISP[60] and ‘CRISPRko’
offered by the Genetic Perturbation Platform of the Broad Institute, to design
sgRNAs for knock-out experiments of the top 100 most cited genes (from http://doi.org/10.5281/zenodo.1066066).
Tumor immune reactivity
Somatic mutation data detected by the MuTect2 algorithm from tumor
samples in the TCGA program were downloaded on 3 October 2017 from the GDC Data
Portal[51]. Predicted
immunogenomic features were taken from Thorsson, et al. [61]. RNA expression of immune
checkpoint related genes shown in Extended Data
Fig. 7 were downloaded using the R package RTCGA version 1.12.0.We gathered whole exome sequencing (WES) data of pretreatment tumor
samples from patients receiving immune checkpoint blockade drugs from five
different studies[42-46] (273 patients in total). Most
patients were stratified by response evaluation criteria in solid
tumors[62] version 1.1
(RECIST 1.1), with the exception of patients in Hugo, et al. [43], who were classified by
irRECIST[63]. We
classified patients in response or no-response groups based on information
supplied by the authors of the respective studies. Patients were classified as
responders based on the following information: in Miao, et al. [44] patients with ‘clinical
benefit’ or ‘intermediate benefit’ as responders, in Van
Allen, et al. [42], Miao, et al.
[46] and Hugo, et al.
[43] patients with
‘response’, and in Forde, et al. [45] patients with ‘major pathological
response’. Patients in the studied cohorts received the following immune
checkpoint blockade drugs: Van Allen, et al. [42] were metastatic melanoma samples and received
ipilimumab (anti CTLA-4), Hugo, et al. [43] were melanoma samples that received pembrolizumab and
nivolumab (anti PD-1), Miao, et al. [44] contained metastatic clear cell renal cell carcinoma
samples that received nivolumab, Forde, et al. [45] were advanced non-small-cell lung carcinoma
samples that received nivolumab. Miao, et al. [46] included tumor samples originating from
bladder, lung, skin and head/neck, which were treated with anti PD-1, anti
PD-L1, and anti CTLA-4 drugs. Because Miao, et al. [46] presented a collection of previously published
samples together with some new WES samples, care was taken in selecting the WES
samples unique to this study. To count the number of frameshifts that cannot
trigger NMD, we used the indel variants supplied by the authors and aligned them
to NMD efficacy scores on the canonical transcript database of UCSC, and counted
the frameshifts in any of the NMD rules as frameshifts that do not trigger NMD.
The tumor mutational burden (TMB) was defined as the total amount of SNVs in a
tumor sample.
Statistics
All statistical analyses were performed in R version 3.4.4. The
statistical tests applied are described in the corresponding section. Unless
stated differently, a two-tailed Mann-Whitney U test was used. A Life Sciences
Reporting Summary accompanies this article.
The distribution of genome-wide NMD efficacy scores and of NMD rules in
all genes with more than 20 disease-associated PTC variants.
a, the distribution of NMDetective-A scores over all
genes in hg38 reveals three global clusters of inefficient,
intermediate-efficiency and efficient NMD. b, genes in which
there is an excess of PTCs in NMD-evading regions (left barplot) and genes
where there is a dearth of PTCs is NMD-evading regions (right barplot). The
proportion of PTCs in different NMD-evading regions is shown as colored
segments in the bar chart. The relative portion of the protein-coding mRNA
sequence that is covered by the NMD rules is shown as a black vertical
stripe. c, a schematic of a gene that illustrates how PTCs that
trigger or evade NMD can lead to different outcomes in protein
expression.
The sequence context of nonsense variants is not different between
different types of NMD regions.
a, the trinucleotide spectrum of nonsense variants in
ExAC is consistent across gene regions that trigger or evade NMD,
b. spectrum of variants shows high Pearson correlations
between the different types of NMD regions. c, the baseline
NMD-evasion rule coverage for population genomic data, obtained from
nonsense variants simulated from the trinucleotide context of whole-genome
population variants at different VAF ranges, exhibits a consistent
distribution at different VAF ranges. Observed nonsense variants are
increasingly enriched towards NMD-evading regions with an increasing VAF,
compared to the simulated baseline at same VAFs. Odds ratios significant at
P<0.01 (Fisher’s exact test) are shown, comparing the
distribution of simulated versus observed nonsense
mutations.
Disease genes with a significant enrichment of PTC variants that do or do
not trigger NMD, with and without normalization to local density of missense
mutations.
a-b, significant enrichment of genes at FDR<0.05
after normalization to the number of ClinVar missense variants observed in
the same NMD regions. c-d, genes significant at an
FDR<25% are shown (see Fig. 2d-e
for a list at FDR<5%). Log2 odds ratios are for ClinVar
frequencies of NMD-evading frameshifting indel and nonsense variants versus
NMD-detected frameshifting indel and nonsense variants regions of a gene,
normalized to the length of the NMD-evading versus NMD-detected regions.
FDRs are by Fisher’s exact test, two-tailed, Benjamini-Hochberg
adjusted. a-d, log2 odds ratios are shown
separately for the four rules, for each rule which is significant in a
particular gene.
Effect of NMD rules observed in CRISPR assays.
a, sgRNAs targeted to gene regions that evade NMD show
a smaller enrichment when selecting for cells that do not express the
targeted protein. Fold differences in sgRNA abundance are pooled per rule
and shown for all proteins in a and additionally broken down by
protein in c. P values are by Mann-Whitney U test, two-sided.
b, Models that discriminate essential from non-essential
genes based on the fold-depletion of sgRNAs are more accurate for sgRNAs
that target gene regions that trigger NMD than for sgRNAs targeted to
different NMD-evading regions.
Relevance of NMD rules for CRISPR sgRNA design.
a, fitness loss upon targeting a non-essential gene
(left) versus an essential gene (right) using a sgRNA directed at gene
sections which are covered by various NMD-evasion rules.
b-e, distribution of loci targeted by sgRNAs
that are NMD-detected or NMD-evading (according to the individual NMD rules)
for genome-wide CRISPR libraries (b, c) or by
sgRNA design tools (d, e).
CRISPR screening data support canonical and non-canonical determinants of
NMD efficacy.
a, the non-canonical long-exon NMD evasion rule has
similar effects as the canonical last-exon NMD evasion rule in terms of
attenuated loss of fitness when targeting an essential gene (Methods). b-e,
minor non-canonical NMD determinants, which are not included in the
NMDetective-B model, but are included in the
comprehensive NMDetective-A model. This includes: distance
to downstream splice site in long exons (b), for the
start-proximal rule, existence of a downstream in-frame AUG codon,
presumably facilitating translation re-initiation (c), distance
to the wild-type stop codon (d), and the
effect of mRNA turnover on the observed NMD efficacy (e).
Tumor infiltration by immune cells is associated with a high burden of
NMD-evading frameshifting indels.
a-b, Individual immune markers for the TCGA samples
were estimated using gene expression data[50]. FDR is by two-sided Mann-Whitney U test,
Benjamini-Hochberg adjusted. In panel b, only tests significant
at FDR<25% are shown.
Evidence that NMD activity is a determinant of immune reactivity of
tumors.
a, in the TCGA kidney cancer cohorts (KIRC, KICH and
KIRP), a cancer type where indel burden is known to be particularly strongly
associated with immunogenicity[41], higher relative burden of NMD-evading frameshifts was
associated with longer survival (p=0.011 for pooled data from both panels,
by log-rank test) without application of immunotherapy. Patients were
separated based on the number of frameshifting indels that do not trigger
NMD being higher than the number that trigger NMD (cyan) and those patients
where the converse is true (red). b, in the TCGA UCEC cohort of
uterine corpus endometrial carcinoma, where the key NMD gene UPF1 is
commonly mutated, the predicted higher impact of UPF1 mutations is
associated with multiple gene-expression based markers of lymphocyte, but
not macrophage, infiltration. Patients with more than one UPF1 mutation were
assigned to the group of the mutation with the higher VEP score. P values by
Mann-Whitney U test.
NMD rules improve predictions of response to immunotherapy across
multiple cancer types.
a, assigning NMD rules to frameshift mutations based on
the location of the first downstream PTC in the new reading frame also shows
that the burden of frameshifts that cannot trigger NMD is higher in patients
that respond to immunotherapy. P values are by a one-tailed Mann-Whitney U
test. b, standardized regression coefficients and the 95%
confidence interval in a logistic regression model that predicts responders
versus non-responders. c, pseudo-R2 for sequential
addition of features to a joint model. The null model includes only the
study (dataset) as a covariate. d, precision-recall curves for
logistic regression models with three different sets of features: a tumor
mutation burden (TMB) baseline, another baseline where TMB and all
frameshifting indels are considered together, and the full model that
considers TMB and NMD-evading and NMD-detected frameshifting indels
separately. P values are by Chi-squared test. AUPRC, area under the
precision-recall curve.
Authors: Manuel A Rivas; Matti Pirinen; Donald F Conrad; Monkol Lek; Emily K Tsang; Konrad J Karczewski; Julian B Maller; Kimberly R Kukurba; David S DeLuca; Menachem Fromer; Pedro G Ferreira; Kevin S Smith; Rui Zhang; Fengmei Zhao; Eric Banks; Ryan Poplin; Douglas M Ruderfer; Shaun M Purcell; Taru Tukiainen; Eric V Minikel; Peter D Stenson; David N Cooper; Katharine H Huang; Timothy J Sullivan; Jared Nedzel; Carlos D Bustamante; Jin Billy Li; Mark J Daly; Roderic Guigo; Peter Donnelly; Kristin Ardlie; Michael Sammeth; Emmanouil T Dermitzakis; Mark I McCarthy; Stephen B Montgomery; Tuuli Lappalainen; Daniel G MacArthur Journal: Science Date: 2015-05-08 Impact factor: 47.728
Authors: Joery den Hoed; Elke de Boer; Norine Voisin; Alexander J M Dingemans; Nicolas Guex; Laurens Wiel; Christoffer Nellaker; Shivarajan M Amudhavalli; Siddharth Banka; Frederique S Bena; Bruria Ben-Zeev; Vincent R Bonagura; Ange-Line Bruel; Theresa Brunet; Han G Brunner; Hui B Chew; Jacqueline Chrast; Loreta Cimbalistienė; Hilary Coon; Emmanuèlle C Délot; Florence Démurger; Anne-Sophie Denommé-Pichon; Christel Depienne; Dian Donnai; David A Dyment; Orly Elpeleg; Laurence Faivre; Christian Gilissen; Leslie Granger; Benjamin Haber; Yasuo Hachiya; Yasmin Hamzavi Abedi; Jennifer Hanebeck; Jayne Y Hehir-Kwa; Brooke Horist; Toshiyuki Itai; Adam Jackson; Rosalyn Jewell; Kelly L Jones; Shelagh Joss; Hirofumi Kashii; Mitsuhiro Kato; Anja A Kattentidt-Mouravieva; Fernando Kok; Urania Kotzaeridou; Vidya Krishnamurthy; Vaidutis Kučinskas; Alma Kuechler; Alinoë Lavillaureix; Pengfei Liu; Linda Manwaring; Naomichi Matsumoto; Benoît Mazel; Kirsty McWalter; Vardiella Meiner; Mohamad A Mikati; Satoko Miyatake; Takeshi Mizuguchi; Lip H Moey; Shehla Mohammed; Hagar Mor-Shaked; Hayley Mountford; Ruth Newbury-Ecob; Sylvie Odent; Laura Orec; Matthew Osmond; Timothy B Palculict; Michael Parker; Andrea K Petersen; Rolph Pfundt; Eglė Preikšaitienė; Kelly Radtke; Emmanuelle Ranza; Jill A Rosenfeld; Teresa Santiago-Sim; Caitlin Schwager; Margje Sinnema; Lot Snijders Blok; Rebecca C Spillmann; Alexander P A Stegmann; Isabelle Thiffault; Linh Tran; Adi Vaknin-Dembinsky; Juliana H Vedovato-Dos-Santos; Samantha A Schrier Vergano; Eric Vilain; Antonio Vitobello; Matias Wagner; Androu Waheeb; Marcia Willing; Britton Zuccarelli; Usha Kini; Dianne F Newbury; Tjitske Kleefstra; Alexandre Reymond; Simon E Fisher; Lisenka E L M Vissers Journal: Am J Hum Genet Date: 2021-01-28 Impact factor: 11.025
Authors: Nicole A Teran; Daniel C Nachun; Tiffany Eulalio; Nicole M Ferraro; Craig Smail; Manuel A Rivas; Stephen B Montgomery Journal: Am J Hum Genet Date: 2021-07-02 Impact factor: 11.025
Authors: Yang Pan; Kathryn E Kadash-Edmondson; Robert Wang; John Phillips; Song Liu; Antoni Ribas; Richard Aplenc; Owen N Witte; Yi Xing Journal: Trends Pharmacol Sci Date: 2021-04 Impact factor: 14.819
Authors: Shuquan Rao; Yao Yao; Josias Soares de Brito; Qiuming Yao; Anne H Shen; Ruth E Watkinson; Alyssa L Kennedy; Steven Coyne; Chunyan Ren; Jing Zeng; Anna Victoria Serbin; Sabine Studer; Kaitlyn Ballotti; Chad E Harris; Kevin Luk; Christian S Stevens; Myriam Armant; Luca Pinello; Scot A Wolfe; Roberto Chiarle; Akiko Shimamura; Benhur Lee; Peter E Newburger; Daniel E Bauer Journal: Cell Stem Cell Date: 2021-01-28 Impact factor: 24.633
Authors: Divya Sinha; Benjamin Steyer; Pawan K Shahi; Katherine P Mueller; Rasa Valiauga; Kimberly L Edwards; Cole Bacig; Stephanie S Steltzer; Sandhya Srinivasan; Amr Abdeen; Evan Cory; Viswesh Periyasamy; Alireza Fotuhi Siahpirani; Edwin M Stone; Budd A Tucker; Sushmita Roy; Bikash R Pattnaik; Krishanu Saha; David M Gamm Journal: Am J Hum Genet Date: 2020-07-23 Impact factor: 11.025
Authors: Abigael Cheruiyot; Shan Li; Sridhar Nonavinkere Srivatsan; Tanzir Ahmed; Yuhao Chen; Delphine S Lemacon; Ying Li; Zheng Yang; Brian A Wadugu; Wayne A Warner; Shondra M Pruett-Miller; Esther A Obeng; Daniel C Link; Dalin He; Fei Xiao; Xiaowei Wang; Julie M Bailis; Matthew J Walter; Zhongsheng You Journal: Cancer Res Date: 2021-07-02 Impact factor: 12.701