Literature DB >> 34784346

Alternative poly-adenylation modulates α1-antitrypsin expression in chronic obstructive pulmonary disease.

Lela Lackey1, Aaztli Coria2,3, Auyon J Ghosh4,5, Phil Grayeski6, Abigail Hatfield1, Vijay Shankar1, John Platig4, Zhonghui Xu4, Silvia B V Ramos3, Edwin K Silverman4,5, Victor E Ortega7, Michael H Cho4,5, Craig P Hersh4,5, Brian D Hobbs4,5, Peter Castaldi4,8, Alain Laederach2.   

Abstract

α1-anti-trypsin (A1AT), encoded by SERPINA1, is a neutrophil elastase inhibitor that controls the inflammatory response in the lung. Severe A1AT deficiency increases risk for Chronic Obstructive Pulmonary Disease (COPD), however, the role of A1AT in COPD in non-deficient individuals is not well known. We identify a 2.1-fold increase (p = 2.5x10-6) in the use of a distal poly-adenylation site in primary lung tissue RNA-seq in 82 COPD cases when compared to 64 controls and replicate this in an independent study of 376 COPD and 267 controls. This alternative polyadenylation event involves two sites, a proximal and distal site, 61 and 1683 nucleotides downstream of the A1AT stop codon. To characterize this event, we measured the distal ratio in human primary tissue short read RNA-seq data and corroborated our results with long read RNA-seq data. Integrating these results with 3' end RNA-seq and nanoluciferase reporter assay experiments we show that use of the distal site yields mRNA transcripts with over 50-fold decreased translation efficiency and A1AT expression. We identified seven RNA binding proteins using enhanced CrossLinking and ImmunoPrecipitation precipitation (eCLIP) with one or more binding sites in the SERPINA1 3' UTR. We combined these data with measurements of the distal ratio in shRNA knockdown experiments, nuclear and cytoplasmic fractionation, and chemical RNA structure probing. We identify Quaking Homolog (QKI) as a modulator of SERPINA1 mRNA translation and confirm the role of QKI in SERPINA1 translation with luciferase reporter assays. Analysis of single-cell RNA-seq showed differences in the distribution of the SERPINA1 distal ratio among hepatocytes, macrophages, αβ-Tcells and plasma cells in the liver. Alveolar Type 1,2, dendritic cells and macrophages also vary in their distal ratio in the lung. Our work reveals a complex post-transcriptional mechanism that regulates alternative polyadenylation and A1AT expression in COPD.

Entities:  

Mesh:

Substances:

Year:  2021        PMID: 34784346      PMCID: PMC8631626          DOI: 10.1371/journal.pgen.1009912

Source DB:  PubMed          Journal:  PLoS Genet        ISSN: 1553-7390            Impact factor:   5.917


Introduction

The α1-anti-trypsin (A1AT) protein functions primarily as a neutrophil elastase inhibitor and controls the inflammatory response in the lung [1,2]. It is encoded in the SERPINA1 messenger RNA (mRNA), that is translated into a single protein isoform [3]. Despite coding for a single protein isoform, the SERPINA1 mRNA is spliced into 11 transcript variants, with all variants differing only in their 5’ UnTranslated Region (UTR). This puts the gene in the top 5% of most transcriptionally complex genes in the human genome [4]. Interestingly, these 5’ UTR transcript variants selectively include and exclude upstream Open Reading Frames (uORFs) which regulate the translation efficiency of the mRNA ultimately affecting A1AT expression [4,5]. Recent deep resequencing of the SERPINA1 locus in the SubPopulations and InteRmediate Outcome Measures In COPD Study (SPIROMICS) identified additional variation mapping to the 5’ UTR of the gene associated with lowered A1AT serum levels and functional small airway disease [6]. These results support that post-transcriptional regulation of expression is an important and understudied component of A1AT function and lung physiology. A1AT deficiency is a hereditary disorder that can lead to panlobular emphysema in the lung and cirrhosis in the liver [7-9]. A1AT is secreted and predominantly produced by the liver [7,8]. Accordingly, SERPINA1 mRNA is most highly expressed in the liver, however there is also significant expression in blood, lung, small intestine, spleen and kidney in humans (Fig 1A) [10,11]. Although the role of the A1AT protein in disease etiology is well established, the SERPINA1 mRNA, and in particular its multiple alternative splicing transcriptional isoforms is exceptional and still poorly understood [3]. To date, the role of the 3′ UTR of the SERPINA1 mRNA in A1AT expression has yet to be investigated. 3′ UTRs of mRNAs are known to control both stability and translation efficiency of the message through multiple mechanisms including RNA interference, RNA Binding Protein (RBP) binding, and alternative polyadenylation [12-16].
Fig 1

Tissue expression and alternative polyadenylation of the SERPINA1 mRNA.

A) Tissue expression in Transcripts per Million (TPM) on a log scale for SERPINA1 mRNA for 17382 samples sequenced by the GTEx RNA-seq consortia [10,11,62]. The six tissues with median SERPINA1 expression above average (~1000 TPM, indicated with dashed line) were further analyzed. B) Variation (as measured by the tissue specific standard deviation) in tissues. We observe particularly high levels of variation among individuals within blood samples. C) Read coverage of GTEx RNA-seq data for the six tissues highly expressing SERPINA1 mRNA are shown in pink. Read coverage for HepG2 and HepG2 3′ RNA-seq experiments shown in turquoise. All these RNA-seq data sets confirm the exon structure of SERPINA1 as shown on top in black and grey. We observe tissue specific differential exon usage in the 5’ UTR (exons 1–3) consistent with previous work characterizing differential isoform usage in this region of SERPINA1 [4]. D) Zoom of the 3′ UTR region for primary liver data mean depth (top) and HepG2 3′end seq experiment identifying the proximal and distal APA sites. E) PacBio long read sequencing of primary human liver tissue data confirming isoforms using proximal (blue) and distal (yellow) APA sites.

Tissue expression and alternative polyadenylation of the SERPINA1 mRNA.

A) Tissue expression in Transcripts per Million (TPM) on a log scale for SERPINA1 mRNA for 17382 samples sequenced by the GTEx RNA-seq consortia [10,11,62]. The six tissues with median SERPINA1 expression above average (~1000 TPM, indicated with dashed line) were further analyzed. B) Variation (as measured by the tissue specific standard deviation) in tissues. We observe particularly high levels of variation among individuals within blood samples. C) Read coverage of GTEx RNA-seq data for the six tissues highly expressing SERPINA1 mRNA are shown in pink. Read coverage for HepG2 and HepG2 3′ RNA-seq experiments shown in turquoise. All these RNA-seq data sets confirm the exon structure of SERPINA1 as shown on top in black and grey. We observe tissue specific differential exon usage in the 5’ UTR (exons 1–3) consistent with previous work characterizing differential isoform usage in this region of SERPINA1 [4]. D) Zoom of the 3′ UTR region for primary liver data mean depth (top) and HepG2 3′end seq experiment identifying the proximal and distal APA sites. E) PacBio long read sequencing of primary human liver tissue data confirming isoforms using proximal (blue) and distal (yellow) APA sites. One hallmark of both A1AT protein and SERPINA1 mRNA expression is significant variability between individuals [6]. However, this variability is not clearly associated with lung function, particularly when accounting for the inflammatory status as measured by C-reactive protein [17]. Although some of this variability can be attributed to genetic factors, such as the Z allele which causes accumulation of A1AT in the liver, the majority cannot be accounted for by cis genetic variation alone [7,8]. We investigate here alternative 3′ polyadenylation in the SERPINA1 mRNA and its role in Chronic Obstructive Pulmonary Disease (COPD), a leading cause of death in the world [18-20]. We characterize a distal polyadenylation site in the SERPINA1 mRNA that is differentially used in the lungs of COPD individuals. By combining quantitative 3′ end sequencing, large scale transcriptomic profiling of individuals, quantitative nanoluciferase reporter assays, RNA chemical structure probing, and single-cell RNA sequencing, we characterize the role of SERPINA1 Alternative Poly-Adenylation (APA) in controlling A1AT translation. In addition, we identify key RNA binding proteins affecting the mechanism of A1AT post-transcriptional regulation. This novel mechanism is significant because it reveals a complex post-transcriptional control mechanism affecting A1AT expression that is an important component of A1AT deficiency disease etiology and COPD.

Results

Tissue specific expression of SERPINA1 mRNA

We begin our investigation of post-transcriptional regulation by visualizing median expression of the SERPINA1 mRNA across individuals sequenced in the Gene Tissue Expression (GTEx) Atlas (Fig 1A). We identified six tissues (Liver, Blood, Lung, Small Intestine, Spleen and Kidney) where the median expression was above 1000 TPM and therefore sufficient to obtain individual estimates of exon specific expression. In Fig 1B, we illustrate the standard deviation of expression across individuals in the GTEx data and map them to the human body [6,21]. Analysis of mean short-read coverage across individuals for these six tissues (Fig 1C) reveals well defined exons in the coding region of the gene (exons 4–7); the hg38 reference annotation of exons is shown in gray (untranslated) and black (protein coding) in Fig 1C. We observe differential coverage in exons 1–3 consistent with the tissue specific expression of SERPINA1 mRNA transcript isoforms previously described [3,4,22,23]. The premise for our study is illustrated in Fig 1D where we have expanded exon 7, which includes the 3′ UTR of the mRNA. We observe a significant drop off in coverage approximately ~60 nucleotides downstream of the stop codon, which suggests the presence of distal and proximal alternative polyadenylation (APA) sites. The steep drop is also observed in coverage in RNA-seq data from Hep-G2 cell lines (Fig 1C, turquoise tracks). We therefore performed 3′-seq sequencing on these cells to quantitatively map the alternative poly-adenylation sites with single nucleotide resolution (Fig 1D). We observe two peaks in this signal (labeled proximal and distal) with sharp 3′ cliffs confirming that SERPINA1 mRNA is alternatively poly adenylated. These peaks are consistent with the drop off in coverage in RNA-seq. Upon analysis of the sequence directly upstream of the distal drop off site, we observe a canonical AAUAAA motif consistent with a poly-A signal, while a weaker signal (AUUAAA) is present at the proximal site (S1A Fig). Furthermore our 3′-seq data allow us to map the cleavage site and identify the G/U-rich region downstream (shown in S1A Fig). These sequence motifs are hallmarks of APA in 3′ UTRs [11,24]. Finally, IGV analysis of PacBio long read RNA sequencing from primary human liver tissue confirms the existence of both long and short isoforms in the 3′ UTR as well as expression of multiple 5’ UTR splice isoforms (Fig 1E) [25]. Our results so far establish that the 3′ UTR of SERPINA1 is alternatively polyadenylated, and that in all tissues (and HepG2 cell lines) the shorter isoform is preferentially expressed. Importantly, however, in all tissues and cell lines we observe coverage over the entire long isoform, suggesting that both isoforms are constitutively present, albeit at different levels. From the RNA-seq coverage data we compute the distal ratio for the six tissues by dividing the mean distal coverage depth by the proximal depth normalized by length. When we compute the distal ratio in human primary tissues, we observe the highest distal site usage in the liver (Fig 2A). Furthermore, we also observe significant variability in the distal ratio among individuals, especially in the lung and liver.
Fig 2

Distal ratio across tissues and in COPD.

The distal ratio is measured as the relative depth of SERPINA1 3′ UTR distal vs. proximal reads in RNA-seq data. A) Distal ratio measured in the six tissues expressing SERPINA1 mRNA above 1000 TPM in Gtex RNA-seq consortium data [10,11]. We observe the highest distal ratio in liver, but important variation in the lung. B) The distal ratio measured in a publicly available [26] short-read lung tissue RNA-seq data set from n = 82 (red) COPD subjects and n = 64 (blue) controls. C) Distal ratio measured from lung tissue RNA-seq data in the Lung Tissue Research Consortium for n = 376 COPD (red) subjects and n = 267 (blue) normal participants. In both studies we observe a significant increase of the distal ratio in COPD subjects. D) The characteristic drop at the proximal APA site indicative of alternative polyadenylation for mean COPD (red) and Normal (blue) lung tissue for n = 376 COPD subjects and n = 267 normal patients. E) Distal ratio analysis of LTRC subjects broken down by SERPINA1 M (normal), S (mild disease) and Z (severe disease) alleles showing that the S and Z alleles exacerbate use of the distal alternative poly-adenylation site in the lungs of individuals with COPD.

Distal ratio across tissues and in COPD.

The distal ratio is measured as the relative depth of SERPINA1 3′ UTR distal vs. proximal reads in RNA-seq data. A) Distal ratio measured in the six tissues expressing SERPINA1 mRNA above 1000 TPM in Gtex RNA-seq consortium data [10,11]. We observe the highest distal ratio in liver, but important variation in the lung. B) The distal ratio measured in a publicly available [26] short-read lung tissue RNA-seq data set from n = 82 (red) COPD subjects and n = 64 (blue) controls. C) Distal ratio measured from lung tissue RNA-seq data in the Lung Tissue Research Consortium for n = 376 COPD (red) subjects and n = 267 (blue) normal participants. In both studies we observe a significant increase of the distal ratio in COPD subjects. D) The characteristic drop at the proximal APA site indicative of alternative polyadenylation for mean COPD (red) and Normal (blue) lung tissue for n = 376 COPD subjects and n = 267 normal patients. E) Distal ratio analysis of LTRC subjects broken down by SERPINA1 M (normal), S (mild disease) and Z (severe disease) alleles showing that the S and Z alleles exacerbate use of the distal alternative poly-adenylation site in the lungs of individuals with COPD.

APA in the lung and COPD

To investigate the role of APA in the SERPINA1 3′ UTR and its relationship with COPD we computed distal ratio in lung tissue from two independent population studies. In the first we analyzed a publicly available [26] short-read lung tissue RNA-seq data set from n = 82 subjects and n = 64 controls and observed a 2.1 (p = 2.5x10-6) increase of the median distal ratio in individuals with COPD (Fig 2B). We therefore sought to replicate this finding in the larger Lung Tissue Research Consortium (LTRC) study where we obtained RNA-seq from 376 COPD cases and 267 controls (Fig 2C). There we observed a similar 1.5-fold (p < 7.6 x 10−10) increase in the distal ratio, suggesting higher use of the distal polyadenylation site in individuals with COPD. This is also visualized as the mean normalized read depth for the LTRC data in Fig 2D. We observe in the LTRC data the characteristic drop-off at the proximal APA site consistent with our previous analysis of primary human tissues (Fig 1D). The most common genetic variation leading to A1AT deficiency is the SERPINA1 Z allele [7-9]. Individuals homozygous for the S allele (SS) and heterozygous for the Z allele (MZ) may also have reduced A1AT [23,27]. Although there are many other SERPINA1 variant alleles, most are less common or do not significantly associate with A1AT levels [6]. In addition to this genetic effect, the likelihood of an individual to have lung-related A1AT deficiency-associated diseases such as COPD is strongly influenced by environmental exposures, mainly cigarette smoke [28-30]. To determine if protease inhibitor (PI) type M, S or Z alleles influenced the SERPINA1 distal ratio, we separately analyzed the distal ratio in individuals with MM, MS, MZ, and ZZ genotypes based on whole genome sequencing in our LTRC cohort data. This dataset includes genotyped MM individuals with no disease (n = 210) and with COPD (n = 284). Rare genotypes have proportionally lower representation in our data (MS: n = 14 and n = 20), (MZ: n = 8 and n = 16) and (ZZ: n = 1 and n = 7). As can be seen in Fig 2E, we observe a higher distal ratio amongst all genotypes with COPD (red box plots, Fig 2E). The distal ratio is significantly higher for individuals with COPD and an MS (p = 0.005) or MZ (p = 0.013) genotype compared to individuals with COPD and an MM genotype (Fig 2E). In the LTRC data set there were only n = 7 ZZ homozygous individuals, limiting the statistical power of this analysis (p = 0.15), but it is clear from Fig 2E that most ZZ individuals with COPD also have a higher distal. This indicates that COPD likely exacerbates use of the longer 3′ UTR. We further analyzed the distal ratio as a function of the Global Initiative for Chronic Obstructive Lung Disease (GOLD) spirometric stage [31] which shows a statistically significant increase in the distal ratio correlated with GOLD stage (S2 Fig). Together these data indicate that the distal ratio in lungs increases in individuals with COPD and increases with higher disease spirometric severity, raising the question of what the post-transcriptional functional effects of this lengthening are on A1AT expression.

Role of long and short 3′ UTRs in SERPINA1 translation and stability

To understand the functional consequences of APA site usage in the SERPINA1 mRNA 3′ UTR we performed a series of assays using nanoluciferase reporter constructs in both HepG2 (liver hepatocyte) and A549 (adenocarcinomic human alveolar basal epithelial) cell lines. In these experiments we aim to measure the amount of protein produced for the short (proximal APA site used) and long (distal APA site used) isoforms (Fig 3A). In our construct design, the upstream APA site (proximal) is mutated (indicated by an X in Fig 3A) so that only the correct 3’UTR isoform can be expressed. Both constructs have an additional strong SV40 polyadenylation signal to ensure that the transcripts end as expected. Using 3’ end specific sequencing we confirmed that we can accurately detect 3’ ends (S3A and S3B Fig) and that short and long 3’UTR constructs have the expected sequence (S3C and S3D Fig). As can be seen in Fig 3B the proximal (or short) isoform yields 50-fold more luminescence relative to the distal (long) isoform. This is the case both in HepG2 and A549 cell lines.
Fig 3

Luciferase reporter and mRNA stability assays to measure the effect of long vs. short 3′ UTRs.

To measure effect on translation efficiency and mRNA stability of the long and short 3′ UTR sequences we performed a series of luciferase reporter assays. A) Schematic of the luciferase reporter assay, combining a nanoluciferase reporter, PEST domain and the SERPINA1 exon 7 coding sequence upstream of the long and short 3′ UTRs. The proximal APA site is mutated to inhibit use of this site (indicated with an x on the long construct). These constructs are co-transfected in HepG2 and A549 cell lines with a control firefly reporter and the ratio of nanoluciferase protein to firefly protein measured. B) Log normalized luminescence, which indicates gene expression, measured for short (blue) and long (tan) SERPINA1 isoforms. The expression is significantly higher for the short isoform by close to two orders of magnitude in both lung derived A540 cells and liver derived HepG2 cells. C) Pulse chase experiments in A549 cells using ethylene uridine (EU) and click-it chemistry for labeling with biotin-azide to measure relative mRNA stability by qRT-PCR [63]. We confirmed that GAPDH was consistently stable over the time period and similar declines in both the long and short SERPINA1 RNAs, indicating similar stability for both long and short 3′ UTRs. D) Deletion construct design to identify regions controlling gene expression in SERPINA1 3′ UTR. Six constructs were designed to selectively delete regions 1–4. E) Relative luminescence which indicates expression for the six deletion constructs compared to short (blue) and long (tan) SERPINA1 3′ UTRs. F) Predicted vs. measured expression of deletion constructs using regression model described by Eq 1. This model yields the translation coefficients of the four regions reported in Table 1.

Luciferase reporter and mRNA stability assays to measure the effect of long vs. short 3′ UTRs.

To measure effect on translation efficiency and mRNA stability of the long and short 3′ UTR sequences we performed a series of luciferase reporter assays. A) Schematic of the luciferase reporter assay, combining a nanoluciferase reporter, PEST domain and the SERPINA1 exon 7 coding sequence upstream of the long and short 3′ UTRs. The proximal APA site is mutated to inhibit use of this site (indicated with an x on the long construct). These constructs are co-transfected in HepG2 and A549 cell lines with a control firefly reporter and the ratio of nanoluciferase protein to firefly protein measured. B) Log normalized luminescence, which indicates gene expression, measured for short (blue) and long (tan) SERPINA1 isoforms. The expression is significantly higher for the short isoform by close to two orders of magnitude in both lung derived A540 cells and liver derived HepG2 cells. C) Pulse chase experiments in A549 cells using ethylene uridine (EU) and click-it chemistry for labeling with biotin-azide to measure relative mRNA stability by qRT-PCR [63]. We confirmed that GAPDH was consistently stable over the time period and similar declines in both the long and short SERPINA1 RNAs, indicating similar stability for both long and short 3′ UTRs. D) Deletion construct design to identify regions controlling gene expression in SERPINA1 3′ UTR. Six constructs were designed to selectively delete regions 1–4. E) Relative luminescence which indicates expression for the six deletion constructs compared to short (blue) and long (tan) SERPINA1 3′ UTRs. F) Predicted vs. measured expression of deletion constructs using regression model described by Eq 1. This model yields the translation coefficients of the four regions reported in Table 1.
Table 1

Translation Coefficient of SERPINA1 3′ UTR Regions.

RegionTranscript CoordinatesGenomic CoordinatesTranslation CoefficienteCLIP RBPs
ProximalNM_000295:1519–1597chr14:94378451–94378373-PCBP1, PCBP2, SND1, LARP4
Region 1NM_000295:1619–2016chr14:94377952–943783500.34±0.04AKAP1, PCBP2, ILF3
Region 2NM_000295:2017–2416chr14:94377953–943775540.08±0.05AKAP1, PCBP2
Region 3NM_000295:2417–2816chr14:94377555–943771560.22±0.05AKAP1, PCBP2, ILF3, QKI
Region 4NM_000295:2817–3220chr14:94377157–943767470.32±0.04QKI
To distinguish between changes in RNA stability and translational efficiency, we also measured the relative stability of the long and short isoforms in A549 cell lines. We observed a decline over time in both the long and short SERPINA1 constructs (Fig 3C), consistent with high ΔCT values at 24 hours (S4A Fig). This finding indicates that the relative stability of the long and short isoforms (distal and proximal, respectively) is similar (Fig 3C). Longer 3′ UTRs are thought to destabilize mRNAs as they include more micro-RNA and RNA binding protein sites which affect mRNA stability and translation efficiency [12,32]. However, for SERPINA1, we do not detect significant stability differences between the short and long isoforms at early or late timepoints (Figs 3C and S4A). We find that steady state levels of the long isoform are lower than steady state levels of the short isoform (S4B Fig). However, this difference does not account for the large difference in translation between the short and long isoforms (Fig 3B). Thus, these experiments support the long isoform primarily inhibiting translation of the protein product.

Mechanism of translation suppression by long SERPINA1 3′ UTR

To identify regions of the long SERPINA1 3′ UTR that control translation we designed a series of six deletion constructs based on the long 3’UTR construct that uses only a distal APA site (Figs 3D and S3). We measured the relative luminescence of these constructs in A549 cells and observed that in general shorter constructs resulted in higher protein production. However, the relationship between construct length and expression is not linear, rather each deleted region appears to affect translation to a different degree (Fig 3E). To understand the role of each region in controlling translation we developed a model to predict the relative contribution to translation efficiency of each specific region (Eq 1). When we fit this model to our data (Fig 3F) we obtained high correlation between predicted and experimental values (R2 = 0.96). Furthermore, an analysis of the relative weight of each region’s translation (Table 1) reveals that no single region uniquely controls translation efficiency. Nonetheless, regions 1, 3 and 4 together account for over 90% of the variation in translation efficiency between the long and short 3′ UTRs. Together these experiments suggest that the primary function of the long (distal) 3′ UTR in SERPINA1 is to inhibit translation, and that regions 1, 3, and 4 contribute the most to the translation repression. To dissect the mechanisms controlling translation repression and distal vs. proximal APA site usage we now investigate the effects of trans factors on the SERPINA1 3′ UTR.

Role of RNA Binding Proteins and structure in controlling translation efficiency

The fact that the long isoform of the 3′ UTR does not destabilize the message (Fig 3C) but decreases translation efficiency by several orders of magnitude is uncommon, but not unprecedented [33]. To reveal a mechanism that can account for this we begin our investigation by mining eCLIP data (enhanced CrossLinking and ImmunoPrecipitation) from recent ENCODE (Encyclopedia of DNA Elements) experiments on RNA binding proteins carried out in HepG2 cell lines [34-36]. In Fig 4A, we show high-confidence experimental eCLIP binding sites on the long SERPINA1 3′ UTR isoform. Several interesting features emerge from this analysis. First, multiple proteins bind near the proximal APA site, but further downstream fewer binding sites are present. The most distal binding site identified by eCLIP is for the RNA binding protein QKI (Quaking homolog) which spans regions 3 and 4. No other RBPs were identified by eCLIP binding downstream of this site. As a reminder, Deletion 1 (which removes Regions 2, 3 and 4) restores the translation efficiency of SERPINA1 3′ UTR to near proximal levels (Fig 3E). Also, deletions 5 and 6 (which maintain regions 3 and 4) both suppress translation to near full-length levels (Fig 3E). While other RNA binding proteins may have a role in regulating SERPINA1 mRNA or A1AT expression [37-39] (S1 Table), QKI’s distal binding position in Regions 3 and 4 in the 3’UTR is unique.
Fig 4

Role of RNA binding proteins and structure in modulating translation efficiency.

Mapping of eCLIP (enhanced CrossLinking and ImmunoPrecipitation) from recent ENCODE (Encyclopedia of DNA Elements) experiments on RNA binding proteins carried out in HepG2 cell lines [34–36] onto the long 3′ UTR isoform of SERPINA1 mRNA. Each rectangle indicates a protein binding site and sites are colored by RBP. B) SHAPE (Selective 2’ Hydroxyl Acylation by Primer Extension) structure probing long SERPINA1 3′ UTR. SHAPE data identifies flexible (unpaired) nucleotides in the RNA structure revealing regions of high accessibility for protein binding. Red indicates highly reactive nucleotides, yellow intermediate and black low. C) Using SHAPE data, we compute the entropy of the SERPINA1 mRNA structure. Low entropy regions adopt single, well-defined structures, while high entropy regions are more disordered [64,65]. D) SHAPE derived secondary structure model for short (top) and long SERPINA1 3′ UTR indicated as an arc diagram. Only highly probable base-pairs are shown (green). As can be seen QKI (Quaking Homolog) binds to a low-entropy region in the 3′ UTR with extensive local base-pairing. E) SHAPE reactivity for the nucleotides for the eCLIP QKI binding site including the putative binding motif of the RNA binding protein 5’-NACUAAY-N(1,20)-UAAY-3′ [46]. F) We found the QKI binding site to impact gene expression from the SERPINA1 3’UTR using our nanoluciferase assay. Mutating the QKI binding site increases expression from the long 3’UTR isoform (QKImutant) while adding the QKI binding site and flanking nucleotides decreases expression of the short 3’UTR isoform (Short+QKI) in A549 cells. G) Secondary structure model derived from experimental SHAPE data for QKI binding region, both regions of the motif are accessible for binding. H) QKI is known to retain mRNAs in the nucleus. We measured the SERPINA1 distal ratio in Hepatocellular carcinoma (●) and HepG2 (Δ) nuclear and cytoplasmic RNA-seq fractions and find the highest ratio in the nucleus. This suggests QKI binds and retains the long SERPINA1 isoform in the nucleus thereby inhibiting translation.

Role of RNA binding proteins and structure in modulating translation efficiency.

Mapping of eCLIP (enhanced CrossLinking and ImmunoPrecipitation) from recent ENCODE (Encyclopedia of DNA Elements) experiments on RNA binding proteins carried out in HepG2 cell lines [34-36] onto the long 3′ UTR isoform of SERPINA1 mRNA. Each rectangle indicates a protein binding site and sites are colored by RBP. B) SHAPE (Selective 2’ Hydroxyl Acylation by Primer Extension) structure probing long SERPINA1 3′ UTR. SHAPE data identifies flexible (unpaired) nucleotides in the RNA structure revealing regions of high accessibility for protein binding. Red indicates highly reactive nucleotides, yellow intermediate and black low. C) Using SHAPE data, we compute the entropy of the SERPINA1 mRNA structure. Low entropy regions adopt single, well-defined structures, while high entropy regions are more disordered [64,65]. D) SHAPE derived secondary structure model for short (top) and long SERPINA1 3′ UTR indicated as an arc diagram. Only highly probable base-pairs are shown (green). As can be seen QKI (Quaking Homolog) binds to a low-entropy region in the 3′ UTR with extensive local base-pairing. E) SHAPE reactivity for the nucleotides for the eCLIP QKI binding site including the putative binding motif of the RNA binding protein 5’-NACUAAY-N(1,20)-UAAY-3′ [46]. F) We found the QKI binding site to impact gene expression from the SERPINA1 3’UTR using our nanoluciferase assay. Mutating the QKI binding site increases expression from the long 3’UTR isoform (QKImutant) while adding the QKI binding site and flanking nucleotides decreases expression of the short 3’UTR isoform (Short+QKI) in A549 cells. G) Secondary structure model derived from experimental SHAPE data for QKI binding region, both regions of the motif are accessible for binding. H) QKI is known to retain mRNAs in the nucleus. We measured the SERPINA1 distal ratio in Hepatocellular carcinoma (●) and HepG2 (Δ) nuclear and cytoplasmic RNA-seq fractions and find the highest ratio in the nucleus. This suggests QKI binds and retains the long SERPINA1 isoform in the nucleus thereby inhibiting translation. Our previous work on the SERPINA1 5’ UTR established that RNA structure plays an important role in controlling translation efficiency of the A1AT protein [4]. To determine if RNA structure could play a role in controlling translation by mediating QKI binding, we performed SHAPE-MaP (Selective 2’ Hydroxyl Acylation by Primer Extension and Mutational Profiling) to determine the secondary structure of SERPINA1 3′ UTR [40-43]. The SHAPE data is shown in Fig 4B, where red bars indicate high, yellow intermediate, and black low nucleotide reactivity. High SHAPE reactivities indicate the nucleotides are flexible and therefore likely unpaired [4,41,42]. From the SHAPE data we compute the Shannon Entropy of the RNA, which measures the degree of structuredness (Fig 4C) [42-45], as well as the structure indicated as an arc-plot in Fig 4D. We observe low Shannon Entropy near the distal APA site and in the short isoform, suggesting a high level of RNA structuredness in the short isoform but high entropy (low structuredness) in Regions 2, 3 and 4. Interestingly, the QKI site has low Shannon Entropy indicative of higher structure near the binding site. Previously published analysis of QKI in vitro RNA binding specificity suggests that the protein binds to the consensus motif 5’-NACUAAY-N(1,20)-UAAY-3′ [46]. The motif is present in the SERPINA1 3′ UTR within the eCLIP QKI binding site, and multiple nucleotides have high SHAPE reactivity in the motif suggesting it will be accessible for binding (Fig 4E). The eCLIP data, the presence of the QKI consensus motif in the SERPINA1 3′ UTR, SHAPE structural analysis, and our luciferase data suggest QKI is likely playing a central role in translation repression of A1AT expression. Using our nanoluciferase translation assay, we therefore mutated the bipartite QKI motif in the long isoform, which significantly restores translational efficiency (Fig 4F, QKImutant). Addition of the QKI bipartite motif to the short isoform (Short+QKI) significantly represses translation (Fig 4F). These data therefore further support that QKI is directly affecting translation of SERPINA1 3′ UTR. Finally, we modeled the RNA secondary structure of the SERPINA1 3′ UTR (Fig 4G) using the SHAPE data as a pseudo-free energy as previously carried out in [42,47,48]. This structure has potential ramifications as a modulator of translation repression and would also be a logical RNA target for small molecule or anti-sense therapeutic development [49]. We observed significant base-pairing in Region 1 and several hairpins spanning Regions 3 and 4 (indicated as green arcs, Fig 4D), which are consistent with low Shannon Entropy observed for the QKI binding site. Our structural model shows the proximal APA site is accessible (S1B Fig). One known function of QKI is to retain transcripts in the nucleus, which has the effect of repressing translation [50]. Thus, a majority of the SERPINA1 long 3′ UTR should be retained in the nucleus if this is the mechanism of translation suppression. When we analyze the distal ratio from cytoplasmic and nuclear poly-A selected fractions of HepG2 and HCC (Hepatocellular carcinoma) RNA-seq data [51,52] we observe a much higher proportion of distal reads in the nuclear fraction (Fig 4H). Based on the increased ratio of distal reads in the nucleus, we posit that one mechanism by which QKI could be altering translational efficiency is by retaining SERPINA1 mRNA in the nucleus. This would effectively inhibit translation since mRNAs must be exported to the cytoplasm to be translated by the ribosome. Additional mechanisms may be involved in translational inhibition in other regions of the 3’UTR that we did not detect since the ENCODE RNA binding protein data is not a comprehensive atlas of all known RNA binding proteins. Nonetheless, our findings demonstrate the value of integrating large scale genomic data sets in investigating post-transcriptional regulation.

Differential expression of RBPs in COPD and the distal ratio

Having analyzed eCLIP RNA binding protein data, we were interested in identifying RBPs that alter the distal ratio of the SERPINA1 3′ UTR in COPD that may not have been detected by RNA binding protein crosslinking. We therefore measured differential expression in the LTRC data for the 224 RBPs for which shRNA ENCODE knock-down data is available [34-36]. For these 224 RBPs we also measured the distal ratio upon shRNA knock-down and plot this on the y-axis of Fig 5A. The plot is centered on the control, empty vector shRNA distal ratio, indicated by a horizontal line. On the x-axis we plot the log2 differential expression fold change log2(DEFC) for these 224 RBPs when comparing the 376 COPD cases and 267 controls in the LTRC primary lung tissue data. In Fig 5A, the further the RBP is from the center of this two-dimensional volcano plot the more likely it modulates the distal ratio in COPD lungs.
Fig 5

Role of RNA binding proteins and relative single-cell populations in modulating SERPINA1 APA.

A) Two-dimensional volcano plot of endogenous shRNA distal ratio in SERPINA1 (y-axis) as a function of log2 differential expression fold change (log2(DEFC)) in LTRC primary lung tissue from 376 COPD cases and 267 controls for the 224 corresponding RNA binding proteins. The horizontal line represents the mean distal ratio in the corresponding empty vector shRNA controls, while the vertical line is centered on zero. Blue points (bottom right quadrant and top left quadrant) will decrease the distal ratio in COPD, while orange points (top left and bottom right quadrant) will increase the distal ratio. Filled circles indicate RNA binding proteins for which the distal ratio changes from shRNA control, and log2(DEFC) are both significant in both data sets with padj<0.05. B.) log2 computed Euclidean distance from center of two-dimensional volcano plot for significant (padj<0.05) RNA binding proteins affecting COPD. Again, those RNA binding proteins expected to lower the distal ratio are colored blue and negative, while those expected to increase the distal ratio are orange and positive.

Role of RNA binding proteins and relative single-cell populations in modulating SERPINA1 APA.

A) Two-dimensional volcano plot of endogenous shRNA distal ratio in SERPINA1 (y-axis) as a function of log2 differential expression fold change (log2(DEFC)) in LTRC primary lung tissue from 376 COPD cases and 267 controls for the 224 corresponding RNA binding proteins. The horizontal line represents the mean distal ratio in the corresponding empty vector shRNA controls, while the vertical line is centered on zero. Blue points (bottom right quadrant and top left quadrant) will decrease the distal ratio in COPD, while orange points (top left and bottom right quadrant) will increase the distal ratio. Filled circles indicate RNA binding proteins for which the distal ratio changes from shRNA control, and log2(DEFC) are both significant in both data sets with padj<0.05. B.) log2 computed Euclidean distance from center of two-dimensional volcano plot for significant (padj<0.05) RNA binding proteins affecting COPD. Again, those RNA binding proteins expected to lower the distal ratio are colored blue and negative, while those expected to increase the distal ratio are orange and positive. When we measure the distal ratio change in shRNA knockdown experiments (y-axis, Fig 5A), we are reporting the change in distal ratio upon down-regulation of that RBP. These same RBPs may be up, down, or not regulated in the lung tissue of COPD vs. control individuals. As we consider the representation developed in Fig 5A, there are two types of proteins: 1.) RBPs that increase the distal ratio (e.g. because their distal ratio is above the control ratio and log2(DEFC)<0 in COPD), or 2.) RBPs that decrease the distal ratio in COPD (e.g. because their measured distal ratio after shRNA is below that of the control and log2(DEFC)<0). Thus, RBPs in the top left and bottom right corners of Fig 5E contribute to increasing the distal ratio in COPD (orange), and those in the bottom left and top right corners contribute to decreasing the distal ratio (blue). In Fig 5A, filled circles meet the padj<0.05 for both distal ratio difference from control and DEFC. Numerical data used to create Fig 5A is also provided in S2 Table. We notice that QKI is in the bottom left quadrant as it is significantly down regulated in COPD (log2(DEFC) = -0.13, padj = 5.4x10-16) and significantly decreases the distal ratio in shRNA knockdown experiments (distal ratio = 0.036, p = 1.54x10-32). QKI decreased expression in COPD therefore decreases the long isoform of the SERPINA1 mRNA, which would lead to higher translation efficiency of the A1AT protein. One clear result in Fig 5A is that no single protein can account for the increase in usage of the distal 3′ UTR site we report in Fig 2. To quantify the effect of each protein we compute the log2 Euclidean distance from the center of the graph (Fig 5B) which quantifies the potential effect of each RNA binding protein on the distal ratio in COPD. A higher distance indicates a higher effect in COPD of the RBP on the distal ratio. We plot this distance as negative if the RBP lowers the distal ratio in COPD (blue) and positive (orange) if it increases. If we consider all RBPs where the absolute value of the log2 Euclidean distance > 1 and padj<0.05 for shRNA distal ratio change and log2(DEFC) then we identify 6 RBPs that decrease the distal ratio and only 2 that increase it (Fig 5B blue and orange, respectively, S2 Table). CSTF2 (cleavage stimulation factor 2), an important modulator of the cleavage stimulation factor complex which regulates APA is in the bottom left-hand corner and has the most negative Euclidean distance in Fig 2B, such that its differential expression in COPD will also contribute to less distal SERPINA1 mRNA expression. This analysis reveals that in COPD lung tissue, RNA binding protein expression levels are adjusted in a way that favors the short SERPINA1 isoform, thereby resulting in higher A1AT translation efficiency. This is contrary to what we observed in Fig 2, where the distal ratio is increased in COPD individuals. Furthermore, SERPINA1 mRNA is not differentially expressed in COPD LTRC data (log2(FC) = 0.07, padj = 0.16), and that these changes only affect the ratio of long to short isoform, and therefore the translation of the A1AT protein in the lung. An important caveat in interpreting the data in Fig 5 is that it is not a comprehensive analysis of all known RNA binding proteins, since we are limited on the y-axis by existing ENCODE shRNA experiments. Furthermore, the shRNA data were collected in HepG2 cells, whereas our differential expression analysis was measured in primary lung tissue. There are therefore likely additional RBPs controlling the distal ratio in the lung that this analysis cannot reveal. It is possible that a different molecular mechanism this analysis cannot reveal explains the observed lengthening in COPD lungs. However, our analysis in Fig 5A and 5B of expression changes in the LTRC lung cohort reveal that in COPD lungs, multiple RNA binding proteins (including QKI and CSTF2) have altered expression that, in the context of corresponding shRNA knock-down experiments, will lead to a decreased distal ratio. This may act as a molecular buffer, effectively keeping the distal ratio from increasing more. We also analyzed single-cell RNA-seq data to investigate if another mechanism could explain the observed higher distal ratio observed in Fig 2B and 2C. Single-cell RNA-seq from five healthy human liver biopsies previously identified 20 clusters of cells, including six different hepatocyte clusters, shown in S5A Fig [53]. SERPINA1 mRNA is most predominantly expressed (green) in the six hepatocyte clusters (cluster numbers 1, 3, 5, 6, 24 and 15) (S5C Fig). In comparison lung UMAP (Uniform Manifold Approximation and Projection) single cell RNA-seq analysis tissue from five healthy donors previously revealed 25 cell type clusters (S5B Fig) [54]. In these data SERPINA1 mRNA is predominantly expressed in Alveolar Type 1 and 2 cells as well as Macrophages (S5D Fig). Read coverage in single-cell RNA-seq data is significantly sparser and has higher variability than bulk RNA-seq data. We were able to visualize the distribution of distal ratios for each cell type in clusters with sufficient cells expressing SERPINA1 mRNA. For the liver data these include three hepatocyte clusters (1, 3, and 6) as well as Macrophages, αβ-Tcells and Plasma cells (S5E Fig). In the lung, these are Alveolar Type 1 and 2, Dendritic cells and Macrophages (S5F Fig). We observe that the distal ratio distributions are different in each cell type, but these differences are not statistically significant in these data, mostly due to very large variations observed. Recent single-cell analysis of lung tissue in COPD individuals observed important changes in AT2 cell populations in COPD individuals [55], however, the methods used for cell dissociation do cause substantial losses in AT1 cells. This makes it difficult to quantitatively relate these results with our observations in the LTRC data. Nonetheless, any changes in relative cell-type abundance in COPD lungs will affect the observed distal ratio in bulk-RNA seq data, and we hypothesize this is the mechanism that causes the observed increase in distal ratio observed in the LTRC COPD data (Fig 2C). Although we cannot quantitatively establish that RBPs or changes in cell type population alone are responsible for the observed difference in the distal ratio of SERPINA1 mRNA in COPD lungs (Fig 2B, 2C and 2D), our data reveal the complex network of factors that ultimately control A1AT protein production at the molecular and cellular level in the lung.

Discussion

A1AT protein expression is key to both liver and lung disease, and deficiency of the protein can lead to panlobular emphysema [7-9]. In the liver, Z-allele individuals often accumulate polymers of mutant A1AT leading to liver cirrhosis [2,56]. Although the SERPINA1 mRNA is expressed at the highest level in the liver, specifically in hepatocytes (S5C Fig), we show here it is also expressed in primary lung tissue, in Alveolar Type 1, 2 and macrophage cell types. The prevailing understanding of A1AT function is that it is secreted from the liver and carries out its functions in the lung [7,8]. Our findings agree with this, but also reveal a potentially important role for SERPINA1 mRNA in the lung. The unique transcript structure of the A1AT mRNA, with 11 transcript isoforms differing only in their 5’ UTR, is indicative of a post-transcriptional regulatory mechanism encoded in the non-protein-coding regions of this mRNA [3,4,6]. We report here the characterization of an alternative poly-adenylation event in the SERPINA 3′ UTR that controls A1AT expression. Our measurements of both short and long-read RNA-sequencing, combined with 3′ end- sequencing confirms that the SERPINA1 3′ UTR has a long (distal) and short (proximal) isoform (Fig 1). Use of the distal site is significantly increased in lung tissue from COPD subjects (Fig 2) and this yields transcripts with a 50-fold lower translation efficiency (Fig 3). Furthermore, the heterozygous MS and MZ individuals and ZZ individuals with COPD all present a higher distal ratio compared with MM individuals with COPD. Given that a higher distal ratio results in lower A1AT translation, our work indicates that alternative polyadenylation, and in particular the use of the distal site in these individuals, likely exacerbates A1AT protein deficiency in these individual’s lungs. Thus, the post-transcriptional response in the lung further prevents the production of A1AT protein, presumably where A1AT produced in the liver and transmitted through the plasma does not have access. Our analysis of SERPINA1 3’UTR translation, structure, sequence, and interactions with RNA binding proteins (Fig 4) indicate that QKI suppresses A1AT protein expression from SERPINA1 isoforms with long 3’UTRs. These data agree with shRNA knockdown experiments where we observe a lower distal ratio in the absence of QKI. Nuclear retention is a known function of QKI in the brain and may be part of the translational inhibition of SERPINA1 translation in liver and lung tissue [50,57]. Interestingly, QKI is downregulated in COPD lungs, which suggests that at a molecular level, individuals with COPD retain the long isoform less in their lungs. Our structural results also suggest that targeting the RNA structure near the QKI binding site in the SERPINA1 3′ UTR would likely increase translation of the A1AT protein, providing a potentially novel site for RNA therapeutic development that could be particularly advantageous to MS, MZ and ZZ individuals with COPD [49]. Interestingly, mice have six SERPINA1 paralogs (Serpina1a-f), which have been edited out to generate a model of A1AT deficiency and emphysema [58]. While all these paralogs share a conserved proximal polyadenylation site with the human SERPINA1 (S6A Fig), none of the murine paralogs contain a QKI binding site anywhere in the potential 3’UTR (S6B Fig). Only Serpina1c, d and f have potential distal sites (S6C Fig). Serpina1c and Serpina1d, as most of the other murine paralogs, have their highest expression in liver, while Serpina1f is expressed in the kidney (S6D and S6E Fig). However, there is no evidence for significant expression of the distal region in any of the murine Serpina1 paralogs in two different RNA-Seq datasets (median distal ratio < 0.001, S6F and S6G Fig). Post-transcriptional regulation of SERPINA1 has therefore diverged between humans and mice, and testing of SERPINA1 therapeutics aimed at modulating RNA regulation will require development of a different animal model system. Our data reveal a more fundamental aspect of post-transcriptional response to COPD in lung tissue. We found that CSTF2 is significantly down regulated in COPD lungs (log2(DEFC) = -0.09, padj = 9.5x10-5). This RNA binding protein is a global regulator of transcription termination and plays a central role in controlling alternative poly-adenylation [59-61]. For the SERPINA1 mRNA, the result is a preferential use of the proximal site, which in turn results in higher A1AT expression. In fact, most changes in RNA binding protein expression in COPD lungs (Fig 5B) have the effect of favoring the proximal alternative polyadenylation site. Thus, the molecular response in the lung at the level of RNA binding proteins favors the short 3′ UTR of the SERPINA1 mRNA, increasing the translation and expression of the A1AT protein. The skew toward RBPs that decrease the amount of SERPINA1 long 3′UTR isoforms in individuals with COPD (Fig 5A and 5B) contradicts our observation of the 2.1-fold higher distal ratio (greater use of the distal site) in COPD lungs (Fig 2). However, our analysis of single-cell RNA-seq data from primary lung and tissue data reveal that SERPINA1 mRNA is expressed in multiple different cell types. Although single-cell RNA-seq data is still too sparse allow us to accurately measure the distal ratio for each cell type cluster, we observe variation in the distal ratio between individual cells. Thus, one possible explanation for the increased distal ratio observed in COPD lung tissues is a change in cellular composition. This change may also be driven by differences in the inflammatory state of the lung. If this is indeed the case, then our finding that RNA binding protein expression changes in COPD lungs to increase the shorter isoform indicates that molecular expression effectively balances changes in cellular composition and/or inflammation that increase 3′ UTR length. This may also explain why we observe only a 2.1-fold increase in distal ratio in COPD, as it is effectively buffered by compensating changes in RNA binding expression. Finally, the ENCODE RNA binding protein eCLIP and shRNA data is very powerful but is not a comprehensive atlas of all known RNA binding proteins. As such, it is possible that a yet to be profiled RNA binding protein lengthens the 3’ UTR that our analysis could not reveal. Although statistically significant, the 2.1-fold increase in distal site usage in COPD lungs remains modest (Fig 2) and would have at most a 15% decrease on the overall translation and expression of A1AT protein. Given the 50-fold decrease in translation efficiency of the long isoform 3′ UTR isoform, it is however essential to buffer against a high distal ratio in all tissues requiring protein expression. Our data reveal that RNA binding proteins, QKI and CSTF2 play a central role in this regulation and are accordingly differentially regulated in the lungs of COPD individuals. Importantly, our findings are also consistent with one known mechanism of translational control by QKI, which is nuclear retention. Independent of the exact mechanism, our data suggest that inhibition of the QKI/SERPINA1 3’UTR interaction offers a potential therapeutic strategy for increasing A1AT expression in the lungs of individuals with COPD.

Methods

Cell culture

Liver derived hepatocellular carcinoma (HepG2) cells and lung derived epithelial carcinoma (A549) cells were obtained from ATCC through UNC Tissue Culture Facility. HepG2 cells were cultured in Eagle’s Minimum Essential Media supplemented with 10% fetal bovine serum. A549 cells were cultured in RPMI 1640 media supplemented with 10% fetal bovine serum. Cells were split regularly using Tryple (Fisher). All cells were grown in 5% CO2 and 37°C.

3′ sequencing

Total RNA was extracted from HepG2 cells using trizol (ThermoFisher) followed by RNA column purification (Purelink Invitrogen) and on-column DNA digestion (Purelink, Invitrogen). RNA was used to create 3′ end specific libraries with the Quant Seq kit (Lexogen). The same sample of HepG2 RNA was depleted of ribosomal RNA (Ribominus Eukaryotic v2, Fisher) and used to generate standard Illumina short-read libraries using the Nextera Flex kit (Illumina). Both libraries were sequenced (Illumina MiSeq, 300 kit) and the fastq sequence files used in downstream analysis. The standardized Bluebee Lexogen Quant Seq FWD protocol was used for trimming, read alignment, and quality control steps of the Quant Seq library. The standard HepG2 transcriptome sequencing was analyzed in the same manner as other fastq datasets (see—Calculation of SERPINA1 distal ratio from sequencing).

Calculation of SERPINA1 distal ratio from RNA sequencing

We followed the GTEx Consortium and TopMed guidelines for sequence alignment to the human genome (https://github.com/broadinstitute/gtex-pipeline/commits/master/TOPMed_RNAseq_pipeline.md). Briefly, we used STAR [66] to align fastq files to hg38 (containing no alternative, no decoy and no HLA chromosomes). After alignment, or with pre-aligned samples, we used samtools to determine the read depth at each nucleotide in the 3′UTR, summed up the depth within the distal and proximal specific regions and normalized by the gene region length. We calculated the ratio of distal 3′UTR reads by taking the length normalized read counts in the distal region and dividing by the sum of the length normalized distal and proximal reads. The same approach was used on individual clusters for single-cell RNA-seq analysis (S5 Fig).

Construction of nanoluciferase SERPINA1 3′UTR reporter system

We gene synthesized the full 3′UTR sequence of SERPINA1 (Twist biosciences) with a mutated proximal polyA site (AUUAAA to AUCAAG) and 60 nucleotides of coding sequence. This coding region and full 3′UTR were cloned into pNL3.2.CMV (Promega) using sequence homology (NEB Builder HiFi). A short 3′UTR isoform was amplified from the gene synthesized construct using PCR with the same coding sequence portion. This short isoform was cloned into the same backbone with sequence homology. The nanoluciferase expressed from this construct contains a PEST domain to prevent protein accumulation. All additional deletion and mutation constructs were created from verified long 3’UTR plasmids. Primers for cloning are listed in S3 Table. The mutated polyA site was reverted back to wildtype for both the long and short 3′UTR constructs (NEB Q5 Site directed mutagenesis kit) for structural studies. All constructs were verified by Sanger sequencing and restriction digest.

Luciferase experiments

For deletion series analysis, A549 cells were split into 96 well opaque, clear bottom plates (~150,000 cells/mL) and transfected the next day with 10 ng of control firefly luciferase and 10 ng of nanoluciferase constructs using standard quantities of Lipofectamine 3000 and P3000 in serum-free Optimem (Fisher). HepG2 cells were treated similarly, but 50 ng of firefly and nanoluciferase constructs were used. Cells were transfected in triplicate. After 30 hours plates were used to measure luminescence using the Promega Nano-Glo Dual-Luciferase Reporter Assay System. Luminescence was quantified at 480 nm on the BMG Labtech Clariostar plate reader. Four biological replicates from separate days were combined for analysis. R was used to normalize samples (0 to 1) and we created a linear model with no y-intercept. Coefficients for each region are listed in Table 1. For QKI mutation analysis, A549 cells were split into 12 well plates and transfected the next day with 0.5 ug of control firefly luciferase and 0.5 ug of nanoluciferase constructs using standard quantities of Lipofectamine 3000 and P3000 in serum-free Optimem (Fisher). Cells were transfected in triplicate. After 48 hours plates were used to measure luminescence using the Promega Nano-Glo Dual-Luciferase Reporter Assay. Luminescence was quantified at 480 nm on the SpectraMax iD5 plate reader. Three replicates from separate days were combined for analysis.

RNA stability quantification

We transfected A549 cells (same conditions as Luciferase assays) with equimolar amounts of either the long or short 3′UTR nanoluciferase reporter and performed a pulse-chase experiment by incubating transfected cells with ethylene uridine (EU) and harvesting either immediately, two hours or six hours later (Click-iT Nascent RNA Capture Kit, Invitrogen). Incorporation of EU into RNA does not alter cellular biology and allows for click-it chemistry and labeling with biotin-azide [63]. RNA was extracted using Trizol (Invitrogen) chloroform extraction and Phaselock Tubes (Invitrogen). RNA was purified using Purelink RNA Mini Kit (Invitrogen) and treated with DNAse I (Invitrogen) prior to being converted to cDNA using VILO reverse transcriptase according to manufacturer’s instructions (Invitrogen). qPCR was carried out by using PowerUp Sybr Green Master Mix (Thermo Fisher Scientific) on cDNA isolated from 0, 2 and 6 hour time points and primers specific for nanoluciferase, the short or long isoform of SERPINA1. Primers are listed in S3 Table. The qPCR reactions were performed on an Applied Biosystems QuantStudio 6. We confirmed that GAPDH was consistently stable over the entire time period and then determined SERPINA1 3′UTR RNA stability by normalizing expression values of triplicate samples to GAPDH (ΔCT = CTSERPINA1—CTGAPDH).

RNA structure experiments

A T7 promoter was added to the wild-type nanoluciferase SERPINA1 3′UTR during PCR amplification. In vitro RNA was produced off the purified PCR product (NEB HiScribe T7 High yield RNA synthesis kit), treated with Turbo DNase, purified and incubated at 37°C for 10 min in folding buffer (final concentration, 200 mM Bicine, 100 mM NaCl and 10 mM MgCl2). SHAPE treatment was performed with an addition of DMSO (control, 10% final concentration) or 5NIA (25 mM in DMSO final concentration) for 5 min at 37°C. The RNA was purified (AmpureXP RNA beads). We generated sequencing libraries (Swift RNA library kit), bioanalyzed and quantified the libraries and then sequenced the samples (Illumina MiSeq 300 kit). Reactivity profiles were generated using SHAPEMapper [40]. Base-pairing probabilities were generated with SuperFold and RNA structure models with RNAstructure [67,68].

RBP shRNA knockdown effects and eCLIP analysis of SERPINA1 3′UTR isoforms

The RBP eCLIP binding data was downloaded from the ENCODE data portal on Nov. 18, 2019. This download was restricted to assays in K562 and HepG2 cell lines and for RBPs with a “Target Category” in ENCODE that matched “RNA binding protein” and mapped to GRCh38. This resulted in eCLIP data for 122 RBPs. All downloaded eCLIP sites were originally identified through the described ENCODE 3 pipeline, a stringent process that takes into account experimental reproducibility and fraction of reads in peaks (FRiP) over background [34]. We then mapped peaks using the narrowPeak bed files with “1, 2" listed in the biological replicates column in the downloaded ENCODE metadata file. We also downloaded sequencing files for RBP shRNA knockdown for control shRNA treatment and shRNAs against SERPINA1 RBPs in HepG2 cells from the ENCODE portal. All shRNA experiments had a knockdown efficiency of at least 50% compared to the scrambled control for protein and RNA. We aligned these fastq files and calculated the ratio of distal 3′UTR reads (see—Calculation of SERPINA1 distal ratio from sequencing).

RBP differential expression analysis in LTRC data

COPD was defined as forced expiratory volume in one second (FEV1) to forced vital capacity (FVC) of less than 0.7 and FEV1 less than 80% predicted (Global Initiative for Chronic Obstructive Lung Disease spirometric stages 2–4). No individual with alternative pathologic lung diagnosis other than emphysema was included as a subject or control. Control subjects were defined by FEV1/FVC ratio > 0.70. In pre-analysis filtering, we excluded genes with fewer than 1 count per million reads in at least 50% of subjects. We performed differential gene expression using the limma [69] and voom [70] packages from R/Bioconductor. We tested the associations of gene expression levels and COPD. We adjusted the models for age, race, sex, current smoking status, smoking pack-years, and library preparation batch. Surrogate variables were used to estimate other latent effects and were computed using the sva package. We controlled for multiple testing with a false discovery rate (FDR) of 1%.

Datasets

We obtained pre-aligned hg38 GTEX tissue specific sequencing through dbGAP (dbGaP Accession phs000424.v8.p2) and focused on the top tissues where SERPINA1 is expressed (Liver, Lung, Blood, Spleen, Kidney and Small Intestine). We re-analyzed publicly available lung RNA-seq data from 82 COPD individuals and 64 individuals with normal spirometry (Bioproject PRJNA245811). Sequencing runs without coverage in the long 3′UTR region of SERPINA1 were excluded from analysis. We repeated our analysis with a larger set of lung RNA-Seq data from 376 individuals exhibiting COPD and 277 individuals with normal lung function (Lung Tissue Research Consortium generated by the NHLBI TOPMed Project (https://www.nhlbiwgs.org/). This larger RNA-Seq dataset is available with authorization (dbGaP Accession phs001662.v2.p1). Individuals without clear phenotype were excluded. We used the ENCODE HepG2 fractionation experiments (Bioproject PRJNA30709:GSE30567—SRR307915, SRR307916, SRR307928, SRR307929) and the patient-derived hepatocellular carcinoma cell line fractionation experiments (Bioproject PRJNA543441) to investigate cellular localization of SERPINA1 isoforms. To analyze SERPINA1 long read sequencing, we used publicly available data from PacBio human liver aligned using IsoSeq. Links to this data and human blood and heart sequencing are available on https://s3.amazonaws.com/datasets.pacb.com/downloadtools.html.

Sequence of proximal and distal polyadenylation sites in SERPINA1 3′ UTR.

A) SERPINA1 mRNAS 3′ UTR sequence showing consensus polyA signal (yellow), cleavage site as determined by 3′ End sequencing (red) and putative G/U-rich region downstream of cleavage site. B) Structural analysis of proximal site in SERPINA1 mRNA 3′ UTR. We observe that both the stop codon (red) and APA site (purple) are highly accessible (i.e. have high SHAPE reactivity), allowing both these sites access to the Ribosome and cleavage machinery respectively. (TIF) Click here for additional data file.

Analysis of distal ratio with GOLD stages of COPD in LTRC data.

SERPINA1 3′ UTR distal ratio for disease severity as measured by GOLD stage rating of COPD (red) and non-COPD (blue) indicating an increase in distal ratio with increasing disease severity. (TIF) Click here for additional data file.

3’UTR constructs are terminated at the expected length.

We measured 3’ end specific reads from A549 cells transfected with the nanoluciferase short or long 3’UTR construct and identified 3’ ends of endogenous and transfected SERPINA1 constructs. As expected, we found that 3’ reads cluster at the end of the 3’UTR of the endogenously expressed GAPDH in both A) short and B) long SERPINA1 3’UTR transfected cells. C) We identified a single primary 3’ end for the short 3’UTR SERPINA1 transcripts. D) We identified a single primary 3’ end for the long 3’UTR SERPINA1 transcripts at the end of the 3’UTR, mainly in the SV40 polyA signal, indicating that there is no cleavage and polyadenylation from the mutated proximal polyA site. (TIF) Click here for additional data file.

Long and short 3’UTR SERPINA1 isoforms have similar RNA stability.

A) We performed pulse chase experiments in A549 cells using ethylene uridine (EU) and click-it chemistry for labeling with biotin-azide to measure relative mRNA stability by qRT-PCR. GAPDH was stable over the 24-hour period. We observed a steep decline in both the long and short SERPINA1 constructs, consistent with high ΔCT values, indicating similar stability for both long and short 3′ UTRs. B) We measured the levels of short and long SERPINA1 3’UTR isoforms transfected into A540 cells at equimolar amounts for steady-state RNA levels. We used two different qRT-PCR primer pairs (blue and red). We found that the short and long transcripts were present at similar levels. (TIF) Click here for additional data file.

Cluster definitions for single cell RNA-seq data.

A) tSNE (t-distributed stochastic neighbor embedding) clustering of single-cell RNA-seq from five healthy human liver biopsies identifying 20 clusters as determined by [53]. B) Lung tissue from five healthy donors UMAP (Uniform Manifold Approximation and Projection) analysis of single cell RNA-seq data reveals 25 cell type Clusters as determined by [54]. C) SERPINA1 expression (normalized transcript counts) in liver cells is indicated by the yellow-green heatmap and reveals the highest counts in Hepatocyte cells, where six separate clusters were identified (clusters 1,3,5,6,14, and 15 indicated on cell map). D) SERPINA1 mRNA in lung cells is predominantly expressed in Alveolar Type 1 and 2 cells as well as Macrophages. E) Liver distal ratio distributions of SERPINA1 mRNA in single cells (indicated as open dots) for hepatocyte clusters 1,3, and 6, ab-Tcells and Plasma cells illustrates significant variation within and among cell type clusters, F) Lung distal ratio distributions in single cells indicated as dots for Alveolar Type 1 and 2, Dendritic and Macrophage cell types also indicate significant variation. (TIF) Click here for additional data file.

Post-transcriptional RNA regulation of murine Serpina1 paralogs is not analogous to regulation of human SERPINA1.

A) Alignment of human SERPINA1 and the six mouse Serpina1 paralogs (a-f) around the human proximal APA site indicating that this initial APA site is conserved in all murine Serpina1s. B) Alignment of human SERPINA1 and the six mouse Serpina1 paralogs (a-f) around the human QKI binding site illustrating poor conservation of the long 3’UTR. There are no canonical QKI bipartite or primary single QKI binding sites in the entire 3’ region (stop codon + 2kB) of murine Serpina1a-f. C) Human SERPINA1 and murine Serpina1a-f all contain a proximal APA site. Only Serpina1c, Serpina1d and Serpina1f have distal APA sites and these are not strictly conserved with the human SERPINA1 distal site. D) and E) Isoform specific alignment of Serpina1a-f from two different murine tissue datasets indicates that Serpina1a-e transcripts are highest expressed in liver tissue while Serpina1f transcripts are primarily expressed in kidney tissue. F) and G) We analyzed distal reads in Serpina1a-f in liver, and kidney tissue where available, and found no evidence for distal reads in the 3’ regions of murine Serpina1a-f. We calculated the median distal ratio to be less than 0.001 suggesting that mice do not express a long isoform of Serpina1. (TIF) Click here for additional data file.

RNA binding proteins identified through eCLIP as binding the 3’UTR of SERPINA1

(XLSX) Click here for additional data file.

RBPs with significant shRNA distal ratio change and corresponding differential expression in COPD from LTRC data.

(XLSX) Click here for additional data file.

Primers used for qRT-PCR and cloning.

(XLSX) Click here for additional data file.

Supplemental Methods.

Additional methods used to generate and analyze data for supplemental figures and tables. (DOCX) Click here for additional data file. 29 Jun 2021 Dear Dr Lackey, Thank you very much for submitting your Research Article entitled 'Alternative poly-adenylation modulates α1-antitrypsin expression in chronic obstructive pulmonary disease' to PLOS Genetics. The manuscript was fully evaluated at the editorial level and by independent peer reviewers. The reviewers have found that the manuscript has limited novelty and presents largely correlative results. They have raised issues regarding the interpretation of the RNA decay data as well as the function of QKI in the expression of alpha1-antitripsin isoforms. Based on the reviews, we will not be able to accept this version of the manuscript, but we would be willing to review a much-revised version, in which experimental evidence is provided for the specific points just mentioned. We cannot, of course, promise publication at that time. Should you decide to revise the manuscript for further consideration here, your revisions should address the specific points made by each reviewer. We will also require a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. If you decide to revise the manuscript for further consideration at PLOS Genetics, please aim to resubmit within the next 60 days, unless it will take extra time to address the concerns of the reviewers, in which case we would appreciate an expected resubmission date by email to plosgenetics@plos.org. If present, accompanying reviewer attachments are included with this email; please notify the journal office if any appear to be missing. They will also be available for download from the link below. You can use this link to log into the system when you are ready to submit a revised version, having first consulted our Submission Checklist. To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols Please be aware that our data availability policy requires that all numerical data underlying graphs or summary statistics are included with the submission, and you will need to provide this upon resubmission if not already present. In addition, we do not permit the inclusion of phrases such as "data not shown" or "unpublished results" in manuscripts. All points should be backed up by data provided with the submission. While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool.  PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. PLOS has incorporated Similarity Check, powered by iThenticate, into its journal-wide submission system in order to screen submitted content for originality before publication. Each PLOS journal undertakes screening on a proportion of submitted articles. You will be contacted if needed following the screening process. To resubmit, use the link below and 'Revise Submission' in the 'Submissions Needing Revision' folder. [LINK] We are sorry that we cannot be more positive about your manuscript at this stage. Please do not hesitate to contact us if you have any concerns or questions. Yours sincerely, Mihaela Zavolan Associate Editor PLOS Genetics Gregory Barsh Editor-in-Chief PLOS Genetics Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: The authors focus on the SERPINA1 gene, which is clinically relevant in the lung. Using computational analysis of several datasets and some experiments, the authors show that SERPINA1 is alternatively polyadenylated, which has not been reported before, that the long isoform is preferentially expressed in COPD disease, that the longer isoform leads to a reduced protein production in a reporter assay, and that various RNA binding proteins (RBPs) bind the long 3’UTR isoform, and can be predicted to be potentially relevant to the differential protein production and/or to the expression changes in COPD. Overall, the topic is of interest to the genetics community, as there are still not too many examples of interesting instances of alternative polyadenylation, and overall, both the computational and the experimental parts are performed rigorously. However, the overall conceptual advance here is modest. I do have several concerns related to over-interpretation of the presented data, and suggestions for experiments that would make the paper more suitable for PLoS Genetics. Major comments: 1. The analysis showing that the 3’UTR length does not affect RNA levels or stability is not convincing. The authors show a metabolic lablelling chase experiment for 24 hr only, after which most labelled RNA is gone, and present it as evidence for no effect of the 3’UTR on RNA stability. Even if the half-life of the RNA is affected by 2- or 4-fold (e.g., from 2hr to 8hr), it would probably not be visible in this experiment. Several relevant time points should be tested to provide a quantitative estimate of the half -life for the short and the long UTR, also controlling for the different plasmid size used. The authors should measure also whether there is a change in steady-state mRNA levels. 2. The suggestion that QKI helps retain the SERPINA1 mRNA in the nucleus is very speculative. QKI has many possible functions, and only one of them is retention in the nucleus. The fact that the long isoform of SERPINA1 is preferentially nuclear is interesting, but can be affected by many other factors except QKI. Can the authors knock QKI down and measure subcellular localization of SERPINA1 mRNA? Otherwise, the statement should be appropriately toned down. 3. In the results presented in Figure 5, the authors reach quite strong conclusions based on overall limited data. The ENCODE shRNA data is impressive, but it only covers a limited set of RBPs, which are knocked-down to varying degrees and not completely eliminated, and the data are from liver HepG2 cells, whereas the COPD data is for lung tissue. Therefore, it is possible that there are RBPs with a stronger effect that were either not profiled by ENCODE, not sufficiently knocked down, or have a different behavior in lung cells. The conclusions of some buffering based on the number of RBPs that pass some arbitrary thresholds are over-reaching. The text should be adjusted accordingly. Since QKI appears as relevant in multiple analyses, can the authors knock it down and see how it affects their luciferase reporter results? Or the endogenous protein levels, if possible? Minor comments: 1. Figure 3E – does deletion of Region1 affect the usage of the proximal site? In general, for each of the deletions, the authors should attempt to quantify the relative Distal/Proximal usage in the 3’UTR of their reporter. 2. Figure 4A – “Region 1” is not labelled in the figure. Also, please explain the color coding of the clusters. Are these all the proteins with clusters in the ENCODE data in this region? What threshold was used to select them? 3. Since in the single-cell data the authors do not observe any statistically significant changes, and as they indicate the overall power of the analysis is severely compromised by data sparseness, I don’t see what this part adds to the manuscript. It can be relocated to the supplement and mentioned in the discussion section. Reviewer #2: In the study led by Lackey and colleagues, the authors continue their investigation of SERPINA1 in COPD. They demonstrate that SERPINA1 pre-mRNA is subject to APA and that this appears to change in the disease setting. They characterize this event and reveal that the long isoform is significantly less translated and retained in the nucleus. Various NGS analyses nominate QKI as a potential regulator of this process and ultimately show that SERPINA1 polyA site choice varies between cell types using single cell RNA-seq. Overall, the study is interesting and has novel aspects, but more work would be needed to recommend further consideration. There are two significant weaknesses with this study that limit its impact and confound the interpretation. I note that both concerns are related to each other. First, the authors rely too heavily on retrospective analyses and need to conduct more experiments that directly test hypotheses. There is too much reliance on correlation in patient and consortium databases. Second, the model of QKI action on SERPINA1 is not made clear throughout the text likely because of my initial concern – more experimentation is required to clarify the model. Is it that QKI does not impact SERPINA1 polyA site choice but rather binds to the long isoform while in the nucleus thereby retaining it and preventing it from getting translated? Or is it that QKI binds the long 3’UTR while getting transcribed and effects PAS choice? Could both mechanisms be occurring? These are just some of the unanswered questions that lead one to get confused as to what exactly is happening in this disease. Specific concerns are below: 1. The type ‘M, S, or Z alleles’ are not explained in the text. A reader not familiar with this literature would have no basis to understand their identity. 2. There is a conceptual misconnect associated between analyzing HepG2 cells and COPD lung tissues. If the authors are going to use HepG2 cells as a cellular model to focus on, then shouldn’t they be then more focused on analysis of cirrhosis in the liver in patients? 3. The two nano-luciferase reporters need to be confirmed that the correct PAS is being utilized. A straightforward 3’RACE should suffice. As this is a key finding in the paper, such confirmation would be essential. 4. The authors datamine eCLIP data to generate a hypothesis that QKI is a trans acting factor potentially regulating translation of SERPINA1. The go on to retrospectively analyze fractionated RNA-seq data and to analyze SHAPE data to ultimately conclude that QKI is a likely an important trans factor that binds to the long 3’UTR isoform of SERPINA1 mRNA and retain it in the nucleus. While these data appear convincing, they are also completely correlative. The authors need to conduct two sets of experiments where they: 1) mutate the QKI consensus site in their nano-luc reporter to determine whether there is an effect on translation and 2) knockdown QKI to determine whether SERPINA1 long isoforms are now relocalized to the cytoplasm. Without these experiments or similar types of experiments, it is impossible to understand cause/effect of QKI to SERPINA1. 5. Based upon the data presented in Figure 5A, I am puzzled how the authors make their conclusion in the context of their previous conclusions ‘QKI decreased expression in COPD therefore decreased the long isoform of the SERPINA1 mRNA, which would lead to higher translation efficiency of the A1AT protein’. This conclusion now suggests that QKI impacts PAS choice and that in the absence of QKI would result in more short isoform being produced to increase translation efficiency of A1AT protein. Or is it that the authors believe that when QKI expression goes down, the long isoform is less retained in the nucleus and is now free to increase translation? Reviewer #3: The manuscript entitled “Alternative poly-adenylation modulates α1-antitrypsin expression in chronic obstructive pulmonary disease” described that there is an increase in the use of a distal poly-adenylation site in primary lung tissue RNA-seq in COPD cases when compared to controls. The authors also reported that the alternative polyadenylation event involves two sites, a proximal and distal site downstream of the A1AT stop codon. They measured the distal ratio in human primary tissue short read RNA-seq data and corroborated their data results with long read RNA-seq data demonstrated a 50-fold decreased translation efficiency and A1AT expression. They also identified QKI as a specific modulator of SERPINA1 mRNA translation through nuclear retention in COPD primary lung tissue. This study revealed a complex post-transcriptional mechanism that regulates alternative polyadenylation and A1AT expression in COPD and acts as a buffering mechanism for changes in cellular populations in the COPD. This is an interesting and well designed study. The findings are important and novel. Comments: 1. In Figure 2, the authors made conclusions about the M, S and Z alleles. More information on the N numbers should be provided and better justified. In addition for the ZZ sample, how conclusions are made with N=1 as a control? 2. In Figures 3B and C it is not clear how the data from the two isoforms validate the authors’ conclusions by inhibiting translation, while showing not effect on mRNA stability? The conclusions reported here should be better supported by the data. 3. In Figure 4 the role and function of the identified RNA binding proteins to modulate translation efficiency in COPD model should be shown. 4. Moreover, in Figure 5 the data from the single cell analysis should be validated in both mRNA and protein levels to support the authors’ conclusions. 5. Finally, the manuscript lacks experiments to assess the function in a COPD model to validate the proposed mechanisms. This is a major limitation. ********** Have all data underlying the figures and results presented in the manuscript been provided? Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Reviewer #3: No 28 Sep 2021 Submitted filename: Lackey_etal_ResponseToReviewers.pdf Click here for additional data file. 25 Oct 2021 Dear Dr Lackey, We are pleased to inform you that your manuscript entitled "Alternative poly-adenylation modulates α1-antitrypsin expression in chronic obstructive pulmonary disease" has been editorially accepted for publication in PLOS Genetics. Congratulations! Before your submission can be formally accepted and sent to production you will need to complete our formatting changes, which you will receive in a follow up email. Please be aware that it may take several days for you to receive this email; during this time no action is required by you. Please note: the accept date on your published article will reflect the date of this provisional acceptance, but your manuscript will not be scheduled for publication until the required changes have been made. Once your paper is formally accepted, an uncorrected proof of your manuscript will be published online ahead of the final version, unless you’ve already opted out via the online submission form. If, for any reason, you do not want an earlier version of your manuscript published online or are unsure if you have already indicated as such, please let the journal staff know immediately at plosgenetics@plos.org. In the meantime, please log into Editorial Manager at https://www.editorialmanager.com/pgenetics/, click the "Update My Information" link at the top of the page, and update your user information to ensure an efficient production and billing process. Note that PLOS requires an ORCID iD for all corresponding authors. Therefore, please ensure that you have an ORCID iD and that it is validated in Editorial Manager. To do this, go to ‘Update my Information’ (in the upper left-hand corner of the main menu), and click on the Fetch/Validate link next to the ORCID field.  This will take you to the ORCID site and allow you to create a new iD or authenticate a pre-existing iD in Editorial Manager. If you have a press-related query, or would like to know about making your underlying data available (as you will be aware, this is required for publication), please see the end of this email. If your institution or institutions have a press office, please notify them about your upcoming article at this point, to enable them to help maximise its impact. Inform journal staff as soon as possible if you are preparing a press release for your article and need a publication date. Thank you again for supporting open-access publishing; we are looking forward to publishing your work in PLOS Genetics! Yours sincerely, Mihaela Zavolan Associate Editor PLOS Genetics Gregory Barsh Editor-in-Chief PLOS Genetics www.plosgenetics.org Twitter: @PLOSGenetics ---------------------------------------------------- Comments from the reviewers (if applicable): Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: The authors have addressed all my comments from the previous review in a satisfactory manner. I can now recommend publication in PLoS Genetics Reviewer #2: I commend the authors on conducting a few of the key experiments that I suggested. I also believe they have done a better job muting the discussion on mechanism as the work is too preliminary to suggest a specific mode of action here. Thus, I am supportive of the manuscript in its current form. Reviewer #3: The authors have addressed my previous comments. ********** Have all data underlying the figures and results presented in the manuscript been provided? Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: Yes: Igor Ulitsky Reviewer #2: No Reviewer #3: Yes: Andriana Margariti ---------------------------------------------------- Data Deposition If you have submitted a Research Article or Front Matter that has associated data that are not suitable for deposition in a subject-specific public repository (such as GenBank or ArrayExpress), one way to make that data available is to deposit it in the Dryad Digital Repository. As you may recall, we ask all authors to agree to make data available; this is one way to achieve that. A full list of recommended repositories can be found on our website. The following link will take you to the Dryad record for your article, so you won't have to re‐enter its bibliographic information, and can upload your files directly: http://datadryad.org/submit?journalID=pgenetics&manu=PGENETICS-D-21-00741R1 More information about depositing data in Dryad is available at http://www.datadryad.org/depositing. If you experience any difficulties in submitting your data, please contact help@datadryad.org for support. Additionally, please be aware that our data availability policy requires that all numerical data underlying display items are included with the submission, and you will need to provide this before we can formally accept your manuscript, if not already present. ---------------------------------------------------- Press Queries If you or your institution will be preparing press materials for this manuscript, or if you need to know your paper's publication date for media purposes, please inform the journal staff as soon as possible so that your submission can be scheduled accordingly. Your manuscript will remain under a strict press embargo until the publication date and time. This means an early version of your manuscript will not be published ahead of your final version. PLOS Genetics may also choose to issue a press release for your article. If there's anything the journal should know or you'd like more information, please get in touch via plosgenetics@plos.org. 11 Nov 2021 PGENETICS-D-21-00741R1 Alternative poly-adenylation modulates α1-antitrypsin expression in chronic obstructive pulmonary disease Dear Dr Lackey, We are pleased to inform you that your manuscript entitled "Alternative poly-adenylation modulates α1-antitrypsin expression in chronic obstructive pulmonary disease" has been formally accepted for publication in PLOS Genetics! Your manuscript is now with our production department and you will be notified of the publication date in due course. The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Soon after your final files are uploaded, unless you have opted out or your manuscript is a front-matter piece, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers. Thank you again for supporting PLOS Genetics and open-access publishing. We are looking forward to publishing your work! With kind regards, Agnes Pap PLOS Genetics On behalf of: The PLOS Genetics Team Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom plosgenetics@plos.org | +44 (0) 1223-442823 plosgenetics.org | Twitter: @PLOSGenetics
  70 in total

1.  Selective 2'-hydroxyl acylation analyzed by primer extension and mutational profiling (SHAPE-MaP) for direct, versatile and accurate RNA structure analysis.

Authors:  Matthew J Smola; Greggory M Rice; Steven Busan; Nathan A Siegfried; Kevin M Weeks
Journal:  Nat Protoc       Date:  2015-10-01       Impact factor: 13.491

Review 2.  Untranslated regions (UTRs) orchestrate translation reprogramming in cellular stress responses.

Authors:  Basavaraj Sajjanar; Rajib Deb; Susheel Kumar Raina; Sachin Pawar; Manoj P Brahmane; Avinash V Nirmale; Nitin P Kurade; Gundallahalli B Manjunathareddy; Santanu Kumar Bal; Narendra Pratap Singh
Journal:  J Therm Biol       Date:  2017-02-16       Impact factor: 2.902

3.  Structural divergence creates new functional features in alphavirus genomes.

Authors:  Katrina M Kutchko; Emily A Madden; Clayton Morrison; Kenneth S Plante; Wes Sanders; Heather A Vincent; Marta C Cruz Cisneros; Kristin M Long; Nathaniel J Moorman; Mark T Heise; Alain Laederach
Journal:  Nucleic Acids Res       Date:  2018-04-20       Impact factor: 16.971

Review 4.  Treatment of COPD: the sooner the better?

Authors:  Marc Decramer; Christopher B Cooper
Journal:  Thorax       Date:  2010-09       Impact factor: 9.139

5.  RNA-binding protein Dnd1 inhibits microRNA access to target mRNA.

Authors:  Martijn Kedde; Markus J Strasser; Bijan Boldajipour; Joachim A F Oude Vrielink; Krasimir Slanchev; Carlos le Sage; Remco Nagel; P Mathijs Voorhoeve; Josyanne van Duijse; Ulf Andersson Ørom; Anders H Lund; Anastassis Perrakis; Erez Raz; Reuven Agami
Journal:  Cell       Date:  2007-12-28       Impact factor: 41.582

6.  Messenger RNA Structure Regulates Translation Initiation: A Mechanism Exploited from Bacteria to Humans.

Authors:  Anthony M Mustoe; Meredith Corley; Alain Laederach; Kevin M Weeks
Journal:  Biochemistry       Date:  2018-06-12       Impact factor: 3.162

7.  The Effects of Rare SERPINA1 Variants on Lung Function and Emphysema in SPIROMICS.

Authors:  Victor E Ortega; Xingnan Li; Wanda K O'Neal; Lela Lackey; Elizabeth Ampleford; Gregory A Hawkins; Philip J Grayeski; Alain Laederach; Igor Barjaktarevic; R Graham Barr; Christopher Cooper; David Couper; MeiLan K Han; Richard E Kanner; Eric C Kleerup; Fernando J Martinez; Robert Paine; Stephen P Peters; Cheryl Pirozzi; Stephen I Rennard; Prescott G Woodruff; Eric A Hoffman; Deborah A Meyers; Eugene R Bleecker
Journal:  Am J Respir Crit Care Med       Date:  2020-03-01       Impact factor: 21.405

8.  Comprehensive Analysis of Transcriptome Sequencing Data in the Lung Tissues of COPD Subjects.

Authors:  Woo Jin Kim; Jae Hyun Lim; Jae Seung Lee; Sang-Do Lee; Ju Han Kim; Yeon-Mok Oh
Journal:  Int J Genomics       Date:  2015-03-05       Impact factor: 2.326

9.  Genome-wide profiling reveals alternative polyadenylation of mRNA in human non-small cell lung cancer.

Authors:  Shirong Zhang; Xiaochen Zhang; Wei Lei; Jiafeng Liang; Yasi Xu; Hailiang Liu; Shenglin Ma
Journal:  J Transl Med       Date:  2019-08-07       Impact factor: 5.531

10.  Landscape of transcription in human cells.

Authors:  Sarah Djebali; Carrie A Davis; Angelika Merkel; Alex Dobin; Timo Lassmann; Ali Mortazavi; Andrea Tanzer; Julien Lagarde; Wei Lin; Felix Schlesinger; Chenghai Xue; Georgi K Marinov; Jainab Khatun; Brian A Williams; Chris Zaleski; Joel Rozowsky; Maik Röder; Felix Kokocinski; Rehab F Abdelhamid; Tyler Alioto; Igor Antoshechkin; Michael T Baer; Nadav S Bar; Philippe Batut; Kimberly Bell; Ian Bell; Sudipto Chakrabortty; Xian Chen; Jacqueline Chrast; Joao Curado; Thomas Derrien; Jorg Drenkow; Erica Dumais; Jacqueline Dumais; Radha Duttagupta; Emilie Falconnet; Meagan Fastuca; Kata Fejes-Toth; Pedro Ferreira; Sylvain Foissac; Melissa J Fullwood; Hui Gao; David Gonzalez; Assaf Gordon; Harsha Gunawardena; Cedric Howald; Sonali Jha; Rory Johnson; Philipp Kapranov; Brandon King; Colin Kingswood; Oscar J Luo; Eddie Park; Kimberly Persaud; Jonathan B Preall; Paolo Ribeca; Brian Risk; Daniel Robyr; Michael Sammeth; Lorian Schaffer; Lei-Hoon See; Atif Shahab; Jorgen Skancke; Ana Maria Suzuki; Hazuki Takahashi; Hagen Tilgner; Diane Trout; Nathalie Walters; Huaien Wang; John Wrobel; Yanbao Yu; Xiaoan Ruan; Yoshihide Hayashizaki; Jennifer Harrow; Mark Gerstein; Tim Hubbard; Alexandre Reymond; Stylianos E Antonarakis; Gregory Hannon; Morgan C Giddings; Yijun Ruan; Barbara Wold; Piero Carninci; Roderic Guigó; Thomas R Gingeras
Journal:  Nature       Date:  2012-09-06       Impact factor: 49.962

View more
  1 in total

1.  Global 5'-UTR RNA structure regulates translation of a SERPINA1 mRNA.

Authors:  Philip J Grayeski; Chase A Weidmann; Jayashree Kumar; Lela Lackey; Anthony M Mustoe; Steven Busan; Alain Laederach; Kevin M Weeks
Journal:  Nucleic Acids Res       Date:  2022-09-23       Impact factor: 19.160

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.