Literature DB >> 27324920

Complex Selection on Human Polyadenylation Signals Revealed by Polymorphism and Divergence Data.

Yaroslav A Kainov1, Vasily N Aushev2, Sergey A Naumenko3, Elena M Tchevkina4, Georgii A Bazykin5.   

Abstract

Polyadenylation is a step of mRNA processing which is crucial for its expression and stability. The major polyadenylation signal (PAS) represents a nucleotide hexamer that adheres to the AATAAA consensus sequence. Over a half of human genes have multiple cleavage and polyadenylation sites, resulting in a great diversity of transcripts differing in function, stability, and translational activity. Here, we use available whole-genome human polymorphism data together with data on interspecies divergence to study the patterns of selection acting on PAS hexamers. Common variants of PAS hexamers are depleted of single nucleotide polymorphisms (SNPs), and SNPs within PAS hexamers have a reduced derived allele frequency (DAF) and increased conservation, indicating prevalent negative selection; at the same time, the SNPs that "improve" the PAS (i.e., those leading to higher cleavage efficiency) have increased DAF, compared to those that "impair" it. SNPs are rarer at PAS of "unique" polyadenylation sites (one site per gene); among alternative polyadenylation sites, at the distal PAS and at exonic PAS. Similar trends were observed in DAFs and divergence between species of placental mammals. Thus, selection permits PAS mutations mainly at redundant and/or weakly functional PAS. Nevertheless, a fraction of the SNPs at PAS hexamers likely affect gene functions; in particular, some of the observed SNPs are associated with disease.
© The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

Entities:  

Keywords:  1000 genomes; AATAAA; SNP; mRNA processing; polyadenylation

Mesh:

Substances:

Year:  2016        PMID: 27324920      PMCID: PMC4943204          DOI: 10.1093/gbe/evw137

Source DB:  PubMed          Journal:  Genome Biol Evol        ISSN: 1759-6653            Impact factor:   3.416


Introduction

Polyadenylation is an essential step of mRNA processing in eukaryotes. It affects many aspects of mRNA physiology and plays an important role in its dynamics. Over 50% of human genes contain more than one potential site of cleavage and polyadenylation (Tian et al. 2005; Shepard et al. 2011). A process called alternative polyadenylation (APA) leads to generation of mRNA isoforms with different lengths of 3′-untranslated region or even truncated protein-coding regions (di Giammartino et al. 2011). The pattern of mRNA polyadenylation undergoes dramatic changes during cell differentiation, proliferation, and malignant transformation (Sandberg et al. 2008; Ji et al. 2009; Singh et al. 2009). Polyadenylation includes two major steps: recognition of polyadenylation signals (PAS) leading to mRNA cleavage and nonmatrix addition of polyA tail (Colgan and Manley 1997). Polyadenylation is a complex process regulated by a variety of trans-acting protein factors and cis-elements of mRNA. mRNA 3′-processing complex contains up to 85 proteins (Shi et al. 2009) including CPSF (cleavage and polyadenylation specificity factor), a multisubunit complex which plays a crucial role in mRNA cleavage and polyadenylation. CPSF binds a specific common PAS, an AATAAA hexamer (or its close variant) usually located within 50 nucleotides upstream of the cleavage site (Chan et al. 2014). PAS is present in almost 90% of mammalian mRNAs, and is the most common and best studied signal of polyadenylation (Proudfoot 1991; Beaudoing et al. 2000; Tian et al. 2005; Cheng et al. 2006). Polyadenylation is tuned by natural selection. Cleavage sites and patterns of their usage are conserved across mammals (Ara et al. 2006; Lee et al. 2008). The regions of 3′-UTRs carrying PAS hexamers are depleted of single nucleotide polymorphisms (SNPs) (Castle 2011), and their deletion or mutation leads to a dramatic decrease in expression of target mRNAs due to changes in polyadenylation (Yang et al. 2009; Nunes et al. 2010) and/or transcription efficiency (Mapendano et al. 2010). Additionally, mutations disrupting a PAS located near an alternative cleavage site or affecting its strength influence the site usage and might be clinically relevant (Thomas and Saetrom 2012). Therefore, PAS hexamers are expected to be strongly selected. These patterns of selection are informative of the functional significance of mutations and may help to improve clinical predictions of mutation effects (Adzhubei et al. 2010; Stecher et al. 2014). However, they have never been studied systematically. Selection may be estimated from data on divergence with related species, or from within-species polymorphism. Divergence data provide more power, as SNP densities are still lower than densities of interspecies substitutions. On the other hand, polymorphism is immune to interspecies changes of fitness landscapes, for example, situations when a mutation deleterious in one species is harmless in another (Kondrashov, Sunyaev et al. 2002; Kern and Kondrashov 2004; Mustonen and Lassig 2009; Naumenko et al. 2012). The current avalanche of data on human population-level polymorphism allows measuring patterns of selection with unprecedented resolution. From a single genome, selection favoring or disfavoring a signal may be inferred from its genomic over- or underrepresentation, respectively. From polymorphism data, selection may be inferred from densities of SNPs or allelic frequencies within those SNPs. The dependence of the overall level of polymorphism on selection may be complex even in the simplest single-locus case (McVean and Charlesworth 2000; Kondrashov et al. 2006; Schmidt et al. 2008), and is further complicated by linkage between sites (Genomes Project et al. 2010; Wilkening et al. 2013; You et al. 2015). However, the situation is simplified when alleles may be a priori subdivided into preferred and unpreferred. Negative selection may then be inferred from underrepresentation of SNPs, or from reduced allele frequencies of such SNPs, at sites occupied by the favored allele, compared to a neutral control. Conversely, positive selection is manifested as an excess of SNPs, and increased allele frequencies of such SNPs, at sites of the disfavored allele, compared to a neutral control (although it simultaneously purges variation at linked sites). Here, we use the data on divergence between species of placental mammals and human polymorphism data of the 1000 Genomes Project (Genomes Project et al. 2010) to comprehensively analyze the patterns of selection acting on PAS hexamers.

Materials and Methods

Source Data Sets

Lists of cleavage site positions (polyAsite.db2), positions of PAS hexamers (PAS.db2) and gene identifiers (gene.db2) were obtained from PolyA.db2 database (Lee et al. 2007) (http://polya.umdnj.edu/polya_db/v2/download/, last accessed June 14, 2016). Genomic coordinates were converted from hg17 to hg19 version using liftOver tool from UCSC (https://genome.ucsc.edu/cgi-bin/hgLiftOver, last accessed June 14, 2016). Polymorphisms data, including allele frequencies, from Interim Phase 1 of 1000 Genomes project (Genomes Project et al. 2012) were downloaded from ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20110521/ (last accessed June 14, 2016), last accessed June 14, 2016. This data set comprises the genotypes of 1,094 individuals, and includes a total of 37,852,169 autosomal SNPs. Only true SNPs, that is, those where the reference and alternative alleles differed in a single-nucleotide mismatch, were included in the analysis. PhastCons score (Siepel et al. 2005) data set for placental mammals was downloaded from UCSC server on February 2, 2016. OMIM data (Online Mendelian Inheritance in Man, OMIM®. McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University [Baltimore, MD]) were downloaded on October 15, 2014 from the omim.org FTP server as a plain text file. SNP identifiers were extracted from the text and queried for the intersection with the polymorphisms we found in PAS hexamers. ClinVar data were downloaded from the ClinVar catalogue FTP server (ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/tab_delimited/, last accessed June 14, 2016) (Landrum et al. 2014) on May 26, 2014. GWAS data were downloaded from NHGRI GWAS site (www.genome.gov/gwastudies, last accessed June 14, 2016) (Welter et al. 2014) on June 18, 2014. We analyzed the intersections between dbRS ids from GWAS and ClinVar databases and polymorphisms observed in PAS hexamers.

Retrieving Sequences of PAS Hexamers and Control Hexamers

Sequences of PAS hexamers were retrieved from the reference human genome (assembly hg19, GRCh37) according to the positions indicated in the PAS.db2 database. A small fraction (<1%) of PAS hexamers that did not match the reference human genome was discarded. Each nucleotide observed at a SNP position was categorized as ancestral or derived, using the ancestral human genome sequence retrieved from Ensembl FTP server, and derived allele frequency (DAF) was measured as the fraction of genotypes carrying the derived allele. As a control, we selected hexamers located in the same 3′-UTR regions but on the opposite (noncoding) DNA strand, and not observed as a PAS in the PAS.db2 database. SNP density was defined as the ratio of the number of SNPs within hexamers to the total length of hexamers. The mean phastCons score for each PAS was extracted from the phastCons data set for placental mammals. Each SNP was characterized as “improvement” if the derived hexamer ranked higher than the ancestral hexamer in the list of 13 hexamers sorted by genomic frequency; as “impairment” if it ranked lower than the ancestral one; and as “disruption” if it did not belong to this list. When a PAS hexamer could not be annotated unambiguously, it was excluded from the corresponding comparison. The final set of characterized PAS hexamers is presented as supplementary table S3, Supplementary Material online.

Statistical Analysis

Differences in SNP densities and DAFs were compared using the two-tailed Fisher’s exact test and two-tailed Mann–Whitney U-test, respectively. In comparisons of functional groups, the considered group was compared to the remainder of the sample. Statistical significance was defined as P < 0.05. All statistical tests were performed in R. Plots were created with ggplot2 R package.

Results

The PAS hexamers typically have one of the 13 nucleotide sequences. The ranking of these PAS hexamers according to their frequencies in the genome is consistent with their ranking according to their efficiency in stimulation of cleavage and polyadenylation (supplementary table S1, Supplementary Material online). In particular, the first two of these sequences—AATAAA and ATTAAA—are by far the most frequent, together comprising 55.4% of all hexamers in the human genome; and their efficiency (Sheets et al. 1990) substantially exceeds that of all lower-ranking hexamers (supplementary table S1, Supplementary Material online). To measure selection, we analyze two aspects of polymorphism: SNP densities and within-population frequencies of nonancestral nucleotides (DAFs), as well as interspecies conservation. These three measurements provide complementary estimates of selection. As multiallelic SNPs are rare in this data set (Genomes Project et al. 2010), DAFs are expected to be only dependent on the strength of selection at corresponding sites. However, SNP densities and rate of sequence divergence also depend on the mutation rate. To study selection, we therefore compare the properties of PAS hexamers to those of control hexamers. Control hexamers were chosen so that they have the same nucleotide sequence, and are located in similar regions of 3′-UTRs, but are positioned on the opposite strand, and therefore cannot be functional PAS hexamers (see Materials and Methods). This approach controls for most sources of local as well as global nonuniformity of the mutation rates. Among the 55,856 investigated PAS hexamers (an average of 3.1 hexamers per gene), 2,066 (3.7%) were polymorphic, according to The 1000 Genomes Project database (supplementary table S2, Supplementary Material online). and only 47 (2.3%) of them carried more than one SNP. As a whole, PAS hexamers were not depleted of SNPs, compared to the control sample (P = 0.77, two-tailed Fisher’s exact test, fig. 1), although the density of SNPs was reduced in hexamers AATAAA and ATTAAA (fig. 3), and also in those PAS hexamers that were more likely to be functional (see below). However, DAFs of the observed SNPs were reduced at PAS hexamers compared to the control (fig. 1), while interspecies conservation was higher than in the control (fig. 1), indicating negative selection against such mutations.
Fig. 1.

Patterns of selection in PAS hexamers. Whiskers represent standard errors of the mean. Asterisks correspond to P < 0.05 according to the Mann–Whitney’s U-test. (A) Densities of SNPs in PAS hexamers and in the control sample. (B) Mean DAF of SNPs is reduced in PAS hexamers, compared to the control sample. (C) Mean phastCons score is increased in PAS hexamers, compared to the control sample. (D) DAF depends on the effect of SNPs on the functional activity of the PAS hexamer (box width represents the number of sites SNPs in the category).

Fig. 3.

Polymorphism in different functional groups of PAS hexamers. (A) SNP densities; (B) DAFs; (C) PhastCons scores. In A and B, box width represents the number of SNPs in the group. Dashed lines represent the mean value in the entire sample, and dotted lines, its standard error. Whiskers represent standard error of the mean. Asterisks identify difference of the particular group from the remaining PAS hexamers, according to Fisher’s exact test or Mann–Whitney U-test; *P < 0.05, **P < 10 − 3, ***P < 10 − 10.

Patterns of selection in PAS hexamers. Whiskers represent standard errors of the mean. Asterisks correspond to P < 0.05 according to the Mann–Whitney’s U-test. (A) Densities of SNPs in PAS hexamers and in the control sample. (B) Mean DAF of SNPs is reduced in PAS hexamers, compared to the control sample. (C) Mean phastCons score is increased in PAS hexamers, compared to the control sample. (D) DAF depends on the effect of SNPs on the functional activity of the PAS hexamer (box width represents the number of sites SNPs in the category). Knowledge of the relative strength of different PAS hexamers allowed us to predict the effect of mutations on their efficiency. We categorized SNPs at PAS hexamers as “disrupting” if the derived hexamer was not one of the 13 legitimate PAS hexamer sequences. The remaining SNPs were categorized as “impairing” if they reduced the rank of the hexamer, or “improving,” if they increased it. Overall, we did not observe a measurable enrichment or depletion for any of these three classes of SNPs, compared to the control (supplementary table S2, Supplementary Material online); however, these SNPs differed in their DAFs. The disrupting SNPs segregated at lower DAFs than the control (fig. 1), indicating negative selection against them. For impairing or improving SNPs, the difference in DAFs from the control was not significant, although on average the impairing SNPs had somewhat lower DAFs, while improving SNPs had higher DAFs than expected. The impairing and disrupting SNPs had significantly lower DAFs than improving SNPs in the PAS hexamers, but not in the control hexamers, indicating negative selection against disrupting and impairing mutations, and/or positive selection in favor of improving mutations in PAS hexamers (fig. 1). Frequency spectrums for the examined SNPs are represented in supplementary materials (supplementary figs. S1–S3, Supplementary Material online). Next, we stratified the PAS hexamers according to several characteristics, and analyzed the differences in SNP densities, allele frequencies, and interspecies conservation between the categories (fig. 2). A few patterns emerged with regard to the densities of SNPs in different classes (fig. 3). First, each gene can have either one polyadenylation-associated cleavage site or multiple alternative sites (fig. 2). SNP density was the lowest at the PAS hexamers corresponding to the only cleavage site in a gene (“unique”), and was higher if multiple cleavage sites were present. When two cleavage sites were present, the PAS hexamer corresponding to the cleavage site distal from the promoter (i.e., located at the 3′-end of the longest mRNA isoform) had lower SNP densities. Second, several PAS hexamers may be associated with a single cleavage site (fig. 2). At such sites, SNP densities were higher, compared to cleavage sites with unique PAS hexamers. Third, a fraction of PAS hexamers was located in an intron; such hexamers usually corresponded to alternative cleavage sites (fig. 2). They had higher SNP densities than the exonic hexamers. Fourth, a fraction of PAS hexamers fell between the start and the stop codon; such hexamers are, of course, always alternative, and are usually intronic (fig. 2). As expected, these hexamers were enriched in SNPs, compared with the PAS hexamers located within the 3′-UTRs. Interspecies conservation data demonstrated similar trends; specifically, the mean phastCons score was significantly higher for the “strongest” AATAAA hexamer and for PAS-hexamers located in exons and 3′-UTRs (fig. 3). While we saw no significant differences between investigated categories in DAFs (fig. 3), the overall patterns were roughly coincident with those observed for SNPs densities and phastCons scores.
Fig. 2.

Schematic representation of functional classification of PAS hexamers. PAS hexamers are categorized according to the number and position of corresponding cleavage sites (A); number of PAS hexamers corresponding to a single cleavage site (B); localization within exon or intron (C); or localization within CDS or 3′-UTR (D). Gray boxes, coding exons; thick lines, 3′-UTR exons; angled lines, introns; arrows, cleavage sites.

Schematic representation of functional classification of PAS hexamers. PAS hexamers are categorized according to the number and position of corresponding cleavage sites (A); number of PAS hexamers corresponding to a single cleavage site (B); localization within exon or intron (C); or localization within CDS or 3′-UTR (D). Gray boxes, coding exons; thick lines, 3′-UTR exons; angled lines, introns; arrows, cleavage sites. Polymorphism in different functional groups of PAS hexamers. (A) SNP densities; (B) DAFs; (C) PhastCons scores. In A and B, box width represents the number of SNPs in the group. Dashed lines represent the mean value in the entire sample, and dotted lines, its standard error. Whiskers represent standard error of the mean. Asterisks identify difference of the particular group from the remaining PAS hexamers, according to Fisher’s exact test or Mann–Whitney U-test; *P < 0.05, **P < 10 − 3, ***P < 10 − 10. To elucidate the potential association of mutations at PAS hexamers with human diseases, we screened the OMIM (Online Mendelian Inheritance in Man), ClinVar (Clinical Variations), and NHGRI GWAS (The National Human Genome Research Institute Genome-Wide Association Studies) databases for PAS-affecting SNPs. We found five SNPs located within the PAS hexamers of five genes (table 1). Somewhat unexpectedly, all the observed mutations had rather high DAFs (> 1%) in the 1000 Genomes data set. Moreover, only two of the SNPs (rs78378222 and rs986475) affected PAS hexamers corresponding to unique polyadenylation sites (one per gene), whereas the other three SNPs affected the signals near alternative sites, two proximal and one distal. The observed SNPs also differed in their effect on polyadenylation site activity. While rs78378222 and rs986475 impair or disrupt the PAS, leading to a decrease of transcript polyadenylation and protein production (Delahaye et al. 2011; Stacey et al. 2011), rs10954213 improves the proximal PAS, resulting in an increased rate of formation of the shorter 3′-UTR isoform and higher mRNA stability and protein expression (Graham et al. 2007); and rs884205 represents an example of de novo formation of alternative PAS from the ancestral CTTAAA hexamer, which is not a PAS hexamer.
Table 1

Characteristics of PAS Hexamers Carrying SNPs Associated with Human Pathologies

dbSNP IDGene NameNormal PASRisk-Associated PASFrequency of Risk-Associated AlleleType of PAS Cleavage SiteEffect of SNP on PASPhenotypeDatabase
rs78378222TP53AATAAAAATACA0.01UniqueImpairmentBasal cell carcinomaGWAS, ClinVar, OMIM
rs12721054APOC1AATAAAAATGAA0.03ProximalImpairmentHigh blood trigliceridesGWAS
rs10954213IRF5AATGAAAATAAA0.53ProximalImprovementSystemic lupus erythematosusClinVar, OMIM
rs986475NCR3AATAAAAACAAA0.1UniqueDisruptionGastrointestinal stromal tumorsOMIM
rs884205TNFRSF11ACTTAAAATTAAA0.19DistalDe novo formationAbnormal bone mineral densityGWAS
Characteristics of PAS Hexamers Carrying SNPs Associated with Human Pathologies

Discussion

In this work, we use the rich whole-genome data set on human polymorphism, The 1000 Genomes Data Set, and interspecies conservation to study the patterns of selection at PAS hexamers. We find that, overall, DAFs at PAS hexamers are lower, and conservation is higher, than in the control sequences, implying negative selection against mutations at PAS hexamers. We see no corresponding reduction in SNP densities, although SNP densities are reduced in the two strongest hexamers AATAAA and ATTAAA which together comprise over a half of the sample. This suggests that the typical selection at less strict PAS hexamers has a moderate strength, so that, although it is capable to reduce the frequency of the inferior allele and to prevent its fixation between species, it is seldom able to eliminate it completely. However, the overall genome-wide patterns give only a crude understanding of selection. Categorization of alleles by their effect on the PAS hexamer efficiency reveals a more complicated picture. As expected, the DAFs of disrupting SNPs were substantially reduced, indicative of substantial negative selection against such mutations. The DAFs of the impairing SNPs were also reduced, compared with the control; in contrast, the DAFs of the improving SNPs were increased (fig. 1). Although the difference of the impairing and improving SNPs from the control was statistically insignificant, they significantly differed from each other: the DAFs of impairing SNPs were significantly lower than those of improving SNPs. This implies that the impairing SNPs are negatively selected, that the improving SNPs are positively selected, or possibly both. In the mutation-selection-drift balance, a continuous influx of deleterious mutations counteracted by selection against them leads to maintenance of an equilibrium concentration of suboptimal alleles in the genome. Under very weak selection, this may lead to alternating fixations of optimal and suboptimal alleles at a locus (Ohta’s turnover) (Kimura and Ohta 1971; Ohta 1992; Charlesworth and Eyre-Walker 2007; Denisov et al. 2014). Here, we observe a different manifestation of the same phenomenon: a downward or upward bias in the mean frequency of the derived allele caused by negative and positive selection, respectively. The strength and/or ubiquity of selection depend on the location of the PAS hexamer. It appears to be primarily determined by whether a mutation within a PAS hexamer may be circumvented by exploiting an alternative PAS hexamer. Mutations at unique PAS hexamers associated with unique cleavage sites are under the strongest selection, while the presence of an alternative PAS hexamer and/or cleavage site relaxes selection against the mutations. Specifically, the presence of another PAS hexamer reduced the action of selection both on the distal and the proximal (P < 0.0001, Fisher’s test) PAS hexamer, compared to the unique ones. Overall, genomic redundancy tends to be associated with reduced selection against mutations within each functional element; for example, alternative splice sites and duplicate genes are under weaker selection than constitutive sites (Kurmangaliyev et al. 2013) and single-copy genes (Force et al. 1999; Kondrashov, Rogozin et al. 2002), respectively. The distal PAS hexamers are under stronger selection than the proximal ones. The proximal hexamers tend to be further from the consensus sequence than the distal hexamers (Tian et al. 2005). This difference may facilitate proximal-to-distal cleavage site usage switching that occurs during a wide range of normal and pathological processes (Tian et al. 2005; di Giammartino et al. 2011). Thus, the less manifested consensus sequence of the proximal PAS hexamers could reduce the effect of the SNPs, compared with the distal PAS hexamers. Additionally, activity of proximal sites could be regulated by other polyadenylation factors (in particular, CSTF proteins) that interact with the downstream GU-rich sequence (Takagaki and Manley 1998; Nunes et al. 2010; Yao et al. 2012), making the nucleotide sequence of PAS hexamers less critical. The exonic sites are under stronger selection than intronic sites. Also, the PAS hexamers located within the coding regions (which are typically also intronic) are under weaker selection than the 3′-UTR PAS hexamers (which tend to be exonic). This is intuitive, as the intronic sites are commonly alternative, whereas many of the exonic sites are constitutive. Additionally, intronic polyadenylation sites function in a tight cross-talk with splicing, and their usage could be regulated more or less independently of the binding of “classical” polyadenylation factors to the PAS hexamer. Thus, the expression level of splicing factors, which interact with specific signals independently of or even in competition with polyadenylation factors, and the strength of the 5′-splicing sites play an important role in regulation of the activity of intronic polyadenylation sites (Castelo-Branco et al. 2004; Kaida et al. 2010). Strikingly, almost all the observed differences between the functional groups of PAS hexamers coincide well with the trends of phylogenetic conservation of cleavage sites (Lee et al. 2008), supporting the key role of PAS hexamers in regulation of cleavage and polyadenylation. Some of the SNPs that affect PAS hexamers might be associated with pathology. Interestingly, a fraction of these “pathological” SNPs affected alternative sites, suggesting that APA is important for physiological gene expression and function. Other germline and somatic mutations that affect alternative PAS hexamers have been previously described as implicated in pathogenesis of human type 1 diabetes, IPEX (immune dysfunction, polyendocrinopathy, enteropathy, X-linked), panic disorder (Bennett et al. 2001; Shin et al. 2007; Gyawali et al. 2010), and tumorigenesis (Wiestner et al. 2007). In conclusion, the patterns of polymorphism within PAS hexamers reveal weak selection acting at these sites. This selection appears to be primarily determined by the direction and extent of the effect of the corresponding mutation on polyadenylation. While the destroying SNPs tend to be negatively selected, we find evidence of positive selection favoring the mutations that make the hexamers more similar to the consensus sequence, indicating that the nonconsensus sequences are mostly suboptimal. While SNPs are rare at those hexamers where they substantially disrupt the function, polymorphism within many of the rarely used hexamers appears to be nearly neutral. Pathogenic mutations may affect polyadenylation via a broad range of mechanisms, including disruption of existing, constitutive, or alternative, sites, improvement of an existing site, or even creation of a spurious site. Furthermore, the link between the changes in polyadenylation and the changes in expression is frequently nonlinear (Spies et al. 2013; Gupta et al. 2014). Annotation of possible effects of SNPs on polyadenylation should be included in any prediction of effects of both somatic and germline mutations; however, the complications listed above make such predictions inherently difficult.
  55 in total

1.  Patterns of variant polyadenylation signal usage in human genes.

Authors:  E Beaudoing; S Freier; J R Wyatt; J M Claverie; D Gautheret
Journal:  Genome Res       Date:  2000-07       Impact factor: 9.043

Review 2.  Preservation of duplicate genes by complementary, degenerative mutations.

Authors:  A Force; M Lynch; F B Pickett; A Amores; Y L Yan; J Postlethwait
Journal:  Genetics       Date:  1999-04       Impact factor: 4.562

3.  Molecular architecture of the human pre-mRNA 3' processing complex.

Authors:  Yongsheng Shi; Dafne Campigli Di Giammartino; Derek Taylor; Ali Sarkeshik; William J Rice; John R Yates; Joachim Frank; James L Manley
Journal:  Mol Cell       Date:  2009-02-13       Impact factor: 17.970

4.  Proliferating cells express mRNAs with shortened 3' untranslated regions and fewer microRNA target sites.

Authors:  Rickard Sandberg; Joel R Neilson; Arup Sarma; Phillip A Sharp; Christopher B Burge
Journal:  Science       Date:  2008-06-20       Impact factor: 47.728

5.  Functional implications of splicing polymorphisms in the human genome.

Authors:  Yerbol Z Kurmangaliyev; Roman A Sutormin; Sergey A Naumenko; Georgii A Bazykin; Mikhail S Gelfand
Journal:  Hum Mol Genet       Date:  2013-05-02       Impact factor: 6.150

6.  Association of a polyadenylation polymorphism in the serotonin transporter and panic disorder.

Authors:  Sandeep Gyawali; Ryan Subaran; Myrna M Weissman; Dylan Hershkowitz; Morgan C McKenna; Ardesheer Talati; Abby J Fyer; Priya Wickramaratne; Phillip B Adams; Susan E Hodge; Carl J Schmidt; Michael J Bannon; Charles E Glatt
Journal:  Biol Psychiatry       Date:  2009-12-06       Impact factor: 13.382

7.  PolyA_DB 2: mRNA polyadenylation sites in vertebrate genes.

Authors:  Ju Youn Lee; Ijen Yeh; Ji Yeon Park; Bin Tian
Journal:  Nucleic Acids Res       Date:  2007-01       Impact factor: 16.971

8.  A large-scale analysis of mRNA polyadenylation of human and mouse genes.

Authors:  Bin Tian; Jun Hu; Haibo Zhang; Carol S Lutz
Journal:  Nucleic Acids Res       Date:  2005-01-12       Impact factor: 16.971

9.  ClinVar: public archive of relationships among sequence variation and human phenotype.

Authors:  Melissa J Landrum; Jennifer M Lee; George R Riley; Wonhee Jang; Wendy S Rubinstein; Deanna M Church; Donna R Maglott
Journal:  Nucleic Acids Res       Date:  2013-11-14       Impact factor: 16.971

10.  CPSF30 and Wdr33 directly bind to AAUAAA in mammalian mRNA 3' processing.

Authors:  Serena L Chan; Ina Huppertz; Chengguo Yao; Lingjie Weng; James J Moresco; John R Yates; Jernej Ule; James L Manley; Yongsheng Shi
Journal:  Genes Dev       Date:  2014-10-09       Impact factor: 11.361

View more
  3 in total

1.  Alternative Polyadenylation of Mammalian Transcripts Is Generally Deleterious, Not Adaptive.

Authors:  Chuan Xu; Jianzhi Zhang
Journal:  Cell Syst       Date:  2018-06-06       Impact factor: 10.304

2.  Regulation of closely juxtaposed proto-oncogene c-fms and HMGXB3 gene expression by mRNA 3' end polymorphism in breast cancer cells.

Authors:  Ho-Hyung Woo; Setsuko K Chambers
Journal:  RNA       Date:  2021-06-21       Impact factor: 5.636

3.  Long-read sequencing uncovers a complex transcriptome topology in varicella zoster virus.

Authors:  István Prazsák; Norbert Moldován; Zsolt Balázs; Dóra Tombácz; Klára Megyeri; Attila Szűcs; Zsolt Csabai; Zsolt Boldogkői
Journal:  BMC Genomics       Date:  2018-12-04       Impact factor: 3.969

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.