-
Bioinformatics (Oxford, England) Nov 2019Alternative polyadenylation (polyA) sites near the 3' end of a pre-mRNA create multiple mRNA transcripts with different 3' untranslated regions (3' UTRs). The sequence...
MOTIVATION
Alternative polyadenylation (polyA) sites near the 3' end of a pre-mRNA create multiple mRNA transcripts with different 3' untranslated regions (3' UTRs). The sequence elements of a 3' UTR are essential for many biological activities such as mRNA stability, sub-cellular localization, protein translation, protein binding and translation efficiency. Moreover, numerous studies in the literature have reported the correlation between diseases and the shortening (or lengthening) of 3' UTRs. As alternative polyA sites are common in mammalian genes, several machine learning tools have been published for predicting polyA sites from sequence data. These tools either consider limited sequence features or use relatively old algorithms for polyA site prediction. Moreover, none of the previous tools consider RNA secondary structures as a feature to predict polyA sites.
RESULTS
In this paper, we propose a new deep learning model, called DeepPASTA, for predicting polyA sites from both sequence and RNA secondary structure data. The model is then extended to predict tissue-specific polyA sites. Moreover, the tool can predict the most dominant (i.e. frequently used) polyA site of a gene in a specific tissue and relative dominance when two polyA sites of the same gene are given. Our extensive experiments demonstrate that DeepPASTA signisficantly outperforms the existing tools for polyA site prediction and tissue-specific relative and absolute dominant polyA site prediction.
AVAILABILITY AND IMPLEMENTATION
https://github.com/arefeen/DeepPASTA.
SUPPLEMENTARY INFORMATION
Supplementary data are available at Bioinformatics online.
Topics: 3' Untranslated Regions; Animals; Neural Networks, Computer; Poly A; Polyadenylation; RNA, Messenger
PubMed: 31081512
DOI: 10.1093/bioinformatics/btz283 -
Methods in Molecular Biology (Clifton,... 2014Polyadenylation is a posttranscriptional modification present throughout all the kingdoms of life with important roles in regulation of RNA stability, translation, and...
Polyadenylation is a posttranscriptional modification present throughout all the kingdoms of life with important roles in regulation of RNA stability, translation, and quality control. Functions of polyadenylation in prokaryotic and organellar RNA metabolism are still not fully characterized, and poly(A) tails appear to play contrasting roles in different systems. Here we present a general overview of the polyadenylation process and the factors involved in its regulation, with an emphasis on the diverse functions of 3' end modification in the control of gene expression in different biological systems.
Topics: Bacteria; Organelles; Polyadenylation
PubMed: 24590792
DOI: 10.1007/978-1-62703-971-0_18 -
RNA (New York, N.Y.) Nov 2022The polyadenylation signal (PAS) is a key sequence element for 3'-end cleavage and polyadenylation of messenger RNA precursors (pre-mRNAs). This hexanucleotide motif is...
The polyadenylation signal (PAS) is a key sequence element for 3'-end cleavage and polyadenylation of messenger RNA precursors (pre-mRNAs). This hexanucleotide motif is recognized by the mammalian polyadenylation specificity factor (mPSF), consisting of CPSF160, WDR33, CPSF30, and Fip1 subunits. Recent studies have revealed how the AAUAAA PAS, the most frequently observed PAS, is recognized by mPSF. We report here the structure of human mPSF in complex with the AUUAAA PAS, the second most frequently identified PAS. Conformational differences are observed for the A1 and U2 nucleotides in AUUAAA compared to the A1 and A2 nucleotides in AAUAAA, while the binding modes of the remaining 4 nt are essentially identical. The 5' phosphate of U2 moves by 2.6 Å and the U2 base is placed near the six-membered ring of A2 in AAUAAA, where it makes two hydrogen bonds with zinc finger 2 (ZF2) of CPSF30, which undergoes conformational changes as well. We also attempted to determine the binding modes of two rare PAS hexamers, AAGAAA and GAUAAA, but did not observe the RNA in the cryo-electron microscopy density. The residues in CPSF30 (ZF2 and ZF3) and WDR33 that recognize PAS are disordered in these two structures.
Topics: Animals; Humans; Polyadenylation; mRNA Cleavage and Polyadenylation Factors; Cleavage And Polyadenylation Specificity Factor; Cryoelectron Microscopy; RNA, Messenger; Protein Binding; RNA Precursors; Mammals; Nucleotides; Poly A
PubMed: 36130077
DOI: 10.1261/rna.079322.122 -
ELife Aug 2020Yeast cells undergoing the diauxic response show a striking upstream shift in poly(A) site utilization, with increased use of ORF-proximal poly(A) sites resulting in...
Yeast cells undergoing the diauxic response show a striking upstream shift in poly(A) site utilization, with increased use of ORF-proximal poly(A) sites resulting in shorter 3' mRNA isoforms for most genes. This altered poly(A) pattern is extremely similar to that observed in cells containing Pol II derivatives with slow elongation rates. Conversely, cells containing derivatives with fast elongation rates show a subtle downstream shift in poly(A) sites. Polyadenylation patterns of many genes are sensitive to both fast and slow elongation rates, and a global shift of poly(A) utilization is strongly linked to increased purine content of sequences flanking poly(A) sites. Pol II processivity is impaired in diauxic cells, but strains with reduced processivity and normal Pol II elongation rates have normal polyadenylation profiles. Thus, Pol II elongation speed is important for poly(A) site selection and for regulating poly(A) patterns in response to environmental conditions.
Topics: Poly A; Polyadenylation; RNA Polymerase II; Saccharomyces cerevisiae; Saccharomyces cerevisiae Proteins; Transcription Elongation, Genetic
PubMed: 32845240
DOI: 10.7554/eLife.59810 -
Analytical and Bioanalytical Chemistry Feb 2018The 3'-polyadenosine (poly A) tail of in vitro transcribed (IVT) mRNA was studied using liquid chromatography coupled to mass spectrometry (LC-MS). Poly A tails were...
The 3'-polyadenosine (poly A) tail of in vitro transcribed (IVT) mRNA was studied using liquid chromatography coupled to mass spectrometry (LC-MS). Poly A tails were cleaved from the mRNA using ribonuclease T1 followed by isolation with dT magnetic beads. Extracted tails were then analyzed by LC-MS which provided tail length information at single-nucleotide resolution. A 2100-nt mRNA with plasmid-encoded poly A tail lengths of either 27, 64, 100, or 117 nucleotides was used for these studies as enzymatically added poly A tails showed significant length heterogeneity. The number of As observed in the tails closely matched Sanger sequencing results of the DNA template, and even minor plasmid populations with sequence variations were detected. When the plasmid sequence contained a discreet number of poly As in the tail, analysis revealed a distribution that included tails longer than the encoded tail lengths. These observations were consistent with transcriptional slippage of T7 RNAP taking place within a poly A sequence. The type of RNAP did not alter the observed tail distribution, and comparison of T3, T7, and SP6 showed all three RNAPs produced equivalent tail length distributions. The addition of a sequence at the 3' end of the poly A tail did, however, produce narrower tail length distributions which supports a previously described model of slippage where the 3' end can be locked in place by having a G or C after the poly nucleotide region. Graphical abstract Determination of mRNA poly A tail length using magnetic beads and LC-MS.
Topics: Animals; Cell Line; Chromatography, High Pressure Liquid; Mass Spectrometry; Mice; Plasmids; Poly A; Polyadenylation; RNA, Messenger; Transcription, Genetic
PubMed: 29313076
DOI: 10.1007/s00216-017-0840-6 -
Journal of Virology Feb 2017Alternative processing of human bocavirus (HBoV) P5 promoter-transcribed RNA is critical for generating the structural and nonstructural protein-encoding mRNA...
UNLABELLED
Alternative processing of human bocavirus (HBoV) P5 promoter-transcribed RNA is critical for generating the structural and nonstructural protein-encoding mRNA transcripts. The regulatory mechanism by which HBoV RNA transcripts are polyadenylated at proximal [(pA)p] or distal [(pA)d] polyadenylation sites is still unclear. We constructed a recombinant HBoV infectious clone to study the alternative polyadenylation regulation of HBoV. Surprisingly, in addition to the reported distal polyadenylation site, (pA)d, a novel distal polyadenylation site, (pA)d2, which is located in the right-end hairpin (REH), was identified during infectious clone transfection or recombinant virus infection. (pA)d2 does not contain typical hexanucleotide polyadenylation signal, upstream elements (USE), or downstream elements (DSE) according to sequence analysis. Further study showed that HBoV nonstructural protein NS1, REH, and cis elements of (pA)d were necessary and sufficient for efficient polyadenylation at (pA)d2. The distance and sequences between (pA)d and (pA)d2 also played a key role in the regulation of polyadenylation at (pA)d2. Finally, we demonstrated that efficient polyadenylation at (pA)d2 resulted in increased HBoV capsid mRNA transcripts and protein translation. Thus, our study revealed that all the bocaviruses have distal poly(A) signals on the right-end palindromic terminus, and alternative polyadenylation at the HBoV 3' end regulates its capsid expression.
IMPORTANCE
The distal polyadenylation site, (pA)d, of HBoV is located about 400 nucleotides (nt) from the right-end palindromic terminus, which is different from those of bovine parvovirus (BPV) and canine minute virus (MVC) in the same genus whose distal polyadenylation is located in the right-end stem-loop structure. A novel polyadenylation site, (pA)d2, was identified in the right-end hairpin of HBoV during infectious clone transfection or recombinant virus infection. Sequence analysis showed that (pA)d2 does not contain typical polyadenylation signals, and the last 42 nt form a stem-loop structure which is almost identical to that of MVC. Further study showed that NS1, REH, and cis elements of (pA)d are required for efficient polyadenylation at (pA)d2. Polyadenylation at (pA)d2 enhances capsid expression. Our study demonstrates alternative polyadenylation at the 3' end of HBoV and suggests an additional mechanism by which capsid expression is regulated.
Topics: Alternative Splicing; Base Sequence; Capsid Proteins; Cell Line; Gene Expression Regulation, Viral; Human bocavirus; Humans; Mutation; Poly A; Polyadenylation; RNA, Messenger; Regulatory Sequences, Nucleic Acid; Terminal Repeat Sequences; Transcription, Genetic
PubMed: 27881651
DOI: 10.1128/JVI.02026-16 -
RNA (New York, N.Y.) Dec 2019Most eukaryotic messenger RNA precursors must undergo 3'-end cleavage and polyadenylation for maturation. We and others recently reported the structure of the AAUAAA...
Most eukaryotic messenger RNA precursors must undergo 3'-end cleavage and polyadenylation for maturation. We and others recently reported the structure of the AAUAAA polyadenylation signal (PAS) in complex with the protein factors CPSF-30, WDR33, and CPSF-160, revealing the molecular mechanism for this recognition. Here we have characterized in detail the interactions between the PAS RNA and the protein factors using fluorescence polarization experiments. Our studies show that AAUAAA is recognized with ∼3 nM affinity by the CPSF-160-WDR33-CPSF-30 ternary complex. Variations in the RNA sequence can greatly reduce the affinity. Similarly, mutations of CPSF-30 residues that have van der Waals interactions with the bases of AAUAAA also lead to substantial reductions in affinity. Finally, our studies confirm that both CPSF-30 and WDR33 are required for high-affinity binding of the PAS RNA, while these two proteins alone and their binary complexes with CPSF-160 have much lower affinity for the RNA.
Topics: Cleavage And Polyadenylation Specificity Factor; Fluorescence; Humans; Nuclear Proteins; Poly A; Polyadenylation; Protein Binding; RNA Precursors; RNA, Messenger; mRNA Cleavage and Polyadenylation Factors
PubMed: 31462423
DOI: 10.1261/rna.070870.119 -
Proceedings of the National Academy of... Jun 2004In contrast to mRNAs, rRNAs are transcribed by RNA polymerase I or III and are not believed to be polyadenylated. Here we show that in Saccharomyces cerevisiae, at least...
In contrast to mRNAs, rRNAs are transcribed by RNA polymerase I or III and are not believed to be polyadenylated. Here we show that in Saccharomyces cerevisiae, at least a small fraction of rRNAs do have a poly(A) tail. The levels of polyadenylated rRNAs are dramatically increased in strains lacking the degradation function of Rrp6p, a component of the nuclear exosome. Pap1p, the poly(A) polymerase, is responsible for adenylating the rRNAs despite the fact that the rRNAs do not have a canonical polyadenylation signal. Polyadenylated rRNAs reside mainly within the nucleus and are in turn degraded. For at least one rRNA type, the polyadenylation preferentially occurs on the precursor rather than the mature product. The existence of polyadenylated rRNAs may reflect a quality-control mechanism of rRNA biogenesis.
Topics: Base Sequence; Cell Nucleus; Exoribonucleases; Exosome Multienzyme Ribonuclease Complex; Genes, Fungal; In Situ Hybridization, Fluorescence; Poly A; Polynucleotide Adenylyltransferase; RNA, Fungal; RNA, Ribosomal; Reverse Transcriptase Polymerase Chain Reaction; Saccharomyces cerevisiae; Saccharomyces cerevisiae Proteins
PubMed: 15173578
DOI: 10.1073/pnas.0402888101 -
Journal of Virology Apr 1981Polyadenylated transcripts of influenza virus RNA are incomplete copies of the individual genome segments, lacking sequences complementary to the 5'-terminal nucleotides...
Polyadenylated transcripts of influenza virus RNA are incomplete copies of the individual genome segments, lacking sequences complementary to the 5'-terminal nucleotides of the virion RNA. By using a procedure which depends on the polyadenylic acid tail of the mRNA being encoded in part by the genome, we have determined that the common tract of uridine residues, approximately 17 to 22 nucleotides from the 5' end of each segment, is the site of polyadenylation of influenza virus mRNA.
Topics: Base Sequence; Genes, Viral; Influenza A virus; Poly A; RNA, Messenger; RNA, Viral
PubMed: 7241649
DOI: 10.1128/JVI.38.1.157-163.1981 -
Biochimica Et Biophysica Acta Feb 2016It is generally accepted that only transcripts synthesized by RNA polymerase II (e.g., mRNA) were subject to AAUAAA-dependent polyadenylation. However, we previously...
It is generally accepted that only transcripts synthesized by RNA polymerase II (e.g., mRNA) were subject to AAUAAA-dependent polyadenylation. However, we previously showed that RNA transcribed by RNA polymerase III (pol III) from mouse B2 SINE could be polyadenylated in an AAUAAA-dependent manner. Many species of mammalian SINEs end with the pol III transcriptional terminator (TTTTT) and contain hexamers AATAAA in their A-rich tail. Such SINEs were united into Class T(+), whereas SINEs lacking the terminator and AATAAA sequences were classified as T(-). Here we studied the structural features of SINE pol III transcripts that are necessary for their polyadenylation. Eight and six SINE families from classes T(+) and T(-), respectively, were analyzed. The replacement of AATAAA with AACAAA in T(+) SINEs abolished the RNA polyadenylation. Interestingly, insertion of the polyadenylation signal (AATAAA) and pol III transcription terminator in T(-) SINEs did not result in polyadenylation. The detailed analysis of three T(+) SINEs (B2, DIP, and VES) revealed areas important for the polyadenylation of their pol III transcripts: the polyadenylation signal and terminator in A-rich tail, β region positioned immediately downstream of the box B of pol III promoter, and τ region located upstream of the tail. In DIP and VES (but not in B2), the τ region is a polypyrimidine motif which is also characteristic of many other T(+) SINEs. Most likely, SINEs of different mammals acquired these structural features independently as a result of parallel evolution.
Topics: Animals; Base Sequence; Mice; Poly A; Polyadenylation; Promoter Regions, Genetic; RNA; RNA Polymerase III; Short Interspersed Nucleotide Elements; Transcription, Genetic
PubMed: 26700565
DOI: 10.1016/j.bbagrm.2015.12.003