-
Methods (San Diego, Calif.) Feb 2019Recent developments in high-throughput RNA sequencing methods coupled with innovative bioinformatic tools have uncovered thousands of circular (circ)RNAs. CircRNAs have...
Recent developments in high-throughput RNA sequencing methods coupled with innovative bioinformatic tools have uncovered thousands of circular (circ)RNAs. CircRNAs have emerged as a vast and novel class of regulatory RNAs with potential to modulate gene expression by acting as sponges for microRNAs (miRNAs) and RNA-binding proteins (RBPs). The biochemical enrichment of circRNAs by exoribonuclease treatment or by depletion of polyadenylated RNAs coupled with deep-sequencing is widely used for the systematic identification of circRNAs. Although these methods enrich circRNAs substantially, they do not eliminate efficiently non-polyadenylated and highly-structured RNAs. Here, we describe a method we termed RPAD, based on initial RNase R treatment followed by Polyadenylation and poly(A) RNA Depletion. These joint interventions drastically depleted linear RNAs leading to isolation of highly pure circRNAs from total RNA pools. By facilitating the isolation of highly pure circRNAs, RPAD enables the elucidation of circRNA biogenesis, sequence, and function.
Topics: Cell Cycle Proteins; Computational Biology; Cytoskeletal Proteins; Exoribonucleases; HeLa Cells; High-Throughput Nucleotide Sequencing; Humans; Intracellular Signaling Peptides and Proteins; Nuclear Proteins; Poly A; Polyadenylation; Protein Serine-Threonine Kinases; RNA; RNA, Circular; RNA, Messenger; RNA-Binding Proteins; Sequence Analysis, RNA
PubMed: 30391514
DOI: 10.1016/j.ymeth.2018.10.022 -
Molecular and Cellular Biology Aug 2005RNA polyadenylation serves a purpose in bacteria and organelles opposite from the role it plays in nuclear systems. The majority of nucleus-encoded transcripts are...
RNA polyadenylation serves a purpose in bacteria and organelles opposite from the role it plays in nuclear systems. The majority of nucleus-encoded transcripts are characterized by stable poly(A) tails at their mature 3' ends, which are essential for stabilization and translation initiation. In contrast, in bacteria, chloroplasts, and plant mitochondria, polyadenylation is a transient feature which promotes RNA degradation. Surprisingly, in spite of their prokaryotic origin, human mitochondrial transcripts possess stable 3'-end poly(A) tails, akin to nucleus-encoded mRNAs. Here we asked whether human mitochondria retain truncated and transiently polyadenylated transcripts in addition to stable 3'-end poly(A) tails, which would be consistent with the preservation of the largely ubiquitous polyadenylation-dependent RNA degradation mechanisms of bacteria and organelles. To this end, using both molecular and bioinformatic methods, we sought and revealed numerous examples of such molecules, dispersed throughout the mitochondrial genome. The broad distribution but low abundance of these polyadenylated truncated transcripts strongly suggests that polyadenylation-dependent RNA degradation occurs in human mitochondria. The coexistence of this system with stable 3'-end polyadenylation, despite their seemingly opposite effects, is so far unprecedented in bacteria and other organelles.
Topics: 3' Untranslated Regions; Cell Line, Tumor; Cells, Cultured; Computational Biology; Cyclooxygenase 1; Evolution, Molecular; Expressed Sequence Tags; Humans; Membrane Proteins; Mitochondria; Polyadenylation; Prokaryotic Cells; Prostaglandin-Endoperoxide Synthases; RNA; RNA, Antisense; RNA, Messenger; RNA, Mitochondrial; RNA, Ribosomal, 16S; RNA, Transfer, Ser
PubMed: 16024781
DOI: 10.1128/MCB.25.15.6427-6435.2005 -
ELife Jun 2020Little is known about co-transcriptional or post-transcriptional regulatory mechanisms linking noncoding variation to variation in organismal traits. To begin addressing...
Little is known about co-transcriptional or post-transcriptional regulatory mechanisms linking noncoding variation to variation in organismal traits. To begin addressing this gap, we used 3' Seq to study the impact of genetic variation on alternative polyadenylation (APA) in the nuclear and total mRNA fractions of 52 HapMap Yoruba human lymphoblastoid cell lines. We mapped 602 APA quantitative trait loci (apaQTLs) at 10% FDR, of which 152 were nuclear specific. Effect sizes at intronic apaQTLs are negatively correlated with eQTL effect sizes. These observations suggest genetic variants can decrease mRNA expression levels by increasing usage of intronic PAS. We also identified 24 apaQTLs associated with protein levels, but not mRNA expression. Finally, we found that 19% of apaQTLs can be associated with disease. Thus, our work demonstrates that APA links genetic variation to variation in gene expression, protein expression, and disease risk, and reveals uncharted modes of genetic regulation.
Topics: Cell Line; Gene Expression Regulation; Humans; Polyadenylation
PubMed: 32584258
DOI: 10.7554/eLife.57492 -
Bioinformatics (Oxford, England) Jul 2013Pre-mRNA cleavage and polyadenylation are essential steps for 3'-end maturation and subsequent stability and degradation of mRNAs. This process is highly controlled by...
MOTIVATION
Pre-mRNA cleavage and polyadenylation are essential steps for 3'-end maturation and subsequent stability and degradation of mRNAs. This process is highly controlled by cis-regulatory elements surrounding the cleavage/polyadenylation sites (polyA sites), which are frequently constrained by sequence content and position. More than 50% of human transcripts have multiple functional polyA sites, and the specific use of alternative polyA sites (APA) results in isoforms with variable 3'-untranslated regions, thus potentially affecting gene regulation. Elucidating the regulatory mechanisms underlying differential polyA preferences in multiple cell types has been hindered both by the lack of suitable data on the precise location of cleavage sites, as well as of appropriate tests for determining APAs with significant differences across multiple libraries.
RESULTS
We applied a tailored paired-end RNA-seq protocol to specifically probe the position of polyA sites in three human adult tissue types. We specified a linear-effects regression model to identify tissue-specific biases indicating regulated APA; the significance of differences between tissue types was assessed by an appropriately designed permutation test. This combination allowed to identify highly specific subsets of APA events in the individual tissue types. Predictive models successfully classified constitutive polyA sites from a biologically relevant background (auROC = 99.6%), as well as tissue-specific regulated sets from each other. We found that the main cis-regulatory elements described for polyadenylation are a strong, and highly informative, hallmark for constitutive sites only. Tissue-specific regulated sites were found to contain other regulatory motifs, with the canonical polyadenylation signal being nearly absent at brain-specific polyA sites. Together, our results contribute to the understanding of the diversity of post-transcriptional gene regulation.
AVAILABILITY
Raw data are deposited on SRA, accession numbers: brain SRX208132, kidney SRX208087 and liver SRX208134. Processed datasets as well as model code are published on our website: http://www.genome.duke.edu/labs/ohler/research/UTR/.
CONTACT
Topics: 3' Untranslated Regions; Adult; Genomics; Humans; Linear Models; Organ Specificity; Poly A; Polyadenylation; Regulatory Sequences, Ribonucleic Acid; Sequence Analysis, RNA
PubMed: 23812974
DOI: 10.1093/bioinformatics/btt233 -
Molecular and Cellular Biology Sep 2022The 3' ends of eukaryotic mRNAs are generated by cleavage of nascent transcripts followed by polyadenylation, which occurs at numerous sites within 3' untranslated...
The 3' ends of eukaryotic mRNAs are generated by cleavage of nascent transcripts followed by polyadenylation, which occurs at numerous sites within 3' untranslated regions (3' UTRs) but rarely within coding regions. An individual gene can yield many 3'-mRNA isoforms with distinct half-lives. We dissect the relative contributions of protein-coding sequences (open reading frames [ORFs]) and 3' UTRs to polyadenylation profiles in yeast. ORF-deleted derivatives often display strongly decreased mRNA levels, indicating that ORFs contribute to overall mRNA stability. Poly(A) profiles, and hence relative isoform half-lives, of most (9 of 10) ORF-deleted derivatives are very similar to their wild-type counterparts. Similarly, in-frame insertion of a large protein-coding fragment between the ORF and 3' UTR has minimal effect on the poly(A) profile in all 15 cases tested. Last, reciprocal ORF/3'-UTR chimeric genes indicate that the poly(A) profile is determined by the 3' UTR. Thus, 3' UTRs are self-contained modular entities sufficient to determine poly(A) profiles and relative 3'-isoform half-lives. In the one atypical instance, ORF deletion causes an upstream shift of poly(A) sites, likely because juxtaposition of an unusually high AT-rich stretch directs polyadenylation closely downstream. This suggests that long AT-rich stretches, which are not encountered until after coding regions, are important for restricting polyadenylation to 3' UTRs.
Topics: 3' Untranslated Regions; 5' Untranslated Regions; Poly A; Polyadenylation; Protein Isoforms; RNA Isoforms; RNA, Messenger; Saccharomyces cerevisiae
PubMed: 35972270
DOI: 10.1128/mcb.00244-22 -
Bioinformatics (Oxford, England) Jun 2020Most eukaryotic genes produce alternative polyadenylation (APA) isoforms. APA is dynamically regulated under different growth and differentiation conditions. Here, we...
SUMMARY
Most eukaryotic genes produce alternative polyadenylation (APA) isoforms. APA is dynamically regulated under different growth and differentiation conditions. Here, we present a bioinformatics package, named APAlyzer, for examining 3'UTR APA, intronic APA and gene expression changes using RNA-seq data and annotated polyadenylation sites in the PolyA_DB database. Using APAlyzer and data from the GTEx database, we present APA profiles across human tissues.
AVAILABILITY AND IMPLEMENTATION
APAlyzer is freely available at https://bioconductor.org/packages/release/bioc/html/APAlyzer.html as an R/Bioconductor package.
SUPPLEMENTARY INFORMATION
Supplementary data are available at Bioinformatics online.
Topics: Computational Biology; Humans; Poly A; Polyadenylation; Protein Isoforms; Software
PubMed: 32321166
DOI: 10.1093/bioinformatics/btaa266 -
Nucleic Acids Research Jul 2017High-throughput RNA sequencing methods coupled with specialized bioinformatic analyses have recently uncovered tens of thousands of unique circular (circ)RNAs, but their...
High-throughput RNA sequencing methods coupled with specialized bioinformatic analyses have recently uncovered tens of thousands of unique circular (circ)RNAs, but their complete sequences, genes of origin and functions are largely unknown. Given that circRNAs lack free ends and are thus relatively stable, their association with microRNAs (miRNAs) and RNA-binding proteins (RBPs) can influence gene expression programs. While exoribonuclease treatment is widely used to degrade linear RNAs and enrich circRNAs in RNA samples, it does not efficiently eliminate all linear RNAs. Here, we describe a novel method for the isolation of highly pure circRNA populations involving RNase R treatment followed by Polyadenylation and poly(A)+ RNA Depletion (RPAD), which removes linear RNA to near completion. High-throughput sequencing of RNA prepared using RPAD from human cervical carcinoma HeLa cells and mouse C2C12 myoblasts led to two surprising discoveries: (i) many exonic circRNA (EcircRNA) isoforms share an identical backsplice sequence but have different body sizes and sequences, and (ii) thousands of novel intronic circular RNAs (IcircRNAs) are expressed in cells. In sum, isolating high-purity circRNAs using the RPAD method can enable quantitative and qualitative analyses of circRNA types and sequence composition, paving the way for the elucidation of circRNA functions.
Topics: Animals; Base Sequence; Cell Line; Computational Biology; Exons; Exoribonucleases; HeLa Cells; High-Throughput Nucleotide Sequencing; Humans; Introns; Mice; Molecular Sequence Annotation; Myoblasts; Poly A; Polyadenylation; RNA; RNA Cleavage; RNA, Circular; RNA, Messenger
PubMed: 28444238
DOI: 10.1093/nar/gkx297 -
ELife Aug 2020Yeast cells undergoing the diauxic response show a striking upstream shift in poly(A) site utilization, with increased use of ORF-proximal poly(A) sites resulting in...
Yeast cells undergoing the diauxic response show a striking upstream shift in poly(A) site utilization, with increased use of ORF-proximal poly(A) sites resulting in shorter 3' mRNA isoforms for most genes. This altered poly(A) pattern is extremely similar to that observed in cells containing Pol II derivatives with slow elongation rates. Conversely, cells containing derivatives with fast elongation rates show a subtle downstream shift in poly(A) sites. Polyadenylation patterns of many genes are sensitive to both fast and slow elongation rates, and a global shift of poly(A) utilization is strongly linked to increased purine content of sequences flanking poly(A) sites. Pol II processivity is impaired in diauxic cells, but strains with reduced processivity and normal Pol II elongation rates have normal polyadenylation profiles. Thus, Pol II elongation speed is important for poly(A) site selection and for regulating poly(A) patterns in response to environmental conditions.
Topics: Poly A; Polyadenylation; RNA Polymerase II; Saccharomyces cerevisiae; Saccharomyces cerevisiae Proteins; Transcription Elongation, Genetic
PubMed: 32845240
DOI: 10.7554/eLife.59810 -
The Plant Journal : For Cell and... Aug 2017Moso bamboo (Phyllostachys edulis) represents one of the fastest-spreading plants in the world, due in part to its well-developed rhizome system. However, the...
Moso bamboo (Phyllostachys edulis) represents one of the fastest-spreading plants in the world, due in part to its well-developed rhizome system. However, the post-transcriptional mechanism for the development of the rhizome system in bamboo has not been comprehensively studied. We therefore used a combination of single-molecule long-read sequencing technology and polyadenylation site sequencing (PAS-seq) to re-annotate the bamboo genome, and identify genome-wide alternative splicing (AS) and alternative polyadenylation (APA) in the rhizome system. In total, 145 522 mapped full-length non-chimeric (FLNC) reads were analyzed, resulting in the correction of 2241 mis-annotated genes and the identification of 8091 previously unannotated loci. Notably, more than 42 280 distinct splicing isoforms were derived from 128 667 intron-containing full-length FLNC reads, including a large number of AS events associated with rhizome systems. In addition, we characterized 25 069 polyadenylation sites from 11 450 genes, 6311 of which have APA sites. Further analysis of intronic polyadenylation revealed that LTR/Gypsy and LTR/Copia were two major transposable elements within the intronic polyadenylation region. Furthermore, this study provided a quantitative atlas of poly(A) usage. Several hundred differential poly(A) sites in the rhizome-root system were identified. Taken together, these results suggest that post-transcriptional regulation may potentially have a vital role in the underground rhizome-root system.
Topics: Alternative Splicing; Introns; Molecular Sequence Annotation; Poaceae; Poly A; Polyadenylation; Rhizome; Sequence Analysis, DNA
PubMed: 28493303
DOI: 10.1111/tpj.13597 -
Genome Research Oct 2012The post-transcriptional fate of messenger RNAs (mRNAs) is largely dictated by their 3' untranslated regions (3' UTRs), which are defined by cleavage and polyadenylation...
The post-transcriptional fate of messenger RNAs (mRNAs) is largely dictated by their 3' untranslated regions (3' UTRs), which are defined by cleavage and polyadenylation (CPA) of pre-mRNAs. We used poly(A)-position profiling by sequencing (3P-seq) to map poly(A) sites at eight developmental stages and tissues in the zebrafish. Analysis of over 60 million 3P-seq reads substantially increased and improved existing 3' UTR annotations, resulting in confidently identified 3' UTRs for >79% of the annotated protein-coding genes in zebrafish. mRNAs from most zebrafish genes undergo alternative CPA, with those from more than a thousand genes using different dominant 3' UTRs at different stages. These included one of the poly(A) polymerase genes, for which alternative CPA reinforces its repression in the ovary. 3' UTRs tend to be shortest in the ovaries and longest in the brain. Isoforms with some of the shortest 3' UTRs are highly expressed in the ovary, yet absent in the maternally contributed RNAs of the embryo, perhaps because their 3' UTRs are too short to accommodate a uridine-rich motif required for stability of the maternal mRNA. At 2 h post-fertilization, thousands of unique poly(A) sites appear at locations lacking a typical polyadenylation signal, which suggests a wave of widespread cytoplasmic polyadenylation of mRNA degradation intermediates. Our insights into the identities, formation, and evolution of zebrafish 3' UTRs provide a resource for studying gene regulation during vertebrate development.
Topics: 3' Untranslated Regions; Animals; Evolution, Molecular; Female; Gene Expression Regulation, Developmental; Genomics; Humans; Molecular Sequence Annotation; Organogenesis; Ovary; Poly A; Polyadenylation; Transcription, Genetic; Zebrafish
PubMed: 22722342
DOI: 10.1101/gr.139733.112