-
Briefings in Bioinformatics Mar 2018Computational programs for predicting RNA sequences with desired folding properties have been extensively developed and expanded in the past several years. Given a... (Review)
Review
Computational programs for predicting RNA sequences with desired folding properties have been extensively developed and expanded in the past several years. Given a secondary structure, these programs aim to predict sequences that fold into a target minimum free energy secondary structure, while considering various constraints. This procedure is called inverse RNA folding. Inverse RNA folding has been traditionally used to design optimized RNAs with favorable properties, an application that is expected to grow considerably in the future in light of advances in the expanding new fields of synthetic biology and RNA nanostructures. Moreover, it was recently demonstrated that inverse RNA folding can successfully be used as a valuable preprocessing step in computational detection of novel noncoding RNAs. This review describes the most popular freeware programs that have been developed for such purposes, starting from RNAinverse that was devised when formulating the inverse RNA folding problem. The most recently published ones that consider RNA secondary structure as input are antaRNA, RNAiFold and incaRNAfbinv, each having different features that could be beneficial to specific biological problems in practice. The various programs also use distinct approaches, ranging from ant colony optimization to constraint programming, in addition to adaptive walk, simulated annealing and Boltzmann sampling. This review compares between the various programs and provides a simple description of the various possibilities that would benefit practitioners in selecting the most suitable program. It is geared for specific tasks requiring RNA design based on input secondary structure, with an outlook toward the future of RNA design programs.
Topics: Algorithms; Animals; Computational Biology; Humans; Models, Molecular; Nucleic Acid Conformation; RNA; RNA Folding; Software
PubMed: 28049135
DOI: 10.1093/bib/bbw120 -
Nature Chemical Biology May 2020RNA secondary structure is critical to RNA regulation and function. We report a new N-kethoxal reagent that allows fast and reversible labeling of single-stranded...
RNA secondary structure is critical to RNA regulation and function. We report a new N-kethoxal reagent that allows fast and reversible labeling of single-stranded guanine bases in live cells. This N-kethoxal-based chemistry allows efficient RNA labeling under mild conditions and transcriptome-wide RNA secondary structure mapping.
Topics: Aldehydes; Animals; Butanones; Embryonic Stem Cells; Guanine; HeLa Cells; High-Throughput Nucleotide Sequencing; Humans; Mice; Nucleic Acid Conformation; Nucleic Acid Heteroduplexes; RNA; RNA Folding; Transcriptome
PubMed: 32015521
DOI: 10.1038/s41589-019-0459-3 -
Methods (San Diego, Calif.) Aug 2017Pre-mRNA molecules can form a variety of structures, and both secondary and tertiary structures have important effects on processing, function and stability of these... (Review)
Review
Pre-mRNA molecules can form a variety of structures, and both secondary and tertiary structures have important effects on processing, function and stability of these molecules. The prediction of RNA secondary structure is a challenging problem and various algorithms that use minimum free energy, maximum expected accuracy and comparative evolutionary based methods have been developed to predict secondary structures. However, these tools are not perfect, and this remains an active area of research. The secondary structure of pre-mRNA molecules can have an enhancing or inhibitory effect on pre-mRNA splicing. An example of enhancing structure can be found in a novel class of introns in zebrafish. About 10% of zebrafish genes contain a structured intron that forms a bridging hairpin that enforces correct splice site pairing. Negative examples of splicing include local structures around splice sites that decrease splicing efficiency and potentially cause mis-splicing leading to disease. Splicing mutations are a frequent cause of hereditary disease. The transcripts of disease genes are significantly more structured around the splice sites, and point mutations that increase the local structure often cause splicing disruptions. Post-splicing, RNA secondary structure can also affect the stability of the spliced intron and regulatory RNA interference pathway intermediates, such as pre-microRNAs. Additionally, RNA secondary structure has important roles in the innate immune defense against viruses. Finally, tertiary structure can also play a large role in pre-mRNA splicing. One example is the G-quadruplex structure, which, similar to secondary structure, can either enhance or inhibit splicing through mechanisms such as creating or obscuring RNA binding protein sites.
Topics: Animals; Exons; G-Quadruplexes; Humans; Immunity, Innate; Introns; Mutation; RNA Folding; RNA Precursors; RNA Splicing; RNA, Double-Stranded; Zebrafish
PubMed: 28595983
DOI: 10.1016/j.ymeth.2017.06.001 -
Bioinformatics (Oxford, England) Jan 2016The function of an RNA molecule is not only linked to its native structure, which is usually taken to be the ground state of its folding landscape, but also in many...
MOTIVATION
The function of an RNA molecule is not only linked to its native structure, which is usually taken to be the ground state of its folding landscape, but also in many cases crucially depends on the details of the folding pathways such as stable folding intermediates or the timing of the folding process itself. To model and understand these processes, it is necessary to go beyond ground state structures. The study of rugged RNA folding landscapes holds the key to answer these questions. Efficient coarse-graining methods are required to reduce the intractably vast energy landscapes into condensed representations such as barrier trees or basin hopping graphs : BHG) that convey an approximate but comprehensive picture of the folding kinetics. So far, exact and heuristic coarse-graining methods have been mostly restricted to the pseudoknot-free secondary structures. Pseudoknots, which are common motifs and have been repeatedly hypothesized to play an important role in guiding folding trajectories, were usually excluded.
RESULTS
We generalize the BHG framework to include pseudoknotted RNA structures and systematically study the differences in predicted folding behavior depending on whether pseudoknotted structures are allowed to occur as folding intermediates or not. We observe that RNAs with pseudoknotted ground state structures tend to have more pseudoknotted folding intermediates than RNAs with pseudoknot-free ground state structures. The occurrence and influence of pseudoknotted intermediates on the folding pathway, however, appear to depend very strongly on the individual RNAs so that no general rule can be inferred.
AVAILABILITY AND IMPLEMENTATION
The algorithms described here are implemented in C++ as standalone programs. Its source code and Supplemental material can be freely downloaded from http://www.tbi.univie.ac.at/bhg.html.
CONTACT
SUPPLEMENTARY INFORMATION
Supplementary data are available at Bioinformatics online.
Topics: Algorithms; Kinetics; Nucleic Acid Conformation; RNA; RNA Folding
PubMed: 26428288
DOI: 10.1093/bioinformatics/btv572 -
BMC Bioinformatics Aug 2011RNAexinv is an interactive java application that performs RNA sequence design, constrained to yield a specific RNA shape and physical attributes. It is an extended...
BACKGROUND
RNAexinv is an interactive java application that performs RNA sequence design, constrained to yield a specific RNA shape and physical attributes. It is an extended inverse RNA folding program with the rationale behind that the generated sequences should not only fold into a desired structure, but they should also exhibit favorable attributes such as thermodynamic stability and mutational robustness. RNAexinv considers not only the secondary structure in order to design sequences, but also the mutational robustness and the minimum free energy. The sequences that are generated may not fully conform with the given RNA secondary structure, but they will strictly conform with the RNA shape of the given secondary structure and thereby take into consideration the recommended values of thermodynamic stability and mutational robustness that are provided.
RESULTS
The output consists of designed sequences that are generated by the proposed method. Selecting a sequence displays the secondary structure drawings of the target and the predicted fold of the sequence, including some basic information about the desired and achieved thermodynamic stability and mutational robustness. RNAexinv can be used successfully without prior experience, simply specifying an initial RNA secondary structure in dot-bracket notation and numerical values for the desired neutrality and minimum free energy. The package runs under LINUX operating system. Secondary structure predictions are performed using the Vienna RNA package.
CONCLUSIONS
RNAexinv is a user friendly tool that can be used for RNA sequence design. It is especially useful in cases where a functional stem-loop structure of a natural sequence should be strictly kept in the designed sequences but a distant motif in the rest of the structure may contain one more or less nucleotide at the expense of another, as long as the global shape is preserved. This allows the insertion of physical observables as constraints. RNAexinv is available at http://www.cs.bgu.ac.il/~RNAexinv.
Topics: Algorithms; Base Sequence; Molecular Sequence Data; Nucleic Acid Conformation; RNA; RNA Folding; Software; Thermodynamics
PubMed: 21813013
DOI: 10.1186/1471-2105-12-319 -
Nucleic Acids Research Jul 2013Synthetic biology and nanotechnology are poised to make revolutionary contributions to the 21st century. In this article, we describe a new web server to support in...
Synthetic biology and nanotechnology are poised to make revolutionary contributions to the 21st century. In this article, we describe a new web server to support in silico RNA molecular design. Given an input target RNA secondary structure, together with optional constraints, such as requiring GC-content to lie within a certain range, requiring the number of strong (GC), weak (AU) and wobble (GU) base pairs to lie in a certain range, the RNAiFold web server determines one or more RNA sequences, whose minimum free-energy secondary structure is the target structure. RNAiFold provides access to two servers: RNA-CPdesign, which applies constraint programming, and RNA-LNSdesign, which applies the large neighborhood search heuristic; hence, it is suitable for larger input structures. Both servers can also solve the RNA inverse hybridization problem, i.e. given a representation of the desired hybridization structure, RNAiFold returns two sequences, whose minimum free-energy hybridization is the input target structure. The web server is publicly accessible at http://bioinformatics.bc.edu/clotelab/RNAiFold, which provides access to two specialized servers: RNA-CPdesign and RNA-LNSdesign. Source code for the underlying algorithms, implemented in COMET and supported on linux, can be downloaded at the server website.
Topics: Algorithms; Base Composition; Base Pairing; Base Sequence; Computer Simulation; Internet; RNA; RNA Folding; Software
PubMed: 23700314
DOI: 10.1093/nar/gkt280 -
Journal of Chemical Theory and... Mar 2022RNA molecules fold as they are transcribed. Cotranscriptional folding of RNA plays a critical role in RNA functions . Present computational strategies focus on...
RNA molecules fold as they are transcribed. Cotranscriptional folding of RNA plays a critical role in RNA functions . Present computational strategies focus on simulations where large structural changes may not be completely sampled. Here, we describe an alternative approach to predicting cotranscriptional RNA folding by zooming in and out of the RNA folding energy landscape. By classifying the RNA structural ensemble into "partitions" based on long, stable helices, we zoom out of the landscape and predict the overall slow folding kinetics from the interpartition kinetic network, and for each interpartition transition, we zoom in on the landscape to simulate the kinetics. Applications of the model to the 117-nucleotide SRP RNA and the 59-nucleotide HIV-1 TAR RNA show agreements with the experimental data and new structural and kinetic insights into biologically significant conformational switches and pathways for these important systems. This approach, by zooming in/out of an RNA folding landscape at different resolutions, might allow us to treat large RNAs with transcriptional pause, transcription speed, and other effects.
Topics: Escherichia coli; Kinetics; Nucleic Acid Conformation; RNA; RNA Folding; Thermodynamics; Transcription, Genetic
PubMed: 35133833
DOI: 10.1021/acs.jctc.1c01233 -
Nucleic Acids Research Sep 2012Folding mechanisms in which secondary structures are stabilized through the formation of tertiary interactions are well documented in protein folding but challenge the...
Folding mechanisms in which secondary structures are stabilized through the formation of tertiary interactions are well documented in protein folding but challenge the folding hierarchy normally assumed for RNA. However, it is increasingly clear that RNA could fold by a similar mechanism. P5abc, a small independently folding tertiary domain of the Tetrahymena thermophila group I ribozyme, is known to fold by a secondary structure rearrangement involving helix P5c. However, the extent of this rearrangement and the precise stage of folding that triggers it are unknown. We use experiments and simulations to show that the P5c helix switches to the native secondary structure late in the folding pathway and is directly coupled to the formation of tertiary interactions in the A-rich bulge. P5c mutations show that the switch in P5c is not rate-determining and suggest that non-native interactions in P5c aid folding rather than impede it. Our study illustrates that despite significant differences in the building blocks of proteins and RNA, there may be common ways in which they self-assemble.
Topics: Magnesium; Molecular Dynamics Simulation; Nucleic Acid Conformation; Nucleic Acid Denaturation; RNA Folding; RNA, Catalytic; Ribonuclease T1; Tetrahymena
PubMed: 22641849
DOI: 10.1093/nar/gks468 -
RNA (New York, N.Y.) Jun 2022RNA folding is hierarchical; therefore, predicting RNA secondary structure from sequence is an intermediate step in predicting tertiary structure. Secondary structure...
RNA folding is hierarchical; therefore, predicting RNA secondary structure from sequence is an intermediate step in predicting tertiary structure. Secondary structure prediction is based on a nearest neighbor model using free energy minimization. To improve secondary structure prediction, all types of naturally occurring secondary structure motifs need to be thermodynamically characterized. However, not all secondary structure motifs are well characterized. Pentaloops, the second most abundant hairpin size, is one such uncharacterized motif. In fact, the current thermodynamic model used to predict the stability of pentaloops was derived from a small data set of pentaloops and from data for other hairpins of different sizes. Here, the most commonly occurring pentaloops were identified and optically melted. New experimental data for 22 pentaloop sequences were combined with previously published data for nine pentaloop sequences. Using linear regression, a pentaloop-specific model was derived. This new model is simpler and more accurate than the current model. The new experimental data and improved model can be incorporated into software that is used to predict RNA secondary structure from sequence.
Topics: Cluster Analysis; Nucleic Acid Conformation; RNA; RNA Folding; Thermodynamics
PubMed: 35318243
DOI: 10.1261/rna.078915.121 -
RNA Biology Sep 2020Secondary structure prediction approaches rely typically on models of equilibrium free energies that are themselves based on in vitro physical chemistry. Recent...
Secondary structure prediction approaches rely typically on models of equilibrium free energies that are themselves based on in vitro physical chemistry. Recent transcriptome-wide experiments of in vivo RNA structure based on SHAPE-MaP experiments provide important information that may make it possible to extend current in vitro-based RNA folding models in order to improve the accuracy of computational RNA folding simulations with respect to the experimentally measured in vivo RNA secondary structure. Here we present a machine learning approach that utilizes RNA secondary structure prediction results and nucleotide sequence in order to predict in vivo SHAPE scores. We show that this approach has a higher Pearson correlation coefficient with experimental SHAPE scores than thermodynamic folding. This could be an important step towards augmenting experimental results with computational predictions and help with RNA secondary structure predictions that inherently take in-vivo folding properties into account.
Topics: Codon, Initiator; Computational Biology; Deep Learning; Models, Molecular; Neural Networks, Computer; Nucleic Acid Conformation; RNA; RNA Folding
PubMed: 32476596
DOI: 10.1080/15476286.2020.1760534