-
Bioinformatics (Oxford, England) Jun 2024RNA design shows growing applications in synthetic biology and therapeutics, driven by the crucial role of RNA in various biological processes. A fundamental challenge...
MOTIVATION
RNA design shows growing applications in synthetic biology and therapeutics, driven by the crucial role of RNA in various biological processes. A fundamental challenge is to find functional RNA sequences that satisfy given structural constraints, known as the inverse folding problem. Computational approaches have emerged to address this problem based on secondary structures. However, designing RNA sequences directly from 3D structures is still challenging, due to the scarcity of data, the nonunique structure-sequence mapping, and the flexibility of RNA conformation.
RESULTS
In this study, we propose RiboDiffusion, a generative diffusion model for RNA inverse folding that can learn the conditional distribution of RNA sequences given 3D backbone structures. Our model consists of a graph neural network-based structure module and a Transformer-based sequence module, which iteratively transforms random sequences into desired sequences. By tuning the sampling weight, our model allows for a trade-off between sequence recovery and diversity to explore more candidates. We split test sets based on RNA clustering with different cut-offs for sequence or structure similarity. Our model outperforms baselines in sequence recovery, with an average relative improvement of 11% for sequence similarity splits and 16% for structure similarity splits. Moreover, RiboDiffusion performs consistently well across various RNA length categories and RNA types. We also apply in silico folding to validate whether the generated sequences can fold into the given 3D RNA backbones. Our method could be a powerful tool for RNA design that explores the vast sequence space and finds novel solutions to 3D structural constraints.
AVAILABILITY AND IMPLEMENTATION
The source code is available at https://github.com/ml4bio/RiboDiffusion.
Topics: RNA; Nucleic Acid Conformation; RNA Folding; Computational Biology; Algorithms; Software; Neural Networks, Computer; Sequence Analysis, RNA
PubMed: 38940178
DOI: 10.1093/bioinformatics/btae259 -
Journal of Computational Biology : a... Jun 2024Extrinsic, experimental information can be incorporated into thermodynamics-based RNA folding algorithms in the form of pseudo-energies. Evolutionary conservation of RNA...
Extrinsic, experimental information can be incorporated into thermodynamics-based RNA folding algorithms in the form of pseudo-energies. Evolutionary conservation of RNA secondary structure elements is detectable in alignments of phylogenetically related sequences and provides evidence for the presence of certain base pairs that can also be converted into pseudo-energy contributions. We show that the centroid base pairs computed from a consensus folding model such as RNAalifold result in a substantial improvement of the prediction accuracy for single sequences. Evidence for specific base pairs turns out to be more informative than a position-wise profile for the conservation of the pairing status. A comparison with chemical probing data, furthermore, strongly suggests that phylogenetic base pairing data are more informative than position-specific data on (un)pairedness as obtained from chemical probing experiments. In this context we demonstrate, in addition, that the conversion of signal from probing data into pseudo-energies is possible using thermodynamic structure predictions as a reference instead of known RNA structures.
Topics: Nucleic Acid Conformation; RNA; Thermodynamics; Algorithms; Phylogeny; Base Pairing; RNA Folding; Base Sequence; Computational Biology
PubMed: 38935442
DOI: 10.1089/cmb.2024.0519 -
International Journal of Molecular... Jun 2024Heat stroke, a hazardous hyperthermia-related illness, is characterized by CNS injury, particularly long-lasting brain damage. A root cause for hyperthermic neurological...
Heat stroke, a hazardous hyperthermia-related illness, is characterized by CNS injury, particularly long-lasting brain damage. A root cause for hyperthermic neurological damage is heat-induced proteotoxic stress through protein aggregation, a known causative agent of neurological disorders. Stress magnitude and enduring persistence are highly correlated with hyperthermia-associated neurological damage. We used an untargeted proteomic approach using liquid chromatography-tandem mass spectrometry (LC-MS/MS) to identify and characterize time-series proteome-wide changes in dose-responsive proteotoxic stress models in medulloblastoma [Daoy], neuroblastoma [SH-SY5Y], and differentiated SH-SY5Y neuron-like cells [SH(D)]. An integrated analysis of condition-time datasets identified global proteome-wide differentially expressed proteins (DEPs) as part of the heat-induced proteotoxic stress response. The condition-specific analysis detected higher DEPs and upregulated proteins in extreme heat stress with a relatively conservative and tight regulation in differentiated SH-SY5Y neuron-like cells. Functional network analysis using ingenuity pathway analysis (IPA) identified common intercellular pathways associated with the biological processes of protein, RNA, and amino acid metabolism and cellular response to stress and membrane trafficking. The condition-wise temporal pathway analysis in the differentiated neuron-like cells detects a significant pathway, functional, and disease association of DEPs with processes like protein folding and protein synthesis, Nervous System Development and Function, and Neurological Disease. An elaborate dose-dependent stress-specific and neuroprotective cellular signaling cascade is also significantly activated. Thus, our study provides a comprehensive map of the heat-induced proteotoxic stress response associating proteome-wide changes with altered biological processes. This helps to expand our understanding of the molecular basis of the heat-induced proteotoxic stress response with potential translational connotations.
Topics: Humans; Neurons; Proteomics; Proteome; Cell Line, Tumor; Heat-Shock Response; Tandem Mass Spectrometry; Chromatography, Liquid; Cell Differentiation; Proteotoxic Stress
PubMed: 38928492
DOI: 10.3390/ijms25126787 -
Nature Communications Jun 2024Although our understanding of the involvement of heterochromatin architectural factors in shaping nuclear organization is improving, there is still ongoing debate...
Although our understanding of the involvement of heterochromatin architectural factors in shaping nuclear organization is improving, there is still ongoing debate regarding the role of active genes in this process. In this study, we utilize publicly-available Micro-C data from mouse embryonic stem cells to investigate the relationship between gene transcription and 3D gene folding. Our analysis uncovers a nonmonotonic - globally positive - correlation between intragenic contact density and Pol II occupancy, independent of cohesin-based loop extrusion. Through the development of a biophysical model integrating the role of transcription dynamics within a polymer model of chromosome organization, we demonstrate that Pol II-mediated attractive interactions with limited valency between transcribed regions yield quantitative predictions consistent with chromosome-conformation-capture and live-imaging experiments. Our work provides compelling evidence that transcriptional activity shapes the 4D genome through Pol II-mediated micro-compartmentalization.
Topics: Animals; Mice; Mouse Embryonic Stem Cells; Transcription, Genetic; RNA Polymerase II; Chromosomal Proteins, Non-Histone; Cohesins; Heterochromatin; Chromosomes; Chromatin; Cell Cycle Proteins; Gene Expression Regulation
PubMed: 38918438
DOI: 10.1038/s41467-024-49727-7 -
BioRxiv : the Preprint Server For... Jun 2024Folding intermediates mediate both protein folding and the misfolding and aggregation observed in human diseases, including amyotrophic lateral sclerosis (ALS), and are...
Folding intermediates mediate both protein folding and the misfolding and aggregation observed in human diseases, including amyotrophic lateral sclerosis (ALS), and are prime targets for therapeutic interventions. In this study, we identified the core nucleus of structure for a folding intermediate in the second RNA recognition motif (RRM2) of the ALS-linked RNA-binding protein, TDP-43, using a combination of experimental and computational approaches. Urea equilibrium unfolding studies revealed that the RRM2 intermediate state consists of collapsed residual secondary structure localized to the N-terminal half of RRM2, while the C-terminus is largely disordered. Steered molecular dynamics simulations and mutagenesis studies yielded key stabilizing hydrophobic contacts that, when mutated to alanine, severely disrupt the overall fold of RRM2. In combination, these findings suggest a role for this RRM intermediate in normal TDP-43 function as well as serving as a template for misfolding and aggregation through the low stability and non-native secondary structure.
PubMed: 38915526
DOI: 10.1101/2024.06.12.598648 -
BioRxiv : the Preprint Server For... Jun 2024White matter hyperintensities (WMHs) are commonly detected on T2-weighted magnetic resonance imaging (MRI) scans, occurring in both typical aging and Alzheimer's...
White matter hyperintensities (WMHs) are commonly detected on T2-weighted magnetic resonance imaging (MRI) scans, occurring in both typical aging and Alzheimer's disease. Despite their frequent appearance and their association with cognitive decline, the molecular factors contributing to WMHs remain unclear. In this study, we investigated the transcriptomic profiles of two commonly affected brain regions with coincident AD pathology-frontal subcortical white matter (frontal-WM) and occipital subcortical white matter (occipital-WM)-and compared with age-matched healthy controls. Through RNA-sequencing in frontal- and occipital-WM bulk tissues, we identified an upregulation of genes associated with brain vasculature function in AD white matter. To further elucidate vasculature-specific transcriptomic features, we performed RNA-seq analysis on blood vessels isolated from these white matter regions, which revealed an upregulation of genes related to protein folding pathways. Finally, comparing gene expression profiles between AD individuals with high-versus low-WMH burden showed an increased expression of pathways associated with immune function. Taken together, our study characterizes the diverse molecular profiles of white matter changes in AD compared to normal aging and provides new mechanistic insights processes underlying AD-related WMHs.
PubMed: 38915516
DOI: 10.1101/2024.06.13.598845 -
Cellular and Molecular Life Sciences :... Jun 2024N-methyladenosine (mA) is one of the most prevalent and conserved RNA modifications. It controls several biological processes, including the biogenesis and function of...
N-methyladenosine (mA) is one of the most prevalent and conserved RNA modifications. It controls several biological processes, including the biogenesis and function of circular RNAs (circRNAs), which are a class of covalently closed-single stranded RNAs. Several studies have revealed that proteotoxic stress response induction could be a relevant anticancer therapy in Acute Myeloid Leukemia (AML). Furthermore, a strong molecular interaction between the mA mRNA modification factors and the suppression of the proteotoxic stress response has emerged. Since the proteasome inhibition leading to the imbalance in protein homeostasis is strictly linked to the stress response induction, we investigated the role of Bortezomib (Btz) on mA regulation and in particular its impact on the modulation of mA-modified circRNAs expression. Here, we show that treating AML cells with Btz downregulated the expression of the mA regulator WTAP at translational level, mainly because of increased oxidative stress. Indeed, Btz treatment promoted oxidative stress, with ROS generation and HMOX-1 activation and administration of the reducing agent N-acetylcysteine restored WTAP expression. Additionally, we identified mA-modified circRNAs modulated by Btz treatment, including circHIPK3, which is implicated in protein folding and oxidative stress regulation. These results highlight the intricate molecular networks involved in oxidative and ER stress induction in AML cells following proteotoxic stress response, laying the groundwork for future therapeutic strategies targeting these pathways.
Topics: Humans; RNA, Circular; Leukemia, Myeloid, Acute; Adenosine; Oxidative Stress; Bortezomib; Cell Line, Tumor; Reactive Oxygen Species; RNA Splicing Factors; Cell Cycle Proteins; Neoplastic Stem Cells; Heme Oxygenase-1; Protein Serine-Threonine Kinases; Intracellular Signaling Peptides and Proteins
PubMed: 38909325
DOI: 10.1007/s00018-024-05299-9 -
The Journal of Biological Chemistry Jun 2024Transfer RNAs (tRNAs) are the most highly modified cellular RNAs, both with respect to the proportion of nucleotides that are modified within the tRNA sequence and with... (Review)
Review
Transfer RNAs (tRNAs) are the most highly modified cellular RNAs, both with respect to the proportion of nucleotides that are modified within the tRNA sequence and with respect to the extraordinary diversity in tRNA modification chemistry. However, the functions of many different tRNA modifications are only beginning to emerge. tRNAs have two general clusters of modifications. The first cluster is within the anticodon stem-loop including several modifications essential for protein translation. The second cluster of modifications is within the tRNA elbow, and roles for these modifications are less clear. In general, tRNA elbow modifications are typically not essential for cell growth, but nonetheless several tRNA elbow modifications have been highly conserved throughout all domains of life. In addition to forming modifications, many tRNA modifying enzymes have been demonstrated or hypothesized to additionally play an important role in folding tRNA acting as tRNA chaperones. In this review, we summarize the known functions of tRNA modifying enzymes throughout the lifecycle of a tRNA molecule, from transcription to degradation. Thereby, we describe how tRNA modification and folding by tRNA modifying enzymes enhance tRNA maturation, tRNA aminoacylation, and tRNA function during protein synthesis, ultimately impacting cellular phenotypes and disease.
PubMed: 38908752
DOI: 10.1016/j.jbc.2024.107488 -
Methods in Molecular Biology (Clifton,... 2024In vitro selection of allosteric ribozymes has many challenges, such as complex and time-consuming experimental procedures, uncertain results, and the unwanted...
In vitro selection of allosteric ribozymes has many challenges, such as complex and time-consuming experimental procedures, uncertain results, and the unwanted functionality of the enriched sequences. The precise computational design of allosteric ribozymes is achievable using RNA secondary structure folding principles. The computational design of allosteric ribozymes is based on experimentally validated EAs, random search algorithms, and a partition function for RNA folding. The in silico design achieves an accuracy exceeding 90%. Various algorithms with different logic gates have been automated via computer programs that can quickly create many allosteric sequences. This can eliminate the need for in vitro selection of allosteric ribozymes, thus vastly reducing the time and cost required.
Topics: RNA, Catalytic; Algorithms; Nucleic Acid Conformation; Computational Biology; Allosteric Regulation; RNA Folding; Software; Computer Simulation
PubMed: 38907934
DOI: 10.1007/978-1-0716-3918-4_28 -
Methods in Molecular Biology (Clifton,... 2024The structure of RNA molecules is absolutely critical to their functions in a biological system. RNA structure is dynamic and changes in response to cellular needs....
The structure of RNA molecules is absolutely critical to their functions in a biological system. RNA structure is dynamic and changes in response to cellular needs. Within the last few decades, there has been an increased interest in studying the structure of RNA molecules and how they change to support the needs of the cell in different conditions. Selective 2'-hydroxyl acylation-based mutational profiling using high-throughput sequencing is a powerful method to predict the secondary structure of RNA molecules both in vivo and in immunopurified samples. Selective 2'-hydroxyl acylation-based mutational profiling using high-throughput sequencing works by adding bulky groups onto accessible "flexible" bases in an RNA molecule that are not involved in any base-pairing or RNA-protein interactions. When the RNA is reverse transcribed into cDNA, the bulky groups are incorporated as base mutations, which can be compared to an unmodified control to identify the locations of flexible bases. The comparison of sequence data between modified and unmodified samples allows the computer software program (developed to generate reactivity profiles) to generate RNA secondary structure models. These models can be compared in a variety of conditions to determine how specific stimuli influence RNA secondary structures.
Topics: RNA Folding; RNA; Mutation; High-Throughput Nucleotide Sequencing; Nucleic Acid Conformation; Software; Acylation
PubMed: 38907926
DOI: 10.1007/978-1-0716-3918-4_20