-
Biomeditsinskaia Khimiia Jun 2024Renalase (RNLS) is a recently discovered protein that plays an important role in the regulation of blood pressure by acting inside and outside cells. Intracellular RNLS...
Renalase (RNLS) is a recently discovered protein that plays an important role in the regulation of blood pressure by acting inside and outside cells. Intracellular RNLS is a FAD-dependent oxidoreductase that oxidizes isomeric forms of β-NAD(P)H. Extracellular renalase lacking its N-terminal peptide and cofactor FAD exerts various protective effects via non-catalytic mechanisms. Certain experimental evidence exists in the literature that the RP220 peptide (a 20-mer peptide corresponding to the amino acid sequence RNLS 220-239) reproduces a number of non-catalytic effects of this protein, acting on receptor proteins of the plasma membrane. The possibility of interaction of this peptide with intracellular proteins has not been studied. Taking into consideration the known role of RNLS as a possible antihypertensive factor, the aim of this study was to perform proteomic profiling of the kidneys of normotensive and hypertensive rats using RP220 as an affinity ligand. Proteomic (semi-quantitative) identification revealed changes in the relative content of about 200 individual proteins in the kidneys of hypertensive rats bound to the affinity sorbent as compared to the kidneys of normotensive animals. Increased binding of SHR renal proteins to RP220 over the normotensive control was found for proteins involved in the development of cardiovascular pathology. Decreased binding of the kidney proteins from hypertensive animals to RP220 was noted for components of the ubiquitin-proteasome system, ribosomes, and cytoskeleton.
Topics: Animals; Rats; Kidney; Hypertension; Rats, Inbred SHR; Proteomics; Monoamine Oxidase; Male; Ligands; Peptides; Proteome
PubMed: 38940203
DOI: 10.18097/PBMC20247003145 -
Bioinformatics (Oxford, England) Jun 2024In drug discovery, it is crucial to assess the drug-target binding affinity (DTA). Although molecular docking is widely used, computational efficiency limits its...
MOTIVATION
In drug discovery, it is crucial to assess the drug-target binding affinity (DTA). Although molecular docking is widely used, computational efficiency limits its application in large-scale virtual screening. Deep learning-based methods learn virtual scoring functions from labeled datasets and can quickly predict affinity. However, there are three limitations. First, existing methods only consider the atom-bond graph or one-dimensional sequence representations of compounds, ignoring the information about functional groups (pharmacophores) with specific biological activities. Second, relying on limited labeled datasets fails to learn comprehensive embedding representations of compounds and proteins, resulting in poor generalization performance in complex scenarios. Third, existing feature fusion methods cannot adequately capture contextual interaction information.
RESULTS
Therefore, we propose a novel DTA prediction method named HeteroDTA. Specifically, a multi-view compound feature extraction module is constructed to model the atom-bond graph and pharmacophore graph. The residue concat graph and protein sequence are also utilized to model protein structure and function. Moreover, to enhance the generalization capability and reduce the dependence on task-specific labeled data, pre-trained models are utilized to initialize the atomic features of the compounds and the embedding representations of the protein sequence. A context-aware nonlinear feature fusion method is also proposed to learn interaction patterns between compounds and proteins. Experimental results on public benchmark datasets show that HeteroDTA significantly outperforms existing methods. In addition, HeteroDTA shows excellent generalization performance in cold-start experiments and superiority in the representation learning ability of drug-target pairs. Finally, the effectiveness of HeteroDTA is demonstrated in a real-world drug discovery study.
AVAILABILITY AND IMPLEMENTATION
The source code and data are available at https://github.com/daydayupzzl/HeteroDTA.
Topics: Drug Discovery; Molecular Docking Simulation; Proteins; Deep Learning; Pharmacophore
PubMed: 38940179
DOI: 10.1093/bioinformatics/btae240 -
Bioinformatics (Oxford, England) Jun 2024Multiple sequence alignment is an important problem in computational biology with applications that include phylogeny and the detection of remote homology between...
SUMMARY
Multiple sequence alignment is an important problem in computational biology with applications that include phylogeny and the detection of remote homology between protein sequences. UPP is a popular software package that constructs accurate multiple sequence alignments for large datasets based on ensembles of hidden Markov models (HMMs). A computational bottleneck for this method is a sequence-to-HMM assignment step, which relies on the precise computation of probability scores on the HMMs. In this work, we show that we can speed up this assignment step significantly by replacing these HMM probability scores with alternative scores that can be efficiently estimated. Our proposed approach utilizes a multi-armed bandit algorithm to adaptively and efficiently compute estimates of these scores. This allows us to achieve similar alignment accuracy as UPP with a significant reduction in computation time, particularly for datasets with long sequences.
AVAILABILITY AND IMPLEMENTATION
The code used to produce the results in this paper is available on GitHub at: https://github.com/ilanshom/adaptiveMSA.
Topics: Sequence Alignment; Algorithms; Software; Markov Chains; Computational Biology; Sequence Analysis, Protein; Phylogeny; Proteins
PubMed: 38940160
DOI: 10.1093/bioinformatics/btae225 -
Bioinformatics (Oxford, England) Jun 2024Profiling of gene expression and chromatin accessibility by single-cell multi-omics approaches can help to systematically decipher how transcription factors (TFs)...
MOTIVATION
Profiling of gene expression and chromatin accessibility by single-cell multi-omics approaches can help to systematically decipher how transcription factors (TFs) regulate target gene expression via cis-region interactions. However, integrating information from different modalities to discover regulatory associations is challenging, in part because motif scanning approaches miss many likely TF binding sites.
RESULTS
We develop REUNION, a framework for predicting genome-wide TF binding and cis-region-TF-gene "triplet" regulatory associations using single-cell multi-omics data. The first component of REUNION, Unify, utilizes information theory-inspired complementary score functions that incorporate TF expression, chromatin accessibility, and target gene expression to identify regulatory associations. The second component, Rediscover, takes Unify estimates as input for pseudo semi-supervised learning to predict TF binding in accessible genomic regions that may or may not include detected TF motifs. Rediscover leverages latent chromatin accessibility and sequence feature spaces of the genomic regions, without requiring chromatin immunoprecipitation data for model training. Applied to peripheral blood mononuclear cell data, REUNION outperforms alternative methods in TF binding prediction on average performance. In particular, it recovers missing region-TF associations from regions lacking detected motifs, which circumvents the reliance on motif scanning and facilitates discovery of novel associations involving potential co-binding transcriptional regulators. Newly identified region-TF associations, even in regions lacking a detected motif, improve the prediction of target gene expression in regulatory triplets, and are thus likely to genuinely participate in the regulation.
AVAILABILITY AND IMPLEMENTATION
All source code is available at https://github.com/yangymargaret/REUNION.
Topics: Transcription Factors; Humans; Single-Cell Analysis; Binding Sites; Chromatin; Genomics; Software; Computational Biology; Protein Binding; Algorithms; Leukocytes, Mononuclear; Multiomics
PubMed: 38940155
DOI: 10.1093/bioinformatics/btae234 -
Bioinformatics (Oxford, England) Jun 2024Insertions and deletions (indels) influence the genetic code in fundamentally distinct ways from substitutions, significantly impacting gene product structure and...
MOTIVATION
Insertions and deletions (indels) influence the genetic code in fundamentally distinct ways from substitutions, significantly impacting gene product structure and function. Despite their influence, the evolutionary history of indels is often neglected in phylogenetic tree inference and ancestral sequence reconstruction, hindering efforts to comprehend biological diversity determinants and engineer variants for medical and industrial applications.
RESULTS
We frame determining the optimal history of indel events as a single Mixed-Integer Programming (MIP) problem, across all branch points in a phylogenetic tree adhering to topological constraints, and all sites implied by a given set of aligned, extant sequences. By disentangling the impact on ancestral sequences at each branch point, this approach identifies the minimal indel events that jointly explain the diversity in sequences mapped to the tips of that tree. MIP can recover alternate optimal indel histories, if available. We evaluated MIP for indel inference on a dataset comprising 15 real phylogenetic trees associated with protein families ranging from 165 to 2000 extant sequences, and on 60 synthetic trees at comparable scales of data and reflecting realistic rates of mutation. Across relevant metrics, MIP outperformed alternative parsimony-based approaches and reported the fewest indel events, on par or below their occurrence in synthetic datasets. MIP offers a rational justification for indel patterns in extant sequences; importantly, it uniquely identifies global optima on complex protein data sets without making unrealistic assumptions of independence or evolutionary underpinnings, promising a deeper understanding of molecular evolution and aiding novel protein design.
AVAILABILITY AND IMPLEMENTATION
The implementation is available via GitHub at https://github.com/santule/indelmip.
Topics: Phylogeny; INDEL Mutation; Evolution, Molecular; Algorithms; Computational Biology
PubMed: 38940131
DOI: 10.1093/bioinformatics/btae254 -
Bioinformatics (Oxford, England) Jun 2024One of the core problems in the analysis of protein tandem mass spectrometry data is the peptide assignment problem: determining, for each observed spectrum, the peptide...
MOTIVATION
One of the core problems in the analysis of protein tandem mass spectrometry data is the peptide assignment problem: determining, for each observed spectrum, the peptide sequence that was responsible for generating the spectrum. Two primary classes of methods are used to solve this problem: database search and de novo peptide sequencing. State-of-the-art methods for de novo sequencing use machine learning methods, whereas most database search engines use hand-designed score functions to evaluate the quality of a match between an observed spectrum and a candidate peptide from the database. We hypothesized that machine learning models for de novo sequencing implicitly learn a score function that captures the relationship between peptides and spectra, and thus may be re-purposed as a score function for database search. Because this score function is trained from massive amounts of mass spectrometry data, it could potentially outperform existing, hand-designed database search tools.
RESULTS
To test this hypothesis, we re-engineered Casanovo, which has been shown to provide state-of-the-art de novo sequencing capabilities, to assign scores to given peptide-spectrum pairs. We then evaluated the statistical power of this Casanovo score function, Casanovo-DB, to detect peptides on a benchmark of three mass spectrometry runs from three different species. In addition, we show that re-scoring with the Percolator post-processor benefits Casanovo-DB more than other score functions, further increasing the number of detected peptides.
Topics: Databases, Protein; Peptides; Machine Learning; Mass Spectrometry; Algorithms; Sequence Analysis, Protein; Tandem Mass Spectrometry
PubMed: 38940129
DOI: 10.1093/bioinformatics/btae218 -
Frontiers in Bioscience (Landmark... Jun 2024Hormone receptors exert their function through binding with their ligands, which results in cellular signaling activation mediated by genomic or non-genomic mechanisms....
BACKGROUND
Hormone receptors exert their function through binding with their ligands, which results in cellular signaling activation mediated by genomic or non-genomic mechanisms. The intrinsic molecular communication of tick and its host comprises an endocrine regulation involving hormones. In the present study, we performed a molecular and analysis of a Membrane Associated Progesterone Receptor in (RmMAPRC).
METHODS
The RmMAPRC protein sequence was analyzed with bioinformatics tools, and its structure was characterized by three-dimensional (3D) modeling and molecular docking. A semi-quantitative reverse transcription and polymerase chain reaction (sqRT-PCR) assessed the gene presence and relative expression in tick organs and embryonic cells.
RESULTS
relative expression in salivary glands, ovaries, and embryonic cells showed overexpression of 3%, 13%, and 24%, respectively. Bioinformatic analysis revealed that RmMAPRC corresponded to a Progesterone Receptor Membrane Component 1 (RmPGRMC1) of ~23.7 kDa, with an N-terminal transmembrane domain and a C-terminal Cytochrome b5-like heme/steroid binding domain. The docking results suggest that RmPGRMC1 could bind to progesterone (P4), some progestins, and P4 antagonists. The phylogenetic reconstruction showed that spp. MAPRC receptors were clustered in a clade that includes , , and (RmMAPRC), and mammals and helminths MAPRC receptors clustered in two separated clades away from ticks.
CONCLUSIONS
The presence of RmPGRMC1 highlights the importance of transregulation as a conserved adaptive mechanism that has succeeded for arthropod parasites, making it a target for tick control.
Topics: Animals; Rhipicephalus; Receptors, Progesterone; Progesterone; Cattle; Molecular Docking Simulation; Host-Parasite Interactions; Female; Amino Acid Sequence; Protein Binding; Phylogeny
PubMed: 38940045
DOI: 10.31083/j.fbl2906238 -
Frontiers in Bioscience (Landmark... Jun 2024The incidence rate of oropharyngeal squamous cell carcinoma (OPSCC) worldwide is alarming. In the clinical community, there is a pressing necessity to comprehend the...
BACKGROUND
The incidence rate of oropharyngeal squamous cell carcinoma (OPSCC) worldwide is alarming. In the clinical community, there is a pressing necessity to comprehend the etiology of the OPSCC to facilitate the administration of effective treatments.
METHODS
This study confers an integrative genomics approach for identifying key oncogenic drivers involved in the OPSCC pathogenesis. The dataset contains RNA-Sequencing (RNA-Seq) samples of 46 Human papillomavirus-positive head and neck squamous cell carcinoma and 25 normal Uvulopalatopharyngoplasty cases. The differential marker selection is performed between the groups with a log2FoldChange (FC) score of 2, adjusted -value < 0.01, and screened 714 genes. The Particle Swarm Optimization (PSO) algorithm selects the candidate gene subset, reducing the size to 73. The state-of-the-art machine learning algorithms are trained with the differentially expressed genes and candidate subsets of PSO.
RESULTS
The analysis of predictive models using Shapley Additive exPlanations revealed that seven genes significantly contribute to the model's performance. These include , , and , which predominantly influence differentiating between sample groups. They were followed in importance by , , , and . The Random Forest and Bayes Net algorithms also achieved perfect validation scores when using PSO features. Furthermore, gene set enrichment analysis, protein-protein interactions, and disease ontology mining revealed a significant association between these genes and the target condition. As indicated by Shapley Additive exPlanations (SHAPs), the survival analysis of three key genes unveiled strong over-expression in the samples from "The Cancer Genome Atlas".
CONCLUSIONS
Our findings elucidate critical oncogenic drivers in OPSCC, offering vital insights for developing targeted therapies and enhancing understanding its pathogenesis.
Topics: Humans; Oropharyngeal Neoplasms; Biomarkers, Tumor; Papillomavirus Infections; Artificial Intelligence; Gene Expression Regulation, Neoplastic; Squamous Cell Carcinoma of Head and Neck; Algorithms; Sequence Analysis, RNA; Machine Learning; Papillomaviridae; Carcinoma, Squamous Cell
PubMed: 38940026
DOI: 10.31083/j.fbl2906220 -
Journal of Extracellular Biology Aug 2023Non-coding RNAs (ncRNAs) are important regulators of gene expression. They are expressed not only in cells, but also in cell-derived extracellular vesicles (EVs). The...
Non-coding RNAs (ncRNAs) are important regulators of gene expression. They are expressed not only in cells, but also in cell-derived extracellular vesicles (EVs). The mechanisms controlling their loading and sorting remain poorly understood. Here, we investigated the impact of mutations on the non-coding RNA content of small melanoma EVs. After purification of small EVs from six different patient-derived melanoma cell lines, we characterized them by small RNA sequencing and lncRNA microarray analysis. We found that mutations are associated with a specific micro and long non-coding RNA content in small EVs. Then, we showed that long and small non-coding RNAs enriched in mutant small EVs share a common sequence motif, highly similar to the RNA-binding motif of Sam68, a protein interacting with hnRNP proteins. This protein thus may be an interesting partner of p53, involved in the expression and loading of the ncRNAs. To conclude, our data support the existence of cellular mechanisms associate with mutations which control the ncRNA content of small EVs in melanoma.
PubMed: 38939511
DOI: 10.1002/jex2.105 -
Frontiers in Oncology 2024Pervasive transcription of the eukaryotic genome generates noncoding RNAs (ncRNAs), which regulate messenger RNA (mRNA) stability and translation. MicroRNAs...
BACKGROUND
Pervasive transcription of the eukaryotic genome generates noncoding RNAs (ncRNAs), which regulate messenger RNA (mRNA) stability and translation. MicroRNAs (miRNAs/miRs) represent a group of well-studied ncRNAs that maintain cellular homeostasis. Thus, any aberration in miRNA expression can cause diseases, including carcinogenesis. According to microRNA microarray analyses, intronic miR-617 is significantly downregulated in oral squamous cell carcinoma (OSCC) tissues compared to normal oral tissues.
METHODS
The miR-617-mediated regulation of is established by performing experiments on OSCC cell lines, patient samples, and xenograft nude mice model. Overexpression plasmid constructs, bisulphite sequencing PCR, bioinformatics analyses, RT-qPCR, Western blotting, dual-luciferase reporter assay, and cell-based assays are utilized to delineate the role of miR-617 in OSCC.
RESULTS
The present study shows that miR-617 has an anti-proliferative role in OSCC cells and is partly downregulated in OSCC cells due to the hypermethylation of its independent promoter. Further, we demonstrate that miR-617 upregulates gene by interacting with its promoter in a dose-dependent and sequence-specific manner, and this interaction is found to be biologically relevant in OSCC patient samples. Subsequently, we show that miR-617 regulates cell proliferation, apoptosis, and anchorage-independent growth of OSCC cells by modulating DDX27 levels. Besides, our study shows that miR-617 exerts its effects through the PI3K/AKT/MTOR pathway via regulating DDX27 levels. Furthermore, the OSCC xenograft study in nude mice shows the anti-tumorigenic potential of miR-617.
CONCLUSION
miR-617-mediated upregulation of DDX27 is a novel mechanism in OSCC and underscores the therapeutic potential of synthetic miR-617 mimics in cancer therapeutics. To the best of our knowledge, miR-617 is the 15th example of a miRNA that upregulates the expression of a protein-coding gene by interacting with its promoter.
PubMed: 38939334
DOI: 10.3389/fonc.2024.1411539