protein sequence - OpenMD.com Journal Search

Towards sequence-based principles for protein phase separation predictions.

Current Opinion in Chemical Biology Aug 2023

The phenomenon of protein phase separation, which underlies the formation of biomolecular condensates, has been associated with numerous cellular functions. Recent... (Review)

Summary PubMed Full Text

Review

Authors: Michele Vendruscolo, Monika Fuxreiter

The phenomenon of protein phase separation, which underlies the formation of biomolecular condensates, has been associated with numerous cellular functions. Recent studies indicate that the amino acid sequences of most proteins may harbour not only the code for folding into the native state but also for condensing into the liquid-like droplet state and the solid-like amyloid state. Here we review the current understanding of the principles for sequence-based methods for predicting the propensity of proteins for phase separation. A guiding concept is that entropic contributions are generally more important to stabilise the droplet state than they are for the native and amyloid states. Although estimating these entropic contributions has proven difficult, we describe some progress that has been recently made in this direction. To conclude, we discuss the challenges ahead to extend sequence-based prediction methods of protein phase separation to include quantitative in vivo characterisations of this process.

Topics: Amyloid; Amino Acid Sequence; Cell Physiological Phenomena

PubMed: 37207400
DOI: 10.1016/j.cbpa.2023.102317

Editorial: Neuropeptide actions in arthropod biology.

Frontiers in Endocrinology 2024

Summary PubMed Full Text PDF

Authors: Qisheng Song, David Stanley

Topics: Animals; Arthropods; Neuropeptides; Amino Acid Sequence; Biology

PubMed: 38481439
DOI: 10.3389/fendo.2024.1387176

HNetGO: protein function prediction via heterogeneous network transformer.

Briefings in Bioinformatics Sep 2023

Protein function annotation is one of the most important research topics for revealing the essence of life at molecular level in the post-genome era. Current research...

Summary PubMed Full Text PDF

Authors: Xiaoshuai Zhang, Huannan Guo, Fan Zhang...

Protein function annotation is one of the most important research topics for revealing the essence of life at molecular level in the post-genome era. Current research shows that integrating multisource data can effectively improve the performance of protein function prediction models. However, the heavy reliance on complex feature engineering and model integration methods limits the development of existing methods. Besides, models based on deep learning only use labeled data in a certain dataset to extract sequence features, thus ignoring a large amount of existing unlabeled sequence data. Here, we propose an end-to-end protein function annotation model named HNetGO, which innovatively uses heterogeneous network to integrate protein sequence similarity and protein-protein interaction network information and combines the pretraining model to extract the semantic features of the protein sequence. In addition, we design an attention-based graph neural network model, which can effectively extract node-level features from heterogeneous networks and predict protein function by measuring the similarity between protein nodes and gene ontology term nodes. Comparative experiments on the human dataset show that HNetGO achieves state-of-the-art performance on cellular component and molecular function branches.

Topics: Humans; Amino Acid Sequence; Gene Ontology; Molecular Sequence Annotation; Neural Networks, Computer; Protein Interaction Maps

PubMed: 37861172
DOI: 10.1093/bib/bbab556

Impact of ancestral sequence reconstruction on mechanistic and structural enzymology.

Current Opinion in Structural Biology Oct 2023

Ancestral sequence reconstruction (ASR) provides insight into the changes within a protein sequence across evolution. More specifically, it can illustrate how specific... (Review)

Summary PubMed Full Text

Review

Authors: Callum R Nicoll, Marta Massari, Marco W Fraaije...

Ancestral sequence reconstruction (ASR) provides insight into the changes within a protein sequence across evolution. More specifically, it can illustrate how specific amino acid changes give rise to different phenotypes within a protein family. Over the last few decades it has established itself as a powerful technique for revealing molecular common denominators that govern enzyme function. Here, we describe the strength of ASR in unveiling catalytic mechanisms and emerging phenotypes for a range of different proteins, also highlighting biotechnological applications the methodology can provide.

Topics: Phylogeny; Evolution, Molecular; Proteins; Amino Acid Sequence; Phenotype

PubMed: 37544113
DOI: 10.1016/j.sbi.2023.102669

Chemically cross-linked hydrogels from repetitive protein arrays.

Journal of Structural Biology Sep 2023

Biomaterials for tissue regeneration must mimic the biophysical properties of the native physiological environment. A protein engineering approach allows the generation...

Summary PubMed Full Text

Authors: Rossana Boni, Elizabeth A Blackburn, Dirk-Jan Kleinjan...

Biomaterials for tissue regeneration must mimic the biophysical properties of the native physiological environment. A protein engineering approach allows the generation of protein hydrogels with specific and customised biophysical properties designed to suit a particular physiological environment. Herein, repetitive engineered proteins were successfully designed to form covalent molecular networks with defined physical characteristics able to sustain cell phenotype. Our hydrogel design was made possible by the incorporation of the SpyTag (ST) peptide and multiple repetitive units of the SpyCatcher (SC) protein that spontaneously formed covalent crosslinks upon mixing. Changing the ratios of the protein building blocks (ST:SC), allowed the viscoelastic properties and gelation speeds of the hydrogels to be altered and controlled. The physical properties of the hydrogels could readily be altered further to suit different environments by tuning the key features in the repetitive protein sequence. The resulting hydrogels were designed with a view to allow cell attachment and encapsulation of liver derived cells. Biocompatibility of the hydrogels was assayed using a HepG2 cell line constitutively expressing GFP. The cells remained viable and continued to express GFP whilst attached or encapsulated within the hydrogel. Our results demonstrate how this genetically encoded approach using repetitive proteins could be applied to bridge engineering biology with nanotechnology creating a level of biomaterial customisation previously inaccessible.

Topics: Protein Array Analysis; Hydrogels; Proteins; Biocompatible Materials; Amino Acid Sequence

PubMed: 37245604
DOI: 10.1016/j.jsb.2023.107981

Hidden Glutathione Transferases in the Human Genome.

Biomolecules Aug 2023

With the development of accurate protein structure prediction algorithms, artificial intelligence (AI) has emerged as a powerful tool in the field of structural biology....

Summary PubMed Full Text PDF

Authors: Aaron J Oakley

With the development of accurate protein structure prediction algorithms, artificial intelligence (AI) has emerged as a powerful tool in the field of structural biology. AI-based algorithms have been used to analyze large amounts of protein sequence data including the human proteome, complementing experimental structure data found in resources such as the Protein Data Bank. The EBI AlphaFold Protein Structure Database (for example) contains over 230 million structures. In this study, these data have been analyzed to find all human proteins containing (or predicted to contain) the cytosolic glutathione transferase (cGST) fold. A total of 39 proteins were found, including the alpha-, mu-, pi-, sigma-, zeta- and omega-class GSTs, intracellular chloride channels, metaxins, multisynthetase complex components, elongation factor 1 complex components and others. Three broad themes emerge: cGST domains as enzymes, as chloride ion channels and as protein-protein interaction mediators. As the majority of cGSTs are dimers, the AI-based structure prediction algorithm AlphaFold-multimer was used to predict structures of all pairwise combinations of these cGST domains. Potential homo- and heterodimers are described. Experimental biochemical and structure data is used to highlight the strengths and limitations of AI-predicted structures.

Topics: Humans; Glutathione Transferase; Genome, Human; Artificial Intelligence; Algorithms; Amino Acid Sequence

PubMed: 37627305
DOI: 10.3390/biom13081240

Predicting thermostability difference between cellular protein orthologs.

Bioinformatics (Oxford, England) Aug 2023

Protein thermostability is of great interest, both in theory and in practice.

Summary PubMed Full Text PDF

Authors: Jianwen Fang

MOTIVATION

Protein thermostability is of great interest, both in theory and in practice.

RESULTS

This study compared orthologous proteins with different cellular thermostability. A large number of physicochemical properties of protein were calculated and used to develop a series of machine learning models for predicting cellular thermostability differences between orthologous proteins. Most of the important features in these models are also highly correlated to relative cellular thermostability. A comparison between the present study with previous comparison of orthologous proteins from thermophilic and mesophilic organisms found that most highly correlated features are consistent in these studies, suggesting they may be important to protein thermostability.

AVAILABILITY AND IMPLEMENTATION

Data freely available for download at https://github.com/fangj3/cellular-protein-thermostability-dataset.

Topics: Amino Acid Sequence; Proteins

PubMed: 37572303
DOI: 10.1093/bioinformatics/btad504

High-throughput process development from gene cloning to protein production.

Microbial Cell Factories Sep 2023

In the post-genomic era, the demand for faster and more efficient protein production has increased, both in public laboratories and industry. In addition, with the... (Review)

Summary PubMed Full Text PDF

Review

Authors: Manman Sun, Alex Xiong Gao, Xiuxia Liu...

In the post-genomic era, the demand for faster and more efficient protein production has increased, both in public laboratories and industry. In addition, with the expansion of protein sequences in databases, the range of possible enzymes of interest for a given application is also increasing. Faced with peer competition, budgetary, and time constraints, companies and laboratories must find ways to develop a robust manufacturing process for recombinant protein production. In this review, we explore high-throughput technologies for recombinant protein expression and present a holistic high-throughput process development strategy that spans from genes to proteins. We discuss the challenges that come with this task, the limitations of previous studies, and future research directions.

Topics: Cloning, Molecular; Amino Acid Sequence; Genomics; Laboratories; Recombinant Proteins

PubMed: 37715258
DOI: 10.1186/s12934-023-02184-1

Principles of metabolome conservation in animals.

Proceedings of the National Academy of... Aug 2023

Metabolite levels shape cellular physiology and disease susceptibility, yet the general principles governing metabolome evolution are largely unknown. Here, we introduce...

Summary PubMed Full Text PDF

Authors: Orsolya Liska, Gábor Boross, Charles Rocabert...

Metabolite levels shape cellular physiology and disease susceptibility, yet the general principles governing metabolome evolution are largely unknown. Here, we introduce a measure of conservation of individual metabolite levels among related species. By analyzing multispecies tissue metabolome datasets in phylogenetically diverse mammals and fruit flies, we show that conservation varies extensively across metabolites. Three major functional properties, metabolite abundance, essentiality, and association with human diseases predict conservation, highlighting a striking parallel between the evolutionary forces driving metabolome and protein sequence conservation. Metabolic network simulations recapitulated these general patterns and revealed that abundant metabolites are highly conserved due to their strong coupling to key metabolic fluxes in the network. Finally, we show that biomarkers of metabolic diseases can be distinguished from other metabolites simply based on evolutionary conservation, without requiring any prior clinical knowledge. Overall, this study uncovers simple rules that govern metabolic evolution in animals and implies that most tissue metabolome differences between species are permitted, rather than favored by natural selection. More broadly, our work paves the way toward using evolutionary information to identify biomarkers, as well as to detect pathogenic metabolome alterations in individual patients.

Topics: Animals; Humans; Metabolome; Amino Acid Sequence; Drosophila; Knowledge; Mammals

PubMed: 37603743
DOI: 10.1073/pnas.2302147120

Rapid multiple protein sequence search by parallel and heterogeneous computation.

Bioinformatics (Oxford, England) Mar 2024

Protein sequence database search and multiple sequence alignment generation is a fundamental task in many bioinformatics analyses. As the data volume of sequences...

Summary PubMed Full Text PDF

Authors: Jiefu Li, Ziyuan Wang, Xuwei Fan...

MOTIVATION

Protein sequence database search and multiple sequence alignment generation is a fundamental task in many bioinformatics analyses. As the data volume of sequences continues to grow rapidly, there is an increasing need for efficient and scalable multiple sequence query algorithms for super-large databases without expensive time and computational costs.

RESULTS

We introduce Chorus, a novel protein sequence query system that leverages parallel model and heterogeneous computation architecture to enable users to query thousands of protein sequences concurrently against large protein databases on a desktop workstation. Chorus achieves over 100× speedup over BLASTP without sacrificing sensitivity. We demonstrate the utility of Chorus through a case study of analyzing a ∼1.5-TB large-scale metagenomic datasets for novel CRISPR-Cas protein discovery within 30 min.

AVAILABILITY AND IMPLEMENTATION

Chorus is open-source and its code repository is available at https://github.com/Bio-Acc/Chorus.

Topics: Software; Algorithms; Amino Acid Sequence; Proteins; Databases, Protein

PubMed: 38547405
DOI: 10.1093/bioinformatics/btae151