-
Biochemical Society Transactions Jun 2023RAS proteins are small GTPases that transduce signals from membrane receptors to signaling pathways that regulate growth and differentiation. Four RAS proteins are... (Review)
Review
RAS proteins are small GTPases that transduce signals from membrane receptors to signaling pathways that regulate growth and differentiation. Four RAS proteins are encoded by three genes - HRAS, KRAS, NRAS. Among them, KRAS is mutated in human cancer more frequently than any other oncogene. The KRAS pre-mRNA is alternatively spliced to generate two transcripts, KRAS4A and KRAS4B, that encode distinct proto-oncoproteins that differ almost exclusively in their C-terminal hypervariable regions (HVRs) that controls subcellular trafficking and membrane association. The KRAS4A isoform arose 475 million years ago in jawed vertebrates and has persisted in all vertebrates ever since, strongly suggesting non-overlapping functions of the splice variants. Because KRAS4B is expressed at higher levels in most tissues, it has been considered the principal KRAS isoform. However, emerging evidence for KRAS4A expression in tumors and splice variant-specific interactions and functions have sparked interest in this gene product. Among these findings, the KRAS4A-specific regulation of hexokinase I is a stark example. The aim of this mini-review is to provide an overview of the origin and differential functions of the two splice variants of KRAS.
Topics: Animals; Humans; Proto-Oncogene Proteins p21(ras); Neoplasms; Protein Isoforms; Signal Transduction; ras Proteins; Mutation
PubMed: 37222266
DOI: 10.1042/BST20221347 -
Annual Review of Biomedical Data Science Aug 2023Alternative splicing is pivotal to the regulation of gene expression and protein diversity in eukaryotic cells. The detection of alternative splicing events requires... (Review)
Review
Alternative splicing is pivotal to the regulation of gene expression and protein diversity in eukaryotic cells. The detection of alternative splicing events requires specific omics technologies. Although short-read RNA sequencing has successfully supported a plethora of investigations on alternative splicing, the emerging technologies of long-read RNA sequencing and top-down mass spectrometry open new opportunities to identify alternative splicing and protein isoforms with less ambiguity. Here, we summarize improvements in short-read RNA sequencing for alternative splicing analysis, including percent splicing index estimation and differential analysis. We also review the computational methods used in top-down proteomics analysis regarding proteoform identification, including the construction of databases of protein isoforms and statistical analyses of search results. While many improvements in sequencing and computational methods will result from emerging technologies, there should be future endeavors to increase the effectiveness, integration, and proteome coverage of alternative splicing events.
Topics: Proteomics; Transcriptome; Protein Isoforms; Alternative Splicing; RNA Splicing
PubMed: 37561601
DOI: 10.1146/annurev-biodatasci-020722-044021 -
Biomolecules Aug 2022The heat shock protein 90 (Hsp90) is a molecular chaperone and a key regulator of proteostasis under both physiological and stress conditions. In mammals, there are two... (Review)
Review
The heat shock protein 90 (Hsp90) is a molecular chaperone and a key regulator of proteostasis under both physiological and stress conditions. In mammals, there are two cytosolic Hsp90 isoforms: Hsp90α and Hsp90β. These two isoforms are 85% identical and encoded by two different genes. Hsp90β is constitutively expressed and essential for early mouse development, while Hsp90α is stress-inducible and not necessary for survivability. These two isoforms are known to have largely overlapping functions and to interact with a large fraction of the proteome. To what extent there are isoform-specific functions at the protein level has only relatively recently begun to emerge. There are studies indicating that one isoform is more involved in the functionality of a specific tissue or cell type. Moreover, in many diseases, functionally altered cells appear to be more dependent on one particular isoform. This leaves space for designing therapeutic strategies in an isoform-specific way, which may overcome the unfavorable outcome of pan-Hsp90 inhibition encountered in previous clinical trials. For this to succeed, isoform-specific functions must be understood in more detail. In this review, we summarize the available information on isoform-specific functions of mammalian Hsp90 and connect it to possible clinical applications.
Topics: Animals; HSP90 Heat-Shock Proteins; Mice; Molecular Chaperones; Protein Isoforms; Proteome
PubMed: 36139005
DOI: 10.3390/biom12091166 -
Current Opinion in Neurobiology Aug 2020The synaptotagmin family of molecules is known for regulating calcium-dependent membrane fusion events. Mice and humans express 17 synaptotagmin isoforms, where most... (Review)
Review
The synaptotagmin family of molecules is known for regulating calcium-dependent membrane fusion events. Mice and humans express 17 synaptotagmin isoforms, where most studies have focused on isoforms 1, 2, and 7, which are involved in synaptic vesicle exocytosis. Recent work has highlighted how brain function relies on additional isoforms, with roles in postsynaptic receptor endocytosis, vesicle trafficking, membrane repair, synaptic plasticity, and protection against neurodegeneration, for example, in addition to the traditional concept of synaptotagmin-mediated neurotransmitter release - in neurons as well as glia, and at different timepoints. In fact, it is not uncommon for the same isoform to feature several splice isoforms, form homo- and heterodimers, and function in different subcellular locations and cell types. This review aims to highlight the diversity of synaptotagmins, offers a concise summary of key findings on all isoforms, and discusses different ways of grouping these.
Topics: Animals; Calcium; Exocytosis; Humans; Membrane Fusion; Mice; Nerve Tissue Proteins; Protein Isoforms; Synaptotagmin I; Synaptotagmins
PubMed: 32663762
DOI: 10.1016/j.conb.2020.04.006 -
Briefings in Functional Genomics Mar 2024Following the central dogma of molecular biology, gene expression heterogeneity can aid in predicting and explaining the wide variety of protein products, functions and,... (Review)
Review
Following the central dogma of molecular biology, gene expression heterogeneity can aid in predicting and explaining the wide variety of protein products, functions and, ultimately, heterogeneity in phenotypes. There is currently overlapping terminology used to describe the types of diversity in gene expression profiles, and overlooking these nuances can misrepresent important biological information. Here, we describe transcriptome diversity as a measure of the heterogeneity in (1) the expression of all genes within a sample or a single gene across samples in a population (gene-level diversity) or (2) the isoform-specific expression of a given gene (isoform-level diversity). We first overview modulators and quantification of transcriptome diversity at the gene level. Then, we discuss the role alternative splicing plays in driving transcript isoform-level diversity and how it can be quantified. Additionally, we overview computational resources for calculating gene-level and isoform-level diversity for high-throughput sequencing data. Finally, we discuss future applications of transcriptome diversity. This review provides a comprehensive overview of how gene expression diversity arises, and how measuring it determines a more complete picture of heterogeneity across proteins, cells, tissues, organisms and species.
Topics: Transcriptome; Gene Expression Profiling; Protein Isoforms; Alternative Splicing; Sequence Analysis, RNA; High-Throughput Nucleotide Sequencing
PubMed: 37225889
DOI: 10.1093/bfgp/elad019 -
Nature Methods Apr 2022
Topics: Amino Acid Sequence; Protein Isoforms
PubMed: 35396478
DOI: 10.1038/s41592-022-01472-9 -
Bioinformatics (Oxford, England) Apr 2020Accurate estimation of transcript isoform abundance is critical for downstream transcriptome analyses and can lead to precise molecular mechanisms for understanding...
MOTIVATION
Accurate estimation of transcript isoform abundance is critical for downstream transcriptome analyses and can lead to precise molecular mechanisms for understanding complex human diseases, like cancer. Simplex mRNA Sequencing (RNA-Seq) based isoform quantification approaches are facing the challenges of inherent sampling bias and unidentifiable read origins. A large-scale experiment shows that the consistency between RNA-Seq and other mRNA quantification platforms is relatively low at the isoform level compared to the gene level. In this project, we developed a platform-integrated model for transcript quantification (IntMTQ) to improve the performance of RNA-Seq on isoform expression estimation. IntMTQ, which benefits from the mRNA expressions reported by the other platforms, provides more precise RNA-Seq-based isoform quantification and leads to more accurate molecular signatures for disease phenotype prediction.
RESULTS
In the experiments to assess the quality of isoform expression estimated by IntMTQ, we designed three tasks for clustering and classification of 46 cancer cell lines with four different mRNA quantification platforms, including newly developed NanoString's nCounter technology. The results demonstrate that the isoform expressions learned by IntMTQ consistently provide more and better molecular features for downstream analyses compared with five baseline algorithms which consider RNA-Seq data only. An independent RT-qPCR experiment on seven genes in twelve cancer cell lines showed that the IntMTQ improved overall transcript quantification. The platform-integrated algorithms could be applied to large-scale cancer studies, such as The Cancer Genome Atlas (TCGA), with both RNA-Seq and array-based platforms available.
AVAILABILITY AND IMPLEMENTATION
Source code is available at: https://github.com/CompbioLabUcf/IntMTQ.
SUPPLEMENTARY INFORMATION
Supplementary data are available at Bioinformatics online.
Topics: Algorithms; Gene Expression Profiling; Humans; Protein Isoforms; RNA Isoforms; RNA, Messenger; Sequence Analysis, RNA; Software
PubMed: 31834359
DOI: 10.1093/bioinformatics/btz932 -
Genome Biology Mar 2022The detection of physiologically relevant protein isoforms encoded by the human genome is critical to biomedicine. Mass spectrometry (MS)-based proteomics is the...
BACKGROUND
The detection of physiologically relevant protein isoforms encoded by the human genome is critical to biomedicine. Mass spectrometry (MS)-based proteomics is the preeminent method for protein detection, but isoform-resolved proteomic analysis relies on accurate reference databases that match the sample; neither a subset nor a superset database is ideal. Long-read RNA sequencing (e.g., PacBio or Oxford Nanopore) provides full-length transcripts which can be used to predict full-length protein isoforms.
RESULTS
We describe here a long-read proteogenomics approach for integrating sample-matched long-read RNA-seq and MS-based proteomics data to enhance isoform characterization. We introduce a classification scheme for protein isoforms, discover novel protein isoforms, and present the first protein inference algorithm for the direct incorporation of long-read transcriptome data to enable detection of protein isoforms previously intractable to MS-based detection. We have released an open-source Nextflow pipeline that integrates long-read sequencing in a proteomic workflow for isoform-resolved analysis.
CONCLUSIONS
Our work suggests that the incorporation of long-read sequencing and proteomic data can facilitate improved characterization of human protein isoform diversity. Our first-generation pipeline provides a strong foundation for future development of long-read proteogenomics and its adoption for both basic and translational research.
Topics: Alternative Splicing; Humans; Protein Isoforms; Proteogenomics; Proteomics; Sequence Analysis, RNA; Transcriptome
PubMed: 35241129
DOI: 10.1186/s13059-022-02624-y -
Bioinformatics (Oxford, England) Feb 2022RNA expression at isoform level is biologically more informative than at gene level and can potentially reveal cellular subsets and corresponding biomarkers that are not...
MOTIVATION
RNA expression at isoform level is biologically more informative than at gene level and can potentially reveal cellular subsets and corresponding biomarkers that are not visible at gene level. However, due to the strong 3' bias sequencing protocol, mRNA quantification for high-throughput single-cell RNA sequencing such as Chromium Single Cell 3' 10× Genomics is currently performed at the gene level.
RESULTS
We have developed an isoform-level quantification method for high-throughput single-cell RNA sequencing by exploiting the concepts of transcription clusters and isoform paralogs. The method, called Scasa, compares well in simulations against competing approaches including Alevin, Cellranger, Kallisto, Salmon, Terminus and STARsolo at both isoform- and gene-level expression. The reanalysis of a CITE-Seq dataset with isoform-based Scasa reveals a subgroup of CD14 monocytes missed by gene-based methods.
AVAILABILITY AND IMPLEMENTATION
Implementation of Scasa including source code, documentation, tutorials and test data supporting this study is available at Github: https://github.com/eudoraleer/scasa and Zenodo: https://doi.org/10.5281/zenodo.5712503.
SUPPLEMENTARY INFORMATION
Supplementary data are available at Bioinformatics online.
Topics: Gene Expression Profiling; Sequence Analysis, RNA; Protein Isoforms; Software; RNA, Messenger; RNA
PubMed: 34864849
DOI: 10.1093/bioinformatics/btab807 -
Cell Reports Methods Sep 2021Transcription start site (TSS) selection influences transcript stability and translation as well as protein sequence. Alternative TSS usage is pervasive in organismal... (Review)
Review
Transcription start site (TSS) selection influences transcript stability and translation as well as protein sequence. Alternative TSS usage is pervasive in organismal development, is a major contributor to transcript isoform diversity in humans, and is frequently observed in human diseases including cancer. In this review, we discuss the breadth of techniques that have been used to globally profile TSSs and the resulting insights into gene regulation, as well as future prospects in this area of inquiry.
Topics: Humans; Promoter Regions, Genetic; Gene Expression Regulation; Protein Isoforms
PubMed: 34632443
DOI: 10.1016/j.crmeth.2021.100081