-
Science Advances Jan 2022Tumors display widespread transcriptome alterations, but the full repertoire of isoform-level alternative splicing in cancer is unknown. We developed a long-read (LR)...
Tumors display widespread transcriptome alterations, but the full repertoire of isoform-level alternative splicing in cancer is unknown. We developed a long-read (LR) RNA sequencing and analytical platform that identifies and annotates full-length isoforms and infers tumor-specific splicing events. Application of this platform to breast cancer samples identifies thousands of previously unannotated isoforms; ~30% affect protein coding exons and are predicted to alter protein localization and function. We performed extensive cross-validation with -omics datasets to support transcription and translation of novel isoforms. We identified 3059 breast tumor–specific splicing events, including 35 that are significantly associated with patient survival. Of these, 21 are absent from GENCODE and 10 are enriched in specific breast cancer subtypes. Together, our results demonstrate the complexity, cancer subtype specificity, and clinical relevance of previously unidentified isoforms and splicing events in breast cancer that are only annotatable by LR-seq and provide a rich resource of immuno-oncology therapeutic targets.
Topics: Alternative Splicing; Breast Neoplasms; Female; High-Throughput Nucleotide Sequencing; Humans; Protein Isoforms; Sequence Analysis, RNA; Transcriptome
PubMed: 35044822
DOI: 10.1126/sciadv.abg6711 -
Nucleic Acids Research May 2023In situ capturing technologies add tissue context to gene expression data, with the potential of providing a greater understanding of complex biological systems....
In situ capturing technologies add tissue context to gene expression data, with the potential of providing a greater understanding of complex biological systems. However, splicing variants and full-length sequence heterogeneity cannot be characterized at spatial resolution with current transcriptome profiling methods. To that end, we introduce spatial isoform transcriptomics (SiT), an explorative method for characterizing spatial isoform variation and sequence heterogeneity using long-read sequencing. We show in mouse brain how SiT can be used to profile isoform expression and sequence heterogeneity in different areas of the tissue. SiT reveals regional isoform switching of Plp1 gene between different layers of the olfactory bulb, and the use of external single-cell data allows the nomination of cell types expressing each isoform. Furthermore, SiT identifies differential isoform usage for several major genes implicated in brain function (Snap25, Bin1, Gnas) that are independently validated by in situ sequencing. SiT also provides for the first time an in-depth A-to-I RNA editing map of the adult mouse brain. Data exploration can be performed through an online resource (https://www.isomics.eu), where isoform expression and RNA editing can be visualized in a spatial context.
Topics: Animals; Mice; Alternative Splicing; Sequence Analysis, RNA; Protein Isoforms; Gene Expression Profiling; Gene Expression; Transcriptome
PubMed: 36928528
DOI: 10.1093/nar/gkad169 -
Nature May 2023Mitotic defects activate the spindle-assembly checkpoint, which inhibits the anaphase-promoting complex co-activator CDC20 to induce a prolonged cell cycle arrest. Once...
Mitotic defects activate the spindle-assembly checkpoint, which inhibits the anaphase-promoting complex co-activator CDC20 to induce a prolonged cell cycle arrest. Once errors are corrected, the spindle-assembly checkpoint is silenced, allowing anaphase onset to occur. However, in the presence of persistent unresolvable errors, cells can undergo 'mitotic slippage', exiting mitosis into a tetraploid G1 state and escaping the cell death that results from a prolonged arrest. The molecular logic that enables cells to balance these duelling mitotic arrest and slippage behaviours remains unclear. Here we demonstrate that human cells modulate the duration of their mitotic arrest through the presence of conserved, alternative CDC20 translational isoforms. Downstream translation initiation results in a truncated CDC20 isoform that is resistant to spindle-assembly-checkpoint-mediated inhibition and promotes mitotic exit even in the presence of mitotic perturbations. Our study supports a model in which the relative levels of CDC20 translational isoforms control the duration of mitotic arrest. During a prolonged mitotic arrest, new protein synthesis and differential CDC20 isoform turnover create a timer, with mitotic exit occurring once the truncated Met43 isoform achieves sufficient levels. Targeted molecular changes or naturally occurring cancer mutations that alter CDC20 isoform ratios or its translational control modulate mitotic arrest duration and anti-mitotic drug sensitivity, with potential implications for the diagnosis and treatment of human cancers.
Topics: Humans; Cdc20 Proteins; Protein Biosynthesis; Protein Isoforms; Spindle Apparatus; Peptide Chain Initiation, Translational; M Phase Cell Cycle Checkpoints
PubMed: 37100900
DOI: 10.1038/s41586-023-05943-7 -
Biomolecules Aug 2022The heat shock protein 90 (Hsp90) is a molecular chaperone and a key regulator of proteostasis under both physiological and stress conditions. In mammals, there are two... (Review)
Review
The heat shock protein 90 (Hsp90) is a molecular chaperone and a key regulator of proteostasis under both physiological and stress conditions. In mammals, there are two cytosolic Hsp90 isoforms: Hsp90α and Hsp90β. These two isoforms are 85% identical and encoded by two different genes. Hsp90β is constitutively expressed and essential for early mouse development, while Hsp90α is stress-inducible and not necessary for survivability. These two isoforms are known to have largely overlapping functions and to interact with a large fraction of the proteome. To what extent there are isoform-specific functions at the protein level has only relatively recently begun to emerge. There are studies indicating that one isoform is more involved in the functionality of a specific tissue or cell type. Moreover, in many diseases, functionally altered cells appear to be more dependent on one particular isoform. This leaves space for designing therapeutic strategies in an isoform-specific way, which may overcome the unfavorable outcome of pan-Hsp90 inhibition encountered in previous clinical trials. For this to succeed, isoform-specific functions must be understood in more detail. In this review, we summarize the available information on isoform-specific functions of mammalian Hsp90 and connect it to possible clinical applications.
Topics: Animals; HSP90 Heat-Shock Proteins; Mice; Molecular Chaperones; Protein Isoforms; Proteome
PubMed: 36139005
DOI: 10.3390/biom12091166 -
Nucleic Acids Research Jan 2022APPRIS (https://appris.bioinfo.cnio.es) is a well-established database housing annotations for protein isoforms for a range of species. APPRIS selects principal...
APPRIS (https://appris.bioinfo.cnio.es) is a well-established database housing annotations for protein isoforms for a range of species. APPRIS selects principal isoforms based on protein structure and function features and on cross-species conservation. Most coding genes produce a single main protein isoform and the principal isoforms chosen by the APPRIS database best represent this main cellular isoform. Human genetic data, experimental protein evidence and the distribution of clinical variants all support the relevance of APPRIS principal isoforms. APPRIS annotations and principal isoforms have now been expanded to 10 model organisms. In this paper we highlight the most recent updates to the database. APPRIS annotations have been generated for two new species, cow and chicken, the protein structural information has been augmented with reliable models from the EMBL-EBI AlphaFold database, and we have substantially expanded the confirmatory proteomics evidence available for the human genome. The most significant change in APPRIS has been the implementation of TRIFID functional isoform scores. TRIFID functional scores are assigned to all splice isoforms, and APPRIS uses the TRIFID functional scores and proteomics evidence to determine principal isoforms when core methods cannot.
Topics: Animals; Cattle; Chickens; Databases, Protein; Humans; Protein Conformation; Protein Isoforms; Proteins; Proteomics
PubMed: 34755885
DOI: 10.1093/nar/gkab1058 -
Genome Biology Mar 2022The detection of physiologically relevant protein isoforms encoded by the human genome is critical to biomedicine. Mass spectrometry (MS)-based proteomics is the...
BACKGROUND
The detection of physiologically relevant protein isoforms encoded by the human genome is critical to biomedicine. Mass spectrometry (MS)-based proteomics is the preeminent method for protein detection, but isoform-resolved proteomic analysis relies on accurate reference databases that match the sample; neither a subset nor a superset database is ideal. Long-read RNA sequencing (e.g., PacBio or Oxford Nanopore) provides full-length transcripts which can be used to predict full-length protein isoforms.
RESULTS
We describe here a long-read proteogenomics approach for integrating sample-matched long-read RNA-seq and MS-based proteomics data to enhance isoform characterization. We introduce a classification scheme for protein isoforms, discover novel protein isoforms, and present the first protein inference algorithm for the direct incorporation of long-read transcriptome data to enable detection of protein isoforms previously intractable to MS-based detection. We have released an open-source Nextflow pipeline that integrates long-read sequencing in a proteomic workflow for isoform-resolved analysis.
CONCLUSIONS
Our work suggests that the incorporation of long-read sequencing and proteomic data can facilitate improved characterization of human protein isoform diversity. Our first-generation pipeline provides a strong foundation for future development of long-read proteogenomics and its adoption for both basic and translational research.
Topics: Alternative Splicing; Humans; Protein Isoforms; Proteogenomics; Proteomics; Sequence Analysis, RNA; Transcriptome
PubMed: 35241129
DOI: 10.1186/s13059-022-02624-y -
Annual Review of Biomedical Data Science Aug 2023Alternative splicing is pivotal to the regulation of gene expression and protein diversity in eukaryotic cells. The detection of alternative splicing events requires... (Review)
Review
Alternative splicing is pivotal to the regulation of gene expression and protein diversity in eukaryotic cells. The detection of alternative splicing events requires specific omics technologies. Although short-read RNA sequencing has successfully supported a plethora of investigations on alternative splicing, the emerging technologies of long-read RNA sequencing and top-down mass spectrometry open new opportunities to identify alternative splicing and protein isoforms with less ambiguity. Here, we summarize improvements in short-read RNA sequencing for alternative splicing analysis, including percent splicing index estimation and differential analysis. We also review the computational methods used in top-down proteomics analysis regarding proteoform identification, including the construction of databases of protein isoforms and statistical analyses of search results. While many improvements in sequencing and computational methods will result from emerging technologies, there should be future endeavors to increase the effectiveness, integration, and proteome coverage of alternative splicing events.
Topics: Proteomics; Transcriptome; Protein Isoforms; Alternative Splicing; RNA Splicing
PubMed: 37561601
DOI: 10.1146/annurev-biodatasci-020722-044021 -
Journal of Molecular Biology Aug 2017Genome-wide studies of aging have identified subsets of genes that show age-related changes in expression. Although the types of genes that are age regulated vary among... (Review)
Review
Genome-wide studies of aging have identified subsets of genes that show age-related changes in expression. Although the types of genes that are age regulated vary among different tissues and organisms, some patterns emerge from these large data sets. First, aging is associated with a broad induction of stress response pathways, although the specific genes and pathways involved differ depending on cell type and species. In contrast, a wide variety of functional classes of genes are downregulated with age, often including tissue-specific genes. Although the upregulation of age-regulated genes is likely to be governed by stress-responsive transcription factors, questions remain as to why particular genes are susceptible to age-related transcriptional decline. Here, we discuss recent findings showing that splicing is misregulated with age. While defects in splicing could lead to changes in protein isoform levels, they could also impact gene expression through nonsense-mediated decay of intron-retained transcripts. The discovery that splicing is misregulated with age suggests that other aspects of gene expression, such as transcription elongation, termination, and polyadenylation, must also be considered as potential mechanisms for age-related changes in transcript levels. Moreover, the considerable variation between genome-wide aging expression studies indicates that there is a critical need to analyze the transcriptional signatures of aging in single-cell types rather than whole tissues. Since age-associated decreases in gene expression could contribute to a progressive decline in cellular function, understanding the mechanisms that determine the aging transcriptome provides a potential target to extend healthy cellular lifespan.
Topics: Aging; Animals; Gene Expression Profiling; Gene Expression Regulation; Humans; Protein Isoforms; RNA Splicing
PubMed: 28684248
DOI: 10.1016/j.jmb.2017.06.019 -
Genome Biology Jun 2002Multiple members of the 14-3-3 protein family have been found in all eukaryotes so far investigated, yet they are apparently absent from prokaryotes. The major native... (Review)
Review
Multiple members of the 14-3-3 protein family have been found in all eukaryotes so far investigated, yet they are apparently absent from prokaryotes. The major native forms of 14-3-3s are homo- and hetero-dimers, the biological functions of which are to interact physically with specific client proteins and thereby effect a change in the client. As a result, 14-3-3s are involved in a vast array of processes such as the response to stress, cell-cycle control, and apoptosis, serving as adapters, activators, and repressors. There are currently 133 full-length sequences available in GenBank for this highly conserved protein family. A phylogenetic tree based on the conserved middle core region of the protein sequences shows that, in plants, the 14-3-3 family can be divided into two clearly defined groups. The core region encodes an amphipathic groove that binds the multitude of client proteins that have conserved 14-3-3-recognition sequences. The amino and carboxyl termini of 14-3-3 proteins are much more divergent than the core region and may interact with isoform-specific client proteins and/or confer specialized subcellular and tissue localization.
Topics: 14-3-3 Proteins; Amino Acid Sequence; Animals; Evolution, Molecular; Models, Molecular; Molecular Sequence Data; Phylogeny; Protein Isoforms; Sequence Alignment; Tyrosine 3-Monooxygenase
PubMed: 12184815
DOI: 10.1186/gb-2002-3-7-reviews3010 -
EMBO Reports Dec 2021All living organisms have developed processes to sense and address environmental changes to maintain a stable internal state (homeostasis). When activated, the p53... (Review)
Review
All living organisms have developed processes to sense and address environmental changes to maintain a stable internal state (homeostasis). When activated, the p53 tumour suppressor maintains cell and organ integrity and functions in response to homeostasis disruptors (stresses) such as infection, metabolic alterations and cellular damage. Thus, p53 plays a fundamental physiological role in maintaining organismal homeostasis. The TP53 gene encodes a network of proteins (p53 isoforms) with similar and distinct biochemical functions. The p53 network carries out multiple biological activities enabling cooperation between individual cells required for long-term survival of multicellular organisms (animals) in response to an ever-changing environment caused by mutation, infection, metabolic alteration or damage. In this review, we suggest that the p53 network has evolved as an adaptive response to pathogen infections and other environmental selection pressures.
Topics: Animals; Genes, p53; Homeostasis; Infections; Mutation; Protein Isoforms; Stress, Physiological; Tumor Suppressor Protein p53
PubMed: 34779563
DOI: 10.15252/embr.202153085