-
BMC Bioinformatics May 2021Full-length isoform quantification from RNA-Seq is a key goal in transcriptomics analyses and has been an area of active development since the beginning. The fundamental... (Review)
Review
BACKGROUND
Full-length isoform quantification from RNA-Seq is a key goal in transcriptomics analyses and has been an area of active development since the beginning. The fundamental difficulty stems from the fact that RNA transcripts are long, while RNA-Seq reads are short.
RESULTS
Here we use simulated benchmarking data that reflects many properties of real data, including polymorphisms, intron signal and non-uniform coverage, allowing for systematic comparative analyses of isoform quantification accuracy and its impact on differential expression analysis. Genome, transcriptome and pseudo alignment-based methods are included; and a simple approach is included as a baseline control.
CONCLUSIONS
Salmon, kallisto, RSEM, and Cufflinks exhibit the highest accuracy on idealized data, while on more realistic data they do not perform dramatically better than the simple approach. We determine the structural parameters with the greatest impact on quantification accuracy to be length and sequence compression complexity and not so much the number of isoforms. The effect of incomplete annotation on performance is also investigated. Overall, the tested methods show sufficient divergence from the truth to suggest that full-length isoform quantification and isoform level DE should still be employed selectively.
Topics: Gene Expression Profiling; Protein Isoforms; RNA-Seq; Sequence Analysis, RNA; Transcriptome
PubMed: 34034652
DOI: 10.1186/s12859-021-04198-1 -
Bioinformatics (Oxford, England) Feb 2022RNA expression at isoform level is biologically more informative than at gene level and can potentially reveal cellular subsets and corresponding biomarkers that are not...
MOTIVATION
RNA expression at isoform level is biologically more informative than at gene level and can potentially reveal cellular subsets and corresponding biomarkers that are not visible at gene level. However, due to the strong 3' bias sequencing protocol, mRNA quantification for high-throughput single-cell RNA sequencing such as Chromium Single Cell 3' 10× Genomics is currently performed at the gene level.
RESULTS
We have developed an isoform-level quantification method for high-throughput single-cell RNA sequencing by exploiting the concepts of transcription clusters and isoform paralogs. The method, called Scasa, compares well in simulations against competing approaches including Alevin, Cellranger, Kallisto, Salmon, Terminus and STARsolo at both isoform- and gene-level expression. The reanalysis of a CITE-Seq dataset with isoform-based Scasa reveals a subgroup of CD14 monocytes missed by gene-based methods.
AVAILABILITY AND IMPLEMENTATION
Implementation of Scasa including source code, documentation, tutorials and test data supporting this study is available at Github: https://github.com/eudoraleer/scasa and Zenodo: https://doi.org/10.5281/zenodo.5712503.
SUPPLEMENTARY INFORMATION
Supplementary data are available at Bioinformatics online.
Topics: Gene Expression Profiling; Sequence Analysis, RNA; Protein Isoforms; Software; RNA, Messenger; RNA
PubMed: 34864849
DOI: 10.1093/bioinformatics/btab807 -
Cells Dec 2023Alternative splicing changes are closely linked to aging, though it remains unclear if they are drivers or effects. As organisms age, splicing patterns change, varying... (Review)
Review
Alternative splicing changes are closely linked to aging, though it remains unclear if they are drivers or effects. As organisms age, splicing patterns change, varying gene isoform levels and functions. These changes may contribute to aging alterations rather than just reflect declining RNA quality control. Three main splicing types-intron retention, cassette exons, and cryptic exons-play key roles in age-related complexity. These events modify protein domains and increase nonsense-mediated decay, shifting protein isoform levels and functions. This may potentially drive aging or serve as a biomarker. Fluctuations in splicing factor expression also occur with aging. Somatic mutations in splicing genes can also promote aging and age-related disease. The interplay between splicing and aging has major implications for aging biology, though differentiating correlation and causation remains challenging. Declaring a splicing factor or event as a driver requires comprehensive evaluation of the associated molecular and physiological changes. A greater understanding of how RNA splicing machinery and downstream targets are impacted by aging is essential to conclusively establish the role of splicing in driving aging, representing a promising area with key implications for understanding aging, developing novel therapeutical options, and ultimately leading to an increase in the healthy human lifespan.
Topics: Humans; Alternative Splicing; RNA, Messenger; Protein Isoforms; RNA Splicing Factors; Aging; Nonsense Mediated mRNA Decay
PubMed: 38132139
DOI: 10.3390/cells12242819 -
International Journal of Molecular... Nov 2018The insulin receptor (IR) mediates both metabolic and mitogenic effects especially when overexpressed or in clinical conditions with compensatory hyperinsulinemia, due... (Review)
Review
The insulin receptor (IR) mediates both metabolic and mitogenic effects especially when overexpressed or in clinical conditions with compensatory hyperinsulinemia, due to the metabolic pathway resistance, as obesity diabetes. In many cancers, IR is overexpressed preferentially as IR-A isoform, derived by alternative splicing of exon 11. The IR-A overexpression, and the increased IR-A:IR-B ratio, are mechanisms that promote the mitogenic response of cancer cells to insulin and IGF-2, which is produced locally by both epithelial and stromal cancer cells. In cancer IR-A, isoform predominance may occur for dysregulation at both mRNA transcription and post-transcription levels, including splicing factors, non-coding RNAs and protein degradation. The mechanisms that regulate IR isoform expression are complex and not fully understood. The IR isoform overexpression may play a role in cancer cell stemness, in tumor progression and in resistance to target therapies. From a clinical point of view, the IR-A overexpression in cancer may be a determinant factor for the resistance to IGF-1R target therapies for this issue. IR isoform expression in cancers may have the meaning of a predictive biomarker and co-targeting IGF-1R and IR-A may represent a new more efficacious treatment strategy.
Topics: Animals; Gene Expression Regulation, Neoplastic; Humans; Models, Biological; Neoplasms; Protein Isoforms; Receptor, Insulin
PubMed: 30453495
DOI: 10.3390/ijms19113615 -
G3 (Bethesda, Md.) Nov 2022Long-read sequencing technologies such as isoform sequencing can generate highly accurate sequences of full-length mRNA transcript isoforms. Such long-read...
Long-read sequencing technologies such as isoform sequencing can generate highly accurate sequences of full-length mRNA transcript isoforms. Such long-read transcriptomics may be especially useful in investigations of lymphocyte functional plasticity as it relates to human health and disease. However, no long-read isoform-aware reference transcriptomes of human circulating lymphocytes are readily available despite being valuable as benchmarks in a variety of transcriptomic studies. To begin to fill this gap, we purified 4 lymphocyte populations (CD4+ T, CD8+ T, NK, and Pan B cells) from the peripheral blood of a healthy male donor and obtained high-quality RNA (RIN > 8) for isoform sequencing and parallel RNA-Seq analyses. Many novel polyadenylated transcript isoforms, supported by both isoform sequencing and RNA-Seq data, were identified within each sample. The datasets met several metrics of high quality and have been deposited to the Gene Expression Omnibus database (GSE202327, GSE202328, GSE202329) as both raw and processed files to serve as long-read reference transcriptomes for future studies of human circulating lymphocytes.
Topics: Humans; Male; Transcriptome; Gene Expression Profiling; High-Throughput Nucleotide Sequencing; Protein Isoforms; Sequence Analysis, RNA; Lymphocyte Subsets
PubMed: 36161486
DOI: 10.1093/g3journal/jkac253 -
The International Journal of... May 2019STAT3β is an isoform of STAT3 (signal transducer and activator of transcription 3) that differs from the STAT3α isoform by the replacement of the C-terminal 55 amino... (Review)
Review
STAT3β is an isoform of STAT3 (signal transducer and activator of transcription 3) that differs from the STAT3α isoform by the replacement of the C-terminal 55 amino acid residues with 7 specific residues. The constitutive activation of STAT3α plays a pivotal role in the activation of oncogenic pathways, such as cell proliferation, maturation and survival, while STAT3β is often referred to as a dominant-negative regulator of cancer. STAT3β reveals a "spongy cushion" effect through its cooperation with STAT3α or forms a ternary complex with other co-activators. Especially in tumour cells, relatively high levels of STAT3β lead to some favourable changes. However, there are still many mechanisms that have not been clearly explained in contrast to STAT3α, such as STAT3β nuclear retention, more stable heterodimers and the prolonged Y705 phosphorylation. In addition to its transcriptional activities, STAT3β may also function in the cytosol with respect to the mitochondria, cytoskeleton rearrangements and metastasis of cancer cells. In this review, we summarize the mechanisms that underlie the unique roles of STAT3β combined with total STAT3 to enlighten and draw the attention of researchers studying STAT3 and discuss some interesting questions that warrant answers.
Topics: Animals; Humans; Neoplasms; Protein Isoforms; STAT3 Transcription Factor
PubMed: 30822557
DOI: 10.1016/j.biocel.2019.02.006 -
Clinical Epigenetics 2016DNA methylation, through 5-methyl- and 5-hydroxymethylcytosine (5mC and 5hmC), is considered to be one of the principal interfaces between the genome and our... (Review)
Review
DNA methylation, through 5-methyl- and 5-hydroxymethylcytosine (5mC and 5hmC), is considered to be one of the principal interfaces between the genome and our environment, and it helps explain phenotypic variations in human populations. Initial reports of large differences in methylation level in genomic regulatory regions, coupled with clear gene expression data in both imprinted genes and malignant diseases, provided easily dissected molecular mechanisms for switching genes on or off. However, a more subtle process is becoming evident, where small (<10 %) changes to intermediate methylation levels are associated with complex disease phenotypes. This has resulted in two clear methylation paradigms. The latter "subtle change" paradigm is rapidly becoming the epigenetic hallmark of complex disease phenotypes, although we are currently hampered by a lack of data addressing the true biological significance and meaning of these small differences. Our initial expectation of rapidly identifying mechanisms linking environmental exposure to a disease phenotype led to numerous observational/association studies being performed. Although this expectation remains unmet, there is now a growing body of literature on specific genes, suggesting wide ranging transcriptional and translational consequences of such subtle methylation changes. Data from the glucocorticoid receptor (NR3C1) has shown that a complex interplay between DNA methylation, extensive 5'UTR splicing, and microvariability gives rise to the overall level and relative distribution of total and N-terminal protein isoforms generated. Additionally, the presence of multiple AUG translation initiation codons throughout the complete, processed mRNA enables translation variability, hereby enhancing the translational isoforms and the resulting protein isoform diversity, providing a clear link between small changes in DNA methylation and significant changes in protein isoforms and cellular locations. Methylation changes in the NR3C1 CpG island alters the NR3C1 transcription and eventually protein isoforms in the tissues, resulting in subtle but visible physiological variability. This review addresses the current pathophysiological and clinical associations of such characteristically small DNA methylation changes, the ever-growing roles of DNA methylation and the evidence available, particularly from the glucocorticoid receptor of the cascade of events initiated by such subtle methylation changes, as well as addressing the underlying question as to what represents a genuine biologically significant difference in methylation.
Topics: 5' Untranslated Regions; 5-Methylcytosine; CpG Islands; DNA Methylation; Environmental Exposure; Epigenesis, Genetic; Gene Expression; Gene Expression Regulation; Humans; Phenotype; Protein Isoforms; Receptors, Glucocorticoid
PubMed: 27602172
DOI: 10.1186/s13148-016-0256-8 -
Cells Apr 2021p53 protein isoform expression has been found to correlate with prognosis and chemotherapy response in acute myeloid leukemia (AML). We aimed to investigate how p53...
p53 protein isoform expression has been found to correlate with prognosis and chemotherapy response in acute myeloid leukemia (AML). We aimed to investigate how p53 protein isoforms are modulated during epigenetic differentiation therapy in AML, and if p53 isoform expression could be a potential biomarker for predicting a response to this treatment. p53 full-length (FL), p53β and p53γ protein isoforms were analyzed by 1D and 2D gel immunoblots in AML cell lines, primary AML cells from untreated patients and AML cells from patients before and after treatment with valproic acid (VPA), all- retinoic acid (ATRA) and theophylline. Furthermore, global gene expression profiling analysis was performed on samples from the clinical protocol. Correlation analyses were performed between p53 protein isoform expression and in vitro VPA sensitivity and FAB (French-American-British) class in primary AML cells. The results show downregulation of p53β/γ and upregulation of p53FL in AML cell lines treated with VPA, and in some of the patients treated with differentiation therapy. p53FL positively correlated with in vitro VPA sensitivity and the FAB class of AML, while p53β/γ isoforms negatively correlated with the same. Our results indicate that p53 protein isoforms are modulated by and may predict sensitivity to differentiation therapy in AML.
Topics: Adult; Aged; Aged, 80 and over; Blast Crisis; Cell Differentiation; Cell Line, Tumor; Epigenesis, Genetic; Female; Gene Expression Regulation, Leukemic; Humans; Leukemia, Myeloid, Acute; Male; Middle Aged; Protein Isoforms; Tumor Suppressor Protein p53; Valproic Acid
PubMed: 33917201
DOI: 10.3390/cells10040833 -
Expert Review of Proteomics Aug 2016Heart diseases are a leading cause of morbidity and mortality for both men and women worldwide, and impose significant economic burdens on the healthcare systems.... (Review)
Review
INTRODUCTION
Heart diseases are a leading cause of morbidity and mortality for both men and women worldwide, and impose significant economic burdens on the healthcare systems. Despite substantial effort over the last several decades, the molecular mechanisms underlying diseases of the heart remain poorly understood.
AREAS COVERED
Altered protein post-translational modifications (PTMs) and protein isoform switching are increasingly recognized as important disease mechanisms. Top-down high-resolution mass spectrometry (MS)-based proteomics has emerged as the most powerful method for the comprehensive analysis of PTMs and protein isoforms. Here, we will review recent technology developments in the field of top-down proteomics, as well as highlight recent studies utilizing top-down proteomics to decipher the cardiac proteome for the understanding of the molecular mechanisms underlying diseases of the heart. Expert commentary: Top-down proteomics is a premier method for the global and comprehensive study of protein isoforms and their PTMs, enabling the identification of novel protein isoforms and PTMs, characterization of sequence variations, and quantification of disease-associated alterations. Despite significant challenges, continuous development of top-down proteomics technology will greatly aid the dissection of the molecular mechanisms underlying diseases of the hearts for the identification of novel biomarkers and therapeutic targets.
Topics: Biomarkers; Heart Diseases; Humans; Mass Spectrometry; Protein Isoforms; Protein Processing, Post-Translational; Proteome; Proteomics
PubMed: 27448560
DOI: 10.1080/14789450.2016.1209414 -
Nucleic Acids Research Jul 2015This paper introduces the APPRIS WebServer (http://appris.bioinfo.cnio.es) and WebServices (http://apprisws.bioinfo.cnio.es). Both the web servers and the web services...
This paper introduces the APPRIS WebServer (http://appris.bioinfo.cnio.es) and WebServices (http://apprisws.bioinfo.cnio.es). Both the web servers and the web services are based around the APPRIS Database, a database that presently houses annotations of splice isoforms for five different vertebrate genomes. The APPRIS WebServer and WebServices provide access to the computational methods implemented in the APPRIS Database, while the APPRIS WebServices also allows retrieval of the annotations. The APPRIS WebServer and WebServices annotate splice isoforms with protein structural and functional features, and with data from cross-species alignments. In addition they can use the annotations of structure, function and conservation to select a single reference isoform for each protein-coding gene (the principal protein isoform). APPRIS principal isoforms have been shown to agree overwhelmingly with the main protein isoform detected in proteomics experiments. The APPRIS WebServer allows for the annotation of splice isoforms for individual genes, and provides a range of visual representations and tools to allow researchers to identify the likely effect of splicing events. The APPRIS WebServices permit users to generate annotations automatically in high throughput mode and to interrogate the annotations in the APPRIS Database. The APPRIS WebServices have been implemented using REST architecture to be flexible, modular and automatic.
Topics: Alternative Splicing; Animals; Cats; Cattle; Dogs; Humans; Internet; Mice; Molecular Sequence Annotation; Protein Isoforms; Rats; Software
PubMed: 25990727
DOI: 10.1093/nar/gkv512