-
GigaScience Dec 2022The nonrandom distribution of alleles of common genomic variants produces haplotypes, which are fundamental in medical and population genetic studies. Consequently,...
BACKGROUND
The nonrandom distribution of alleles of common genomic variants produces haplotypes, which are fundamental in medical and population genetic studies. Consequently, protein-coding genes with different co-occurring sets of alleles can encode different amino acid sequences: protein haplotypes. These protein haplotypes are present in biological samples and detectable by mass spectrometry, but they are not accounted for in proteomic searches. Consequently, the impact of haplotypic variation on the results of proteomic searches and the discoverability of peptides specific to haplotypes remain unknown.
FINDINGS
Here, we study how common genetic haplotypes influence the proteomic search space and investigate the possibility to match peptides containing multiple amino acid substitutions to a publicly available data set of mass spectra. We found that for 12.42% of the discoverable amino acid substitutions encoded by common haplotypes, 2 or more substitutions may co-occur in the same peptide after tryptic digestion of the protein haplotypes. We identified 352 spectra that matched to such multivariant peptides, and out of the 4,582 amino acid substitutions identified, 6.37% were covered by multivariant peptides. However, the evaluation of the reliability of these matches remains challenging, suggesting that refined error rate estimation procedures are needed for such complex proteomic searches.
CONCLUSIONS
As these procedures become available and the ability to analyze protein haplotypes increases, we anticipate that proteomics will provide new information on the consequences of common variation, across tissues and time.
Topics: Proteomics; Haplotypes; Reproducibility of Results; Proteins; Peptides
PubMed: 37919975
DOI: 10.1093/gigascience/giad093 -
The ISME Journal Dec 2023Cyanobacteria form dense multicellular communities that experience transient conditions in terms of access to light and oxygen. These systems are productive but also...
Cyanobacteria form dense multicellular communities that experience transient conditions in terms of access to light and oxygen. These systems are productive but also undergo substantial biomass turnover through cell death, supplementing heightened heterotrophic respiration. Here we use metagenomics and metaproteomics to survey the molecular response of a mat-forming cyanobacterium undergoing mass cell lysis after exposure to dark and anoxic conditions. A lack of evidence for viral, bacterial, or eukaryotic antagonism contradicts commonly held beliefs on the causative agent for cyanobacterial death during dense growth. Instead, proteogenomics data indicated that lysis likely resulted from a genetically programmed response triggered by a failure to maintain osmotic pressure in the wake of severe energy limitation. Cyanobacterial DNA was rapidly degraded, yet cyanobacterial proteins remained abundant. A subset of proteins, including enzymes involved in amino acid metabolism, peptidases, toxin-antitoxin systems, and a potentially self-targeting CRISPR-Cas system, were upregulated upon lysis, indicating possible involvement in the programmed cell death response. We propose this natural form of cell death could provide new pathways for controlling harmful algal blooms and for sustainable bioproduct production.
Topics: Proteome; Cyanobacteria; Harmful Algal Bloom; Biomass; Cell Death
PubMed: 37914776
DOI: 10.1038/s41396-023-01545-3 -
Journal For Immunotherapy of Cancer Oct 2023Identification of tumor antigens presented by the human leucocyte antigen (HLA) molecules is essential for the design of effective and safe cancer immunotherapies that... (Review)
Review
Identification of tumor antigens presented by the human leucocyte antigen (HLA) molecules is essential for the design of effective and safe cancer immunotherapies that rely on T cell recognition and killing of tumor cells. Mass spectrometry (MS)-based immunopeptidomics enables high-throughput, direct identification of HLA-bound peptides from a variety of cell lines, tumor tissues, and healthy tissues. It involves immunoaffinity purification of HLA complexes followed by MS profiling of the extracted peptides using data-dependent acquisition, data-independent acquisition, or targeted approaches. By incorporating DNA, RNA, and ribosome sequencing data into immunopeptidomics data analysis, the proteogenomic approach provides a powerful means for identifying tumor antigens encoded within the canonical open reading frames of annotated coding genes and non-canonical tumor antigens derived from presumably non-coding regions of our genome. We discuss emerging computational challenges in immunopeptidomics data analysis and tumor antigen identification, highlighting key considerations in the proteogenomics-based approach, including accurate DNA, RNA and ribosomal sequencing data analysis, careful incorporation of predicted novel protein sequences into reference protein database, special quality control in MS data analysis due to the expanded and heterogeneous search space, cancer-specificity determination, and immunogenicity prediction. The advancements in technology and computation is continually enabling us to identify tumor antigens with higher sensitivity and accuracy, paving the way toward the development of more effective cancer immunotherapies.
Topics: Humans; Histocompatibility Antigens Class I; Mass Spectrometry; Antigens, Neoplasm; Peptides; HLA Antigens; Neoplasms; Histocompatibility Antigens Class II; RNA; DNA
PubMed: 37899131
DOI: 10.1136/jitc-2023-007073 -
The Science of the Total Environment Jan 2024Exposure to single molecules under laboratory conditions has led to a better understanding of the mechanisms of action (MeOAs) and effects of pharmaceutical active... (Review)
Review
Mixture effects of pharmaceuticals carbamazepine, diclofenac and venlafaxine on Mytilus galloprovincialis mussel probed by metabolomics and proteogenomics combined approach.
Exposure to single molecules under laboratory conditions has led to a better understanding of the mechanisms of action (MeOAs) and effects of pharmaceutical active compounds (PhACs) on non-target organisms. However, not taking the co-occurrence of contaminants in the environment and their possible interactions into account may lead to underestimation of their impacts. In this study, we combined untargeted metabolomics and proteogenomics approaches to assess the mixture effects of diclofenac, carbamazepine and venlafaxine on marine mussels (Mytilus galloprovincialis). Our multi-omics approach and data fusion strategy highlighted how such xenobiotic cocktails induce important cellular changes that can be harmful to marine bivalves. This response is mainly characterized by energy metabolism disruption, fatty acid degradation, protein synthesis and degradation, and the induction of endoplasmic reticulum stress and oxidative stress. The known MeOAs and molecular signatures of PhACs were taken into consideration to gain insight into the mixture effects, thereby revealing a potential additive effect. Multi-omics approaches on mussels as sentinels offer a comprehensive overview of molecular and cellular responses triggered by exposure to contaminant mixtures, even at environmental concentrations.
Topics: Animals; Mytilus; Diclofenac; Venlafaxine Hydrochloride; Proteogenomics; Water Pollutants, Chemical; Carbamazepine; Benzodiazepines; Pharmaceutical Preparations
PubMed: 37879482
DOI: 10.1016/j.scitotenv.2023.168015 -
Nature Cancer Dec 2023Despite recent advances in the treatment of acute myeloid leukemia (AML), there has been limited success in targeting surface antigens in AML, in part due to shared...
Despite recent advances in the treatment of acute myeloid leukemia (AML), there has been limited success in targeting surface antigens in AML, in part due to shared expression across malignant and normal cells. Here, high-density immunophenotyping of AML coupled with proteogenomics identified unique expression of a variety of antigens, including the RNA helicase U5 snRNP200, on the surface of AML cells but not on normal hematopoietic precursors and skewed Fc receptor distribution in the AML immune microenvironment. Cell membrane localization of U5 snRNP200 was linked to surface expression of the Fcγ receptor IIIA (FcγIIIA, also known as CD32A) and correlated with expression of interferon-regulated immune response genes. Anti-U5 snRNP200 antibodies engaging activating Fcγ receptors were efficacious across immunocompetent AML models and were augmented by combination with azacitidine. These data provide a roadmap of AML-associated antigens with Fc receptor distribution in AML and highlight the potential for targeting the AML cell surface using Fc-optimized therapeutics.
Topics: Humans; Antibodies, Monoclonal; Antigens, Surface; Leukemia, Myeloid, Acute; Receptors, Fc; Receptors, IgG; Ribonucleoproteins, Small Nuclear; Tumor Microenvironment
PubMed: 37872381
DOI: 10.1038/s43018-023-00656-2 -
Journal of Proteome Research Nov 2023An accurate quantification of HLA class I gene expression is important in understanding the interplay with the tumor microenvironment of antitumor cytotoxic T cell...
An accurate quantification of HLA class I gene expression is important in understanding the interplay with the tumor microenvironment of antitumor cytotoxic T cell activities. Because HLA-I sequences are highly variable, standard RNAseq and mass spectrometry-based quantification workflows using common genome and protein sequence references do not provide HLA-I allele specific quantifications. Here, we used personalized HLA-I nucleotide and protein reference sequences based on the subjects' HLA-I genotypes and surveyed tumor and adjacent normal samples from patients across nine cancer types. Mass spectrometry using data dependent acquisition data was validated to be sufficient to estimate HLA-A protein expression at the allele level. We found that HLA-I proteins were present in significantly higher levels in tumors compared to adjacent normal tissues from 41 to 63% of head and neck squamous cell carcinoma, uterine corpus endometrial carcinoma, and clear cell renal cell carcinoma patients, and this was driven by increased levels of HLA-I gene transcripts. Most immune cell types are universally enriched in HLA-I high tumors, while endothelial and neuronal cells showed divergent relationships with HLA-I. Pathway analysis revealed that tumor senescence and autophagy activity influence the level of HLA-I proteins in glioblastoma. Genes correlated to HLA-I protein expression are mostly the ones directly involved in HLA-I function in immune response and cell death, while glycosylation genes are exclusively co-expressed with HLA-I at the protein level.
Topics: Humans; Histocompatibility Antigens Class I; Carcinoma, Squamous Cell; Proteogenomics; Carcinoma, Renal Cell; Kidney Neoplasms; Tumor Microenvironment
PubMed: 37857377
DOI: 10.1021/acs.jproteome.3c00491 -
Journal of Pharmaceutical Analysis Sep 2023Pheretima, also called "earthworms", is a well-known animal-derived traditional Chinese medicine that is extensively used in over 50 Chinese patent medicines (CPMs) in...
Pheretima, also called "earthworms", is a well-known animal-derived traditional Chinese medicine that is extensively used in over 50 Chinese patent medicines (CPMs) in Chinese Pharmacopoeia (2020 edition). However, its zoological origin is unclear, both in the herbal market and CPMs. In this study, a strategy for integrating in-house annotated protein databases constructed from close evolutionary relationship-sourced RNA sequencing data from public archival resources and various sequencing algorithms (restricted search, open search, and de novo) was developed to characterize the phenotype of natural peptides of three major commercial species of Pheretima, including (PA), (PV), and (MM). We identified 10,477 natural peptides in the PA, 7,451 in PV, and 5,896 in MM samples. Five specific signature peptides were screened and then validated using synthetic peptides; these demonstrated robust specificity for the authentication of PA, PV, and MM. Finally, all marker peptides were successfully applied to identify the zoological origins of Brain Heart capsules and Xiaohuoluo pills, revealing the inconsistent Pheretima species used in these CPMs. In conclusion, our integrated strategy could be used for the in-depth characterization of natural peptides of other animal-derived traditional Chinese medicines, especially non-model species with poorly annotated protein databases.
PubMed: 37842652
DOI: 10.1016/j.jpha.2023.06.006 -
Microbial Physiology 2024The denitrifying betaproteobacterium Aromatoleum aromaticum EbN1T is a facultative anaerobic degradation specialist and belongs to the environmental bacteria studied... (Review)
Review
The denitrifying betaproteobacterium Aromatoleum aromaticum EbN1T is a facultative anaerobic degradation specialist and belongs to the environmental bacteria studied best on the proteogenomic level. This review summarizes the current state of knowledge about the anaerobic and aerobic degradation (to CO2) of 47 organic growth substrates (23 aromatic, 21 aliphatic, and 3 amino acids) as well as the modes of respiratory energy conservation (denitrification vs. O2-respiration). The constructed catabolic network is comprised of 256 genes, which occupy ∼7.5% of the coding regions of the genome. In total, 219 encoded proteins have been identified by differential proteomics, yielding a proteome coverage of ∼74% of the network. Its degradation section is composed of 31 peripheral and 4 central pathways, with several peripheral modules (e.g., for 4-ethylphenol, 2-phenylethylamine, indoleacetate, and phenylpropanoids) discovered only after the complete genome [Arch Microbiol. 2005 Jan;183(1):27-36] and a first proteomic survey [Proteomics. 2007 Jun;7(13):2222-39] of A. aromaticum EbN1T were reported. The activation of recalcitrant aromatic compounds involves a suite of biochemically intriguing reactions ranging from C-H-bond activation (e.g., ethylbenzene dehydrogenase) via carboxylation (e.g., acetophenone carboxylase) to oxidative deamination (e.g., benzylamine), reductive dearomatization (benzoyl-CoA), and epoxide-forming oxygenases (e.g., phenylacetyl-CoA). The peripheral reaction sequences are substrate-specifically induced, mediated by specific transcriptional regulators with in vivo response thresholds in the nanomolar range. While lipophilic substrates (e.g., phenolics) enter the cells via passive diffusion, polar ones require active uptake that is driven by specific transporters. Next to the protein repertoire for canonical complexes I-III, denitrification, and O2-respiration (low- and high-affinity oxidases), the genome encodes an Ndh-II, a tetrathionate reductase, two ETF:quinone oxidoreductases, and two Rnf-type complexes, broadening the electron transfer flexibility of the strain. Taken together, the detailed catabolic network presented here forms a solid basis for future systems biology-level studies with A. aromaticum EbN1T.
Topics: Bacterial Proteins; Anaerobiosis; Metabolic Networks and Pathways; Aerobiosis; Proteome; Proteomics; Denitrification; Rhodocyclaceae
PubMed: 37816339
DOI: 10.1159/000534425 -
BioRxiv : the Preprint Server For... Oct 2023There has been a dramatic increase in the identification of non-conical translation and a significant expansion of the protein-coding genome and proteome. Among the...
There has been a dramatic increase in the identification of non-conical translation and a significant expansion of the protein-coding genome and proteome. Among the strategies used to identify novel small ORFs (smORFs), Ribosome profiling (Ribo-Seq) is the gold standard for the annotation of novel coding sequences by reporting on smORF translation. In Ribo-Seq, ribosome-protected footprints (RPFs) that map to multiple sites in the genome are computationally removed since they cannot unambiguously be assigned to a specific genomic location, or to a specific transcript in the case of multiple isoforms. Furthermore, RPFs necessarily result in short (25-34 nucleotides) reads, increasing the chance of ambiguous and multi-mapping alignments, such that smORFs that reside in these regions cannot be identified by Ribo-Seq. Here, we show that the inclusion of proteogenomics to create a Ribosome Profiling and Proteogenomics Pipeline (RP3) bypasses this limitation to identify a group of microprotein-encoding smORFs that are missed by current Ribo-Seq pipelines. Moreover, we show that the microproteins identified by RP3 have different sequence compositions from the ones identified by Ribo-Seq-only pipelines, which can affect proteomics identification. In aggregate, the development of RP3 maximizes the detection and confidence of protein-encoding smORFs and microproteins.
PubMed: 37808637
DOI: 10.1101/2023.09.27.559809 -
Science Bulletin Nov 2023Epstein-Barr virus (EBV) is the oncogenic driver of multiple cancers. However, the underlying mechanism of virus-cancer immunological interaction during disease...
Epstein-Barr virus (EBV) is the oncogenic driver of multiple cancers. However, the underlying mechanism of virus-cancer immunological interaction during disease pathogenesis remains largely elusive. Here we reported the first comprehensive proteogenomic characterization of natural killer/T-cell lymphoma (NKTCL), a representative disease model to study EBV-induced lymphomagenesis, incorporating genomic, transcriptomic, and in-depth proteomic data. Our multi-omics analysis of NKTCL revealed that EBV gene pattern correlated with immune-related oncogenic signaling. Single-cell transcriptome further delineated the tumor microenvironment as immune-inflamed, -deficient, and -desert phenotypes, in association with different setpoints of cancer-immunity cycle. EBV interacted with transcriptional factors to provoke GPCR interactome (GPCRome) reprogramming. Enhanced expression of chemokine receptor-1 (CCR1) on malignant and immunosuppressive cells modulated virus-cancer interaction on microenvironment. Therapeutic targeting CCR1 showed promising efficacy with EBV eradication, T-cell activation, and lymphoma cell killing in NKTCL organoid. Collectively, our study identified a previously unknown GPCR-mediated malignant progression and translated sensors of viral molecules into EBV-specific anti-cancer therapeutics.
Topics: Humans; Herpesvirus 4, Human; Epstein-Barr Virus Infections; Proteomics; Lymphoma; Natural Killer T-Cells; Tumor Microenvironment
PubMed: 37798178
DOI: 10.1016/j.scib.2023.09.029