-
Nature Methods Nov 2014Proteogenomics is an area of research at the interface of proteomics and genomics. In this approach, customized protein sequence databases generated using genomic and... (Review)
Review
Proteogenomics is an area of research at the interface of proteomics and genomics. In this approach, customized protein sequence databases generated using genomic and transcriptomic information are used to help identify novel peptides (not present in reference protein sequence databases) from mass spectrometry-based proteomic data; in turn, the proteomic data can be used to provide protein-level evidence of gene expression and to help refine gene models. In recent years, owing to the emergence of new sequencing technologies such as RNA-seq and dramatic improvements in the depth and throughput of mass spectrometry-based proteomics, the pace of proteogenomic research has greatly accelerated. Here I review the current state of proteogenomic methods and applications, including computational strategies for building and using customized protein sequence databases. I also draw attention to the challenge of false positive identifications in proteogenomics and provide guidelines for analyzing the data and reporting the results of proteogenomic studies.
Topics: Databases, Nucleic Acid; Databases, Protein; Genetic Variation; Genomics; High-Throughput Nucleotide Sequencing; Mass Spectrometry; Protein Isoforms; Proteome; Proteomics; Sequence Analysis, Protein
PubMed: 25357241
DOI: 10.1038/nmeth.3144 -
Cellular and Molecular Life Sciences :... Mar 2015Proteogenomics, or the integration of proteomics with genomics and transcriptomics, is emerging as the next step towards a unified understanding of cellular functions....
Proteogenomics, or the integration of proteomics with genomics and transcriptomics, is emerging as the next step towards a unified understanding of cellular functions. Looking globally and simultaneously at gene structure, RNA expression, protein synthesis and post-translational modifications have become technically feasible and offer a new perspective to molecular processes. Recent publications have highlighted the value of proteogenomics in oncology for defining the molecular signature of human tumors, and translation to other areas of biomedicine and life sciences is anticipated. This mini-review will discuss recent developments, challenges and perspectives in proteogenomics.
Topics: Genomics; Humans; Metabolomics; Neoplasms; Proteins; Proteomics; RNA, Messenger
PubMed: 25609363
DOI: 10.1007/s00018-015-1837-y -
EC Psychology and Psychiatry Jun 2023The aim of this study is to provide a comprehensive overview of spatial multiomics analysis, including its definition, processes, applications, significance and relevant...
The aim of this study is to provide a comprehensive overview of spatial multiomics analysis, including its definition, processes, applications, significance and relevant research in psychiatric disorders. To achieve this, a literature search was conducted, focusing on three major spatial omics techniques and their application to three common psychiatric disorders: Alzheimer's disease (AD), schizophrenia, and autism spectrum disorders. Spatial genomics analysis has revealed specific genes associated with neuropsychiatric disorders in certain brain regions. Spatial transcriptomics analysis has identified genes related to AD in areas such as the hippocampus, olfactory bulb, and middle temporal gyrus. It has also provided insight into the response to AD in mouse models. Spatial proteogenomics has identified autism spectrum disorder (ASD)-risk genes in specific cell types, while schizophrenia risk loci have been linked to transcriptional signatures in the human hippocampus. In summary, spatial multiomics analysis offers a powerful approach to understand AD pathology and other psychiatric diseases, integrating multiple data modalities to identify risk genes for these disorders. It is valuable for studying psychiatric disorders with high or low cellular heterogeneity and provides new insights into the brain nucleome to predict disease progression and aid diagnosis and treatment.
PubMed: 37424930
DOI: No ID Found -
Molecular & Cellular Proteomics : MCP Jun 2017With combined technological advancements in high-throughput next-generation sequencing and deep mass spectrometry-based proteomics, proteogenomics, the integrative... (Review)
Review
With combined technological advancements in high-throughput next-generation sequencing and deep mass spectrometry-based proteomics, proteogenomics, the integrative analysis of proteomic and genomic data, has emerged as a new research field. Early efforts in the field were focused on improving protein identification using sample-specific genomic and transcriptomic sequencing data. More recently, integrative analysis of quantitative measurements from genomic and proteomic studies have identified novel insights into gene expression regulation, cell signaling, and disease. Many methods and tools have been developed or adapted to enable an array of integrative proteogenomic approaches and in this article, we systematically classify published methods and tools into four major categories, (1) Sequence-centric proteogenomics; (2) Analysis of proteogenomic relationships; (3) Integrative modeling of proteogenomic data; and (4) Data sharing and visualization. We provide a comprehensive review of methods and available tools in each category and highlight their typical applications.
Topics: Humans; Information Dissemination; Models, Biological; Proteogenomics
PubMed: 28456751
DOI: 10.1074/mcp.MR117.000024 -
Nature Communications Aug 2023Systemic pan-tumor analyses may reveal the significance of common features implicated in cancer immunogenicity and patient survival. Here, we provide a comprehensive...
Systemic pan-tumor analyses may reveal the significance of common features implicated in cancer immunogenicity and patient survival. Here, we provide a comprehensive multi-omics data set for 32 patients across 25 tumor types for proteogenomic-based discovery of neoantigens. By using an optimized computational approach, we discover a large number of tumor-specific and tumor-associated antigens. To create a pipeline for the identification of neoantigens in our cohort, we combine DNA and RNA sequencing with MS-based immunopeptidomics of tumor specimens, followed by the assessment of their immunogenicity and an in-depth validation process. We detect a broad variety of non-canonical HLA-binding peptides in the majority of patients demonstrating partially immunogenicity. Our validation process allows for the selection of 32 potential neoantigen candidates. The majority of neoantigen candidates originates from variants identified in the RNA data set, illustrating the relevance of RNA as a still understudied source of cancer antigens. This study underlines the importance of RNA-centered variant detection for the identification of shared biomarkers and potentially relevant neoantigen candidates.
Topics: Humans; Proteogenomics; Neoplasms; Antigens, Neoplasm; Peptides
PubMed: 37532709
DOI: 10.1038/s41467-023-39570-7 -
Cell Reports. Medicine May 2024Non-clear cell renal cell carcinomas (non-ccRCCs) encompass diverse malignant and benign tumors. Refinement of differential diagnosis biomarkers, markers for early...
Non-clear cell renal cell carcinomas (non-ccRCCs) encompass diverse malignant and benign tumors. Refinement of differential diagnosis biomarkers, markers for early prognosis of aggressive disease, and therapeutic targets to complement immunotherapy are current clinical needs. Multi-omics analyses of 48 non-ccRCCs compared with 103 ccRCCs reveal proteogenomic, phosphorylation, glycosylation, and metabolic aberrations in RCC subtypes. RCCs with high genome instability display overexpression of IGF2BP3 and PYCR1. Integration of single-cell and bulk transcriptome data predicts diverse cell-of-origin and clarifies RCC subtype-specific proteogenomic signatures. Expression of biomarkers MAPRE3, ADGRF5, and GPNMB differentiates renal oncocytoma from chromophobe RCC, and PIGR and SOSTDC1 distinguish papillary RCC from MTSCC. This study expands our knowledge of proteogenomic signatures, biomarkers, and potential therapeutic targets in non-ccRCC.
Topics: Humans; Proteogenomics; Kidney Neoplasms; Biomarkers, Tumor; Carcinoma, Renal Cell; Transcriptome; Male; Female; Middle Aged; Gene Expression Regulation, Neoplastic
PubMed: 38703764
DOI: 10.1016/j.xcrm.2024.101547 -
Current Issues in Molecular Biology May 2024Proteogenomics represents a transformative intersection in nephrology, uniting genomics, transcriptomics, and proteomics to unravel the molecular intricacies of kidney... (Review)
Review
Proteogenomics represents a transformative intersection in nephrology, uniting genomics, transcriptomics, and proteomics to unravel the molecular intricacies of kidney diseases. This review encapsulates the methodological essence of proteogenomics and its profound implications in chronic kidney disease (CKD) research. We explore the proteogenomic pipeline, highlighting the integrated analysis of genomic, transcriptomic, and proteomic data and its pivotal role in enhancing our understanding of kidney pathologies. Through case studies, we showcase the application of proteogenomics in clear cell renal cell carcinoma (ccRCC) and Autosomal Recessive Polycystic Kidney Disease (ARPKD), emphasizing its potential in personalized treatment strategies and biomarker discovery. The review also addresses the challenges in proteogenomic analysis, including data integration complexities and bioinformatics limitations, and proposes solutions for advancing the field. Ultimately, this review underscores the prospective future of proteogenomics in nephrology, particularly in advancing personalized medicine and providing novel therapeutic insights.
PubMed: 38785547
DOI: 10.3390/cimb46050279 -
Nature Communications Mar 2023The subtypes of duodenal cancer (DC) are complicated and the carcinogenesis process is not well characterized. We present comprehensive characterization of 438 samples...
The subtypes of duodenal cancer (DC) are complicated and the carcinogenesis process is not well characterized. We present comprehensive characterization of 438 samples from 156 DC patients, covering 2 major and 5 rare subtypes. Proteogenomics reveals LYN amplification at the chromosome 8q gain functioned in the transmit from intraepithelial neoplasia phase to infiltration tumor phase via MAPK signaling, and illustrates the DST mutation improves mTOR signaling in the duodenal adenocarcinoma stage. Proteome-based analysis elucidates stage-specific molecular characterizations and carcinogenesis tracks, and defines the cancer-driving waves of the adenocarcinoma and Brunner's gland subtypes. The drug-targetable alanyl-tRNA synthetase (AARS1) in the high tumor mutation burden/immune infiltration is significantly enhanced in DC progression, and catalyzes the lysine-alanylation of poly-ADP-ribose polymerases (PARP1), which decreases the apoptosis of cancer cells, eventually promoting cell proliferation and tumorigenesis. We assess the proteogenomic landscape of early DC, and provide insights into the molecular features corresponding therapeutic targets.
Topics: Humans; Duodenal Neoplasms; Proteogenomics; Brunner Glands; Adenocarcinoma; Carcinogenesis
PubMed: 36991000
DOI: 10.1038/s41467-023-37221-5 -
Analytical Chemistry Aug 2023Small proteins of around 50 aa in length have been largely overlooked in genetic and biochemical assays due to the inherent challenges with detecting and characterizing...
Small proteins of around 50 aa in length have been largely overlooked in genetic and biochemical assays due to the inherent challenges with detecting and characterizing them. Recent discoveries of their critical roles in many biological processes have led to an increased recognition of the importance of small proteins for basic research and as potential new drug targets. One example is CcoM, a 36 aa subunit of the -type oxidase that plays an essential role in adaptation to oxygen-limited conditions in , a model for the clinically relevant, opportunistic pathogen . However, as no comprehensive data were available in , we devised an integrated, generic approach to study small proteins more systematically. Using the first complete genome as basis, we conducted bottom-up proteomics analyses and established a digest-free, direct-sequencing proteomics approach to study cells grown under aerobic and oxygen-limiting conditions. Finally, we also applied a proteogenomics pipeline to identify missed protein-coding genes. Overall, we identified 2921 known and 29 novel proteins, many of which were differentially regulated. Among 176 small proteins 16 were novel. Direct sequencing, featuring a specialized precursor acquisition scheme, exhibited advantages in the detection of small proteins with higher (up to 100%) sequence coverage and more spectral counts, including sequences with high proline content. Three novel small proteins, uniquely identified by direct sequencing and not conserved beyond , were predicted to form an operon with a conserved protein and may represent genes. These data demonstrate the power of this combined approach to study small proteins in and show its potential for other prokaryotes.
Topics: Pseudomonas stutzeri; Proteomics; Proteogenomics; Pseudomonas aeruginosa; Oxygen
PubMed: 37535005
DOI: 10.1021/acs.analchem.3c00676 -
Journal of Pharmaceutical Analysis Sep 2023Pheretima, also called "earthworms", is a well-known animal-derived traditional Chinese medicine that is extensively used in over 50 Chinese patent medicines (CPMs) in...
Pheretima, also called "earthworms", is a well-known animal-derived traditional Chinese medicine that is extensively used in over 50 Chinese patent medicines (CPMs) in Chinese Pharmacopoeia (2020 edition). However, its zoological origin is unclear, both in the herbal market and CPMs. In this study, a strategy for integrating in-house annotated protein databases constructed from close evolutionary relationship-sourced RNA sequencing data from public archival resources and various sequencing algorithms (restricted search, open search, and de novo) was developed to characterize the phenotype of natural peptides of three major commercial species of Pheretima, including (PA), (PV), and (MM). We identified 10,477 natural peptides in the PA, 7,451 in PV, and 5,896 in MM samples. Five specific signature peptides were screened and then validated using synthetic peptides; these demonstrated robust specificity for the authentication of PA, PV, and MM. Finally, all marker peptides were successfully applied to identify the zoological origins of Brain Heart capsules and Xiaohuoluo pills, revealing the inconsistent Pheretima species used in these CPMs. In conclusion, our integrated strategy could be used for the in-depth characterization of natural peptides of other animal-derived traditional Chinese medicines, especially non-model species with poorly annotated protein databases.
PubMed: 37842652
DOI: 10.1016/j.jpha.2023.06.006