-
Microbiology Spectrum Sep 2023Pseudogenes, once considered "junk DNA" based on the incorrect assumption that the absence of full coding potential means a complete lack of functionality, have recently...
Pseudogenes, once considered "junk DNA" based on the incorrect assumption that the absence of full coding potential means a complete lack of functionality, have recently become a subject of significant interest in the scientific community. Concurrently, it is widely assumed that bacterial genomes are compact and have a high density of coding genes with little room for non-coding genes, including pseudogenes. A key aspect of genome annotation is the correct identification of genes and the distinction between coding genes and pseudogenes, as it directly impacts functional and comparative genomics studies. In this study, we analyzed the genomic data of 4,699 strains of the bacterium () as they exhibit high variability in the number of annotated pseudogenes. In particular, we looked for correlations between the number of pseudogenes and other genomic and meta-features of the strains. We identified clusters of orthologous genes and pseudogenes and compared cluster size distributions and length homogeneity within clusters. We then mapped and examined orthology relationships between genes and pseudogenes. Additionally, we generated a phylogenetic tree of the strains and found that phylogenetically related strains are more homogeneous in the number of pseudogenes and share a significant amount of pseudogenes. Finally, we delved into clusters of orthologous genes and pseudogenes and quantified their phylogenetic neighborhood, classifying pseudogenes into evolutionary preserved pseudogenes, mis-annotated pseudogenes, or pseudogenes formed by failed horizontal transfer events. This in-depth study provides important insights that can be incorporated into pseudogene annotation pipelines in the future. IMPORTANCE Accurate annotation of genes and pseudogenes is vital for comparative genomics analysis. Recent studies have shown that bacterial pseudogenes have an important role in regulatory processes and can provide insight into the evolutionary history of homologous genes or the genome as a whole. Due to pseudogenes' nature as non-functional genes, there is no commonly accepted definition of a pseudogene, which poses difficulties in verifying the annotation through experimental methods and resolving discrepancies among different annotation techniques. Our study introduces an in-depth analysis of annotated genes and pseudogenes and insights that can be incorporated into improved pseudogene annotation pipelines in the future.
PubMed: 37750703
DOI: 10.1128/spectrum.01704-23 -
BMC Cancer Apr 2021Nasopharyngeal carcinoma (NPC) is a malignant head and neck tumor, and more than 70% of new cases are in East and Southeast Asia. However, association between NPC and...
BACKGROUND
Nasopharyngeal carcinoma (NPC) is a malignant head and neck tumor, and more than 70% of new cases are in East and Southeast Asia. However, association between NPC and pseudogenes playing important roles in genesis of multiple tumor types is still not clear and needs to be investigated.
METHODS
Using RNA-Sequencing (RNA-seq) technology, we analyzed pseudogene expression in 13 primary NPC and 6 recurrent NPC samples as well as their paracancerous counterparts. Quantitative PCR was used to validate the differentially expressed pseudogenes.
RESULTS
We found 251 differentially expressed pseudogenes including 73 up-regulated and 178 down-regulated ones between primary NPC and paracancerous tissues. Enrichment analysis of gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway were conducted to filter out the key pseudogenes. We reported that pseudogenes from cytochrome P450 (CYP) family, such as CYP2F2P, CYP2G1P, CYP4F24P, CYP2B7P and CYP2G2P were significantly down-regulated in NPC compared to paracancerous tissues, while IGHV1OR15-2, IGHV3-11, FCGR1CP and IGHV3-69-1 belonging to Fc gamma receptors were significantly up-regulated. CYP2B7P, CYP2F2P and CYP4F26P were enriched in arachidonic acid metabolism pathway. The qRT-PCR analysis validated the lower expression of pseudogenes CYP2F2P and CYP2B7P in NPC tissues and cell lines compared to paracancerous tissues and normal human nasopharyngeal epithelial cell line. CYP2B7P overexpression weakened migratory and invasive capacity of NPC cell line. Moreover, the expression pattern of those pseudogenes in recurrent NPC tissues was different from the primary NPC.
CONCLUSION
This study suggested the role of pseudogenes in tumorigenesis and progression, potentially functioning as therapeutic targets to NPC.
Topics: Adult; Aged; Arachidonic Acid; Cell Line, Tumor; Cell Movement; Cytochrome P-450 Enzyme System; Cytochrome P450 Family 2; Down-Regulation; Female; Gene Ontology; Humans; Male; Middle Aged; Nasopharyngeal Carcinoma; Nasopharyngeal Neoplasms; Neoplasm Invasiveness; Neoplasm Recurrence, Local; Pseudogenes; Real-Time Polymerase Chain Reaction; Receptors, IgG; Sequence Analysis, RNA; Transfection; Up-Regulation
PubMed: 33931030
DOI: 10.1186/s12885-021-08211-x -
Scientific Reports Dec 2022Pseudogene-derived transcripts, especially those barely transcribed in normal tissues, have been regarded as a kind of non-coding RNAs, and present potential functions...
Pseudogene-derived transcripts, especially those barely transcribed in normal tissues, have been regarded as a kind of non-coding RNAs, and present potential functions in tumorigenicity and tumor development in human beings. However, their exact effects on hepatocellular carcinoma (HCC) remain largely unknown. On basis of our previous research and the constructed online database for the non-coding RNAs related to HCC, a series of pseudogene transcripts have been discovered, and SNRPFP1, the homologous pseudogene of SNRPF, was found to produce an anomalously high expression long non-coding RNA in HCC. In this study, we validated the expression of the SNRPFP1 transcript in both HCC tissues and cell lines. The adverse correlation between SNRPFP1 expression and patients' outcomes was observed. And depletion of SNRPF1 in HCC cells significantly suppressed cell proliferation and apoptosis resistance. Meanwhile, the motility of HCC cells was potently impaired. Interestingly, miR-126-5p, one of the tumor-suppressive genes commonly decreased in HCC, was found negatively expressed and correlated with SNRPF1, and a specific region of SNRPF1 transcript is directly binding to miR-126-5p in a molecular sponge way. The rescue experiment by knock-out miR-126-5p significantly reversed the cell growth suppression and a higher ratio of cell apoptosis induced by SNRPF1 depletion. Lastly, we concluded that SNRPF1 is a pseudogene active in HCC, and its abnormally over-expressed transcript is a strong promoter of HCC cell progress in vitro by sponging miR-126-5p. We believe that the findings in this study provide new strategies for HCC prevention and therapeutic treatment.
Topics: Humans; Carcinoma, Hepatocellular; Liver Neoplasms; RNA, Long Noncoding; MicroRNAs; Pseudogenes; Cell Line, Tumor; Cell Proliferation; Gene Expression Regulation, Neoplastic; Cell Movement
PubMed: 36535956
DOI: 10.1038/s41598-022-24597-5 -
ELife Apr 2020The partial success of a study to reproduce experiments that linked pseudogenes and cancer proves that understanding RNA networks is more complicated than expected.
The partial success of a study to reproduce experiments that linked pseudogenes and cancer proves that understanding RNA networks is more complicated than expected.
Topics: Biology; Humans; Neoplasms; Pseudogenes; RNA; RNA, Messenger; Reproducibility of Results
PubMed: 32314733
DOI: 10.7554/eLife.56397 -
American Journal of Cancer Research 2022Cancer of the thyroid is the most common endocrine malignancy. While treatment options are limited for individuals with medullary or anaplastic thyroid cancer,... (Review)
Review
Cancer of the thyroid is the most common endocrine malignancy. While treatment options are limited for individuals with medullary or anaplastic thyroid cancer, understanding the underlying mechanisms is vital to developing a successful thyroid cancer treatment strategy due to the tumor's multistep carcinogenesis. Non-coding RNAs (ncRNAs) have been associated with thyroid cancer progression in several recent studies; however, the role of regulatory interactions among different types of ncRNAs in thyroid cancer remains unclear. Recently, competing endogenous RNA (ceRNA) has been discovered as a mechanism demonstrating regulatory interactions among non-coding RNAs, including pseudogenes, long non-coding RNAs (lnRNAs), circular RNAs (circRNAs), and microRNAs (miRNAs). It has been concluded from the literature that numerous ceRNA networks are deregulated during the development, invasion, and metastasis of thyroid cancer, as well as in epithelial-mesenchymal transition (EMT) and drug resistance. Further understanding of these deregulations is important to develop diagnostic procedures for early detection of thyroid cancer and promising therapeutic options for effective treatment. The purpose of this review is to highlight the emerging roles of some newly found ceRNA members in thyroid cancer and outline the current body of knowledge regarding ceRNA, lncRNA, pseudogenes, and miRNAs.
PubMed: 35411240
DOI: No ID Found -
Nature Communications Jul 2020Pseudogenes are ideal markers of genome remodelling. In turn, the mouse is an ideal platform for studying them, particularly with the recent availability of...
Pseudogenes are ideal markers of genome remodelling. In turn, the mouse is an ideal platform for studying them, particularly with the recent availability of strain-sequencing and transcriptional data. Here, combining both manual curation and automatic pipelines, we present a genome-wide annotation of the pseudogenes in the mouse reference genome and 18 inbred mouse strains (available via the mouse.pseudogene.org resource). We also annotate 165 unitary pseudogenes in mouse, and 303, in human. The overall pseudogene repertoire in mouse is similar to that in human in terms of size, biotype distribution, and family composition (e.g. with GAPDH and ribosomal proteins being the largest families). Notable differences arise in the pseudogene age distribution, with multiple retro-transpositional bursts in mouse evolutionary history and only one in human. Furthermore, in each strain about a fifth of all pseudogenes are unique, reflecting strain-specific evolution. Finally, we find that ~15% of the mouse pseudogenes are transcribed, and that highly transcribed parent genes tend to give rise to many processed pseudogenes.
Topics: Animals; Conserved Sequence; Evolution, Molecular; Gene Ontology; Genome; Humans; Mice, Inbred C57BL; Molecular Sequence Annotation; Pseudogenes; Species Specificity; Transcription, Genetic
PubMed: 32728065
DOI: 10.1038/s41467-020-17157-w -
Science Advances Jun 2024Mutations in cause Gaucher disease and are the most important genetic risk factor for Parkinson's disease. However, analysis of transcription at this locus is...
Mutations in cause Gaucher disease and are the most important genetic risk factor for Parkinson's disease. However, analysis of transcription at this locus is complicated by its highly homologous pseudogene, . We show that >50% of short RNA-sequencing reads mapping to also map to . Thus, we used long-read RNA sequencing in the human brain, which allowed us to accurately quantify expression from both and . We discovered significant differences in expression compared to short-read data and identify currently unannotated transcripts of both and . These included protein-coding transcripts from both genes that were translated in human brain, but without the known lysosomal function-yet accounting for almost a third of transcription. Analyzing brain-specific cell types using long-read and single-nucleus RNA sequencing revealed region-specific variations in transcript expression. Overall, these findings suggest nonlysosomal roles for and with implications for our understanding of the role of in health and disease.
Topics: Humans; Glucosylceramidase; Pseudogenes; Brain; Molecular Sequence Annotation; Parkinson Disease; Gaucher Disease; Sequence Analysis, RNA
PubMed: 38924406
DOI: 10.1126/sciadv.adk1296 -
Frontiers in Immunology 2022Type 2 diabetes mellitus (T2DM) has been confirmed to be closely associated with breast cancer (BC). However, the shared mechanisms between these diseases remain...
Type 2 diabetes mellitus (T2DM) has been confirmed to be closely associated with breast cancer (BC). However, the shared mechanisms between these diseases remain unclear. By comparing different datasets, we identified shared differentially expressed (DE) RNAs in T2DM and BC, including 427 mRNAs and 6 miRNAs from the GEO(Gene Expression Omnibus) database. We used databases to predict interactions to construct two critical networks. The transcription factor (TF)-miRNA‒mRNA network contained 236 TFs, while the RNA binding protein (RBP)-pseudogene-mRNA network showed that the pseudogene S-phase kinase associated protein 1 pseudogene 1 (SKP1P1) might play a key role in regulating gene expression. The shared mRNAs between T2DM and BC were enriched in cytochrome (CYP) pathways, and further analysis of and expression in cell lines, single cells and other cancers showed that they were strongly correlated with the survival and prognosis of patients with BC. This result suggested that patients with T2DM presenting the downregulation of and might have a higher risk of developing BC. Overall, our work revealed that high expression of CYPs in patients with T2DM might be a susceptibility factor for BC and identified novel gene candidates and immune features that are promising targets for immunotherapy in patients with BC.
Topics: Breast Neoplasms; Cytochromes; Diabetes Mellitus, Type 2; Female; Gene Expression Regulation, Neoplastic; Gene Regulatory Networks; Humans; MicroRNAs; Pseudogenes; RNA, Messenger; RNA-Binding Proteins; S-Phase Kinase-Associated Proteins; Transcription Factors
PubMed: 36131924
DOI: 10.3389/fimmu.2022.915017 -
Movement Disorders Clinical Practice Jul 2022
PubMed: 35844281
DOI: 10.1002/mdc3.13499 -
International Journal of Environmental... Mar 2022Mounting evidence has linked carbon nanotube (CNT) exposure with malignant transformation of lungs. Long non-coding RNAs (lncRNAs) and pseudogenes are important...
Identification of Candidate lncRNA and Pseudogene Biomarkers Associated with Carbon-Nanotube-Induced Malignant Transformation of Lung Cells and Prediction of Potential Preventive Drugs.
Mounting evidence has linked carbon nanotube (CNT) exposure with malignant transformation of lungs. Long non-coding RNAs (lncRNAs) and pseudogenes are important regulators to mediate the pathogenesis of diseases, representing potential biomarkers for surveillance of lung carcinogenesis in workers exposed to CNTs and possible targets to develop preventive strategies. The aim of this study was to screen crucial lncRNAs and pseudogenes and predict preventive drugs. GSE41178 (small airway epithelial cells exposed to single- or multi-walled CNTs or dispersant control) and GSE56104 (lung epithelial cells exposed to single-walled CNTs or dispersant control) datasets were downloaded from the Gene Expression Omnibus database. Weighted correlation network analysis was performed for these two datasets, and the turquoise module was preserved and associated with CNT-induced malignant phenotypes. In total, 24 lncRNAs and 112 pseudogenes in this module were identified as differentially expressed in CNT-exposed cells compared with controls. Four lncRNAs (MEG3, ARHGAP5-AS1, LINC00174 and PVT1) and five pseudogenes (MT1JP, MT1L, RPL23AP64, ZNF826P and TMEM198B) were predicted to function by competing endogenous RNA (MEG3/RPL23AP64-hsa-miR-942-5p-CPEB2/PHF21A/BAMBI; ZNF826P-hsa-miR-23a-3p-SYNGAP1, TMEM198B-hsa-miR-15b-5p-SYNGAP1/CLU; PVT1-hsa-miR-423-5p-PSME3) or co-expression (MEG3/MT1L/ZNF826P/MT1JP-ATM; ARHGAP5-AS1-TMED10, LINC00174-NEDD4L, ARHGAP5-AS1/PVT1-NIP7; MT1L/MT1JP-SYNGAP1; MT1L/MT1JP-CLU) mechanisms. The expression levels and prognosis of all genes in the above interaction pairs were validated using lung cancer patient samples. The receiver operating characteristic curve analysis showed the combination of four lncRNAs, five pseudogenes or lncRNAs + pseudogenes were all effective for predicting lung cancer (accuracy >0.8). The comparative toxicogenomics database suggested schizandrin A, folic acid, zinc or gamma-linolenic acid may be preventive drugs by reversing the expression levels of lncRNAs or pseudogenes. In conclusion, this study highlights lncRNAs and pseudogenes as candidate diagnostic biomarkers and drug targets for CNT-induced lung cancer.
Topics: Biomarkers; Gene Expression Regulation, Neoplastic; Humans; Lung; Lung Neoplasms; MicroRNAs; Nanotubes, Carbon; Nuclear Proteins; Pseudogenes; RNA, Long Noncoding
PubMed: 35270630
DOI: 10.3390/ijerph19052936