-
Genome Biology Aug 2021The human genome encodes over 14,000 pseudogenes that are evolutionary relics of protein-coding genes and commonly considered as nonfunctional. Emerging evidence...
BACKGROUND
The human genome encodes over 14,000 pseudogenes that are evolutionary relics of protein-coding genes and commonly considered as nonfunctional. Emerging evidence suggests that some pseudogenes may exert important functions. However, to what extent human pseudogenes are functionally relevant remains unclear. There has been no large-scale characterization of pseudogene function because of technical challenges, including high sequence similarity between pseudogene and parent genes, and poor annotation of transcription start sites.
RESULTS
To overcome these technical obstacles, we develop an integrated computational pipeline to design the first genome-wide library of CRISPR interference (CRISPRi) single-guide RNAs (sgRNAs) that target human pseudogene promoter-proximal regions. We perform the first pseudogene-focused CRISPRi screen in luminal A breast cancer cells and reveal approximately 70 pseudogenes that affect breast cancer cell fitness. Among the top hits, we identify a cancer-testis unitary pseudogene, MGAT4EP, that is predominantly localized in the nucleus and interacts with FOXA1, a key regulator in luminal A breast cancer. By enhancing the promoter binding of FOXA1, MGAT4EP upregulates the expression of oncogenic transcription factor FOXM1. Integrative analyses of multi-omic data from the Cancer Genome Atlas (TCGA) reveal many unitary pseudogenes whose expressions are significantly dysregulated and/or associated with overall/relapse-free survival of patients in diverse cancer types.
CONCLUSIONS
Our study represents the first large-scale study characterizing pseudogene function. Our findings suggest the importance of nuclear function of unitary pseudogenes and underscore their underappreciated roles in human diseases. The functional genomic resources developed here will greatly facilitate the study of human pseudogene function.
Topics: Breast Neoplasms; Cell Nucleus; Cell Proliferation; Clustered Regularly Interspaced Short Palindromic Repeats; Computational Biology; Forkhead Box Protein M1; Gene Expression Regulation, Neoplastic; Hepatocyte Nuclear Factor 3-alpha; Humans; MCF-7 Cells; Promoter Regions, Genetic; Protein Binding; Pseudogenes; RNA, Guide, CRISPR-Cas Systems; Reproducibility of Results; Up-Regulation
PubMed: 34425866
DOI: 10.1186/s13059-021-02464-2 -
Theranostics 2020Pseudogenes were initially regarded as "nonfunctional" genomic elements that did not have protein-coding abilities due to several endogenous inactivating mutations.... (Review)
Review
Pseudogenes were initially regarded as "nonfunctional" genomic elements that did not have protein-coding abilities due to several endogenous inactivating mutations. Although pseudogenes are widely expressed in prokaryotes and eukaryotes, for decades, they have been largely ignored and classified as gene "junk" or "relics". With the widespread availability of high-throughput sequencing analysis, especially omics technologies, knowledge concerning pseudogenes has substantially increased. Pseudogenes are evolutionarily conserved and derive primarily from a mutation or retrotransposon, conferring the pseudogene with a "gene repository" role to store and expand genetic information. In contrast to previous notions, pseudogenes have a variety of functions at the DNA, RNA and protein levels for broadly participating in gene regulation to influence the development and progression of certain diseases, especially cancer. Indeed, some pseudogenes have been proven to encode proteins, strongly contradicting their "trash" identification, and have been confirmed to have tissue-specific and disease subtype-specific expression, indicating their own value in disease diagnosis. Moreover, pseudogenes have been correlated with the life expectancy of patients and exhibit great potential for future use in disease treatment, suggesting that they are promising biomarkers and therapeutic targets for clinical applications. In this review, we summarize the natural properties, functions, disease involvement and clinical value of pseudogenes. Although our knowledge of pseudogenes remains nascent, this field deserves more attention and deeper exploration.
Topics: Biomarkers; Diagnostic Techniques and Procedures; Evolution, Molecular; Gene Expression Regulation; Humans; Life Expectancy; Mutation; Neoplasms; Prognosis; Pseudogenes; Therapeutics
PubMed: 32042317
DOI: 10.7150/thno.40659 -
Science Immunology Nov 2022Herpes simplex virus 1 (HSV-1) infects several billion people worldwide and can cause life-threatening herpes simplex encephalitis (HSE) in some patients. Monogenic...
Herpes simplex virus 1 (HSV-1) infects several billion people worldwide and can cause life-threatening herpes simplex encephalitis (HSE) in some patients. Monogenic defects in components of the type I interferon system have been identified in patients with HSE, emphasizing the role of inborn errors of immunity underlying HSE pathogenesis. Here, we identify compound heterozygous loss-of-function mutations in the gene encoding for transcription factor IIIA (TFIIIA), a component of the RNA polymerase III complex, in a patient with common variable immunodeficiency and HSE. Patient fibroblasts and gene-edited cells displayed impaired HSV-1-induced innate immune responses and enhanced HSV-1 replication. Chromatin immunoprecipitation sequencing analysis identified the 5 ribosomal RNA pseudogene 141 (), an endogenous ligand of the RNA sensor RIG-I, as a transcriptional target of TFIIIA. mutant cells exhibited diminished expression and abrogated RIG-I activation upon HSV-1 infection. Our work unveils a crucial role for TFIIIA in transcriptional regulation of a cellular RIG-I agonist and shows that genetic defects lead to impaired cell-intrinsic anti-HSV-1 responses and can predispose to HSE.
Topics: Humans; Encephalitis, Herpes Simplex; Pseudogenes; RNA; Ligands; Transcription Factor TFIIIA; Herpesvirus 1, Human; Mutation
PubMed: 36399538
DOI: 10.1126/sciimmunol.abq4531 -
Genome Biology and Evolution Oct 2022Trypanosomatids belong to a remarkable group of unicellular, parasitic organisms of the order Kinetoplastida, an early diverging branch of the phylogenetic tree of...
Trypanosomatids belong to a remarkable group of unicellular, parasitic organisms of the order Kinetoplastida, an early diverging branch of the phylogenetic tree of eukaryotes, exhibiting intriguing biological characteristics affecting gene expression (intronless polycistronic transcription, trans-splicing, and RNA editing), metabolism, surface molecules, and organelles (compartmentalization of glycolysis, variation of the surface molecules, and unique mitochondrial DNA), cell biology and life cycle (phagocytic vacuoles evasion and intricate patterns of cell morphogenesis). With numerous genomic-scale data of several trypanosomatids becoming available since 2005 (genomes, transcriptomes, and proteomes), the scientific community can further investigate the mechanisms underlying these unusual features and address other unexplored phenomena possibly revealing biological aspects of the early evolution of eukaryotes. One fundamental aspect comprises the processes and mechanisms involved in the acquisition and loss of genes throughout the evolutionary history of these primitive microorganisms. Here, we present a comprehensive in silico analysis of pseudogenes in three major representatives of this group: Leishmania major, Trypanosoma brucei, and Trypanosoma cruzi. Pseudogenes, DNA segments originating from altered genes that lost their original function, are genomic relics that can offer an essential record of the evolutionary history of functional genes, as well as clues about the dynamics and evolution of hosting genomes. Scanning these genomes with functional proteins as proxies to reveal intergenic regions with protein-coding features, relying on a customized threshold to distinguish statistically and biologically significant sequence similarities, and reassembling remnant sequences from their debris, we found thousands of pseudogenes and hundreds of open reading frames, with particular characteristics in each trypanosomatid: mutation profile, number, content, density, codon bias, average size, single- or multi-copy gene origin, number and type of mutations, putative primitive function, and transcriptional activity. These features suggest a common process of pseudogene formation, different patterns of pseudogene evolution and extant biological functions, and/or distinct genome organization undertaken by those parasites during evolution, as well as different evolutionary and/or selective pressures acting on distinct lineages.
Topics: Animals; Pseudogenes; Phylogeny; Open Reading Frames; Genome; Trypanosoma brucei brucei; Parasites
PubMed: 36208292
DOI: 10.1093/gbe/evac142 -
Genes Jul 2021Although ignored in the past, with the recent deepening of research, significant progress has been made in the field of non-coding RNAs (ncRNAs). Accumulating evidence... (Review)
Review
Although ignored in the past, with the recent deepening of research, significant progress has been made in the field of non-coding RNAs (ncRNAs). Accumulating evidence has revealed that microRNA (miRNA) response elements regulate RNA. Long ncRNAs, circular RNAs, pseudogenes, miRNAs, and messenger RNAs (mRNAs) form a competitive endogenous RNA (ceRNA) network that plays an essential role in cancer and cardiovascular, neurodegenerative, and autoimmune diseases. Gastric cancer (GC) is one of the most common cancers, with a high degree of malignancy. Considerable progress has been made in understanding the molecular mechanism and treatment of GC, but GC's mortality rate is still high. Studies have shown a complex ceRNA crosstalk mechanism in GC. lncRNAs, circRNAs, and pseudogenes can interact with miRNAs to affect mRNA transcription. The study of the involvement of ceRNA in GC could improve our understanding of GC and lead to the identification of potential effective therapeutic targets. The research strategy for ceRNA is mainly to screen the different miRNAs, lncRNAs, circRNAs, pseudogenes, and mRNAs in each sample through microarray or sequencing technology, predict the ceRNA regulatory network, and, finally, conduct functional research on ceRNA. In this review, we briefly discuss the proposal and development of the ceRNA hypothesis and the biological function and principle of ceRNAs in GC, and briefly introduce the role of ncRNAs in the GC's ceRNA network.
Topics: Gene Expression Regulation, Neoplastic; Gene Regulatory Networks; Helicobacter pylori; Humans; MicroRNAs; Microarray Analysis; Pseudogenes; RNA, Circular; RNA, Long Noncoding; RNA, Messenger; RNA, Untranslated; Response Elements; Stomach Neoplasms
PubMed: 34356052
DOI: 10.3390/genes12071036 -
Nucleic Acids Research Jul 2023Several atlasing efforts aim to profile human gene and protein expression across tissues, cell types and cell lines in normal physiology, development and disease. One...
Several atlasing efforts aim to profile human gene and protein expression across tissues, cell types and cell lines in normal physiology, development and disease. One utility of these resources is to examine the expression of a single gene across all cell types, tissues and cell lines in each atlas. However, there is currently no centralized place that integrates data from several atlases to provide this type of data in a uniform format for visualization, analysis and download, and via an application programming interface. To address this need, GeneRanger is a web server that provides access to processed data about gene and protein expression across normal human cell types, tissues and cell lines from several atlases. At the same time, TargetRanger is a related web server that takes as input RNA-seq data from profiled human cells and tissues, and then compares the uploaded input data to expression levels across the atlases to identify genes that are highly expressed in the input and lowly expressed across normal human cell types and tissues. Identified targets can be filtered by transmembrane or secreted proteins. The results from GeneRanger and TargetRanger are visualized as box and scatter plots, and as interactive tables. GeneRanger and TargetRanger are available from https://generanger.maayanlab.cloud and https://targetranger.maayanlab.cloud, respectively.
Topics: Humans; Cell Line; Proteomics; Pseudogenes; RNA-Seq; Software; Internet
PubMed: 37166966
DOI: 10.1093/nar/gkad399 -
Genome Biology Jan 2021PIWI proteins, a subfamily of PAZ/PIWI Domain family RNA-binding proteins, are best known for their function in silencing transposons and germline development by... (Review)
Review
PIWI proteins, a subfamily of PAZ/PIWI Domain family RNA-binding proteins, are best known for their function in silencing transposons and germline development by partnering with small noncoding RNAs called PIWI-interacting RNAs (piRNAs). However, recent studies have revealed multifaceted roles of the PIWI-piRNA pathway in regulating the expression of other major classes of RNAs in germ cells. In this review, we summarize how PIWI proteins and piRNAs regulate the expression of many disparate RNAs, describing a highly complex global genomic regulatory relationship at the RNA level through which piRNAs functionally connect all major constituents of the genome in the germline.
Topics: Animals; Caenorhabditis elegans; DNA Transposable Elements; Drosophila; Drosophila Proteins; Gene Expression Regulation, Developmental; Gene Silencing; Germ Cells; Pseudogenes; RNA, Long Noncoding; RNA, Messenger; RNA, Small Interfering; RNA-Binding Proteins
PubMed: 33419460
DOI: 10.1186/s13059-020-02221-x -
Genome Medicine May 2017The Human Genome Project and advances in DNA sequencing technologies have revolutionized the identification of genetic disorders through the use of clinical exome... (Review)
Review
The Human Genome Project and advances in DNA sequencing technologies have revolutionized the identification of genetic disorders through the use of clinical exome sequencing. However, in a considerable number of patients, the genetic basis remains unclear. As clinicians begin to consider whole-genome sequencing, an understanding of the processes and tools involved and the factors to consider in the annotation of the structure and function of genomic elements that might influence variant identification is crucial. Here, we discuss and illustrate the strengths and weaknesses of approaches for the annotation and classification of important elements of protein-coding genes, other genomic elements such as pseudogenes and the non-coding genome, comparative-genomic approaches for inferring gene function, and new technologies for aiding genome annotation, as a practical guide for clinicians when considering pathogenic sequence variation. Complete and accurate annotation of structure and function of genome features has the potential to reduce both false-negative (from missing annotation) and false-positive (from incorrect annotation) errors in causal variant identification in exome and genome sequences. Re-analysis of unsolved cases will be necessary as newer technology improves genome annotation, potentially improving the rate of diagnosis.
Topics: Diagnostic Techniques and Procedures; Genetic Variation; Humans; Molecular Sequence Annotation; Pseudogenes; Sequence Analysis, DNA
PubMed: 28558813
DOI: 10.1186/s13073-017-0441-1 -
Hepatology (Baltimore, Md.) Jun 2023Interferon (IFN) signaling is critical to the pathogenesis of alcohol-associated hepatitis (AH), yet the mechanisms for activation of this system are elusive. We...
BACKGROUND AND AIMS
Interferon (IFN) signaling is critical to the pathogenesis of alcohol-associated hepatitis (AH), yet the mechanisms for activation of this system are elusive. We hypothesize that host-derived 5S rRNA pseudogene (RNA5SP) transcripts regulate IFN production and modify immunity in AH.
APPROACH AND RESULTS
Mining of transcriptomic datasets revealed that in patients with severe alcohol-associated hepatitis (sAH), hepatic expression of genes regulated by IFNs was perturbed and gene sets involved in IFN production were enriched. RNA5SP transcripts were also increased and correlated with expression of type I IFNs. Interestingly, inflammatory mediators upregulated in sAH, but not in other liver diseases, were positively correlated with certain RNA5SP transcripts. Real-time quantitative PCR demonstrated that RNA5SP transcripts were upregulated in peripheral blood mononuclear cells (PBMCs) from patients with sAH. In sAH livers, increased 5S rRNA and reduced nuclear MAF1 (MAF1 homolog, negative regulator of RNA polymerase III) protein suggested a higher activity of RNA polymerase III (Pol III); inhibition of Pol III reduced RNA5SP expression in monocytic THP-1 cells. Expression of several RNA5SP transcript-interacting proteins was downregulated in sAH, potentially unmasking transcripts to immunosensors. Indeed, siRNA knockdown of interacting proteins potentiated the immunostimulatory activity of RNA5SP transcripts. Molecular interaction and cell viability assays demonstrated that RNA5SP transcripts adopted Z-conformation and contributed to ZBP1-mediated caspase-independent cell death.
CONCLUSIONS
Increased expression and binding availability of RNA5SP transcripts was associated with hepatic IFN production and inflammation in sAH. These data identify RNA5SP transcripts as a potential target to mitigate inflammation and hepatocellular injury in AH.
Topics: Humans; RNA, Ribosomal, 5S; Pseudogenes; RNA Polymerase III; Biosensing Techniques; Leukocytes, Mononuclear; Immunoassay; Inflammation; Hepatitis, Alcoholic; Interferon Type I
PubMed: 36645226
DOI: 10.1097/HEP.0000000000000024 -
Medicine Feb 2016Osteoarthritis (OA) is a complex disorder characterized by degenerative articular cartilage and is largely attributed to genetic risk factors. Single nucleotide... (Meta-Analysis)
Meta-Analysis Review
Osteoarthritis (OA) is a complex disorder characterized by degenerative articular cartilage and is largely attributed to genetic risk factors. Single nucleotide polymorphisms (SNPs) are common DNA variants that have shown promising and efficiency, compared with positional cloning, to map candidate genes of complex diseases, including OA. In this study, we aim to provide an overview of multiple SNPs from a number of genes that have recently been linked to OA susceptibility. We also performed a comprehensive meta-analysis to evaluate the association of SNP rs7639618 of double von Willebrand factor A domains (DVWA) gene with OA susceptibility. A systematic search of studies on the association of SNPs with susceptibility to OA was conducted in PubMed and Google scholar. Studies subjected to meta-analysis include human and case-control studies that met the Hardy-Weinberg equilibrium model and provide sufficient data to calculate an odds ratio (OR). A total of 9500 OA cases and 9365 controls in 7 case-control studies relating to SNP rs7639618 were included in this study and the ORs with 95% confidence intervals (CIs) were calculated. Over 50 SNPs from different genes have been shown to be associated with either hip (23), or knee (20), or both (13) OA. The ORs of these SNPs for OA and the subtypes are not consistent. As to SNP rs7639618 of DVWA, increased knee OA risk was observed in all genetic models analyzed. Specifically, people from Asian with G-allele showed significantly increased risk of knee OA (A versus G: OR = 1.28, 95% CI 1.13-1.46; AA versus GG: OR = 1.60, 95% CI 1.25-2.05; GA versus GG: OR = 1.31, 95% CI 1.18-1.44; AA versus GA+GG: OR = 1.34, 95% CI 1.12-1.61; AA+GA versus GG: OR = 1.40, 95% CI 1.19-1.64), but not in Caucasians or with hip OA. Our results suggest that multiple SNPs play different roles in the pathogenesis of OA and its subtypes; SNP rs7639618 of DVWA gene is associated with a significantly increased risk of knee OA in Asians. Given the limited sample size, further studies are needed to evaluate this observation.
Topics: Collagen Type VI; Genetic Heterogeneity; Genetic Predisposition to Disease; Humans; Odds Ratio; Osteoarthritis; Polymorphism, Single Nucleotide; Pseudogenes
PubMed: 26886631
DOI: 10.1097/MD.0000000000002811