-
International Journal of Molecular... Nov 2016Pseudogenes are paralogs generated from ancestral functional genes (parents) during genome evolution, which contain critical defects in their sequences, such as lacking... (Review)
Review
Pseudogenes are paralogs generated from ancestral functional genes (parents) during genome evolution, which contain critical defects in their sequences, such as lacking a promoter, having a premature stop codon or frameshift mutations. Generally, pseudogenes are functionless, but recent evidence demonstrates that some of them have potential roles in regulation. The majority of pseudogenes are generated from functional progenitor genes either by gene duplication (duplicated pseudogenes) or retro-transposition (processed pseudogenes). Pseudogenes are primarily identified by comparison to their parent genes. Bioinformatics tools for pseudogene prediction have been developed, among which PseudoPipe, PSF and Shiu's pipeline are publicly available. We compared these three tools using the well-annotated genome and its known 924 pseudogenes as a test data set. PseudoPipe and Shiu's pipeline identified ~80% of pseudogenes, of which 94% were shared, while PSF failed to generate adequate results. A need for improvement of the bioinformatics tools for pseudogene prediction accuracy in plant genomes was thus identified, with the ultimate goal of improving the quality of genome annotation in plants.
Topics: Computational Biology; Gene Duplication; Genome, Plant; Pseudogenes
PubMed: 27916797
DOI: 10.3390/ijms17121991 -
Microbial Genomics Oct 2022Whole-genome sequence analyses have significantly contributed to the understanding of virulence and evolution of the complex (MTBC), the causative pathogens of...
Whole-genome sequence analyses have significantly contributed to the understanding of virulence and evolution of the complex (MTBC), the causative pathogens of tuberculosis. Most MTBC evolutionary studies are focused on single nucleotide polymorphisms and deletions, but rare studies have evaluated gene content, whereas none has comprehensively evaluated pseudogenes. Accordingly, we describe an extensive study focused on quantifying and predicting possible functions of MTBC and pseudogenes. Using NCBI's PGAP-detected pseudogenes, we analysed 25 837 pseudogenes from 158 MTBC and strains and combined transcriptomics and proteomics of H37Rv to gain insights about pseudogenes' expression. Our results indicate significant variability concerning rate and conservancy of predicted pseudogenes among different ecotypes and lineages of tuberculous mycobacteria and pseudogenization of important virulence factors and genes of the metabolism and antimicrobial resistance/tolerance. We show that predicted pseudogenes contribute considerably to MTBC genetic diversity at the population level. Moreover, the transcription machinery of can fully transcribe most pseudogenes, indicating intact promoters and recent pseudogene evolutionary emergence. Proteomics of and close evaluation of mutational lesions driving pseudogenization suggest that few predicted pseudogenes are likely capable of neofunctionalization, nonsense mutation reversal, or phase variation, contradicting the classical definition of pseudogenes. Such findings indicate that genome annotation should be accompanied by proteomics and protein function assays to improve its accuracy. While indels and insertion sequences are the main drivers of the observed mutational lesions in these species, population bottlenecks and genetic drift are likely the evolutionary processes acting on pseudogenes' emergence over time. Our findings unveil a new perspective on MTBC's evolution and genetic diversity.
Topics: Anti-Infective Agents; Codon, Nonsense; DNA Transposable Elements; Mycobacterium tuberculosis; Pseudogenes; Virulence Factors; Drug Resistance, Bacterial
PubMed: 36250787
DOI: 10.1099/mgen.0.000876 -
Theranostics 2020Pseudogenes were initially regarded as "nonfunctional" genomic elements that did not have protein-coding abilities due to several endogenous inactivating mutations.... (Review)
Review
Pseudogenes were initially regarded as "nonfunctional" genomic elements that did not have protein-coding abilities due to several endogenous inactivating mutations. Although pseudogenes are widely expressed in prokaryotes and eukaryotes, for decades, they have been largely ignored and classified as gene "junk" or "relics". With the widespread availability of high-throughput sequencing analysis, especially omics technologies, knowledge concerning pseudogenes has substantially increased. Pseudogenes are evolutionarily conserved and derive primarily from a mutation or retrotransposon, conferring the pseudogene with a "gene repository" role to store and expand genetic information. In contrast to previous notions, pseudogenes have a variety of functions at the DNA, RNA and protein levels for broadly participating in gene regulation to influence the development and progression of certain diseases, especially cancer. Indeed, some pseudogenes have been proven to encode proteins, strongly contradicting their "trash" identification, and have been confirmed to have tissue-specific and disease subtype-specific expression, indicating their own value in disease diagnosis. Moreover, pseudogenes have been correlated with the life expectancy of patients and exhibit great potential for future use in disease treatment, suggesting that they are promising biomarkers and therapeutic targets for clinical applications. In this review, we summarize the natural properties, functions, disease involvement and clinical value of pseudogenes. Although our knowledge of pseudogenes remains nascent, this field deserves more attention and deeper exploration.
Topics: Biomarkers; Diagnostic Techniques and Procedures; Evolution, Molecular; Gene Expression Regulation; Humans; Life Expectancy; Mutation; Neoplasms; Prognosis; Pseudogenes; Therapeutics
PubMed: 32042317
DOI: 10.7150/thno.40659 -
Genome Biology and Evolution Oct 2022Trypanosomatids belong to a remarkable group of unicellular, parasitic organisms of the order Kinetoplastida, an early diverging branch of the phylogenetic tree of...
Trypanosomatids belong to a remarkable group of unicellular, parasitic organisms of the order Kinetoplastida, an early diverging branch of the phylogenetic tree of eukaryotes, exhibiting intriguing biological characteristics affecting gene expression (intronless polycistronic transcription, trans-splicing, and RNA editing), metabolism, surface molecules, and organelles (compartmentalization of glycolysis, variation of the surface molecules, and unique mitochondrial DNA), cell biology and life cycle (phagocytic vacuoles evasion and intricate patterns of cell morphogenesis). With numerous genomic-scale data of several trypanosomatids becoming available since 2005 (genomes, transcriptomes, and proteomes), the scientific community can further investigate the mechanisms underlying these unusual features and address other unexplored phenomena possibly revealing biological aspects of the early evolution of eukaryotes. One fundamental aspect comprises the processes and mechanisms involved in the acquisition and loss of genes throughout the evolutionary history of these primitive microorganisms. Here, we present a comprehensive in silico analysis of pseudogenes in three major representatives of this group: Leishmania major, Trypanosoma brucei, and Trypanosoma cruzi. Pseudogenes, DNA segments originating from altered genes that lost their original function, are genomic relics that can offer an essential record of the evolutionary history of functional genes, as well as clues about the dynamics and evolution of hosting genomes. Scanning these genomes with functional proteins as proxies to reveal intergenic regions with protein-coding features, relying on a customized threshold to distinguish statistically and biologically significant sequence similarities, and reassembling remnant sequences from their debris, we found thousands of pseudogenes and hundreds of open reading frames, with particular characteristics in each trypanosomatid: mutation profile, number, content, density, codon bias, average size, single- or multi-copy gene origin, number and type of mutations, putative primitive function, and transcriptional activity. These features suggest a common process of pseudogene formation, different patterns of pseudogene evolution and extant biological functions, and/or distinct genome organization undertaken by those parasites during evolution, as well as different evolutionary and/or selective pressures acting on distinct lineages.
Topics: Animals; Pseudogenes; Phylogeny; Open Reading Frames; Genome; Trypanosoma brucei brucei; Parasites
PubMed: 36208292
DOI: 10.1093/gbe/evac142 -
Methods in Molecular Biology (Clifton,... 2021One of the most commonly described biological feature of processed pseudogenes is the ability to influence the expression of their parental coding genes. As evidenced in...
One of the most commonly described biological feature of processed pseudogenes is the ability to influence the expression of their parental coding genes. As evidenced in several studies, the high sequence similarity between these RNA pairs sets up a certain level of competition for posttranscriptional regulators, including, among others, RNA-binding proteins (RBPs). RBPs may affect, positively or negatively, the stability of bound mRNAs, so that, if an overexpressed pseudogene competes with its homologous coding gene, the downstream protein synthesis would change, with potential pathological consequences. Given these premises, a rigorous and comprehensive understanding of interactions between pseudogene-parental gene RNA pairs and RBPs could provide further insights into the biological bases of complex diseases, such as cancer, cardiovascular disease, and type 2 diabetes, identifying novel predictive and/or prognostic biomarkers.Herein, we detail easily adaptable protocols of plasmid-based molecular cloning and RNA-electrophoretic mobility shift assay (EMSA) used in our laboratory for determining the interaction between a cytoplasmatic stabilizing protein (αCP1) and the pseudogene-parental gene RNA pair HMGA1-p /HMGA1. We also offer a general overview of RNA immunoprecipitation procedures and present novel bioinformatic tools for predicting RBPs binding sites on pseudogene transcripts.
Topics: 3' Untranslated Regions; Binding Sites; Binding, Competitive; Biotinylation; Diabetes Mellitus; Electrophoretic Mobility Shift Assay; HMGA1a Protein; Humans; Immunoprecipitation; Luminescent Measurements; Protein Binding; Pseudogenes; RNA; RNA Probes; RNA Stability; RNA, Messenger; RNA-Binding Proteins; Reverse Transcriptase Polymerase Chain Reaction; Sequence Deletion; Transfection
PubMed: 34165716
DOI: 10.1007/978-1-0716-1503-4_12 -
Genome Biology Aug 2021The human genome encodes over 14,000 pseudogenes that are evolutionary relics of protein-coding genes and commonly considered as nonfunctional. Emerging evidence...
BACKGROUND
The human genome encodes over 14,000 pseudogenes that are evolutionary relics of protein-coding genes and commonly considered as nonfunctional. Emerging evidence suggests that some pseudogenes may exert important functions. However, to what extent human pseudogenes are functionally relevant remains unclear. There has been no large-scale characterization of pseudogene function because of technical challenges, including high sequence similarity between pseudogene and parent genes, and poor annotation of transcription start sites.
RESULTS
To overcome these technical obstacles, we develop an integrated computational pipeline to design the first genome-wide library of CRISPR interference (CRISPRi) single-guide RNAs (sgRNAs) that target human pseudogene promoter-proximal regions. We perform the first pseudogene-focused CRISPRi screen in luminal A breast cancer cells and reveal approximately 70 pseudogenes that affect breast cancer cell fitness. Among the top hits, we identify a cancer-testis unitary pseudogene, MGAT4EP, that is predominantly localized in the nucleus and interacts with FOXA1, a key regulator in luminal A breast cancer. By enhancing the promoter binding of FOXA1, MGAT4EP upregulates the expression of oncogenic transcription factor FOXM1. Integrative analyses of multi-omic data from the Cancer Genome Atlas (TCGA) reveal many unitary pseudogenes whose expressions are significantly dysregulated and/or associated with overall/relapse-free survival of patients in diverse cancer types.
CONCLUSIONS
Our study represents the first large-scale study characterizing pseudogene function. Our findings suggest the importance of nuclear function of unitary pseudogenes and underscore their underappreciated roles in human diseases. The functional genomic resources developed here will greatly facilitate the study of human pseudogene function.
Topics: Breast Neoplasms; Cell Nucleus; Cell Proliferation; Clustered Regularly Interspaced Short Palindromic Repeats; Computational Biology; Forkhead Box Protein M1; Gene Expression Regulation, Neoplastic; Hepatocyte Nuclear Factor 3-alpha; Humans; MCF-7 Cells; Promoter Regions, Genetic; Protein Binding; Pseudogenes; RNA, Guide, CRISPR-Cas Systems; Reproducibility of Results; Up-Regulation
PubMed: 34425866
DOI: 10.1186/s13059-021-02464-2 -
Methods in Molecular Biology (Clifton,... 2021Aberrant expression of pseudogenes has been observed in many cancer types. Deregulated pseudogenes engage in a multitude of biological processes at the DNA, RNA, and... (Review)
Review
Aberrant expression of pseudogenes has been observed in many cancer types. Deregulated pseudogenes engage in a multitude of biological processes at the DNA, RNA, and protein levels and eventually facilitate disease progression. To investigate pseudogene functions in cancer, cell lines and cell line transplantation models have been widely used. However, cancer biology is best studied in the context of an intact organism. Here, we present various strategies to investigate pseudogenes in genetically engineered mouse models and discuss advantages and disadvantages of the different approaches.
Topics: Animals; Cell Line, Tumor; Drug Resistance, Microbial; Embryonic Stem Cells; Gene Expression Regulation; Genes, Synthetic; Heterografts; Humans; Mice; Mice, Transgenic; Molecular Targeted Therapy; Neoplasm Transplantation; Neoplasms, Experimental; Promoter Regions, Genetic; Pseudogenes; RNA Interference; Recombinant Proteins; Species Specificity; Tetracycline; Up-Regulation
PubMed: 34165722
DOI: 10.1007/978-1-0716-1503-4_18 -
Journal of Immunology (Baltimore, Md. :... Jan 2022The biological relevance of genes initially categorized as "pseudogenes" is slowly emerging, notably in innate immunity. In the HLA region on chromosome 6, is one such...
The biological relevance of genes initially categorized as "pseudogenes" is slowly emerging, notably in innate immunity. In the HLA region on chromosome 6, is one such pseudogene; yet, it is transcribed, and its variation is associated with immune properties. Furthermore, two alleles, * and *, putatively encode a complete, membrane-bound HLA protein. Here we thus hypothesized that HLA-H contributes to immune homeostasis similarly to tolerogenic molecules HLA-G, -E, and -F. We tested if * encodes a membrane-bound protein that can inhibit the cytotoxicity of effector cells. We used an HLA-null human erythroblast cell line transduced with * cDNA to demonstrate that HLA-H*02:07 encodes a membrane-bound protein. Additionally, using a cytotoxicity assay, our results support that K562 * inhibits human effector IL-2-activated PBMCs and human IL-2-independent NK92-MI cell line activity. Finally, through in silico genotyping of the Denisovan genome and haplotypic association with Denisovan-derived *, we also show that * is of archaic origin. Hence, admixture with archaic humans brought a functional allele into modern European and Asian populations.
Topics: Alleles; Asian People; Cell Membrane; Cytotoxicity, Immunologic; Evolution, Molecular; Gene Frequency; Genotype; HLA-A11 Antigen; Haplotypes; Hemochromatosis Protein; Homeostasis; Humans; Immune Tolerance; K562 Cells; Killer Cells, Natural; Lymphocyte Activation; Pseudogenes; White People
PubMed: 34872977
DOI: 10.4049/jimmunol.2100358 -
Nucleic Acids Research Jul 2023Several atlasing efforts aim to profile human gene and protein expression across tissues, cell types and cell lines in normal physiology, development and disease. One...
Several atlasing efforts aim to profile human gene and protein expression across tissues, cell types and cell lines in normal physiology, development and disease. One utility of these resources is to examine the expression of a single gene across all cell types, tissues and cell lines in each atlas. However, there is currently no centralized place that integrates data from several atlases to provide this type of data in a uniform format for visualization, analysis and download, and via an application programming interface. To address this need, GeneRanger is a web server that provides access to processed data about gene and protein expression across normal human cell types, tissues and cell lines from several atlases. At the same time, TargetRanger is a related web server that takes as input RNA-seq data from profiled human cells and tissues, and then compares the uploaded input data to expression levels across the atlases to identify genes that are highly expressed in the input and lowly expressed across normal human cell types and tissues. Identified targets can be filtered by transmembrane or secreted proteins. The results from GeneRanger and TargetRanger are visualized as box and scatter plots, and as interactive tables. GeneRanger and TargetRanger are available from https://generanger.maayanlab.cloud and https://targetranger.maayanlab.cloud, respectively.
Topics: Humans; Cell Line; Proteomics; Pseudogenes; RNA-Seq; Software; Internet
PubMed: 37166966
DOI: 10.1093/nar/gkad399 -
Genome Biology Jan 2021PIWI proteins, a subfamily of PAZ/PIWI Domain family RNA-binding proteins, are best known for their function in silencing transposons and germline development by... (Review)
Review
PIWI proteins, a subfamily of PAZ/PIWI Domain family RNA-binding proteins, are best known for their function in silencing transposons and germline development by partnering with small noncoding RNAs called PIWI-interacting RNAs (piRNAs). However, recent studies have revealed multifaceted roles of the PIWI-piRNA pathway in regulating the expression of other major classes of RNAs in germ cells. In this review, we summarize how PIWI proteins and piRNAs regulate the expression of many disparate RNAs, describing a highly complex global genomic regulatory relationship at the RNA level through which piRNAs functionally connect all major constituents of the genome in the germline.
Topics: Animals; Caenorhabditis elegans; DNA Transposable Elements; Drosophila; Drosophila Proteins; Gene Expression Regulation, Developmental; Gene Silencing; Germ Cells; Pseudogenes; RNA, Long Noncoding; RNA, Messenger; RNA, Small Interfering; RNA-Binding Proteins
PubMed: 33419460
DOI: 10.1186/s13059-020-02221-x