-
Nature Reviews. Genetics Mar 2020Pseudogenes are defined as regions of the genome that contain defective copies of genes. They exist across almost all forms of life, and in mammalian genomes are... (Review)
Review
Pseudogenes are defined as regions of the genome that contain defective copies of genes. They exist across almost all forms of life, and in mammalian genomes are annotated in similar numbers to recognized protein-coding genes. Although often presumed to lack function, growing numbers of pseudogenes are being found to play important biological roles. In consideration of their evolutionary origins and inherent limitations in genome annotation practices, we posit that pseudogenes have been classified on a scientifically unsubstantiated basis. We reflect that a broad misunderstanding of pseudogenes, perpetuated in part by the pejorative inference of the 'pseudogene' label, has led to their frequent dismissal from functional assessment and exclusion from genomic analyses. With the advent of technologies that simplify the study of pseudogenes, we propose that an objective reassessment of these genomic elements will reveal valuable insights into genome function and evolution.
Topics: Animals; Evolution, Molecular; Genomics; Humans; Pseudogenes
PubMed: 31848477
DOI: 10.1038/s41576-019-0196-1 -
Frontiers in Endocrinology 2021Copy Number Variations (CNVs) account for a large proportion of human genome and are a primary contributor to human phenotypic variation, in addition to being the... (Review)
Review
Copy Number Variations (CNVs) account for a large proportion of human genome and are a primary contributor to human phenotypic variation, in addition to being the molecular basis of a wide spectrum of disease. Multiallelic CNVs represent a considerable fraction of large CNVs and are strictly related to segmental duplications according to their prevalent duplicate alleles. RCCX CNV is a complex, multiallelic and tandem CNV located in the major histocompatibility complex (MHC) class III region. RCCX structure is typically defined by the copy number of a DNA segment containing a series of genes - the serine/threonine kinase 19 (), the complement 4 (), the steroid 21-hydroxylase (), and the tenascin-X () - lie close to each other. In the Caucasian population, the most common RCCX haplotype (69%) consists of two segments containing the genes , with a telomere-to-centromere orientation. Nonallelic homologous recombination (NAHR) plays a key role into the RCCX genetic diversity: unequal crossover facilitates large structural rearrangements and copy number changes, whereas gene conversion mediates relatively short sequence transfers. The results of these events increased the RCCX genetic diversity and are responsible of specific human diseases. This review provides an overview on RCCX complexity pointing out the molecular bases of Congenital Adrenal Hyperplasia (CAH) due to CYP21A2 deficiency, CAH-X Syndrome and disorders related to CNV of complement component C4.
Topics: Adrenal Hyperplasia, Congenital; Complement C4; DNA Copy Number Variations; Genetic Variation; Humans; Nuclear Proteins; Protein Serine-Threonine Kinases; Pseudogenes; Steroid 21-Hydroxylase; Tenascin
PubMed: 34394006
DOI: 10.3389/fendo.2021.709758 -
BMC Evolutionary Biology Oct 2020Through its ability to open pores in cell membranes, perforin-1 plays a key role in the immune system. Consistent with this role, the gene encoding perforin shows...
BACKGROUND
Through its ability to open pores in cell membranes, perforin-1 plays a key role in the immune system. Consistent with this role, the gene encoding perforin shows hallmarks of complex evolutionary events, including amplification and pseudogenization, in multiple species. A large proportion of these events occurred in phyla for which scarce genomic data were available. However, recent large-scale genomics projects have added a wealth of information on those phyla. Using this input, we annotated perforin-1 homologs in more than eighty species including mammals, reptiles, birds, amphibians and fishes.
RESULTS
We have annotated more than 400 perforin genes in all groups studied. Most mammalian species only have one perforin locus, which may contain a related pseudogene. However, we found four independent small expansions in unrelated members of this class. We could reconstruct the full-length coding sequences of only a few avian perforin genes, although we found incomplete and truncated forms of these gene in other birds. In the rest of reptilia, perforin-like genes can be found in at least three different loci containing up to twelve copies. Notably, mammals, non-avian reptiles, amphibians, and possibly teleosts share at least one perforin-1 locus as assessed by flanking genes. Finally, fish genomes contain multiple perforin loci with varying copy numbers and diverse exon/intron patterns. We have also found evidence for shorter genes with high similarity to the C2 domain of perforin in several teleosts. A preliminary analysis suggests that these genes arose at least twice during evolution from perforin-1 homologs.
CONCLUSIONS
The assisted annotation of new genomic assemblies shows complex patterns of birth-and-death events in the evolution of perforin. These events include duplication/pseudogenization in mammals, multiple amplifications and losses in reptiles and fishes and at least one case of partial duplication with a novel start codon in fishes.
Topics: Amphibians; Animals; Birds; Evolution, Molecular; Fishes; Genome; Mammals; Perforin; Phylogeny; Reptiles
PubMed: 33076840
DOI: 10.1186/s12862-020-01698-1 -
BMC Medical Genomics Apr 2020Given the vast range of molecular mechanisms giving rise to breast cancer, it is unlikely universal cures exist. However, by providing a more precise prognosis for...
BACKGROUND
Given the vast range of molecular mechanisms giving rise to breast cancer, it is unlikely universal cures exist. However, by providing a more precise prognosis for breast cancer patients through integrative models, treatments can become more individualized, resulting in more successful outcomes. Specifically, we combine gene expression, pseudogene expression, miRNA expression, clinical factors, and pseudogene-gene functional networks to generate these models for breast cancer prognostics. Establishing a LASSO-generated molecular gene signature revealed that the increased expression of genes STXBP5, GALP and LOC387646 indicate a poor prognosis for a breast cancer patient. We also found that increased CTSLP8 and RPS10P20 and decreased HLA-K pseudogene expression indicate poor prognosis for a patient. Perhaps most importantly we identified a pseudogene-gene interaction, GPS2-GPS2P1 (improved prognosis) that is prognostic where neither the gene nor pseudogene alone is prognostic of survival. Besides, miR-3923 was predicted to target GPS2 using miRanda, PicTar, and TargetScan, which imply modules of gene-pseudogene-miRNAs that are potentially functionally related to patient survival.
RESULTS
In our LASSO-based model, we take into account features including pseudogenes, genes and candidate pseudogene-gene interactions. Key biomarkers were identified from the features. The identification of key biomarkers in combination with significant clinical factors (such as stage and radiation therapy status) should be considered as well, enabling a specific prognostic prediction and future treatment plan for an individual patient. Here we used our PseudoFuN web application to identify the candidate pseudogene-gene interactions as candidate features in our integrative models. We further identified potential miRNAs targeting those features in our models using PseudoFuN as well. From this study, we present an interpretable survival model based on LASSO and decision trees, we also provide a novel feature set which includes pseudogene-gene interaction terms that have been ignored by previous prognostic models. We find that some interaction terms for pseudogenes and genes are significantly prognostic of survival. These interactions are cross-over interactions, where the impact of the gene expression on survival changes with pseudogene expression and vice versa. These may imply more complicated regulation mechanisms than previously understood.
CONCLUSIONS
We recommend these novel feature sets be considered when training other types of prognostic models as well, which may provide more comprehensive insights into personalized treatment decisions.
Topics: Biomarkers, Tumor; Breast Neoplasms; Computational Biology; Female; Gene Expression Profiling; Gene Expression Regulation, Neoplastic; Gene Regulatory Networks; Humans; Prognosis; Pseudogenes; Survival Rate
PubMed: 32241256
DOI: 10.1186/s12920-020-0687-0 -
International Journal of Molecular... Dec 2020Protein aggregation is classically considered the main cause of neuronal death in neurodegenerative diseases (NDDs). However, increasing evidence suggests that... (Review)
Review
Protein aggregation is classically considered the main cause of neuronal death in neurodegenerative diseases (NDDs). However, increasing evidence suggests that alteration of RNA metabolism is a key factor in the etiopathogenesis of these complex disorders. Non-coding RNAs are the major contributor to the human transcriptome and are particularly abundant in the central nervous system, where they have been proposed to be involved in the onset and development of NDDs. Interestingly, some ncRNAs (such as lncRNAs, circRNAs and pseudogenes) share a common functionality in their ability to regulate gene expression by modulating miRNAs in a phenomenon known as the competing endogenous RNA mechanism. Moreover, ncRNAs are found in body fluids where their presence and concentration could serve as potential non-invasive biomarkers of NDDs. In this review, we summarize the ceRNA networks described in Alzheimer's disease, Parkinson's disease, multiple sclerosis, amyotrophic lateral sclerosis and spinocerebellar ataxia type 7, and discuss their potential as biomarkers of these NDDs. Although numerous studies have been carried out, further research is needed to validate these complex interactions between RNAs and the alterations in RNA editing that could provide specific ceRNET profiles for neurodegenerative disorders, paving the way to a better understanding of these diseases.
Topics: Animals; Biomarkers; Cell-Free Nucleic Acids; Gene Regulatory Networks; Humans; Neurodegenerative Diseases
PubMed: 33339180
DOI: 10.3390/ijms21249582 -
Methods in Molecular Biology (Clifton,... 2021Competing endogenous RNA (ceRNA) molecules have emerged as key players in regulating gene expression, increasing the complexity of the range of possible dynamics within...
Competing endogenous RNA (ceRNA) molecules have emerged as key players in regulating gene expression, increasing the complexity of the range of possible dynamics within a cell. The actions of competing RNA typically are sponging behaviors, in a manner that fine-tunes gene expression, but there are particular network structures that may show destabilization due to ceRNA interactions. In this chapter, we discuss how these interactions can be modeled and probed from a mathematical, first-principles perspective.
Topics: Algorithms; Computational Biology; Gene Expression Regulation; Gene Regulatory Networks; MicroRNAs; Models, Theoretical; Pseudogenes; RNA, Long Noncoding; RNA, Messenger; RNA, Untranslated; Regulatory Sequences, Ribonucleic Acid
PubMed: 34165711
DOI: 10.1007/978-1-0716-1503-4_7 -
Trends in Genetics : TIG Sep 2019Constitutive heterochromatin represents a significant portion of eukaryotic genomes, but its functions still need to be elucidated. Even in the most updated genetics and... (Review)
Review
Constitutive heterochromatin represents a significant portion of eukaryotic genomes, but its functions still need to be elucidated. Even in the most updated genetics and molecular biology textbooks, constitutive heterochromatin is portrayed mainly as the 'silent' component of eukaryotic genomes. However, there may be more complexity to the relationship between heterochromatin and gene expression. In the fruit fly Drosophila melanogaster, a model for heterochromatin studies, about one-third of the genome is heterochromatic and is concentrated in the centric, pericentric, and telomeric regions of the chromosomes. Recent findings indicate that hundreds of D. melanogaster genes can 'live and work' properly within constitutive heterochromatin. The genomic size of these genes is generally larger than that of euchromatic genes and together they account for a significant fraction of the entire constitutive heterochromatin. Thus, this peculiar genome component in spite its ability to induce silencing, has in fact the means for being quite dynamic. A major scope of this review is to revisit the 'dogma of silent heterochromatin'.
Topics: Animals; Chromosomes, Insect; Drosophila melanogaster; Epigenesis, Genetic; Gene Dosage; Gene Expression Regulation; Genome, Insect; Heterochromatin; Pseudogenes; RNA, Circular; RNA, Small Interfering; Sex Chromosomes; Y Chromosome
PubMed: 31320181
DOI: 10.1016/j.tig.2019.06.002 -
Cancer Cell International Jan 2022HPV as the main cause of cervical cancer has long been revealed, but the detailed mechanism has not yet been elucidated. The role of testis/cancer antigen in cervical...
BACKGROUND
HPV as the main cause of cervical cancer has long been revealed, but the detailed mechanism has not yet been elucidated. The role of testis/cancer antigen in cervical cancer has been revealed. However, there are no reports about the statement of testis/cancer-specific non-coding RNA. In this study, we first proposed TCAM1P as a testis/cancer-specific pseudogene, and used a series of experimental data to verify its relationship with HPV, and analyzed its diagnosis value of high-grade cervical lesions and the mechanism of their high expression in cervical cancer. This provides a new direction for the prevention and treatment of cervical cancer.
METHODS
The specific expression of pseudogenes in each tissue was calculated by "TAU" formula. ROC curve was used to judge the diagnosed value of TCAM1P for high-grade lesions. The proliferation ability of cells was measured by CCK8. The expression of TCAM1P, HPV E6/E7 were detected by qRT-PCR. The binding for RBPs on TCAM1P was predicted by starbase v2.0 database, then RIP assay was used to verify. Besides, Gene Ontology (GO) and KEGG enrichment analysis were performed with "clusterprofiler" R package.
RESULTS
TCAM1P was specifically high-expressed in normal testicular tissue and cervical cancer. Interesting, with the severity of cervical lesions increased, the expression of TCAM1P increased, and TCAM1P could effectively diagnose high-grade cervical lesions. Besides, the expression of TCAM1P was HPV dependent, with highest expression in HPV-positive cervical cancer tissues. Furthermore, RIP assay showed that EIF4A3 regulated the expression of TCAM1P through binding with it. CCK8 assay showed that TCAM1P promoted the proliferation and the Gene ontology (GO) and KEGG Pathway enrichment analysis same suggested that TCAM1P is involved in multiple ways in cell proliferation including Cell cycle, DNA replication and etc. CONCLUSIONS: In this study, we firstly proposed that TCAM1P is cancer/testis pseudogene and is regulated by HPV E6/E7 and EIF4A3. TCAM1P promotes the proliferation of cervical cancer cells and acts as promoter in cervical cancer. Otherwise, TCAM1P promote proliferation through regulating cell cycle and DNA replication, but more evidence needs to be provided to reveal the mechanism by which TCAM1P plays a role in cervical cancer.
PubMed: 35016697
DOI: 10.1186/s12935-021-02440-7 -
Pathology, Research and Practice Jan 2024This review examines and compares the diagnostic and prognostic capabilities of miRNAs and lncRNAs derived from pseudogenes in cancer patients. Additionally, it delves... (Review)
Review
This review examines and compares the diagnostic and prognostic capabilities of miRNAs and lncRNAs derived from pseudogenes in cancer patients. Additionally, it delves into their roles in cancer pathogenesis. Both miRNAs and pseudogene-derived lncRNAs have undergone thorough investigation as remarkably sensitive and specific cancer biomarkers, offering significant potential for cancer detection and monitoring. . Extensive research is essential to gain a complete understanding of the precise roles these non-coding RNAs play in cancer, allowing the development of novel targeted therapies and biomarkers for improved cancer detection and treatment approaches.
Topics: Humans; MicroRNAs; RNA, Long Noncoding; Pseudogenes; Neoplasms; Prognosis; Biomarkers, Tumor
PubMed: 38128189
DOI: 10.1016/j.prp.2023.155014 -
International Journal of Molecular... Nov 2019Apolipoprotein C1 (apoC1), the smallest of all apolipoproteins, participates in lipid transport and metabolism. In humans, gene is in linkage disequilibrium with gene... (Review)
Review
Apolipoprotein C1 (apoC1), the smallest of all apolipoproteins, participates in lipid transport and metabolism. In humans, gene is in linkage disequilibrium with gene on chromosome 19, a proximity that spurred its investigation. Apolipoprotein C1 associates with triglyceride-rich lipoproteins and HDL and exchanges between lipoprotein classes. These interactions occur via amphipathic helix motifs, as demonstrated by biophysical studies on the wild-type polypeptide and representative mutants. Apolipoprotein C1 acts on lipoprotein receptors by inhibiting binding mediated by apolipoprotein E, and modulating the activities of several enzymes. Thus, apoC1 downregulates lipoprotein lipase, hepatic lipase, phospholipase A2, cholesterylester transfer protein, and activates lecithin-cholesterol acyl transferase. By controlling the plasma levels of lipids, apoC1 relates directly to cardiovascular physiology, but its activity extends beyond, to inflammation and immunity, sepsis, diabetes, cancer, viral infectivity, and-not last-to cognition. Such correlations were established based on studies using transgenic mice, associated in the recent years with GWAS, transcriptomic and proteomic analyses. The presence of a duplicate gene, pseudogene , stimulated evolutionary studies and more recently, the regulatory properties of the corresponding non-coding RNA are steadily emerging. Nonetheless, this prototypical apolipoprotein is still underexplored and deserves further research for understanding its physiology and exploiting its therapeutic potential.
Topics: Amino Acid Motifs; Apolipoprotein C-I; Apolipoproteins E; Chromosome Mapping; Gene Expression Regulation; Humans; Lipid Metabolism; Lipoproteins, HDL; Lipoproteins, VLDL; Protein Binding; Pseudogenes; Receptors, Lipoprotein
PubMed: 31779116
DOI: 10.3390/ijms20235939