-
Nature Genetics Aug 2020Standardized gene naming is crucial for effective communication about genes, and as genomics becomes increasingly important in healthcare, the need for a consistent...
Standardized gene naming is crucial for effective communication about genes, and as genomics becomes increasingly important in healthcare, the need for a consistent language for human genes becomes ever more vital. Here we present the current HUGO Gene Nomenclature Committee (HGNC) guidelines for naming not only protein-coding but also RNA genes and pseudogenes, and outline the changes in approach and ethos that have resulted from the discoveries of the last few decades.
Topics: Genes; Human Genetics; Humans; Pseudogenes; RNA
PubMed: 32747822
DOI: 10.1038/s41588-020-0669-3 -
Scientific Reports Oct 2022Colorectal cancer (CRC) is one of the most common and malignant carcinomas. Many long noncoding RNAs (lncRNAs) have been reported to play important roles in the...
Colorectal cancer (CRC) is one of the most common and malignant carcinomas. Many long noncoding RNAs (lncRNAs) have been reported to play important roles in the tumorigenesis of CRC by influencing the expression of some mRNAs via competing endogenous RNA (ceRNA) networks and interacting with miRNAs. Pseudogene is one kind of lncRNA and can act as RNA sponges for miRNAs and regulate gene expression via ceRNA networks. However, there are few studies about pseudogenes in CRC. In this study, 31 differentially expressed (DE) pseudogenes, 17 DE miRNAs and 152 DE mRNAs were identified by analyzing the expression profiles of colon adenocarcinoma obtained from The Cancer Genome Atlas. A ceRNA network was constructed based on these RNAs. Kaplan-Meier analysis showed that 7 pseudogenes, 4 miRNAs and 30 mRNAs were significantly associated with overall survival. Then multivariate Cox regression analysis of the ceRNA-related DE pseudogenes was performed and a 5-pseudogene signature with the greatest prognostic value for CRC was identified. Moreover, the results were validated by the Gene Expression Omnibus database, and quantitative real-time PCR in 113 pairs of CRC tissues and colon cancer cell lines. This study provides a pseudogene-associated ceRNA network, 7 prognostic pseudogene biomarkers, and a 5-pseudogene prognostic risk signature that may be useful for predicting the survival of CRC patients.
Topics: Humans; RNA, Long Noncoding; Prognosis; Pseudogenes; Gene Regulatory Networks; Gene Expression Regulation, Neoplastic; Adenocarcinoma; Colonic Neoplasms; MicroRNAs; RNA, Messenger; Biomarkers
PubMed: 36272991
DOI: 10.1038/s41598-022-22768-y -
International Journal of Oncology Mar 2024Increasing evidence suggests that pseudogenes play crucial roles in various cancers, yet their functions and regulatory mechanisms in glioma pathogenesis remain...
Increasing evidence suggests that pseudogenes play crucial roles in various cancers, yet their functions and regulatory mechanisms in glioma pathogenesis remain enigmatic. In the present study, a novel pseudogene was identified, UBDP1, which is significantly upregulated in glioblastoma and positively correlated with the expression of its parent gene, UBD. Additionally, high levels of these paired genes are linked with a poor prognosis for patients. In the present study, clinical samples were collected followed by various analyses including microarray for long non‑coding RNAs, reverse transcription‑quantitative PCR, fluorescence hybridization and western blotting. Cell lines were authenticated and cultured then subjected to various assays for proliferation, migration, and invasion to investigate the molecular mechanisms. Bioinformatic tools identified miRNA targets, and luciferase reporter assays validated these interactions. A tumor xenograft model in mice was used for studies. and studies have demonstrated that UBDP1, localized in the cytoplasm, functions as a tumor‑promoting factor influencing cell proliferation, migration, invasion and tumor growth. Mechanistic investigations have indicated that UBDP1 exerts its oncogenic effects by decoying miR‑6072 from UBD mRNA, thus forming a competitive endogenous RNA network, which results in the enhanced oncogenic activity of UBD. The present findings offered new insights into the role of pseudogenes in glioma progression, suggesting that targeting the UBDP1/miR‑6072/UBD network may serve as a potential therapeutic strategy for glioma patients.
Topics: Animals; Humans; Mice; Brain Neoplasms; Cell Line, Tumor; Cell Movement; Cell Proliferation; Gene Expression Regulation, Neoplastic; Glioma; In Situ Hybridization, Fluorescence; MicroRNAs; Pseudogenes; RNA, Long Noncoding
PubMed: 38275102
DOI: 10.3892/ijo.2024.5617 -
BMC Plant Biology Mar 2023Blueberries (Vaccinium section Cyanococcus) are an economically important fruit crop in the United States. Understanding genetic structure and relationships in...
BACKGROUND
Blueberries (Vaccinium section Cyanococcus) are an economically important fruit crop in the United States. Understanding genetic structure and relationships in blueberries is essential to advance the genetic improvement of horticulturally important traits. In the present study, we investigated the genomic and evolutionary relationships in 195 blueberry accessions from five species (comprising 33 V. corymbosum, 14 V. boreale, 81 V. darrowii, 29 V. myrsinites, and 38 V. tenellum) using single nucleotide polymorphisms (SNPs) mined from genotyping-by-sequencing (GBS) data.
RESULTS
GBS generated ~ 751 million raw reads, of which 79.7% were mapped to the reference genome V. corymbosum cv. Draper v1.0. After filtering (read depth > 3, minor allele frequency > 0.05, and call rate > 0.9), 60,518 SNPs were identified and used in further analyses. The 195 blueberry accessions formed three major clusters on the principal component (PC) analysis plot, in which the first two PCs accounted for 29.2% of the total genetic variance. Nucleotide diversity (π) was highest for V. tenellum and V. boreale (0.023 each), and lowest for V. darrowii (0.012). Using TreeMix analysis, we identified four migration events and deciphered gene flow among the selected species. In addition, we detected a strong V. boreale lineage in cultivated blueberry species. Pairwise SweeD analysis identified a wide sweep (encompassing 32 genes) as a strong signature of domestication on the scaffold VaccDscaff 12. From this region, five genes encoded topoisomerases, six genes encoded CAP-gly domain linker (which regulates the dynamics of the microtubule cytoskeleton), and three genes coded for GSL8 (involved in the synthesis of the cell wall component callose). One of the genes, augustus_masked-VaccDscaff12-processed-gene-172.10, is a homolog of Arabidopsis AT2G25010 and encodes the protein MAINTENANCE OF MERISTEMS-like involved in root and shoot growth. Additional genomic stratification by admixture analysis identified genetic lineages and species boundaries in blueberry accessions. The results from this study indicate that V. boreale is a genetically distant outgroup, while V. darrowii, V. myrsinites, and V. tenellum are closely related.
CONCLUSION
Our study provides new insights into the evolution and genetic architecture of cultivated blueberries.
Topics: Blueberry Plants; Genomics; Pseudogenes; Arabidopsis; Cell Wall
PubMed: 36872311
DOI: 10.1186/s12870-023-04124-y -
Frontiers in Endocrinology 2021Copy Number Variations (CNVs) account for a large proportion of human genome and are a primary contributor to human phenotypic variation, in addition to being the... (Review)
Review
Copy Number Variations (CNVs) account for a large proportion of human genome and are a primary contributor to human phenotypic variation, in addition to being the molecular basis of a wide spectrum of disease. Multiallelic CNVs represent a considerable fraction of large CNVs and are strictly related to segmental duplications according to their prevalent duplicate alleles. RCCX CNV is a complex, multiallelic and tandem CNV located in the major histocompatibility complex (MHC) class III region. RCCX structure is typically defined by the copy number of a DNA segment containing a series of genes - the serine/threonine kinase 19 (), the complement 4 (), the steroid 21-hydroxylase (), and the tenascin-X () - lie close to each other. In the Caucasian population, the most common RCCX haplotype (69%) consists of two segments containing the genes , with a telomere-to-centromere orientation. Nonallelic homologous recombination (NAHR) plays a key role into the RCCX genetic diversity: unequal crossover facilitates large structural rearrangements and copy number changes, whereas gene conversion mediates relatively short sequence transfers. The results of these events increased the RCCX genetic diversity and are responsible of specific human diseases. This review provides an overview on RCCX complexity pointing out the molecular bases of Congenital Adrenal Hyperplasia (CAH) due to CYP21A2 deficiency, CAH-X Syndrome and disorders related to CNV of complement component C4.
Topics: Adrenal Hyperplasia, Congenital; Complement C4; DNA Copy Number Variations; Genetic Variation; Humans; Nuclear Proteins; Protein Serine-Threonine Kinases; Pseudogenes; Steroid 21-Hydroxylase; Tenascin
PubMed: 34394006
DOI: 10.3389/fendo.2021.709758 -
BMC Medical Genomics Apr 2020Given the vast range of molecular mechanisms giving rise to breast cancer, it is unlikely universal cures exist. However, by providing a more precise prognosis for...
BACKGROUND
Given the vast range of molecular mechanisms giving rise to breast cancer, it is unlikely universal cures exist. However, by providing a more precise prognosis for breast cancer patients through integrative models, treatments can become more individualized, resulting in more successful outcomes. Specifically, we combine gene expression, pseudogene expression, miRNA expression, clinical factors, and pseudogene-gene functional networks to generate these models for breast cancer prognostics. Establishing a LASSO-generated molecular gene signature revealed that the increased expression of genes STXBP5, GALP and LOC387646 indicate a poor prognosis for a breast cancer patient. We also found that increased CTSLP8 and RPS10P20 and decreased HLA-K pseudogene expression indicate poor prognosis for a patient. Perhaps most importantly we identified a pseudogene-gene interaction, GPS2-GPS2P1 (improved prognosis) that is prognostic where neither the gene nor pseudogene alone is prognostic of survival. Besides, miR-3923 was predicted to target GPS2 using miRanda, PicTar, and TargetScan, which imply modules of gene-pseudogene-miRNAs that are potentially functionally related to patient survival.
RESULTS
In our LASSO-based model, we take into account features including pseudogenes, genes and candidate pseudogene-gene interactions. Key biomarkers were identified from the features. The identification of key biomarkers in combination with significant clinical factors (such as stage and radiation therapy status) should be considered as well, enabling a specific prognostic prediction and future treatment plan for an individual patient. Here we used our PseudoFuN web application to identify the candidate pseudogene-gene interactions as candidate features in our integrative models. We further identified potential miRNAs targeting those features in our models using PseudoFuN as well. From this study, we present an interpretable survival model based on LASSO and decision trees, we also provide a novel feature set which includes pseudogene-gene interaction terms that have been ignored by previous prognostic models. We find that some interaction terms for pseudogenes and genes are significantly prognostic of survival. These interactions are cross-over interactions, where the impact of the gene expression on survival changes with pseudogene expression and vice versa. These may imply more complicated regulation mechanisms than previously understood.
CONCLUSIONS
We recommend these novel feature sets be considered when training other types of prognostic models as well, which may provide more comprehensive insights into personalized treatment decisions.
Topics: Biomarkers, Tumor; Breast Neoplasms; Computational Biology; Female; Gene Expression Profiling; Gene Expression Regulation, Neoplastic; Gene Regulatory Networks; Humans; Prognosis; Pseudogenes; Survival Rate
PubMed: 32241256
DOI: 10.1186/s12920-020-0687-0 -
Biochemistry. Biokhimiia Dec 2014Most of the mammalian genome consists of nucleotide sequences not coding for proteins. Exons of genes make up only 3% of the human genome, while the significance of most... (Review)
Review
Most of the mammalian genome consists of nucleotide sequences not coding for proteins. Exons of genes make up only 3% of the human genome, while the significance of most other sequences remains unknown. Recent genome studies with high-throughput methods demonstrate that the so-called noncoding part of the genome may perform important functions. This hypothesis is supported by three groups of experimental data: 1) approximately 10% of the sequences, most of which are located in noncoding parts of the genome, is evolutionarily conserved and thus can be of functional importance; 2) up to 99% of the mammalian genome is being transcribed forming short and long noncoding RNAs in addition to common mRNA; and 3) mutations in noncoding parts of the genome can be accompanied by progression of pathological states of the organism. In the light of these data, in the review we consider the functional role of numerous known sequences of noncoding parts of the genome including introns, DNA methylation regions, enhancers and locus control regions, insulators, S/MAR sequences, pseudogenes, and genes of noncoding RNAs, as well as transposons and simple repeats of centromeric and telomeric regions of chromosomes. The assumption is made that the intergenic noncoding sequences without definite/clear functions can be involved in spatial organization of genetic loci in interphase nuclei.
Topics: Animals; Centromere; DNA Transposable Elements; DNA, Intergenic; Genome; Humans; Mammals; Pseudogenes; RNA, Untranslated; Regulatory Sequences, Nucleic Acid; Telomere
PubMed: 25749159
DOI: 10.1134/S0006297914130021 -
Genome Biology Nov 2022Pseudogenes are excellent markers for genome evolution, which are emerging as crucial regulators of development and disease, especially cancer. However, systematic...
BACKGROUND
Pseudogenes are excellent markers for genome evolution, which are emerging as crucial regulators of development and disease, especially cancer. However, systematic functional characterization and evolution of pseudogenes remain largely unexplored.
RESULTS
To systematically characterize pseudogenes, we date the origin of human and mouse pseudogenes across vertebrates and observe a burst of pseudogene gain in these two lineages. Based on a hybrid sequencing dataset combining full-length PacBio sequencing, sample-matched Illumina sequencing, and public time-course transcriptome data, we observe that abundant mammalian pseudogenes could be transcribed, which contribute to the establishment of organ identity. Our analyses reveal that developmentally dynamic pseudogenes are evolutionarily conserved and show an increasing weight during development. Besides, they are involved in complex transcriptional and post-transcriptional modulation, exhibiting the signatures of functional enrichment. Coding potential evaluation suggests that 19% of human pseudogenes could be translated, thus serving as a new way for protein innovation. Moreover, pseudogenes carry disease-associated SNPs and conduce to cancer transcriptome perturbation.
CONCLUSIONS
Our discovery reveals an unexpectedly high abundance of mammalian pseudogenes that can be transcribed and translated, and these pseudogenes represent a novel regulatory layer. Our study also prioritizes developmentally dynamic pseudogenes with signatures of functional enrichment and provides a hybrid sequencing dataset for further unraveling their biological mechanisms in organ development and carcinogenesis in the future.
Topics: Humans; Mice; Animals; Pseudogenes; Genome; Mammals; Sequence Analysis, DNA; Neoplasms
PubMed: 36348461
DOI: 10.1186/s13059-022-02802-y -
Genome Biology May 2021Pseudogenes are gene copies presumed to mainly be functionless relics of evolution due to acquired deleterious mutations or transcriptional silencing. Using deep...
Pseudogenes are gene copies presumed to mainly be functionless relics of evolution due to acquired deleterious mutations or transcriptional silencing. Using deep full-length PacBio cDNA sequencing of normal human tissues and cancer cell lines, we identify here hundreds of novel transcribed pseudogenes expressed in tissue-specific patterns. Some pseudogene transcripts have intact open reading frames and are translated in cultured cells, representing unannotated protein-coding genes. To assess the biological impact of noncoding pseudogenes, we CRISPR-Cas9 delete the nucleus-enriched pseudogene PDCL3P4 and observe hundreds of perturbed genes. This study highlights pseudogenes as a complex and dynamic component of the human transcriptional landscape.
Topics: Cell Line; DNA, Complementary; Gene Deletion; Haploidy; Humans; Promoter Regions, Genetic; Pseudogenes; Sequence Analysis, DNA; Transcriptome
PubMed: 33971925
DOI: 10.1186/s13059-021-02369-0 -
Clinical Genetics Jul 2018DNA repair pathways are essential for cellular survival as our DNA is constantly under assault from both exogenous and endogenous DNA damaging agents. Five major... (Review)
Review
DNA repair pathways are essential for cellular survival as our DNA is constantly under assault from both exogenous and endogenous DNA damaging agents. Five major mammalian DNA repair pathways exist within a cell to maintain genomic integrity. Of these, the DNA mismatch repair (MMR) pathway is highly conserved among species and is well documented in bacteria. In humans, the importance of MMR is underscored by the discovery that a single mutation in any 1 of 4 genes within the MMR pathway (MLH1, MSH2, MSH6 and PMS2) results in Lynch syndrome (LS). LS is a autosomal dominant condition that predisposes individuals to a higher incidence of many malignancies including colorectal, endometrial, ovarian, and gastric cancers. In this review, we discuss the role of PMS2 in the MMR pathway, the evolving testing criteria used to identify variants in the PMS2 gene, the LS phenotype as well as the autosomal recessive condition called constitutional mismatch repair deficiency syndrome, and current methods used to elucidate the clinical impact of PMS2 mutations.
Topics: Alleles; Colorectal Neoplasms, Hereditary Nonpolyposis; DNA Mismatch Repair; Genetic Association Studies; Genetic Predisposition to Disease; Genetic Testing; Humans; Mismatch Repair Endonuclease PMS2; Mutation; Phenotype; Pseudogenes; Structure-Activity Relationship
PubMed: 29286535
DOI: 10.1111/cge.13205