-
BMC Bioinformatics Feb 2022Generating chromosome-scale haplotype resolved assembly is important for functional studies. However, current de novo assemblers are either haploid assemblers that...
BACKGROUND
Generating chromosome-scale haplotype resolved assembly is important for functional studies. However, current de novo assemblers are either haploid assemblers that discard allelic information, or diploid assemblers that can only tackle genomes of low complexity.
RESULTS
Here, Using robust programs, we build a diploid genome assembly pipeline called gcaPDA (gamete cells assisted Phased Diploid Assembler), which exploits haploid gamete cells to assist in resolving haplotypes. We demonstrate the effectiveness of gcaPDA based on simulated HiFi reads of maize genome which is highly heterozygous and repetitive, and real data from rice.
CONCLUSIONS
With applicability of coping with complex genomes and fewer restrictions on application than most of diploid assemblers, gcaPDA is likely to find broad applications in studies of eukaryotic genomes.
Topics: Alleles; Chromosomes; Diploidy; Haploidy; Haplotypes; High-Throughput Nucleotide Sequencing; Sequence Analysis, DNA
PubMed: 35164674
DOI: 10.1186/s12859-022-04591-4 -
Scientific Reports Nov 2022Genetic drift is a basic evolutionary principle describing random changes in allelic frequencies, with far-reaching consequences in various topics ranging from species...
Genetic drift is a basic evolutionary principle describing random changes in allelic frequencies, with far-reaching consequences in various topics ranging from species conservation efforts to speciation. The conventional approach assumes that genetic drift has the same effect on all populations undergoing the same changes in size, regardless of different non-reproductive behaviors and history of the populations. However, here we reason that processes leading to a systematic increase of individuals` chances of survival, such as learning or immunological memory, can mitigate loss of genetic diversity caused by genetic drift even if the overall mortality rate in the population does not change. We further test this notion in an agent-based model with overlapping generations, monitoring allele numbers in a population of prey, either able or not able to learn from successfully escaping predators' attacks. Importantly, both these populations start with the same effective size and have the same and constant overall mortality rates. Our results demonstrate that even under these conditions, learning can mitigate loss of genetic diversity caused by drift, by creating a pool of harder-to-die individuals that protect alleles they carry from extinction. Furthermore, this effect holds regardless if the population is haploid or diploid or whether it reproduces sexually or asexually. These findings may be of importance not only for basic evolutionary theory but also for other fields using the concept of genetic drift.
Topics: Humans; Genetic Drift; Gene Frequency; Biological Evolution; Alleles; Diploidy
PubMed: 36437294
DOI: 10.1038/s41598-022-24748-8 -
PloS One 2014Sequencing the transcriptome can answer various questions such as determining the transcripts expressed in a given species for a specific tissue or condition, evaluating...
Sequencing the transcriptome can answer various questions such as determining the transcripts expressed in a given species for a specific tissue or condition, evaluating differential expression, discovering variants, and evaluating allele-specific expression. Differential expression evaluates the expression differences between different strains, tissues, and conditions. Allele-specific expression evaluates expression differences between parental alleles. Both differential expression and allele-specific expression have been studied for heterosis (hybrid vigor), where the hybrid has improved performance over the parents for one or more traits. The Allele Workbench software was developed for a heterosis study that evaluated allele-specific expression for a mouse F1 hybrid using libraries from multiple tissues with biological replicates. This software has been made into a distributable package, which includes a pipeline, a Java interface to build the database, and a Java interface for query and display of the results. The required input is a reference genome, annotation file, and one or more RNA-Seq libraries with optional replicates. It evaluates allelic imbalance at the SNP and transcript level and flags transcripts with significant opposite directional allele-specific expression. The Java interface allows the user to view data from libraries, replicates, genes, transcripts, exons, and variants, including queries on allele imbalance for selected libraries. To determine the impact of allele-specific SNPs on protein folding, variants are annotated with their effect (e.g., missense), and the parental protein sequences may be exported for protein folding analysis. The Allele Workbench processing results in transcript files and read counts that can be used as input to the previously published Transcriptome Computational Workbench, which has a new algorithm for determining a trimmed set of gene ontology terms. The software with demo files is available from https://code.google.com/p/allele-workbench. Additionally, all software is ready for immediate use from an Atmosphere Virtual Machine Image available from the iPlant Collaborative (www.iplantcollaborative.org).
Topics: Algorithms; Alleles; Animals; Computational Biology; Computer Graphics; Data Mining; Databases, Genetic; Gene Expression Profiling; Heterozygote; Mice; Polymorphism, Single Nucleotide; Programming Languages; RNA, Messenger; Sequence Analysis; User-Computer Interface
PubMed: 25541944
DOI: 10.1371/journal.pone.0115740 -
Philosophical Transactions of the Royal... Feb 2019Genomic imprinting, where an allele's expression pattern depends on its parental origin, is thought to result primarily from an intragenomic evolutionary conflict....
Genomic imprinting, where an allele's expression pattern depends on its parental origin, is thought to result primarily from an intragenomic evolutionary conflict. Imprinted genes are widely expressed in the brain and have been linked to various phenotypes, including behaviours related to risk tolerance. In this paper, we analyse a model of evolutionary bet-hedging in a system with imprinted gene expression. Previous analyses of bet-hedging have shown that natural selection may favour alleles and traits that reduce reproductive variance, even at the expense of reducing mean reproductive success, with the trade-off between mean and variance depending on the population size. In species where the sexes have different reproductive variances, this bet-hedging trade-off differs between maternally and paternally inherited alleles. Where males have the higher reproductive variance, alleles are more strongly selected to reduce variance when paternally inherited than when maternally inherited. We connect this result to phenotypes connected with specific imprinted genes, including delay discounting and social dominance. The empirical patterns are consistent with paternally expressed imprinted genes promoting risk-averse behaviours that reduce reproductive variance. Conversely, maternally expressed imprinted genes promote risk-tolerant, variance-increasing behaviours. We indicate how future research might further test the hypotheses suggested by our analysis. This article is part of the theme issue 'Risk taking and impulsive behaviour: fundamental discoveries, theoretical perspectives and clinical implications'.
Topics: Alleles; Animals; Biological Evolution; Female; Gene Expression; Genomic Imprinting; Male; Models, Genetic; Phenotype; Selection, Genetic
PubMed: 30966914
DOI: 10.1098/rstb.2018.0142 -
BMC Genomics Jun 2018Allele-specific transcriptional regulation, including of imprinted genes, is essential for normal mammalian development. While the regulatory regions controlling...
BACKGROUND
Allele-specific transcriptional regulation, including of imprinted genes, is essential for normal mammalian development. While the regulatory regions controlling imprinted genes are associated with DNA methylation (DNAme) and specific histone modifications, the interplay between transcription and these epigenetic marks at allelic resolution is typically not investigated genome-wide due to a lack of bioinformatic packages that can process and integrate multiple epigenomic datasets with allelic resolution. In addition, existing ad-hoc software only consider SNVs for allele-specific read discovery. This limitation omits potentially informative INDELs, which constitute about one fifth of the number of SNVs in mice, and introduces a systematic reference bias in allele-specific analyses.
RESULTS
Here, we describe MEA, an INDEL-aware Methylomic and Epigenomic Allele-specific analysis pipeline which enables user-friendly data exploration, visualization and interpretation of allelic imbalance. Applying MEA to mouse embryonic datasets yields robust allele-specific DNAme maps and low reference bias. We validate allele-specific DNAme at known differentially methylated regions and show that automated integration of such methylation data with RNA- and ChIP-seq datasets yields an intuitive, multidimensional view of allelic gene regulation. MEA uncovers numerous novel dynamically methylated loci, highlighting the sensitivity of our pipeline. Furthermore, processing and visualization of epigenomic datasets from human brain reveals the expected allele-specific enrichment of H3K27ac and DNAme at imprinted as well as novel monoallelically expressed genes, highlighting MEA's utility for integrating human datasets of distinct provenance for genome-wide analysis of allelic phenomena.
CONCLUSIONS
Our novel pipeline for standardized allele-specific processing and visualization of disparate epigenomic and methylomic datasets enables rapid analysis and navigation with allelic resolution. MEA is freely available as a Docker container at https://github.com/julienrichardalbert/MEA .
Topics: Alleles; Animals; Chromatin Immunoprecipitation; CpG Islands; DNA Methylation; Epigenesis, Genetic; Epigenomics; Gene Expression Profiling; Germ Cells; Humans; INDEL Mutation; Mice; Mice, Inbred C57BL; Mice, Inbred DBA; Sequence Analysis, DNA; Sequence Analysis, RNA; Software; Transcription Initiation Site
PubMed: 29907088
DOI: 10.1186/s12864-018-4835-2 -
Scientific Reports May 2022Allele-specific expression (ASE) represents differences in the magnitude of expression between alleles of the same gene. This is not straightforward for polyploids,...
Allele-specific expression (ASE) represents differences in the magnitude of expression between alleles of the same gene. This is not straightforward for polyploids, especially autopolyploids, as knowledge about the dose of each allele is required for accurate estimation of ASE. This is the case for the genomically complex Saccharum species, characterized by high levels of ploidy and aneuploidy. We used a Beta-Binomial model to test for allelic imbalance in Saccharum, with adaptations for mixed-ploid organisms. The hierarchical Beta-Binomial model was used to test if allele expression followed the expectation based on genomic allele dosage. The highest frequencies of ASE occurred in sugarcane hybrids, suggesting a possible influence of interspecific hybridization in these genotypes. For all accessions, genes showing ASE (ASEGs) were less frequent than those with balanced allelic expression. These genes were related to a broad range of processes, mostly associated with general metabolism, organelles, responses to stress and responses to stimuli. In addition, the frequency of ASEGs in high-level functional terms was similar among the genotypes, with a few genes associated with more specific biological processes. We hypothesize that ASE in Saccharum is largely a genotype-specific phenomenon, as a large number of ASEGs were exclusive to individual accessions.
Topics: Alleles; Bias; Polymorphism, Single Nucleotide; Polyploidy; Saccharum
PubMed: 35610293
DOI: 10.1038/s41598-022-12725-0 -
Cold Spring Harbor Perspectives in... Jul 2014Sexual antagonism occurs when an allele is beneficial in one sex but costly in the other. Parental antagonism occurs when an allele is beneficial when inherited from one... (Review)
Review
Sexual antagonism occurs when an allele is beneficial in one sex but costly in the other. Parental antagonism occurs when an allele is beneficial when inherited from one sex but costly when inherited from the other because of fitness interactions among kin. Sexual and parental antagonisms together define four genetic niches within the genome that favor different patterns of gene expression. Natural selection generates linkage disequilibrium among sexually and parentally antagonistic loci with male-beneficial alleles coupled to alleles that are beneficial when inherited from males and female-beneficial alleles coupled to alleles that are beneficial when inherited from females. Linkage disequilibrium also develops between sexually and parentally antagonistic loci and loci that influence sex determination. Genes evolve sex-specific expression to resolve sexual antagonism and evolve imprinted expression to resolve parental antagonism. Sex-specific chromosomes allow a gene to specialize in a single niche.
Topics: Alleles; Animal Migration; Animals; Biological Evolution; Female; Gene Expression; Genome; Haplotypes; Linkage Disequilibrium; Male; Meiosis; Phenotype; Selection, Genetic; Sex Determination Processes; Sex Factors
PubMed: 25059710
DOI: 10.1101/cshperspect.a017525 -
BMC Genomics Oct 2021The current and future applications of genomic data may raise ethical and privacy concerns. Processing and storing of this data introduce a risk of abuse by potential...
BACKGROUND
The current and future applications of genomic data may raise ethical and privacy concerns. Processing and storing of this data introduce a risk of abuse by potential offenders since the human genome contains sensitive personal information. For this reason, we have developed a privacy-preserving method, named Varlock providing secure storage of sequenced genomic data. We used a public set of population allele frequencies to mask the personal alleles detected in genomic reads. Each personal allele described by the public set is masked by a randomly selected population allele with respect to its frequency. Masked alleles are preserved in an encrypted confidential file that can be shared in whole or in part using public-key cryptography.
RESULTS
Our method masked the personal variants and introduced new variants detected in a personal masked genome. Alternative alleles with lower population frequency were masked and introduced more often. We performed a joint PCA analysis of personal and masked VCFs, showing that the VCFs between the two groups cannot be trivially mapped. Moreover, the method is reversible and personal alleles in specific genomic regions can be unmasked on demand.
CONCLUSION
Our method masks personal alleles within genomic reads while preserving valuable non-sensitive properties of sequenced DNA fragments for further research. Personal alleles in the desired genomic regions may be restored and shared with patients, clinics, and researchers. We suggest that the method can provide an additional security layer for storing and sharing of the raw aligned reads.
Topics: Alleles; Gene Frequency; Genome, Human; Genomics; Humans; Privacy
PubMed: 34600465
DOI: 10.1186/s12864-021-07996-2 -
Nature Communications Nov 2023Genetic and environmental variation are key contributors during organism development, but the influence of minor perturbations or noise is difficult to assess. This...
Genetic and environmental variation are key contributors during organism development, but the influence of minor perturbations or noise is difficult to assess. This study focuses on the stochastic variation in allele-specific expression that persists through cell divisions in the nine-banded armadillo (Dasypus novemcinctus). We investigated the blood transcriptome of five wild monozygotic quadruplets over time to explore the influence of developmental stochasticity on gene expression. We identify an enduring signal of autosomal allelic variability that distinguishes individuals within a quadruplet despite their genetic similarity. This stochastic allelic variation, akin to X-inactivation but broader, provides insight into non-genetic influences on phenotype. The presence of stochastically canalized allelic signatures represents a novel axis for characterizing organismal variability, complementing traditional approaches based on genetic and environmental factors. We also developed a model to explain the inconsistent penetrance associated with these stochastically canalized allelic expressions. By elucidating mechanisms underlying the persistence of allele-specific expression, we enhance understanding of development's role in shaping organismal diversity.
Topics: Humans; Animals; Armadillos; Phenotype; Alleles; Penetrance
PubMed: 37940702
DOI: 10.1038/s41467-023-43024-5 -
Epigenetics & Chromatin Oct 2023Epigenome editing refers to the targeted reprogramming of genomic loci using an EpiEditor which may consist of an sgRNA/dCas9 complex that recruits DNMT3A/3L to the...
BACKGROUND
Epigenome editing refers to the targeted reprogramming of genomic loci using an EpiEditor which may consist of an sgRNA/dCas9 complex that recruits DNMT3A/3L to the target locus. Methylation of the locus can lead to a modulation of gene expression. Allele-specific DNA methylation (ASM) refers to the targeted methylation delivery only to one allele of a locus. In the context of diseases caused by a dominant mutation, the selective DNA methylation of the mutant allele could be used to repress its expression but retain the functionality of the normal gene.
RESULTS
To set up allele-specific targeted DNA methylation, target regions were selected from hypomethylated CGIs bearing a heterozygous SNP in their promoters in the HEK293 cell line. We aimed at delivering maximum DNA methylation with highest allelic specificity in the targeted regions. Placing SNPs in the PAM or seed regions of the sgRNA, we designed 24 different sgRNAs targeting single alleles in 14 different gene loci. We achieved efficient ASM in multiple cases, such as ISG15, MSH6, GPD1L, MRPL52, PDE8A, NARF, DAP3, and GSPT1, which in best cases led to five to tenfold stronger average DNA methylation at the on-target allele and absolute differences in the DNA methylation gain at on- and off-target alleles of > 50%. In general, loci with the allele discriminatory SNP positioned in the PAM region showed higher success rate of ASM and better specificity. Highest DNA methylation was observed on day 3 after transfection followed by a gradual decline. In selected cases, ASM was stable up to 11 days in HEK293 cells and it led up to a 3.6-fold change in allelic expression ratios.
CONCLUSIONS
We successfully delivered ASM at multiple genomic loci with high specificity, efficiency and stability. This form of super-specific epigenome editing could find applications in the treatment of diseases caused by dominant mutations, because it allows silencing of the mutant allele without repression of the expression of the normal allele thereby minimizing potential side-effects of the treatment.
Topics: Humans; DNA Methylation; RNA, Guide, CRISPR-Cas Systems; Epigenesis, Genetic; Alleles; HEK293 Cells; Epigenome; CRISPR-Cas Systems; Gene Editing
PubMed: 37864244
DOI: 10.1186/s13072-023-00515-5