-
Nucleic Acids Research Jul 2022We present ANANASTRA, https://ananastra.autosome.org, a web server for the identification and annotation of regulatory single-nucleotide polymorphisms (SNPs) with...
We present ANANASTRA, https://ananastra.autosome.org, a web server for the identification and annotation of regulatory single-nucleotide polymorphisms (SNPs) with allele-specific binding events. ANANASTRA accepts a list of dbSNP IDs or a VCF file and reports allele-specific binding (ASB) sites of particular transcription factors or in specific cell types, highlighting those with ASBs significantly enriched at SNPs in the query list. ANANASTRA is built on top of a systematic analysis of allelic imbalance in ChIP-Seq experiments and performs the ASB enrichment test against background sets of SNPs found in the same source experiments as ASB sites but not displaying significant allelic imbalance. We illustrate ANANASTRA usage with selected case studies and expect that ANANASTRA will help to conduct the follow-up of GWAS in terms of establishing functional hypotheses and designing experimental verification.
Topics: Alleles; Binding Sites; Genome-Wide Association Study; Polymorphism, Single Nucleotide; Protein Binding; Transcription Factors; DNA-Binding Proteins
PubMed: 35446421
DOI: 10.1093/nar/gkac262 -
Bioinformatics (Oxford, England) Nov 2021The sparse allele vectors file format is an efficient storage format for large-scale DNA variation data and is designed for high throughput association analysis by...
SUMMARY
The sparse allele vectors file format is an efficient storage format for large-scale DNA variation data and is designed for high throughput association analysis by leveraging techniques for fast deserialization of data into computer memory. A command line interface has been developed to complement the storage format and supports basic features like importing, exporting and subsetting. Additionally, a C++ programming API is available allowing for easy integration into analysis software.
AVAILABILITY AND IMPLEMENTATION
https://github.com/statgen/savvy.
SUPPLEMENTARY INFORMATION
Supplementary data are available at Bioinformatics online.
Topics: Alleles; Software
PubMed: 33989384
DOI: 10.1093/bioinformatics/btab378 -
Journal of Integrative Plant Biology May 2021In rice (Oryza sativa), amylose content (AC) is the major factor that determines eating and cooking quality (ECQ). The diversity in AC is largely attributed to natural...
In rice (Oryza sativa), amylose content (AC) is the major factor that determines eating and cooking quality (ECQ). The diversity in AC is largely attributed to natural allelic variation at the Waxy (Wx) locus. Here we identified a rare Wx allele, Wx , which combines a favorable AC, improved ECQ and grain transparency. Based on a phylogenetic analysis of Wx genomic sequences from 370 rice accessions, we speculated that Wx may have derived from recombination between two important natural Wx alleles, Wx and Wx . We validated the effects of Wx on rice grain quality using both transgenic lines and near-isogenic lines (NILs). When introgressed into the japonica Nipponbare (NIP) background, Wx resulted in a moderate AC that was intermediate between that of NILs carrying the Wx allele and NILs with the Wx allele. Notably, mature grains of NILs fixed for Wx had an improved transparent endosperm relative to soft rice. Further, we introduced Wx into a high-yielding japonica cultivar via molecular marker-assisted selection: the introgressed lines exhibited clear improvements in ECQ and endosperm transparency. Our results suggest that Wx is a promising allele to improve grain quality, especially ECQ and grain transparency of high-yielding japonica cultivars, in rice breeding programs.
Topics: Alleles; Gene Expression Regulation, Plant; Oryza; Plant Proteins
PubMed: 32886440
DOI: 10.1111/jipb.13010 -
DNA Research : An International Journal... Feb 2019The current RNA-Seq method analyses fragments of mRNAs, from which it is occasionally difficult to reconstruct the entire transcript structure. Here, we performed and...
The current RNA-Seq method analyses fragments of mRNAs, from which it is occasionally difficult to reconstruct the entire transcript structure. Here, we performed and evaluated the recent procedure for full-length cDNA sequencing using the Nanopore sequencer MinION. We applied MinION RNA-Seq for various applications, which would not always be easy using the usual RNA-Seq by Illumina. First, we examined and found that even though the sequencing accuracy was still limited to 92.3%, practically useful RNA-Seq analysis is possible. Particularly, taking advantage of the long-read nature of MinION, we demonstrate the identification of splicing patterns and their combinations as a form of full-length cDNAs without losing precise information concerning their expression levels. Transcripts of fusion genes in cancer cells can also be identified and characterized. Furthermore, the full-length cDNA information can be used for phasing of the SNPs detected by WES on the transcripts, providing essential information to identify allele-specific transcriptional events. We constructed a catalogue of full-length cDNAs in seven major organs for two particular individuals and identified allele-specific transcription and splicing. Finally, we demonstrate that single-cell sequencing is also possible. RNA-Seq on the MinION platform should provide a novel approach that is complementary to the current RNA-Seq.
Topics: Alleles; DNA, Complementary; Gene Expression Profiling; High-Throughput Nucleotide Sequencing; Humans; Polymorphism, Single Nucleotide; RNA Splicing; Sequence Analysis, RNA
PubMed: 30462165
DOI: 10.1093/dnares/dsy038 -
Scientific Reports Nov 2022Genetic drift is a basic evolutionary principle describing random changes in allelic frequencies, with far-reaching consequences in various topics ranging from species...
Genetic drift is a basic evolutionary principle describing random changes in allelic frequencies, with far-reaching consequences in various topics ranging from species conservation efforts to speciation. The conventional approach assumes that genetic drift has the same effect on all populations undergoing the same changes in size, regardless of different non-reproductive behaviors and history of the populations. However, here we reason that processes leading to a systematic increase of individuals` chances of survival, such as learning or immunological memory, can mitigate loss of genetic diversity caused by genetic drift even if the overall mortality rate in the population does not change. We further test this notion in an agent-based model with overlapping generations, monitoring allele numbers in a population of prey, either able or not able to learn from successfully escaping predators' attacks. Importantly, both these populations start with the same effective size and have the same and constant overall mortality rates. Our results demonstrate that even under these conditions, learning can mitigate loss of genetic diversity caused by drift, by creating a pool of harder-to-die individuals that protect alleles they carry from extinction. Furthermore, this effect holds regardless if the population is haploid or diploid or whether it reproduces sexually or asexually. These findings may be of importance not only for basic evolutionary theory but also for other fields using the concept of genetic drift.
Topics: Humans; Genetic Drift; Gene Frequency; Biological Evolution; Alleles; Diploidy
PubMed: 36437294
DOI: 10.1038/s41598-022-24748-8 -
Philosophical Transactions of the Royal... Feb 2019Genomic imprinting, where an allele's expression pattern depends on its parental origin, is thought to result primarily from an intragenomic evolutionary conflict....
Genomic imprinting, where an allele's expression pattern depends on its parental origin, is thought to result primarily from an intragenomic evolutionary conflict. Imprinted genes are widely expressed in the brain and have been linked to various phenotypes, including behaviours related to risk tolerance. In this paper, we analyse a model of evolutionary bet-hedging in a system with imprinted gene expression. Previous analyses of bet-hedging have shown that natural selection may favour alleles and traits that reduce reproductive variance, even at the expense of reducing mean reproductive success, with the trade-off between mean and variance depending on the population size. In species where the sexes have different reproductive variances, this bet-hedging trade-off differs between maternally and paternally inherited alleles. Where males have the higher reproductive variance, alleles are more strongly selected to reduce variance when paternally inherited than when maternally inherited. We connect this result to phenotypes connected with specific imprinted genes, including delay discounting and social dominance. The empirical patterns are consistent with paternally expressed imprinted genes promoting risk-averse behaviours that reduce reproductive variance. Conversely, maternally expressed imprinted genes promote risk-tolerant, variance-increasing behaviours. We indicate how future research might further test the hypotheses suggested by our analysis. This article is part of the theme issue 'Risk taking and impulsive behaviour: fundamental discoveries, theoretical perspectives and clinical implications'.
Topics: Alleles; Animals; Biological Evolution; Female; Gene Expression; Genomic Imprinting; Male; Models, Genetic; Phenotype; Selection, Genetic
PubMed: 30966914
DOI: 10.1098/rstb.2018.0142 -
Cell Systems Feb 2022Pan-cancer studies sketched the genomic landscape of the tumor types spectrum. We delineated the purity- and ploidy-adjusted allele-specific profiles of 4,950 patients...
Pan-cancer studies sketched the genomic landscape of the tumor types spectrum. We delineated the purity- and ploidy-adjusted allele-specific profiles of 4,950 patients across 27 tumor types from the Cancer Genome Atlas (TCGA). Leveraging allele-specific data, we reclassified as loss of heterozygosity (LOH) 9% and 7% of apparent copy-number wild-type and gain calls, respectively, and overall observed more than 18 million allelic imbalance somatic events at the gene level. Reclassification of copy-number events revealed associations between driver mutations and LOH, pointing out the timings between the occurrence of point mutations and copy-number events. Integrating allele-specific genomics and matched transcriptomics, we observed that allele-specific gene status is relevant in the regulation of TP53 and its targets. Further, we disclosed the role of copy-neutral LOH in the impairment of tumor suppressor genes and in disease progression. Our results highlight the role of LOH in cancer and contribute to the understanding of tumor progression.
Topics: Alleles; Genomics; Humans; Loss of Heterozygosity; Neoplasms
PubMed: 34731645
DOI: 10.1016/j.cels.2021.10.001 -
Scientific Reports May 2022Allele-specific expression (ASE) represents differences in the magnitude of expression between alleles of the same gene. This is not straightforward for polyploids,...
Allele-specific expression (ASE) represents differences in the magnitude of expression between alleles of the same gene. This is not straightforward for polyploids, especially autopolyploids, as knowledge about the dose of each allele is required for accurate estimation of ASE. This is the case for the genomically complex Saccharum species, characterized by high levels of ploidy and aneuploidy. We used a Beta-Binomial model to test for allelic imbalance in Saccharum, with adaptations for mixed-ploid organisms. The hierarchical Beta-Binomial model was used to test if allele expression followed the expectation based on genomic allele dosage. The highest frequencies of ASE occurred in sugarcane hybrids, suggesting a possible influence of interspecific hybridization in these genotypes. For all accessions, genes showing ASE (ASEGs) were less frequent than those with balanced allelic expression. These genes were related to a broad range of processes, mostly associated with general metabolism, organelles, responses to stress and responses to stimuli. In addition, the frequency of ASEGs in high-level functional terms was similar among the genotypes, with a few genes associated with more specific biological processes. We hypothesize that ASE in Saccharum is largely a genotype-specific phenomenon, as a large number of ASEGs were exclusive to individual accessions.
Topics: Alleles; Bias; Polymorphism, Single Nucleotide; Polyploidy; Saccharum
PubMed: 35610293
DOI: 10.1038/s41598-022-12725-0 -
Nature Communications Nov 2023Genetic and environmental variation are key contributors during organism development, but the influence of minor perturbations or noise is difficult to assess. This...
Genetic and environmental variation are key contributors during organism development, but the influence of minor perturbations or noise is difficult to assess. This study focuses on the stochastic variation in allele-specific expression that persists through cell divisions in the nine-banded armadillo (Dasypus novemcinctus). We investigated the blood transcriptome of five wild monozygotic quadruplets over time to explore the influence of developmental stochasticity on gene expression. We identify an enduring signal of autosomal allelic variability that distinguishes individuals within a quadruplet despite their genetic similarity. This stochastic allelic variation, akin to X-inactivation but broader, provides insight into non-genetic influences on phenotype. The presence of stochastically canalized allelic signatures represents a novel axis for characterizing organismal variability, complementing traditional approaches based on genetic and environmental factors. We also developed a model to explain the inconsistent penetrance associated with these stochastically canalized allelic expressions. By elucidating mechanisms underlying the persistence of allele-specific expression, we enhance understanding of development's role in shaping organismal diversity.
Topics: Humans; Animals; Armadillos; Phenotype; Alleles; Penetrance
PubMed: 37940702
DOI: 10.1038/s41467-023-43024-5 -
Journal of Theoretical Biology May 2023Understanding the role of natural selection in driving evolutionary change requires accurate estimates of the strength of selection acting at the genetic level in the...
Understanding the role of natural selection in driving evolutionary change requires accurate estimates of the strength of selection acting at the genetic level in the wild. This is challenging to achieve but may be easier in the case of populations in migration-selection balance. When two populations are at equilibrium under migration-selection balance, there exist loci whose alleles are selected different ways in the two populations. Such loci can be identified from genome sequencing by their high values of F. This raises the question of what is the strength of selection on locally-adaptive alleles. To answer this question we analyse a 1-locus 2-allele model of a population distributed between two niches. We show by simulation of selected cases that the outputs from finite-population models are essentially the same as those from deterministic infinite-population models. We then derive theory for the infinite-population model showing the dependence of selection coefficients on equilibrium allele frequencies, migration rates, dominance and relative population sizes in the two niches. An Excel spreadsheet is provided for the calculation of selection coefficients and their approximate standard errors from observed values of population parameters. We illustrate our results with a worked example, with graphs showing the dependence of selection coefficients on equilibrium allele frequencies, and graphs showing how F depends on the selection coefficients acting on the alleles at a locus. Given the extent of recent progress in ecological genomics, we hope our methods may help those studying migration-selection balance to quantify the advantages conferred by adaptive genes.
Topics: Genetics, Population; Gene Frequency; Selection, Genetic; Chromosome Mapping; Biological Evolution; Alleles; Models, Genetic
PubMed: 36914112
DOI: 10.1016/j.jtbi.2023.111463