-
Proceedings of the National Academy of... Sep 2022Selection accumulates information in the genome-it guides stochastically evolving populations toward states (genotype frequencies) that would be unlikely under...
Selection accumulates information in the genome-it guides stochastically evolving populations toward states (genotype frequencies) that would be unlikely under neutrality. This can be quantified as the Kullback-Leibler (KL) divergence between the actual distribution of genotype frequencies and the corresponding neutral distribution. First, we show that this population-level information sets an upper bound on the information at the level of genotype and phenotype, limiting how precisely they can be specified by selection. Next, we study how the accumulation and maintenance of information is limited by the cost of selection, measured as the genetic load or the relative fitness variance, both of which we connect to the control-theoretic KL cost of control. The information accumulation rate is upper bounded by the population size times the cost of selection. This bound is very general, and applies across models (Wright-Fisher, Moran, diffusion) and to arbitrary forms of selection, mutation, and recombination. Finally, the cost of maintaining information depends on how it is encoded: Specifying a single allele out of two is expensive, but one bit encoded among many weakly specified loci (as in a polygenic trait) is cheap.
Topics: Alleles; Biological Evolution; Gene Frequency; Genetics, Population; Models, Genetic; Selection, Genetic
PubMed: 36037343
DOI: 10.1073/pnas.2123152119 -
Nature Reviews. Genetics Aug 2022Genetic variation, which is generated by mutation, recombination and gene flow, can reduce the mean fitness of a population, both now and in the future. This 'genetic... (Review)
Review
Genetic variation, which is generated by mutation, recombination and gene flow, can reduce the mean fitness of a population, both now and in the future. This 'genetic load' has been estimated in a wide range of animal taxa using various approaches. Advances in genome sequencing and computational techniques now enable us to estimate the genetic load in populations and individuals without direct fitness estimates. Here, we review the classic and contemporary literature of genetic load. We describe approaches to quantify the genetic load in whole-genome sequence data based on evolutionary conservation and annotations. We show that splitting the load into its two components - the realized load (or expressed load) and the masked load (or inbreeding load) - can improve our understanding of the population genetics of deleterious mutations.
Topics: Animals; Genetic Load; Genetic Variation; Genetics, Population; Genome; Genomics; Inbreeding; Mutation
PubMed: 35136196
DOI: 10.1038/s41576-022-00448-x -
Genetics Nov 2022Understanding the demographic history of populations is a key goal in population genetics, and with improving methods and data, ever more complex models are being...
Understanding the demographic history of populations is a key goal in population genetics, and with improving methods and data, ever more complex models are being proposed and tested. Demographic models of current interest typically consist of a set of discrete populations, their sizes and growth rates, and continuous and pulse migrations between those populations over a number of epochs, which can require dozens of parameters to fully describe. There is currently no standard format to define such models, significantly hampering progress in the field. In particular, the important task of translating the model descriptions in published work into input suitable for population genetic simulators is labor intensive and error prone. We propose the Demes data model and file format, built on widely used technologies, to alleviate these issues. Demes provide a well-defined and unambiguous model of populations and their properties that is straightforward to implement in software, and a text file format that is designed for simplicity and clarity. We provide thoroughly tested implementations of Demes parsers in multiple languages including Python and C, and showcase initial support in several simulators and inference methods. An introduction to the file format and a detailed specification are available at https://popsim-consortium.github.io/demes-spec-docs/.
Topics: Genetics, Population; Software; Demography
PubMed: 36173327
DOI: 10.1093/genetics/iyac131 -
Viruses Aug 2023Aspen mosaic-associated virus (AsMaV) is a newly identified , in the family , , associated with mosaic symptoms in aspen trees (). Aspen trees are widely distributed in...
Aspen mosaic-associated virus (AsMaV) is a newly identified , in the family , , associated with mosaic symptoms in aspen trees (). Aspen trees are widely distributed in Europe and understanding the population structure of AsMaV may aid in the development of better management strategies. The virus genome consists of five negative-sense single-stranded RNA (-ssRNA) molecules. To investigate the genetic diversity and population parameters of AsMaV, different regions of the genome were amplified and analyzed and full-length sequence of the divergent isolates were cloned and sequenced. The results show that RNA3 or nucleoprotein is a good representative for studying genetic diversity in AsMaV. Developed RT-PCR-RFLP was able to identify areas with a higher number of haplotypes and could be applied for screening the large number of samples. In general, AsMaV has a conserved genome and based on the phylogenetic studies, geographical structuring was observed in AsMaV isolates from Sweden and Finland, which could be attributed to founder effects. The genome of AsMaV is under purifying selection but not distributed uniformly on genomic RNAs. Distant AsMaV isolates displayed amino acid sequence variations compared to other isolates, and bioinformatic analysis predicted potential post-translational modification sites in some viral proteins.
Topics: Satellite Viruses; Finland; Sweden; Phylogeny; Genetics, Population; Mosaic Viruses
PubMed: 37632020
DOI: 10.3390/v15081678 -
Molecular Ecology Jun 2021A key step in understanding the genetic basis of different evolutionary outcomes (e.g., adaptation) is to determine the roles played by different mutation types (e.g.,... (Review)
Review
A key step in understanding the genetic basis of different evolutionary outcomes (e.g., adaptation) is to determine the roles played by different mutation types (e.g., SNPs, translocations and inversions). To do this we must simultaneously consider different mutation types in an evolutionary framework. Here, we propose a research framework that directly utilizes the most important characteristics of mutations, their population genetic effects, to determine their relative evolutionary significance in a given scenario. We review known population genetic effects of different mutation types and show how these may be connected to different evolutionary outcomes. We provide examples of how to implement this framework and pinpoint areas where more data, theory and synthesis are needed. Linking experimental and theoretical approaches to examine different mutation types simultaneously is a critical step towards understanding their evolutionary significance.
Topics: Adaptation, Physiological; Biological Evolution; Chromosome Inversion; Genetics, Population; Models, Genetic; Mutation; Mutation Rate; Population Density; Selection, Genetic
PubMed: 33955064
DOI: 10.1111/mec.15936 -
Nature Reviews. Genetics Jan 2024In population genetics, the emergence of large-scale genomic data for various species and populations has provided new opportunities to understand the evolutionary... (Review)
Review
In population genetics, the emergence of large-scale genomic data for various species and populations has provided new opportunities to understand the evolutionary forces that drive genetic diversity using statistical inference. However, the era of population genomics presents new challenges in analysing the massive amounts of genomes and variants. Deep learning has demonstrated state-of-the-art performance for numerous applications involving large-scale data. Recently, deep learning approaches have gained popularity in population genetics; facilitated by the advent of massive genomic data sets, powerful computational hardware and complex deep learning architectures, they have been used to identify population structure, infer demographic history and investigate natural selection. Here, we introduce common deep learning architectures and provide comprehensive guidelines for implementing deep learning models for population genetic inference. We also discuss current challenges and future directions for applying deep learning in population genetics, focusing on efficiency, robustness and interpretability.
Topics: Deep Learning; Genomics; Genetics, Population; Genome; Biological Evolution
PubMed: 37666948
DOI: 10.1038/s41576-023-00636-3 -
Philosophical Transactions of the Royal... Jun 2022Over the past 50 years, geneticists have made great strides in understanding how our species' evolutionary history gave rise to current patterns of human genetic... (Review)
Review
Over the past 50 years, geneticists have made great strides in understanding how our species' evolutionary history gave rise to current patterns of human genetic diversity classically summarized by Lewontin in his 1972 paper, 'The Apportionment of Human Diversity'. One evolutionary process that requires special attention in both population genetics and statistical genetics is admixture: gene flow between two or more previously separated source populations to form a new admixed population. The admixture process introduces ancestry-based structure into patterns of genetic variation within and between populations, which in turn influences the inference of demographic histories, identification of genetic targets of selection and prediction of complex traits. In this review, we outline some challenges for admixture population genetics, including limitations of applying methods designed for populations without recent admixture to the study of admixed populations. We highlight recent studies and methodological advances that aim to overcome such challenges, leveraging genomic signatures of admixture that occurred in the past tens of generations to gain insights into human history, natural selection and complex trait architecture. This article is part of the theme issue 'Celebrating 50 years since Lewontin's apportionment of human diversity'.
Topics: Gene Flow; Genetic Variation; Genetics, Population; Human Genetics; Humans; Metagenomics; Selection, Genetic
PubMed: 35430881
DOI: 10.1098/rstb.2020.0410 -
Molecular Ecology Aug 2023Advancements in environmental DNA (eDNA) approaches have allowed for rapid and efficient species detections in diverse environments. Although most eDNA research is... (Review)
Review
Advancements in environmental DNA (eDNA) approaches have allowed for rapid and efficient species detections in diverse environments. Although most eDNA research is focused on leveraging genetic diversity to identify taxa, some recent studies have explored the potential for these approaches to detect within-species genetic variation, allowing for population genetic assessments and abundance estimates from environmental samples. However, we currently lack a framework outlining the key considerations specific to generating, analysing and applying eDNA data for these two purposes. Here, we discuss how various genetic markers differ with regard to genetic information and detectability in environmental samples and how analysis of eDNA samples differs from common tissue-based analyses. We then outline how it may be possible to obtain species absolute abundance estimates from eDNA by detecting intraspecific genetic variation in mixtures of DNA under multiple scenarios. We also identify the major causes contributing to allele detection and frequency errors in eDNA data, discuss their consequences for population-level analyses and outline bioinformatic approaches to detect and remove erroneous sequences. This review summarizes the key advances required to harness the full potential of eDNA-based intraspecific genetic variation to inform population-level questions in ecology, evolutionary biology and conservation management.
Topics: DNA, Environmental; Biodiversity; DNA Barcoding, Taxonomic; Environmental Monitoring; Genetics, Population; Genetic Variation
PubMed: 37254233
DOI: 10.1111/mec.17031 -
Molecular Ecology Apr 2023The field of biogeography unites landscape genetics and phylogeography under a common conceptual framework. Landscape genetics traditionally focuses on recent-time,...
The field of biogeography unites landscape genetics and phylogeography under a common conceptual framework. Landscape genetics traditionally focuses on recent-time, population-based, spatial genetics processes at small geographical scales, while phylogeography typically investigates deep past, lineage- and species-based processes at large geographical scales. Here, we evaluate the link between landscape genetics and phylogeographical methods using the western fence lizard (Sceloporus occidentalis) as a model species. First, we conducted replicated landscape genetics studies across several geographical scales to investigate how population genetics inferences change depending on the spatial extent of the study area. Then, we carried out a phylogeographical study of population structure at two evolutionary scales informed by inferences derived from landscape genetics results to identify concordance and conflict between these sets of methods. We found significant concordance in landscape genetics processes at all but the largest geographical scale. Phylogeographical results indicate major clades are restricted to distinct river drainages or distinct hydrological regions. At a more recent timescale, we find minor clades are restricted to single river canyons in the majority of cases, while the remainder of river canyons include samples from at most two clades. Overall, the broad-scale pattern implicating stream and river valleys as key features linking populations in the landscape genetics results, and high degree of clade specificity within major topographic subdivisions in the phylogeographical results, is consistent. As landscape genetics and phylogeography share many of the same objectives, synthesizing theory, models and methods between these fields will help bring about a better understanding of ecological and evolutionary processes structuring genetic variation across space and time.
Topics: Biological Evolution; Genetics, Population; Phylogeography; Rivers; Genetic Variation; Phylogeny
PubMed: 36695049
DOI: 10.1111/mec.16861 -
Twin Research and Human Genetics : the... Jun 2021The Hardy-Weinberg law of population genetics is usually associated with the notion of random mating of parents. A numerical example for a triallelic autosomal locus...
The Hardy-Weinberg law of population genetics is usually associated with the notion of random mating of parents. A numerical example for a triallelic autosomal locus shows that an uncountable set of mating combinations can maintain Hardy-Weinberg proportions. Therefore, one cannot infer random mating in a population from the observation of Hardy-Weinberg equilibrium. The mating system which ensures that the genotypic distribution of offspring is the same as that of the parents is specified.
Topics: Gene Frequency; Genetics, Population; Genotype; Humans; Models, Genetic; Reproduction
PubMed: 34291729
DOI: 10.1017/thg.2021.26