In this paper, we applied the Sato–Crutchfield equation to population genetics. We studied it analytically and numerically. It is applied to some cases in population of one locus with two alleles. It is found that Sato–Crutchfield equation does not affect the stability of the evolutionary equations but the reinforcement of the choice of heterozygote state.
It has been observed that a higher mutation load could be introduced into the genomes of children conceived by assisted reproduction technology (fertilization in-vitro). This generates two effects — slightly higher mutational pressure on the whole genetic pool of population and inhomogeneity of mutation distributions in the genetic pool. Computer simulations of the Penna ageing model suggest that already a small fraction of births with enhanced number of new mutations can negatively influence the whole population.
In this paper, we consider the evolution of an (infinitely large) population under recombination and additional evolutionary forces, modeled by a measure-valued ordinary differential equation. We provide a stochastic representation for the solution of this model via duality to a new labeled partitioning process with Markovian labels. In the special case of single-crossover, this leads to a recursive solution formula. This extends (and unifies) previous results on the selection–recombination equation. As a concrete example, we consider the selection–mutation–recombination equation.
A current major focus in genomics is the large-scale collection of genotype data in populations in order to detect variations in the population. The variation data are sought in order to address fundamental and applied questions in genetics that concern the haplotypes in the population. Since, almost all the collected data is in the form of genotypes, but the downstream genetics questions concern haplotypes, the standard approach to this issue has been to try to first infer haplotypes from the genotypes, and then answer the downstream questions using the inferred haplotypes. That two-stage approach has potential deficiencies, giving rise to the general question of how well one can answer the downstream questions using genotype data without first inferring haplotypes, and giving rise to the goal of computing the range of downstream answers that would be obtained over the range of possible inferred haplotype solutions. This paper provides some tools for the study of those issues, and some partial answers. We present algorithms to solve downstream questions concerning the minimum amount of recombination needed to derive given genotypic data, without first fixing a choice of haplotypes. We apply these algorithms to the goal of finding recombination hotspots, obtaining as good results as a published method that first infers haplotypes; and to the case of estimating the minimum amount of recombination needed to derive the true haplotypes underlying the genotypic data, obtaining weaker results compared to first inferring haplotypes using the program PHASE.
While full-sibling group reconstruction from microsatellite data is a well-studied problem, reconstruction of half-sibling groups is much less studied, theoretically challenging, and computationally demanding. In this paper, we present a formulation of the half-sibling reconstruction problem and prove its APX-hardness. We also present exact solutions for this formulation and develop heuristics. Using biological and synthetic datasets we present experimental results and compare them with the leading alternative software COLONY. We show that our results are competitive and allow half-sibling group reconstruction in the presence of polygamy, which is prevalent in nature.
Association tests performed with the Likelihood-Ratio Test (LR Test) can be an alternative to FSTFST, which is often used in population genetics to find variants of interest. Because the LR Test has several properties that could make it preferable to FST, we propose a novel approach for modeling unknown genotypes in highly-similar species. To show the effectiveness of this LR Test approach, we apply it to single-nucleotide polymorphisms (SNPs) associated with the recent speciation of the malaria vectors Anopheles gambiae and Anopheles coluzzii and compare to FST.
The mismatch distribution is a good descriptive summary statistic that describes the phenomena of population genetics. This article scanned mismatch distribution on human genome with single nucleotide polymorphism (SNP) data from the International HapMap Project. It is found that the abnormal mismatch distribution could imply some special segments on some chromosomes. One of the segments, on chromosome 8, was proved as an inversion. Other special segments may also imply some special structure on chromosomes, such as duplication. The conjectures of other segments still need further research.
The Y chromosome contains short tandem repeats (STRs) that have numerous applications, including forensic investigations, male identification for legal purposes, and population genetics. However, commercially available Y-STR tests have limitations in their ability to differentiate closely linked male individuals in forensic genetics. Recent studies have shown that rapidly mutating (RM) Y-STRs offer significantly greater haplotype diversity across worldwide populations than conventional Y-STRs, although some RM Y-STR loci are not included in current commercial kits. This research aimed to evaluate the effectiveness of RM Y-STR haplotype frequencies in distinguishing individuals with genetic variations in the Gilgit population and other Pakistani populations. The study involved analyzing several RM Y-STRs in 56 unrelated Gilgit men and 21 other Pakistani populations. Statistical analysis showed that most of the loci maintained haplotype values, while some varied in certain cases. The results indicated that RM Y-STRs were highly effective in distinguishing genetic differences among the Gilgit population and other Pakistani populations, as evidenced by the gene diversity (GD), average GD and discrimination capacity (DC), match probability (MP), and power discrimination (PD) values calculated for the set of various RM Y-STRs in Table 1 for the Gilgit population and in Table 2 for other Pakistani populations. These findings highlight the potential of RM Y-STRs in forensic genetics and population genetics research and underscore the importance of including a diverse set of loci to maximize the discriminatory power of genetic markers. Further research could explore the utility of RM Y-STRs in other populations and their potential for use in other applications beyond forensic investigations and male identification.
We study the problem of merging genetic maps, when the individual genetic maps are given as directed acyclic graphs. The problem is to build a consensus map, which includes and is consistent with all (or, the vast majority of) the markers in the individual maps. When markers in the input maps have ordering conflicts, the resulting consensus map will contain cycles. We formulate the problem of resolving cycles in a combinatorial optimization framework, which in turn is expressed as an integer linear program. A faster approximation algorithm is proposed, and an additional speed-up heuristic is developed. According to an extensive set of experimental results, our tool is consistently better than JOINMAP, both in terms of accuracy and running time.
Since their discovery in the early 1990s, the human arylamine N-acetyltransferase (NAT) genes have been the subject of a tremendous number of molecular anthropological studies describing their nucleotide diversity in a wide range of populations worldwide. While (HUMAN) NAT2 presents a high number of nucleotide substitutions, with seven of them reaching polymorphic frequencies in almost all human populations, and a high level of non-synonymous changes relative to synonymous, (HUMAN)NAT1 is much less diverse, particularly in its coding region. A pseudogene, NATP, the third member of this small gene family, harbours a diversity similar to NAT2. In accordance with that, selective neutrality tests suggest that (HUMAN)NAT1 and (HUMAN)NAT2 evolve under distinct selective regimes. An evolution of (HUMAN)NAT2 under positive population-specific pressures is proposed to be probably linked to the mode of subsistence and/or the chemical environment populations live in, as reflected by climatic zones and biomes; in contrast, for (HUMAN)NAT1, functional constraints determining the strength of purifying selection are generally invoked.
Many multi-cellular organisms exhibit remarkably similar patterns of aging and mortality. Because this phenomenon appears to arise from the complex interaction of many genes, it has been a challenge to explain it quantitatively as a response to natural selection. We survey attempts by the author and his collaborators to build a framework for understanding how mutation, selection and recombination acting on many genes combine to shape the distribution of genotypes in a large population. A genotype drawn at random from the population at a given time is described by a Poisson random measure on the space of loci and its distribution is characterized by the associated intensity measure. The intensity measures evolve according to a continuous-time measure-valued dynamical system. We present general results on the existence and uniqueness of this dynamical system and how it arises as a limit of discrete generation systems. We also discuss existence of equilibria.
This contribution is concerned with mathematical models for the dynamics of the genetic composition of populations evolving under recombination. Recombination is the genetic mechanism by which two parent individuals create the mixed type of their offspring during sexual reproduction. The corresponding models are large, nonlinear dynamical systems (for the deterministic treatment that applies in the infinite-population limit), or interacting particle systems (for the stochastic treatment required for finite populations). We review recent progress on these difficult problems. In particular, we present a closed solution of the deterministic continuous-time system, for the important special case of single crossovers; we extract an underlying linearity; we analyse how this carries over to the corresponding stochastic setting; and we provide a solution of the analogous deterministic discrete-time dynamics, in terms of its generalised eigenvalues and a simple recursion for the corresponding coefficients.
Neighbor-joining is one of the most widely used methods for constructing evolutionary trees. This approach from phylogenetics is often employed in population genetics, where distance matrices obtained from allele frequencies are used to produce a representation of population relationships in the form of a tree. In phylogenetics, the utility of neighbor-joining derives partly from a result that for a class of distance matrices including those that are additive or tree-like—generated by summing weights over the edges connecting pairs of taxa in a tree to obtain pairwise distances—application of neighbor-joining recovers exactly the underlying tree. For populations within a species, however, migration and admixture can produce distance matrices that reflect more complex processes than those obtained from the bifurcating trees typical in the multispecies context. Admixed populations—populations descended from recent mixture of groups that have long been separated—have been observed to be located centrally in inferred neighbor-joining trees, with short external branches incident to the path connecting their source populations. Here, using a simple model, we explore mathematically the behavior of an admixed population under neighbor-joining. We show that with an additive distance matrix, a population admixed among two source populations necessarily lies on the path between the sources. Relaxing the additivity requirement, we examine the smallest nontrivial case—four populations, one of which is admixed between two of the other three—showing that the two source populations never merge with each other before one of them merges with the admixed population. Furthermore, the distance on the constructed tree between the admixed population and either source population is always smaller than the distance between the source populations, and the external branch for the admixed population is always incident to the path connecting the sources. We define three properties that hold for four taxa and that we hypothesize are satisfied under more general conditions: antecedence of clustering, intermediacy of distances, and intermediacy of path lengths. Our findings can inform interpretations of neighbor-joining trees with admixed groups, and they provide an explanation for patterns observed in trees of human populations.
Most deleterious mutations have very slight effects on total fitness, and it has become clear that below a certain fitness effect threshold, such low-impact mutations fail to respond to natural selection. The existence of such a selection threshold suggests that many low-impact deleterious mutations should accumulate continuously, resulting in relentless erosion of genetic information. In this paper, we use numerical simulation to examine this problem of selection threshold.
The objective of this research was to investigate the effect of various biological factors individually and jointly on mutation accumulation in a model human population. For this purpose, we used a recently-developed, biologically-realistic numerical simulation program, Mendel's Accountant. This program introduces new mutations into the population every generation and tracks each mutation through the processes of recombination, gamete formation, mating, and transmission to the new offspring. This method tracks which individuals survive to reproduce after selection, and records the transmission of each surviving mutation every generation. This allows a detailed mechanistic accounting of each mutation that enters and leaves the population over the course of many generations. We term this type of analysis genetic accounting.
Across all reasonable parameters settings, we observed that high impact mutations were selected away with very high efficiency, while very low impact mutations accumulated just as if there was no selection operating. There was always a large transitional zone, wherein mutations with intermediate fitness effects accumulated continuously, but at a lower rate than would occur in the absence of selection. To characterize the accumulation of mutations of different fitness effect we developed a new statistic, selection threshold (STd), which is an empirically determined value for a given population. A population's selection threshold is defined as that fitness effect wherein deleterious mutations are accumulating at exactly half the rate expected in the absence of selection. This threshold is mid-way between entirely selectable, and entirely unselectable, mutation effects.
Our investigations reveal that under a very wide range of parameter values, selection thresholds for deleterious mutations are surprisingly high. Our analyses of the selection threshold problem indicate that given even modest levels of noise affecting either the genotype-phenotype relationship or the genotypic fitness-survival-reproduction relationship, accumulation of low-impact mutations continually degrades fitness, and this degradation is far more serious than has been previously acknowledged. Simulations based on recently published values for mutation rate and effect-distribution in humans show a steady decline in fitness that is not even halted by extremely intense selection pressure (12 offspring per female, 10 selectively removed). Indeed, we find that under most realistic circumstances, the large majority of harmful mutations are essentially unaffected by natural selection and continue to accumulate unhindered. This finding has major theoretical implications and raises the question, “What mechanism can preserve the many low-impact nucleotide positions that constitute most of the information within a genome?”
Background. In a companion paper, careful numerical simulation was used to demonstrate that there is a quantifiable selection threshold, below which low-impact deleterious mutations escape purifying selection and, therefore, accumulate without limit. In that study we developed the statistic, STd, which is the mid-point of the transition zone between selectable and un-selectable deleterious mutations. We showed that under most natural circumstances, STd values are surprisingly high, such that the large majority of all deleterious mutations are un-selectable. Does a similar selection threshold exist for beneficial mutations?
Methods. As in our companion paper we here employ what we describe as genetic accounting to quantify the selection threshold (STb) for beneficial mutations, and we study how various biological factors combine to determine its value.
Results. In all experiments that employ biologically reasonable parameters, we observe high STb values and a general failure of selection to preferentially amplify the large majority of beneficial mutations. High-impact beneficial mutations strongly interfere with selection for or against all low-impact mutations.
Conclusions. A selection threshold exists for beneficial mutations similar in magnitude to the selection threshold for deleterious ones, but the dynamics of that threshold are different. Our results suggest that for higher eukaryotes, minimal values for STb are in the range of 10−4 to 10−3. It appears very likely that most functional nucleotides in a large genome have fractional contributions to fitness much smaller than this. This means that, given our current understanding of how natural selection operates, we cannot explain the origin of the typical functional nucleotide.
Please login to be able to save your searches and receive alerts for new content matching your search criteria.