Publications

Journal Article
Wakeley J. Pairwise differences under a general model of population subdivision. J. Genet. 1996;75 :81-89.Abstract
A number of different migration and isolation models of population subdivision have been studied. In this paper I analyse a general model of two populations derived from a common ancestral population at some time in the past. The two populations may exchange migrants, but they may also be completely isolated from each other. I derive the expectation and variance of the number of differences between two sequences sampled from the two populations. These are then compared to the corresponding results from two other much-used models: equilibrium migration and complete isolation.
(pdf)
Wakeley J. Distinguishing migration from isolation using the variance of pairwise differences. Theoret. Pop. Biol. 1996;49 (3) :369-386.Abstract

Two demographic scenarios are considered: two populations with migration and two populations that have been completely isolated from each other for some period of time. The variance of the number of differences between pairs of sequences in a single sample is studied and forms the basis of a test of the isolation model. The migration model is one possible alternative to isolation. The isolation model is rejected when the proposed test statistic, which involves the variances of pairwise difference within and between populations, is larger than power and realized significance of the test are investigated using simulations, and an example using mitochondrial DNA illustrates its application.

(pdf)
Wakeley J. The variance of pairwise nucleotide differences in two populations with migration. Theoretical Population Biology. 1996;49 :39-57.Abstract

The variances of three measures of pairwise difference are derived for the case of two populations that exchange migrants. The resulting expressions can be used to place standard errors an estimates of population genetic parameters. The three measures considered are the average number of intrapopulation nucleotide differences, the average number of interpopulation nucleotide differences, and the net number of nucleotide differences between the two populations. The expectations of these statistics are previously known and suggest that they might be used to the quantify the divergence between populations. However, the standard errors of all three statistics are shown to be quite large relative to their expectations. Thus, our ability to quantify divergence between populations with them is limited, at least using available data. An analysis of mitochondrial DNA sequences from grey-crowned babblers illustrates the application of the theory. The variances derived here for migration are compared to previously published results for two populations that have been completely isolated from one another for some length of time. All three variances are greater under migration than under isolation, suggesting that a test to distinguish these two demographic situations could be developed.

(pdf)
Cummings MP, Otto SP, Wakeley J. Sampling properties of DNA sequence data in phylogenetic analysis. Molecular Biology and Evolution. 1995;12 (5) :814-822.Abstract

We inferred phylogenetic trees from individual genes and random samples of nucleotides from the mitochondrial genomes of 10 vertebrates and compared the results to those obtained by analyzing the whole genomes. Individual genes are poor samples in that they infrequently lead to the whole-genome tree. A large number of nucleotide sites is needed to exactly determine the whole-genome tree. A relatively small number of sites, however, often results in a tree close to the whole-genome tree. We found that blocks of contiguous sites were less likely to lead to the whole-genome tree than samples composed of sites drawn individually from throughout the genome. Samples of contiguous sites are not representative of the entire genome, a condition that violates a basic assumption of the bootstrap method as it is applied in phylogenetic studies.

(pdf)
Wakeley J. Substitution-rate variation among sites and the estimation of transition bias. Mol. Biol. Evol. 1994;11 (3) :436-442.Abstract

Substitution-rate variation among sites and differences in the probabilities of change among the four nucleotides are conflated in DNA sequence comparisons. When variation in rate exists among sites but is ignored, biases in the rates of change among nucleotides are underestimated. This paper provides a quantification of this effect when the observed proportions of transitions, P, and transversions, Q, between two sequences are used to estimate transition bias. The utility of P/Q as an estimator is examined both with and without rate variation among sites. A gamma-distributed-rates model is used to illustrate the effect that variation among sites has on estimates of transition bias, but it is argued that the basic results should hold for any pattern of rate variation. Naive estimates of the extent of transition bias, those that ignore rate variation when it is present, can seriously underestimate its true value. The extent of this underestimation increases with the amount of rate variation among sites. An example using human mitochondrial DNA shows that a simple comparison of the proportions of transitions and transversions in recently diverged sequences underestimates the level of transition bias by approximately 15%. This does not depend on the use of P/Q to estimate transition bias; maximum-likelihood methods give similar results.

(pdf)
Wakeley J. Substitution rate variation among sites in hypervariable region I of human mitochondrial DNA. J. Mol. Evol. 1993;37 :613-623.Abstract

More than an order of magnitude difference in substitution rate exists among sites within hypervariable region 1 of the control region of human mitochondrial DNA. A two-rate Poisson mixture and a negative binomial distribution are used to describe the distribution of the inferred number of changes per nucleotide site in this region. When three data sets are pooled, however, the two-rate model cannot explain the data. The negative binomial distribution always fits, suggesting that substitution rates are approximately gamma distributed among sites. Simulations presented here provide support for the use of a biased, yet commonly employed, method of examining rate variation. The use of parsimony in the method to infer the number of changes at each site introduces systematic errors into the analysis. These errors preclude an unbiased quantification of variation in substitution rate but make the method conservative overall. The method can be used to distinguish sites with highly elevated rates, and 29 such sites are identified in hypervariable region 1. Variation does not appear to be clustered within this region. Simulations show that biases in rates of substitution among nucleotides and non-uniform base composition can mimic the effects of variation in rate among sites. However, these factors contribute little to the levels of rate variation observed in hypervariable region 1.

(pdf)

Pages