Scotland C Leman
Associate ProfessorVirginia Polytechnic Institute and State University
On Evolutionary Theory, Inference, and Simulation: A Geneological Perspective
This thesis discusses evolutionary inference from both a modeling perspective and the algorithms associated with performing statistical inference. Genetic data (DNA) takes on a nontraditional form in that a single observation encompass at least hundreds of base pairs and is nonnumeric in nature. Beyond this fact, DNA from individuals that share a common ancestry have similarities in their genetic makeup, so the notion of independent and identically distributed samples does not hold. In turn, a complex network of associations must be employed when modeling the data. The complexities involved in the modeling procedure directly relate to the complexities involved when reconstructing likelihood functions, or posterior distribution. Many computational methods used during statistical inference involve the idea of drawing samples from proposal distributions. However, such proposal distributions are difficult to construct so that their probability distribution match that of the true target distribution, in turn hampering the efficiency of the overall sampling scheme. We will describe a general approach to modeling the evolutionary past. Within this framework, we will discuss specific models which address particular phenomena (speciation, introgression and paracentric inversions) which relate to genomic data. The latter part of this thesis will address two simulation methods used for statistical inference. The first will pertain to direct likelihood construction under an Importance Sampling (IS) framework and the second will address a Markov Chain Monte Carlo (MCMC) procedure for posterior sampling.