Data from: A simulation–based evaluation of methods for inferring linear barriers to gene flow

  • Erin L. Landguth (Contributor)
  • Christopher Blair (Contributor)
  • Samuel A. Cushman (Contributor)
  • Matthew Balazik (Contributor)
  • Annika T H Keeley (Contributor)
  • Faith M. Walker (Contributor)
  • Melanie Murphy (Contributor)
  • Niko Balkenhol (Contributor)
  • Dana E. Weigel (Contributor)
  • Lisette Waits (Contributor)



Different analytical techniques used on the same data set may lead to different conclusions about the existence and strength of genetic structure. Therefore, reliable interpretation of the results from different methods depends on the efficacy and reliability of different statistical methods. In this paper we evaluate the performance of multiple analytical methods to detect the presence of a linear barrier dividing populations. We were specifically interested in determining if simulation conditions, such as dispersal ability and genetic equilibrium, affect the power of different analytical methods for detecting barriers. We evaluate two boundary–detection methods (Monmonier’s algorithm and WOMBLING), two spatial Bayesian clustering methods (TESS and GENELAND), an aspatial clustering approach (STRUCTURE), and two recently–developed, non–Bayesian clustering methods (PSMIX and DAPC). We found that clustering methods had higher success rates than boundary detection methods and also detected the barrier more quickly. All methods detected the barrier more quickly when dispersal was long distance in comparison to short distance dispersal scenarios. Bayesian clustering methods performed best overall, both in terms of highest success rates and lowest time to barrier detection, with GENELAND showing the highest power. None of the methods suggested a continuous linear barrier when the data were generated under an isolation by distance (IBD) model. However, the clustering methods had higher potential for leading to incorrect barrier inferences under IBD unless strict criteria for successful barrier detection were implemented. Based on our findings and those of previous simulation studies, we discuss the utility of different methods for detecting linear barriers to gene flow.,Data_MER120022R1To evaluate the utility of different methods for genetic barrier detection, we used data sets from Landguth et al. (2010), who conducted spatially–explicit, individual–based genetic divergence simulations in the program CDPOP (Landguth & Cushman 2010). Landguth et al. (2010) simulated genotypes for 1 000 individuals of an animal species within a study landscape of 70 x 100 km. Simulations were initiated with 30 loci and 30 alleles maximum per locus (resulting in 900 total possible alleles and mean Ho = 0.967), a k–allele mutation rate of 0.0005 in a two–sex mating structure with sex assigned randomly with equal probability (see Landguth et al. 2010 for details). Landscape resistances to movement were homogeneous and controlled by isolation-by-distance on either side of a complete (i.e., impermeable) linear barrier that bisected the landscape into a western and eastern half (500 individuals on either side). We used data from 10 independent Monte Carlo simulations and under two dispersal distances. In the first scenario (10k), the maximum simulated dispersal distance was 10 kilometres, while in the second scenario (60k) the dispersal distance was set to 60 kilometres. These scenarios use the two most extreme dispersal distances simulated by Landguth et al. (2010) and correspond to species exhibiting short– versus long–range dispersal relative to the spatial extent of the study area. Because we were interested in testing the performance of methods for inferring recent barriers to gene flow, we applied the methods only to the first 20 generations after barrier imposition. There are then 20 generations of raw genotypes for each individual stored in files called grid{generation}.csv. In addition, genetic distance matrices are calculated for each generation using proportion of shared alleles and stored in files called Gdmatrix{generation}.csv.,
Date made availableJan 1 2012

Cite this