Description

Continuing advances in nucleotide sequencing technology are inspiring a suite of genomic approaches in studies of natural populations. Researchers are faced with data management and analytical scales that are increasing by orders of magnitude. With such dramatic advances comes a need to understand biases and error rates, which can be propagated and magnified in large-scale data acquisition and processing. Here we assess genomic sampling biases and the effects of various population-level data filtering strategies in a genotyping-by-sequencing (GBS) protocol. We focus on data from two species of Populus, because this genus has a relatively small genome and is emerging as a target for population genomic studies. We estimate the proportions and patterns of genomic sampling by examining the Populus trichocarpa genome (Nisqually-1), and demonstrate a pronounced bias towards coding regions when using the methylation-sensitive ApeKI restriction enzyme in this species. Using population-level data from a closely related species (P. tremuloides), we also investigate various approaches for filtering GBS data to retain high-depth, informative SNPs that can be used for population genetic analyses. We find a data filter that includes the designation of ambiguous alleles resulted in metrics of population structure and Hardy-Weinberg equilibrium that were most consistent with previous studies of the same populations based on other genetic markers. Analyses of the filtered data (27,910 SNPs) also resulted in patterns of heterozygosity and population structure similar to a previous study using microsatellites. Our application demonstrates that technically and analytically simple approaches can readily be developed for population genomics of natural populations.

Author ORCID Identifier

Paul G. Wolf https://orcid.org/0000-0002-4317-6976

Aaron M. Duffy https://orcid.org/0000-0003-0530-6191

OCLC

985526082

Document Type

Dataset

DCMI Type

Dataset

File Format

.txt, .pdf

Viewing Instructions

***A zipped version of this dataset is available. Contact RDMS (researchdata@usu.edu) for more details.***

Publication Date

4-18-2014

Publisher

Utah State University

Embargo Period

2010

Referenced by

Schilling, M.P., Wolf, P.G., Duffy, A.M., Rai, H.S., Rowe, C.A., Richardson, B.A., Mock, K.E. Genotyping-by-sequencing for Populus population genomics: An assessment of genome sampling patterns and filtering approaches (2014) PLoS ONE, 9 (4), art. no. e95292. Available at 10.1371/journal.pone.0095292

Language

eng

Disciplines

Genomics

License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Checksum

http://di.lib.usu.edu/DATA_Wolf_20140418_ALL.zip

Additional Files

Schilling_et_al_Article_2014.pdf (406 kB)
f59b1969876cee8dfb29cf8c0afbe2ed

H_Rai_Plate_1.txt (1 kB)
MD5: b6e03eda414296e907c469c0addad4f4

H_Rai_Plate_2.txt (1 kB)
MD5: 94b497424a45c8534ff7c7c6808f9886

2051-GSR-1_sequence.txt.bz2 (11008672 kB)
MD5: bd70772a7995747c50ed52d0ea0385ec

2051-GSR-2_sequence.txt.bz2 (7707994 kB)
MD5: 2343c761b91b41a1d26149c1d200e8aa

Included in

Genomics Commons

Share

 
COinS