Description

Ipyrad pipeline parameters files, raw sequence data, barcodes and supporting scripts for clonal and cytotype sample assignment of Populus tremuloides.

OCLC

1143847094

Document Type

Dataset

DCMI Type

Dataset

File Format

.fastq, .py, .txt, .R

Viewing Instructions

This data set contains the following: 1. Raw compressed amplicon sequence data (fastq.gz) from 3 individual Illumina HiSeq2500 lanes ("BP01", "BP02" and "BP03"). Each lane is presented here in 7 individual sequencing files "aa" to "ag" for ease of data transfer. Once download the command line instructions in the README can be used to compile the required fastq.gz files for downstream analysis (p1815-BP01_S40_L006_R1_001.fastq.gz, p1815-BP02_S41_L007_R1_001.fastq.gz and p1815-BP03_S42_L008_R1_001.fastq.gz). 2.Barcode files corresponding to each raw amplicon sequence data file (barcodes01.txt, barcodes02.txt and barcodes03.txt). 3. Ipyrad (http://ipyrad. readthedocs. io) parameters files used to run steps 2 thru 7 of Ipyrad. Separate sequence files were demultiplexed in step 1 of Ipyrad then merged prior to running remaining steps. One parameters file to be subsequently used for clonal assignment (params-BP123_XXXXXXXX_clone.txt) and the other for cytotype analysis (params-BP123_XXXXXXXX_ploidy.txt). 4. Python script to remove a list of individuals from .vcf files for downstream analysis (removeIndVcf.py). 5. R script to transform jaccard pair-wise similarity indices to clonal group assignments (clone_assignment.R). Additionally, the following scripts were implemented as part of this pipeline: 1. vcf2Jaccard (https://github.com/carol-rowe666/vcf2Jaccard, Carol Rowe) 2. vcf2hetAlleleDepth (https://github.com/carol-rowe666/vcf2hetAlleleDepth, Carol Rowe) 3. gbs2ploidy (Gompert, Z., Mock, K. Detection of individual ploidy levels with genotyping-by-sequencing (GBS) analysis. Mol. Eco. Res. 17:1156-67.)

Publication Date

12-13-2019

Funder

Arizona State University School of Life Sciences

Publisher

Utah State University

Methodology

Leaf samples (n=503) were collected from three watersheds in southwestern Colorado. Genomic DNA was subsequently extracted and a ddRAD library prepared following Parchmen et al. 2012. Individually barcoded samples were then pooled and sequenced in three separate libraries using an Illumina HiSeq2500. Additional details can be found in the forthcoming publication.

Scientfic Names

Populus tremuloides

Language

eng

Comments

Exact instructions for concatenating the files are included in the README file.

Disciplines

Botany | Genetics

License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Checksum

f3be4ecad8e8f99ae11ce0d27d88eb1f

Additional Files

README.txt (4 kB)
MD5: 60852e01432ed70e06a2e0fb22226754

barcodes01.txt (4 kB)
MD5: 740ebcfd2c95f14a1a7f9e89156884f5

barcodes02.txt (4 kB)
MD5: 88deeceb758249a37df01677d7d0eda6

barcodes03.txt (3 kB)
MD5: 0d8a192bdff6865c4f34b9f356633a26

msl_10_Parameters.txt (3 kB)
MD5: f8e392a2d34133ba5c2fb430164b09ab

msl_250_Parameters.txt (3 kB)
MD5: f8e392a2d34133ba5c2fb430164b09ab

BP01_S40_L006_R1_aa.fastq.gz (1860416 kB)
MD5: fd1d44ad9bee923f0e4d38b32a02ac04

BP01_S40_L006_R1_ab.fastq.gz (1836351 kB)
MD5: a4f5b31ecbb3fc56e596b74edda52151

BP01_S40_L006_R1_ac.fastq.gz (1859417 kB)
MD5: 445d80b27aef759c03de59d545e63f12

BP01_S40_L006_R1_ad.fastq.gz (1956087 kB)
MD5: f9907a05e0a93a866c54a2436483446c

BP01_S40_L006_R1_ae.fastq.gz (2009013 kB)
MD5: 5f3863b3a28991849f38dee7c29cefc2

BP01_S40_L006_R1_af.fastq.gz (1986107 kB)
MD: cdb480b3596aca03081e96c5a0695c32

BP01_S40_L006_R1_ag.fastq.gz (1226263 kB)
MD5: 22ae67abb06541a32a4de24e615875aa

BP02_S41_L007_R1_aa.fastq.gz (1850345 kB)
MD5: 6dd86972b6b99ec6ad22c7b9a555943b

BP02_S41_L007_R1_ab.fastq.gz (1826515 kB)
MD5: 2785b17dde54d6c21c0eba4bd7c5fd67

BP02_S41_L007_R1_ac.fastq.gz (1873673 kB)
MD5: 423ab2d06994f52fc9b448b1b576a352

BP02_S41_L007_R1_ad.fastq.gz (1990780 kB)
MD5: 0eba6f03722ffea665f97d7a72e191ab

BP02_S41_L007_R1_ae.fastq.gz (1968924 kB)
MD5: 14ac4afbbe3bb2fe6a783c81c78aab76

BP02_S41_L007_R1_af.fastq.gz (2032458 kB)
MD5: 7ce44b1b49a8e1bfb34b6446002b13db

BP02_S41_L007_R1_ag.fastq.gz (669386 kB)
MD5: 6d6e6d160ac0249d7293cea2541ec0f4

BP03_S42_L008_R1_aa.fastq.gz (1840608 kB)
MD5: b6060aaaf4251f31250e74b64b8ecc4f

BP03_S42_L008_R1_ab.fastq.gz (1845704 kB)
MD5: b67c8ce1fc06b78391a46909254d7a11

BP03_S42_L008_R1_ac.fastq.gz (1901644 kB)
MD: c687199b97ac18857318775324dac7b0

BP03_S42_L008_R1_ad.fastq.gz (2020478 kB)
MD5: 9b68d3ab555e68181a5f4cb8cff60736

BP03_S42_L008_R1_ae.fastq.gz (2000875 kB)
MD5: 43dbdcc7c636e820b5c0832778962864

BP03_S42_L008_R1_af.fastq.gz (2117778 kB)
MD5: 0b519e03916f8f55efb502fc7a1ec8be

clone_assignment.R (1 kB)
MD5: c1e503ce99749650f3fb1e18083f177f

BP03_S42_L008_R1_ag.fastq.gz (209154 kB)
MD5: a31f17c8fcd2ee0fa4cae22e6c0380e9

removeIndVcf.py (1 kB)
MD5: b058790fd20812b7265d5b9a731cde82

msl_10_lessDroppedInds.vcf (1454636 kB)
MD5: 4e622dfc6f77ab3d182670b65821bbe3

msl_250_lessDroppedInds.vcf (108375 kB)
MD5: cd6095ef2d640585304f4b9717f3622b

Results.xlsx (123 kB)
MD5: ff3bd7f8203137ff0c5b2ac171e73700

estploidy_pp98.csv (35 kB)
MD5: 37ab3edd0f585c444070de6c426f30c8

2019-02-13-01 Walton_pre.pdf (264 kB)
MD5: 8b06813217ca3c3dc044a852c719c8bb

2019-02-14-01 Walton_post.pdf (244 kB)
MD5: 5bf02e0c4f2bc05036d61a06fd82ded9

Share

 
COinS