April 2015. Digitization of data from Wolf et al 1991. Amer. J. Bot 78: 515-526. Files provided are isozyme allele frquency tables (60 populations, 23 loci) [1991_Wolf_60_pops_arrays.csv], rather than the original genotype numbers. In processing these tables it appears that there may be some errors in the original genotype numbers, but only where N>32 for the population array. I still have the original data (drawings of banding patterns) so can easily correct those few arrays. However, posted here are tables that are consistent with the results reported in Wolf et al 1991. Also provided are locality data with lat long [1991_Wolf_localities]. Please contact me at paul.wolf11@gmail.com or paul.wolf@usu.edu if you need any assistance or data in a different format. Tanner Robeson and Carol Rowe kindly assisted with Digitization. Steps of digitization: Scan in as Paul_Wolf_Higher_Quality.pdf OCR by Tanner Robeson: used the ABBYY Fine Reader OCR. a. Open the OCR b. Select the PDF to Excel option and choose the file that you want read. c. The OCR will try to automatically scan and convert every page of the document, you can either let this run (it doesn't work with your data tables very well automatically) or you can stop it and do each page manually (recommended). d. When manually reading each page with the OCR, use the "draw table area" on the right hand side of the window. e. After drawing the table area hit the "analyze page" at the bottom of the toolbar just above the image of your scan. Confirm that the OCR has drawn the rows and columns of the table correctly. f .At this point it may become apparent that there are distortions in the scan that make it hard for the OCR to draw the lines of the table. If this is the case you can use the "edit image" function on the toolbar. g. Using the "deskew" function on the right hand side of the image editing page will give you the ability to straighten the image so that it can be properly read. When finished editing simply select "exit image editor" just below the toolbar. h. Now select the "read page" function to finish the process. On the right you will have the data table in text form. You can copy and past this directly into an excel spreadsheet. edit in Google drive download as PaulWolfData_sheet1_corrected.xlsx get all sheets in one save as Ipo_60_pops_from_drive.csv run ipo_data_processing.py. Check input file name check output population names and locus names! note pops 107 and 100 locus 'I(N)' near MDH Now run make_locus_columns.py (check name of input file) Take this file, change name to Ipo_60_pops_arrays_correcting.csv and run through Carol's script (Paul_Ipo_1990_data.py) to get table of samples sizes (N) for each locus and each population (pop_sample_sizes_each_locus.csv). Also outpupt of total allele freq for all combos (Allele_freqs_sum_to_1_check.csv). Make corrections manually Run through Carol's scripts again Change filenake to Ipo_60_pops_arrays_complete.csv Post on Digial Commons Check lat long in GeoLocate post to paulwolf/Documents/ZZ Reprint pdf files/1990 Ipo dissertation data/1991_Wolf_localities.csv (corrected 6:12 pm on 13 April 2015 [DONE] Send to Becky Thoms at USU library for posting on Digital Commons