Date of Award:


Document Type:


Degree Name:

Master of Science (MS)


Computer Science


Stephen W. Clyde


Testing an integrated information system that relies on data from multiple sources can be a challenge, particularly when the data is confidential. This thesis describes a novel test data extraction approach, called semantic-based test data extraction for integrated systems (iSTDE) that solves many of the problems associated with creating realistic test data for integrated information systems containing confidential data. iSTDE reads a consistent cross-section of data from the production databases, manipulates that data to obscure individual identities while still preserving overall semantic data characteristics that are critical to thorough system testing, and then moves that test data to an external test environment.

This thesis also presents a theoretical study that compares test-data extraction with a competing technique, named test-data generation. Specifically, this thesis a) describes a comparison method that includes a comprehensive list of characteristics essential for testing the database applications organized into seven different areas, b) presents an analysis of the relative strengths and weaknesses of the different test-data creation techniques, and c) reports a number of specific conclusions that will help testers make appropriate choices.




This work made publicly available electronically on August 8, 2011.