Date of Award:
12-2008
Document Type:
Thesis
Degree Name:
Master of Science (MS)
Department:
Computer Science
Committee Chair(s)
Robert F. Erbacher
Committee
Robert F. Erbacher
Committee
Stephen J. Allan
Committee
Chad Mano
Abstract
This research presents a new and unique technique called SÁDI, statistical analysis data identification, for identifying the type of data on a digital device and its storage format based on data type, specifically the values of the bytes representing the data being examined. This research incorporates the automation required for specialized data identification tools to be useful and applicable in real-world applications. The SÁDI technique utilizes the byte values of the data stored on a digital storage device in such a way that the accuracy of the technique does not rely solely on the potentially misleading metadata information but rather on the values of the data itself. SÁDI provides the capability to identify what digitally stored data actually represents. The identification of the relevancy of data is often dependent upon the identification of the type of data being examined. Typical file type identification is based upon file extensions or magic keys. These typical techniques fail in many typical forensic analysis scenarios, such as needing to deal with embedded data, as in the case of Microsoft Word files or file fragments. These typical techniques for file identification can also be easily circumvented, and individuals with nefarious purposes often do so.
The results from the development of this technique will greatly enhance the capabilities of legal forensic units, as well as expand the knowledge base in the fields of computer forensics and digital security. The results presented here are promising and certainly do not represent the complete capability of this new technique. They compare favorably with other techniques from recent research and with the capabilities and performance of the professional tools currently in use in real-world forensics situations.
Checksum
2eae68a1af38f37b334966b6c6dc4ed3
Recommended Citation
Moody, Sarah Jean, "Automated Data Type Identification and Localization Using Statistical Analysis Data Identification" (2008). All Graduate Theses and Dissertations, Spring 1920 to Summer 2023. 9.
https://digitalcommons.usu.edu/etd/9
Included in
Copyright for this work is retained by the student. If you have any questions regarding the inclusion of this work in the Digital Commons, please email us at .