Date of Award:


Document Type:


Degree Name:

Doctor of Philosophy (PhD)


Computer Science


Daniel Watson


Large datasets require efficient processing, storage and management to efficiently extract useful information for innovation and decision-making. This dissertation demonstrates novel approaches and algorithms using virtual memory approach, parallel computing and cyberinfrastructure. First, we introduce a tailored user-level virtual memory system for parallel algorithms that can process large raster data files in a desktop computer environment with limited memory. The application area for this portion of the study is to develop parallel terrain analysis algorithms that use multi-threading to take advantage of common multi-core processors for greater efficiency. Second, we present two novel parallel WaveCluster algorithms that perform cluster analysis by taking advantage of discrete wavelet transform to reduce large data to coarser representations so data is smaller and more easily managed than the original data in size and complexity. Finally, this dissertation demonstrates an HPC gateway service that abstracts away many details and complexities involved in the use of HPC systems including authentication, authorization, and data and job management.