Date of Award:

5-2015

Document Type:

Dissertation

Degree Name:

Doctor of Philosophy (PhD)

Department:

Computer Science

Committee Chair(s)

Daniel Watson

Committee

Daniel Watson

Committee

David Tarboton

Committee

Vladimir Kulyukin

Committee

Minghui Jiang

Committee

Stephen Clyde

Abstract

Large datasets require high processing power to compute, high-speed network connections to transmit, or high storage capacity to archive. With the advent of the internet, many in the science community and the public at large are faced with a need to manage, store, transmit and process large datasets in an efficient fashion to create value for all concerned. By example, large environmental researchers analyze large map data to extract hydrologic information from topography. However, processing these data and other tasks is hard – sometimes impossible – in minimal resource environments such as desktop systems.

This dissertation demonstrates novel approaches and programming algorithms that address several issues associated with large datasets. We present a novel virtual memory system that can process large map data which are too big to fit in memory in raster-based calculations. Then, we introduce parallel computer algorithms that reduce large data to smaller, more easily managed forms; and finding the patterns inside of this smaller representation of large data efficiently in large computer systems. We need also software services on the internet to access large remote computer systems. We present a software service named HydroGate that abstracts away many details and complexities involved in the use of large remote computer systems, including authentication, authorization, and data and job management.

Checksum

c437a64f39f16090e8873a864c2f9371

Share

COinS