Hydrologic Terrain Processing Using Parallel Computing
Topography in the form of Digital Elevation Models (DEMs), is widely used to derive information for the modeling of hydrologic processes. Hydrologic terrain analysis augments the information content of digital elevation data by removing spurious pits, deriving a structured flow field, and calculating surfaces of hydrologic information derived from the flow field. The increasing availability of large terrain datasets with very small ground sample distance (GSD) poses a challenge for existing algorithms that process terrain data to extract this hydrologic information. This paper will describe a parallel algorithm that has been developed to enhance hydrologic terrain pre-processing so that larger datasets can be more efficiently computed. This paper describes a Message Passing Interface (MPI) parallel implementation for Pit Removal. This key functionality is used within the Terrain Analysis Using Digital Elevation Models (TauDEM) package to remove spurious elevation depressions that are an artifact of the raster representation of the terrain. The parallel algorithm works by decomposing the domain into stripes or tiles where each tile is processed by a separate processor. This method also reduces the memory requirements of each processor so that larger size grids can be processed. The parallel pit removal algorithm is adapted from the method of Planchon and Darboux that starts from a large elevation then iteratively scans the grid, lowering each grid cell to the maximum of the original elevation or the lowest neighbor. The MPI implementation reconciles elevations along process domain edges after each scan. The parallel pit removal algorithm has replaced a serial implementation that was based on a recursive search to identify the pour point outlet of each pit so that the elevation of grid cells within the pit could be raised to that level. Initial tests indicate that the MPI overhead within the algorithm results in slower run times for small problems but produces significantly improved processing speeds for a large grid using sixteen processors. We have also been able to process grids much larger than were possible using the memory based single processor implementation. Specifically for a modest size grid of 28 x 106 grid cells, the serial fill algorithm (base) required 71 seconds to complete. The parallel implementation using 5 processors required 51 seconds. The parallel algorithm using 16 processors required only 20 seconds. For a much larger grid of 404 x 106 grid cells the base algorithm required 1289 seconds to complete. The parallel algorithm using 8 processors required 954 seconds while using 16 processors required 474 seconds.