Kernel Bandwidth Selection for a First Order Nonparametric Streamflow Simulation Model
A new approach for streamflow simulation using nonparametric methods was described in a recent publication (Sharma et al. 1997). Use of nonparametric methods has the advantage that they avoid the issue of selecting a probability distribution and can represent nonlinear features, such as asymmetry and bimodality that hitherto were difficult to represent, in the probability structure of hydrologic variables such as streamflow and precipitation. The nonparametric method used was kernel density estimation, which requires the selection of bandwidth (smoothing) parameters. This study documents some of the tests that were conduced to evaluate the performance of bandwidth estimation methods for kernel density estimation. Issues related to selection of optimal smoothing parameters for kernel density estimation with small samples (200 or fewer data points) are examined. Both reference to a Gaussian density and data based specifications are applied to estimate bandwidths for samples from bivariate normal mixture densities. The three data based methods studied are Maximum Likelihood Cross Validation (MLCV), Least Square Cross Validation (LSCV) and Biased Cross Validation (BCV2). Modifications for estimating optimal local bandwidths using MLCV and LSCV are also examined. We found that the use of local bandwidths does not necessarily improve the density estimate with small samples. Of the global bandwidth estimators compared, we found that MLCV and LSCV are better because they show lower variability and higher accuracy while Biased Cross Validation suffers from multiple optimal bandwidths for samples from strongly bimodal densities. These results, of particular interest in stochastic hydrology where small samples are common, may have importance in other applications of nonparametric density estimation methods with similar sample sizes and distribution shapes.