Watershed Sciences Faculty Publications

Selecting Discriminant Function Models for Predicting the Expected Richness of Aquatic Macroinvertebrates

J. Van Sickle
D. D. Huff
Charles P. Hawkins, Utah State UniversityFollow

Document Type

Article

Journal/Book Title/Conference

Freshwater Biology

Volume

Publication Date

1-1-2006

Keywords

richness, aquatic macroinvertebrates, discriminant function models

First Page

359

Last Page

372

Abstract

1. The predictive modelling approach to bioassessment estimates the macroinvertebrate assemblage expected at a stream site if it were in a minimally disturbed reference condition. The difference between expected and observed assemblages then measures the departure of the site from reference condition. 2. Most predictive models employ site classification, followed by discriminant function (DF) modelling, to predict the expected assemblage from a suite of environmental variables. Stepwise DF analysis is normally used to choose a single subset of DF predictor variables with a high accuracy for classifying sites. An alternative is to screen all possible combinations of predictor variables, in order to identify several ‘best’ subsets that yield good overall performance of the predictive model. 3. We applied best-subsets DF analysis to assemblage and environmental data from 199 reference sites in Oregon, U.S.A. Two sets of 66 best DF models containing between one and 14 predictor variables (that is, having model orders from one to 14) were developed, for five-group and 11-group site classifications. 4. Resubstitution classification accuracy of the DF models increased consistently with model order, but cross-validated classification accuracy did not improve beyond seventh or eighth-order models, suggesting that the larger models were overfitted. 5. Overall predictive model performance at model training sites, measured by the rootmean-squared error of the observed/expected species richness ratio, also improved steadily with DF model order. But high-order DF models usually performed poorly at an independent set of validation sites, another sign of model overfitting. 6. Models selected by stepwise DF analysis showed evidence of overfitting and were outperformed by several of the best-subsets models. 7. The group separation strength of a DF model, as measured by Wilks’ K, was more strongly correlated with overall predictive model performance at training sites than was DF classification accuracy. 8. Our results suggest improved strategies for developing reliable, parsimonious predictive models. We emphasise the value of independent validation data for obtaining a

Recommended Citation

Van Sickle, J. D.D. Huff, and C.P. Hawkins. 2006. Selecting discriminant function models for predicting the expected richness of aquatic macroinvertebrates. Freshwater Biology 51: 359-372

Link to Full Text

COinS

DOI

https://doi.org/10.1111/j.1365-2427.2005.01487.x

Watershed Sciences Faculty Publications

Selecting Discriminant Function Models for Predicting the Expected Richness of Aquatic Macroinvertebrates

Document Type

Journal/Book Title/Conference

Volume

Publication Date

Keywords

First Page

Last Page

Abstract

Recommended Citation

DOI

Browse

For Authors

Scholarly Communication

Research Data

Watershed Sciences Faculty Publications

Selecting Discriminant Function Models for Predicting the Expected Richness of Aquatic Macroinvertebrates

Authors

Document Type

Journal/Book Title/Conference

Volume

Publication Date

Keywords

First Page

Last Page

Abstract

Recommended Citation

Share

DOI

Browse

For Authors

Scholarly Communication

Research Data