Document Type

Article

Author ORCID Identifier

Saichand Thota https://orcid.org/0009-0007-8452-1039

Ayman Nassar https://orcid.org/0000-0003-0878-5861

Soukaina Filali Boubrahimi https://orcid.org/0000-0001-5693-6383

Shah Muhammad Hamdi https://orcid.org/0000-0002-9303-7835

Pouya Hosseinzadeh https://orcid.org/0000-0001-8045-2709

Journal/Book Title/Conference

Hydrology

Volume

11

Issue

5

Publisher

MDPI AG

Publication Date

5-1-2024

Journal Article Version

Version of Record

First Page

1

Last Page

30

Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.

Abstract

Streamflow prediction is crucial for planning future developments and safety measures along river basins, especially in the face of changing climate patterns. In this study, we utilized monthly streamflow data from the United States Bureau of Reclamation and meteorological data (snow water equivalent, temperature, and precipitation) from the various weather monitoring stations of the Snow Telemetry Network within the Upper Colorado River Basin to forecast monthly streamflow at Lees Ferry, a specific location along the Colorado River in the basin. Four machine learning models—Random Forest Regression, Long short-term memory, Gated Recurrent Unit, and Seasonal AutoRegresive Integrated Moving Average—were trained using 30 years of monthly data (1991–2020), split into 80% for training (1991–2014) and 20% for testing (2015–2020). Initially, only historical streamflow data were used for predictions, followed by including meteorological factors to assess their impact on streamflow. Subsequently, sequence analysis was conducted to explore various input-output sequence window combinations. We then evaluated the influence of each factor on streamflow by testing all possible combinations to identify the optimal feature combination for prediction. Our results indicate that the Random Forest Regression model consistently outperformed others, especially after integrating all meteorological factors with historical streamflow data. The best performance was achieved with a 24-month look-back period to predict 12 months of streamflow, yielding a Root Meat Square Error of 2.25 and R-squared (R2) of 0.80. Finally, to assess model generalizability, we tested the best model at other locations—Greenwood Springs (Colorado River), Maybell (Yampa River), and Archuleta (San Juan) in the basin.

Share

COinS