Multi-step-ahead prediction of water levels using machine learning: A comparative analysis in the Vietnamese Mekong Delta

Nguyen Duc Hanh, Nguyen Tien Giang, Le Xuan Hoa, Tran Ngoc Vinh, Huu Duy Nguyen
Author affiliations

Authors

  • Nguyen Duc Hanh Faculty of Hydrology, Meteorology and Oceanography, VNU University of Science, Vietnam National University, 334 Nguyen Trai, Thanh Xuan district, Hanoi, Vietnam
  • Nguyen Tien Giang Faculty of Hydrology, Meteorology and Oceanography, VNU University of Science, Vietnam National University, 334 Nguyen Trai, Thanh Xuan district, Hanoi, Vietnam
  • Le Xuan Hoa Dong Thap Provincial Hydrometeorological Station, Southern Regional Hydrometeorological Station, General Department of Hydrometeorology, Ministry of Natural Resources and Environment
  • Tran Ngoc Vinh Department of Civil and Environmental Engineering, University of Michigan, Ann Arbor, MI, USA
  • Huu Duy Nguyen Faculty of Geography, VNU University of Science, Vietnam National University, Hanoi, Vietnam

DOI:

https://doi.org/10.15625/2615-9783/21067

Keywords:

Water level, multi-step-ahead prediction, machine learning, Vietnamese Mekong delta

Abstract

This study evaluates the efficacy of five machine learning algorithms Support Vector Regression (SVR), Decision Tree (DT), Random Forest (RF), Light Gradient Boosting Machine Regressor (LGBM), and Linear Regression (LR) in predicting water levels in the Vietnamese Mekong Delta's tidal river system, a complex nonlinear hydrological phenomenon. Using daily maximum, minimum, and mean water level data from the Cao Lanh gauging station on the Tien River (2000-2020), models were developed to forecast water levels one, three, five, and seven days in advance. Performance was assessed using Nash-Sutcliffe Efficiency, coefficient of determination, Root Mean Square Error, and Mean Absolute Error. Results indicate that all models performed well, with SVR consistently outperforming others, followed by RF, DT, and LGBM. The study demonstrates the viability of machine learning in water level prediction using solely historical water level data, potentially enhancing flood warning systems, water resource management, and agricultural planning. These findings contribute to the growing knowledge of machine learning applications in hydrology and can inform sustainable water resource management strategies in delta regions.

Downloads

Download data is not yet available.

References

Adnan R.M., Liang Z., Heddam S., Zounemat-Kermani M., Kisi O., Li B., 2020. Least square support vector machine and multivariate adaptive regression splines for streamflow prediction in mountainous basin using hydrometeorological data as inputs. Journal of Hydrology, 586, 124371. https://doi.org/10.1016/j.jhydrol.2019.124371.

Adnan R.M., Liang Z., Parmar K.S., Soni K., Kisi O., 2021. Modeling monthly streamflow in mountainous basin by MARS., GMDH-NN and DENFIS using hydroclimatic data. Neural Computing and Applications, 33, 2853–2871. https://doi.org/10.1007/s00521-020-05164-3.

Ahmed A.N., Yafouz A., Birima A.H., Kisi O., Huang Y.F., Sherif M., Sefelnasr A., El-Shafie A., 2022. Water level prediction using various machine learning algorithms: A case study of Durian Tunggal river., Malaysia. Engineering Applications of Computational Fluid Mechanics, 16, 422–440. https://doi.org/10.1080/19942060.2021.2019128.

Almuallim H., Kaneda S., Akiba Y., 2002. Development and applications of decision trees. Expert Systems. Elsevier, 53–77. https://doi.org/10.1016/B978-012443880-4/50047-8.

Ao Y., Li H., Zhu L., Ali S., Yang Z., 2019. The linear random forest algorithm and its advantages in machine learning assisted logging regression modeling. Journal of Petroleum Science and Engineering, 174, 776–789. https://doi.org/10.1016/j.petrol.2018.11.067.

Baek S-S., Pyo J., Chun J.A., 2020. Prediction of water level and water quality using a CNN-LSTM combined deep learning approach. Water, 12, 3399. https://doi.org/10.3390/w12123399.

Choi C., Kim J., Han H., Han D., Kim H.S., 2019. Development of water level prediction models using machine learning in wetlands: A case study of Upo wetland in South Korea. Water, 12, 93. https://doi.org/10.3390/w12010093.

Chua L.H.C., 2012. Considerations for data-driven and physically based hydrological models in flow forecasting. IFAC Proc, 45, 1025–1030.

Dam Duc Nguyen, Hai Phu Nguyen, Dung Quang Vu, Indra Prakash, Binh Thai Pham, 2023. Using GA-ANFIS machine learning model for forecasting the load bearing capacity of drivenpiles. Journal of Science and Transport Technology, JSTT 2023, 3(2), 26–33. https://doi.org/10.58845/jstt.utt.2023.en.3.2.26-33.

Dehghani R., Torabi Poudeh H., Younesi H., Shahinejad B., 2020. Daily streamflow prediction using support vector machine-artificial flora SVM-AF hybrid model. Acta Geophysica, 68, 1763–1778. https://doi.org/10.1007/s11600-020-00472-7.

DHI Water and Environment, 1999. MIKE 11 Reference Manual.

Do H.X., Le M.H., Pham H.T., Le T.H., Nguyen Q.B., 2022. Identifying hydrologic reference stations to understand changes in water resources across Vietnam - a data-driven approach. Vietnam Journal of Earth Sciences, 44(1), 144–164. https://doi.org/10.15625/2615-9783/16980.

Essam Y., Huang Y.F., Ng J.L., Birima A.H., Ahmed A.N., El-Shafie A., 2022. Predicting streamflow in Peninsular Malaysia using support vector machine and deep learning algorithms. Scientific Reports 12, 1–26. https://doi.org/10.1038/s41598-022-07693-4.

Fan J., Ma X., Wu L., Zhang F., Yu X., Zeng W., 2019. Light Gradient Boosting Machine: An efficient soft computing model for estimating daily reference evapotranspiration with local and external meteorological data. Agricultural Water Management, 225, 105758. https://doi.org/10.1016/j.agwat.2019.105758.

Ghaith M., Siam A., Li Z., El-Dakhakhni W., 2019. Hybrid hydrological data-driven approach for daily streamflow forecasting. Journal of Hydrologic Engineering, 25(2), 04019063. https://doi.org/10.1061/(ASCE)HE.1943-5584.0001866.

Gorriz J.M., Segovia F., Ramirez J., Ortiz A., Suckling J., 2024. Is k-fold cross validation the best model selection method for machine learning? arXiv:2401.16407 [stat.ML]. https://doi.org/10.48550/arXiv.2401.16407.

Gu B., Sheng VS., Wang Z., Ho D., Osman S., Li S., 2015. Incremental learning for ν-support vector regression. Neural networks, 67, 140–150. https://doi.org/10.1016/j.neunet.2015.03.013.

Herath M., Jayathilaka T., Hoshino Y., Rathnayake U., 2023. Deep machine learning-based water level prediction model for Colombo flood detention area. Applied Sciences, 13, 2194. https://doi.org/10.3390/app13042194.

Hicks F.E., Peacock T., 2005. Suitability of HEC RAS for flood forecasting. Canadian Water Resources Journal, 30(2), 159–174. https://doi.org/10.4296/cwrj3 002159.

Jadhav SD., Channe H., 2016. Comparative study of K-NN., naive Bayes and decision tree classification techniques. International Journal of Science and Research IJSR, 5, 1842–1845. https://www.ijsr.net/getabstract.php?paperid=NOV153131.

Kim B., Sanders B.F., Famiglietti J.S., Guinot V., 2015. Urban flood modeling with porous shallow-water equations: a case study of model errors in the presence of anisotropic porosity. J. Hydrol, 523, 680–692.

Kim D., Han H., Wang W., Kim H.S., 2022. Improvement of deep learning models for river water level prediction using complex network method. Water, 14, 466. https://doi.org/10.3390/w14030466.

Kisi O., 2010. Wavelet regression model for short-term streamflow forecasting. Journal of hydrology, 389, 344–353. https://doi.org/10.1016/j.jhydrol.2010.06.013.

Kong L., Li Y., Yuan S., Li J., Tang H., Yang Q., Fu X., 2023. Research on water level forecasting and hydraulic parameter calibration in the 1D open channel hydrodynamic model using data assimilation. Journal of Hydrology, 625, 129997. https://doi.org/10.1016/j.jhydrol.2023.129997.

Langsetmo L., Schousboe J.T., Taylor B.C., Cauley J.A., Fink H.A., Cawthon P.M., Kado D.M., Ensrud K.E., Group OFiMR., 2023. Advantages and disadvantages of random forest models for prediction of hip fracture risk versus mortality risk in the oldest old. JBMR plus, 7, e10757.

Leo Breiman., 2001. Random Forest. https://link.springer.com/article/10.1023/A:1010933404324.

Li G., Zhu H., Jian H., Zha W., Wang J., Shu Z., Yao S., Han H., 2023. A combined hydrodynamic model and deep learning method to predict water level in ungauged rivers. Journal of Hydrology, 625, 130025. https://doi.org/10.1016/j.jhydrol.2023.130025.

Li L., Jun K.S., 2022. A hybrid approach to improve flood forecasting by combining a hydrodynamic flow model and artificial neural networks. Water, 14, 1393. https://doi.org/10.3390/w14091393.

Li S., Dong X., Ma D., Dang B., Zang H., Gong Y., 2024. Utilizing the LightGBM Algorithm for Operator User Credit Assessment Research. arXiv preprint arXiv, 240314483. https://doi.org/10.48550/arXiv.2403.14483.

Liu D., Jiang W., Mu L., Wang S., 2020. Streamflow prediction using deep learning neural network: case study of Yangtze River. IEEE access, 8, 90069–90086. 10.1109/ACCESS.2020.2993874.

Ma J., Theiler J., Perkins S., 2003. Accurate on-line support vector regression. Neural computation, 15, 2683–2703. https://doi.org/10.1162/089976603322385117.

Manh Van Le, Indra Prakash, Dam Duc Nguyen, 2023. Predicting load-deflection of composite concrete bridges using machine learning models. Journal of Science and Transport technology, JSTT, 3(4), 44–52. https://doi.org/10.58845/jstt.utt.2023.en.3.4.44-52.

Moriasi D.N., Gitau M. W., Pai N., Daggupati P., 2015. Hydrologic and water quality models: performance measures and evaluation criteria. Transactions of the ASABE (American Society of Agricultural and Biological Engineers), 58(6), 1763–1785. http://dx.doi.org/10.13031/trans.58.10715.

Mosavi A., Ozturk P., Chau K.-W., 2018. Flood prediction using machine learning models: literature review. Water, 10, 1536.

Nanda T., Sahoo B., Chatterjee C., 2019. Enhancing real-time streamflow forecasts with wavelet-neural network based error-updating schemes and ECMWF meteorological predictions in Variable Infiltration Capacity model. Journal of Hydrology, 575, 890–910. https://doi.org/10.1016/j.jhydrol.2019.05.051.

Narsimlu B., Gosain A.K., Chahar B.R., Singh S.K., Srivastava PK., 2015. SWAT model calibration and uncertainty analysis for streamflow prediction in the Kunwari River Basin., India., using sequential uncertainty fitting. Environmental Processes, 2, 79–95. https://doi.org/10.1007/s40710-015-0064-8.

Nematzadeh S., Kiani F., Torkamanian-Afshar M., Aydin N., 2022. Tuning hyperparameters of machine learning algorithms and deep neural networks using metaheuristics: A bioinformatics study on biomedical and biological cases. Computational Biology and Chemistry, 97, 107619. https://doi.org/10.1016/j.compbiolchem.2021.107619.

Nguyen H.D., 2023. Daily streamflow forecasting by machine learning in Tra Khuc River in Vietnam. Vietnam Journal of Earth Sciences 45(1), 82–97. https://doi.org/10.15625/2615-9783/17914.

Nguyen H.-D., Pham V.-D., Nguyen Q.-H., Pham V.-M., Pham M.H., Vu V.M., Bui Q.-T., 2020. An optimal search for neural network parameters using the Salp swarm optimization algorithm: a landslide application. Remote Sensing Letters, 11, 353–362. https://doi.org/10.1080/2150704X.2020.1716409.

Nguyen H.D., Van C.P., Do A.D., 2023a. Application of hybrid model-based deep learning and swarm‐based optimizers for flood susceptibility prediction in Binh Dinh province., Vietnam Earth Science Informatics, 1–21. https://doi.org/10.1002/gj.4885.

Nguyen H.D., Van C.P., Nguyen Q.-H., Bui Q.-T., 2023b. Daily streamflow prediction based on the long short-term memory algorithm: a case study in the Vietnamese Mekong Delta. Journal of Water and Climate Change, 14, 1247–1267. https://doi.org/10.2166/wcc.2023.419.

Nguyen P.N.B., Phan V.T., Trinh T.L., Tangang F.T., Cruz F., Boon S.J., Juneng L., Chung J.X., Aldrian, 2022. Projected future changes in drought characteristics over Southeast Asia. Vietnam Journal of Earth Science, 44(1), 127–143. https://doi.org/10.15625/2615-9783/16974.

Özdoğan-Sarıkoç G., Dadaser-Celik F., 2024. Physically based vs. data-driven models for streamflow and reservoir volume prediction at a data-scarce semi-arid basin. Environ Sci Pollut Res, 31, 39098–39119. https://doi.org/10.1007/s11356-024-33732-w.

Pachouly J., Ahirrao S., Kotecha K., Selvachandran G. Abraham A., 2022. A systematic literature review on software defect prediction using artificial intelligence: datasets, data validation methods, approaches, and tools. Engineering Applications of Artificial Intelligence, 111, 104773. https://doi.org/10.1016/j.engappai.2022.104773.

Pan M., Zhou H., Cao J., Liu Y., Hao J., Li S., Chen C.-H., 2020. Water level prediction model based on GRU and CNN. Ieee Access, 8, 60090–60100. 10.1109/ACCESS.2020.2982433.

Park K., Jung Y., Seong Y., Lee S., 2022. Development of deep learning models to improve the accuracy of water levels time series prediction through multivariate hydrological data. Water, 14, 469. https://doi.org/10.3390/w14030469.

Peters N.E., Freer J., Beven K., 2003. Modelling hydrologic responses in a small forested catchment Panola Mountain., Georgia., USA: a comparison of the original and a new dynamic TOPMODEL. Hydrological Processes, 17, 345–362. https://doi.org/10.1002/hyp.1128.

Phan T.-T.-H., Nguyen X.H., 2020. Combining statistical machine learning models with ARIMA for water level forecasting: The case of the Red river. Advances in Water Resources, 142, 103656. https://doi.org/10.1016/j.advwatres.2020.103656.

Probst P., Boulesteix A.-L., Bischl B., 2019. Tunability: Importance of Hyperparameters of Machine Learning Algorithms. Journal of Machine Learning Research, 20, 1934–1965.

Qi Wang, Song Wang, 2020. Machine learning-based water level prediction in lake Erie. Water, 12(100), 2654. https://doi.org/10.3390/w12102654.

Quinlan J.R., 1987. Generating production rules from decision trees. ijcai. Citeseer, 304–307.

Sampurno J., Vallaeys V., Ardianto R., Hanert E., 2022. Integrated hydrodynamic and machine learning models for compound flooding prediction in a data-scarce estuarine delta, Nonlin. Processes Geophys., 29, 301–315. https://doi.org/10.5194/npg-29-301-2022.

Siddique-E-Akbor A., Hossain F., Lee H., Shum C., 2011. Inter-comparison study of water level estimates derived from hydrodynamic–hydrologic model and satellite altimetry for a complex deltaic environment. Remote Sensing of Environment, 115, 1522–1531.

Thi-Thu-Hong Phan, Xuan Hoai Nguyen, 2020. Combining statistical machine learning models with ARIMA for water level forecasting: The case of the Red river. Advances in Water Resources, 142, 103656. https://doi.org/10.1016/j.advwatres.2020.103656.

Tran V.N., Kim J., 2022. Robust and efficient uncertainty quantification for extreme events that deviate significantly from the training dataset using polynomial chaos-kriging. Journal of Hydrology 609, 127716. https://doi.org/10.1016/j.jhydrol.2022.127716.

Van Phong T., Ly H.-B., Trinh P.T., Prakash I., Btjvjoes P., 2020. Landslide susceptibility mapping using Forest by Penalizing Attributes FPA algorithm based machine learning approach. Vietnam J. Earth Sci., 42, 237–246. https://doi.org/10.15625/0866-7187/42/3/15047.

Vapnik V., Guyon I., Hastie T., 1995. Support vector machines. Mach Learn, 20, 273–297. https://link.springer.com/article/10.1007/BF00994018.

Vijendra Kumar, Naresh Kedam, Kul Vaibhav Sharrma, Darshan J. Mehta, Tommaso Caloiero, 2023. Advanced Machine Learning Techniques to Improve Hydrological Prediction: A comparative Analysis of Streamflow Prediction Models. Water, 15(14), 2572. https://doi.org/10.3390/w15142572.

Vinh Ngoc Tran, Jongho Kim, 2019. Quantification of predictive uncertainty with a metamodel: toward more eficient hydrologic simulations. Stochastic environmental research and risk assessment, 33, 1453–1476. https://doi.org/10.1007/s00477-019-01703-0.

Vinh Ngoc Tran, Jongho Kim, 2022. Robust and efficient uncertainty quantification for extreme events that deviate significantly from the training dataset using polynomial chaos-kriging. Journal of Hydrology, 609, 127716. https://doi.org/10.1016/j.jhydrol.2022.127716.

Vinh Ngoc Tran, M. Chase Dwelle, Khachik Sargsyan, Valeriy Y. Ivanov, Jongho Kim, 2020. A novel modeling framework for computationally efficient and accurate real time ensemble flood forecasting with uncertainty quantification. Advancing earth and space sciences. Water Resources Research, 56(3), e2019WR025727. https://doi.org/10.1029/2019WR025727.

Vinh Ngoc Tran, Valeriy Y. Ivanov, Donghui Xu, Jongho Kim, 2023a. Closing in on hydrologic predictive accuracy: combining the strengths of high-fidelity and physics-agnostic models. Advancing earth and space sciences, Geophysical research letters, 50(17), e2023GL104464. https://doi.org/10.1029/2023GL104464.

Vinh Ngoc Tran, Valeriy Y. Ivanov, Giang Tien Nguyen, Tran Ngoc Anh, Phuong Huy Nguyen, Dae-Hong Kim, Jongho Kim, 2024. A deep learning modeling framework with uncertainty quantification for inflow-outflow predictions for cascade reservoirs. Journal of Hydrology, 629, 130608. https://doi.org/10.1016/j.jhydrol.2024.130608.

Vinh Ngoc Tran, Valeriy Y. Ivanov, Jongho Kim, 2023b. Data reformation – A novel data processing technique enhancing machine learning applicability for predicting streamflow extremes. Advances in Water Resources, 182, 104569. https://doi.org/10.1016/j.advwatres.2023.104569.

Wang F., Cheng H., Dai H., Han H., 2021. Freeway short-term travel time prediction based on lightgbm algorithm. IOP Conference Series: Earth and Environmental Science. IOP Publishing. 012029. 10.1088/1755-1315/638/1/012029.

Wang Q., Wang S., 2020. Machine learning-based water level prediction in Lake Erie. Water, 12, 2654. https://doi.org/10.3390/w12102654.

Wunsch A., Liesch T., Broda S., 2018. Forecasting groundwater levels using nonlinear autoregressive networks with exogenous input (NARX). Journal of Hydrology, 567, 743–758. https://doi.org/10.1016/j.jhydrol.2018.01.045.

Xu W., Chen J., Zhang XJ., 2022. Scale effects of the monthly streamflow prediction using a state-of-the-art deep learning model. Water Resources Management, 36, 3609–3625. https://doi.org/10.1007/s11269-022-03216-y.

Yu Tong, Zhu Hong, 2020. Hyper-parameter optimization: A review of algorithms and applications. arXiv preprint arXiv, 2003.05689.

Zhang F., Dai H., Tang D., 2014. A conjunction method of wavelet transform-particle swarm optimization-support vector machine for streamflow forecasting. Journal of Applied Mathematics, 910196. https://doi.org/10.1155/2014/910196.

Zhao G., Pang B., Xu Z., Xu L., 2020. A hybrid machine learning framework for real-time water level prediction in high sediment load reaches. Journal of Hydrology, 581, 124422. https://doi.org/10.1016/j.jhydrol.2019.124422.

Zhu S., Hrnjica B., Ptak M., Choiński A., Sivakumar B., 2020. Forecasting of water level in multiple temperate lakes using machine learning models. Journal of Hydrology, 585, 124819. https://doi.org/10.1016/j.jhydrol.2020.124819.

Downloads

Published

02-07-2024

How to Cite

Nguyen Duc, H., Nguyen Tien, G., Le Xuan, H., Tran Ngoc, V., & Nguyen Huu, D. (2024). Multi-step-ahead prediction of water levels using machine learning: A comparative analysis in the Vietnamese Mekong Delta. Vietnam Journal of Earth Sciences, 468–488. https://doi.org/10.15625/2615-9783/21067

Issue

Section

Articles

Most read articles by the same author(s)