Advancements in water quality prediction: a practical review of machine learning and deep learning approaches

Research Projects

Organizational Units

Journal Issue

Abstract

Water quality plays a pivotal role in ensuring the safety and sustainability of water resources, with significant implications for environmental protection, public health, and various industrial applications. This paper presents both a review of related state-of-the-art works and an implementation and application of adapted versions of these related works for predicting water quality parameters on a new water dataset from Galicia, Spain. The reviewed studies encompass a range of predictive models applied to diverse water quality parameters, including dissolved oxygen levels, pH levels, and other complex water parameters. These models include various machine learning and deep learning methods such as Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM) networks, and Bidirectional LSTMs. This research contributes by implementing various models on the dataset and experimentally demonstrating the impact of key factors on model performance. These factors include model sophistication, imputation techniques, recurrent architectures, and customized approaches for water quality prediction using deep learning. Notably, K-Nearest Neighbors (KNN) imputation enhances performance by preserving local data relationships, while noise filtering further improves predictive accuracy. Additionally, we observe that smaller batch sizes and learning rates lead to better generalization in sparse datasets, outperforming traditional approaches. The conclusions are guided by comparing the performance of all models on the Galician dataset using the Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and Coefficient of Determination (R2 ). This paper provides the first DL-based water quality analysis for Galicia, emphasizing the need for regional model adaptation. Our results guide future research directions, including the exploration of Transformer-based architectures for time-series data, more sophisticated feature selection techniques, and neural-network-based imputation strategies to enhance data completeness.

Description

Bibliographic citation

A. Helaly, M., Rady, S., Mabrouk, M. et al. Advancements in water quality prediction: a practical review of machine learning and deep learning approaches. Cluster Comput 28, 598 (2025). https://doi.org/10.1007/s10586-025-05221-3

Relation

Has part

Has version

Is based on

Is part of

Is referenced by

Is version of

Requires

Sponsors

Open access funding provided by The Science, Technology & Innovation Funding Authority (STDF) in cooperation with The Egyptian Knowledge Bank (EKB).

Rights

©The Author(s) 2025. This article is licensed under a Creative Commons Attribution 4.0 International License
Attribution 4.0 International