Published July 20, 2022
| Version v1
Publication
Trends and Applications of Machine Learning in Water Supply Networks Management
Description
Purpose: This study describes the trends and applications of machine learning systems in the
management of water supply networks. Machine learning is a field in constant development, and it has a
great potential and capability to attain improvements in real industries. The recent tendency of data storage
by companies that manage the water supply networks have created a range of possibilities to apply
machine learning. One particular case is the prediction of pipe failures based on historical data, which can
help to optimally plan the renovation and maintenance tasks. The objective of this work is to define the
stages and main characteristics of machine learning systems, focusing on supervised learning methods.
Additionally, singularities that are usually found in data from water supply networks are highlighted.
Design/methodology/approach: For this purpose, thirteen studies which contain real cases from
around the world are discussed. From the data processing to the model validation, a tour of the methods
used in each study is carried out. Moreover, the trendiest models are briefly defined together with the
mechanisms that best suit their performance.
Findings: As a result of the study, it was found that the imbalanced class problem is typical of data from
water supply networks where only a small percentage of pipes fail. Consequently, it is recommended to use
sampling methods to train classifiers, however, it is not necessary if we are training a regression system.
Additionally, scaling and transformation of variables has generally a positive impact on the model's
performance. Currently, cross-validation is almost a requirement to obtain reliable and representative
results. This technique is employed in most revised studies to train and validate their models.
Originality/value: The use of machine learning systems to predict pipe failures in water supply networks
is still a developing field. This study tries to define the advantages and disadvantages of different methods
to process data from water supply networks, as well as to train and validate the models.
Abstract
Universidad de Sevilla VI PPIT-USAdditional details
Identifiers
- URL
- https://idus.us.es/handle//11441/135647
- URN
- urn:oai:idus.us.es:11441/135647