A Combined Statistical and Machine Learning Approach for Predicting Surface Water Quality in Burkina Faso
Issoufou OUEDRAOGO
*
Mining Engineering Department, University Yembila Abdoulaye TOGUYENI, BP 54 Fada N' Gourma, Burkina Faso and Geosciences and Environment Laboratory (LaGE), Joseph KI-ZERBO University, BP 7021, Ouagadougou, Burkina Faso.
Issan KI
Geosciences and Environment Laboratory (LaGE), Joseph KI-ZERBO University, BP 7021, Ouagadougou, Burkina Faso, Direction de la Qualité des Eaux/Ministère de l'Environnement de l'Eau et de l'Assainissement, Burkina Faso and Unité de Formation et de Recherche en Sciences Appliquées et Technologies, Université Daniel OUEZZIN-COULIBALY, Dédougou, BP 139, Burkina Faso.
Michelline M.R. KANSOLE
Mining Engineering Department, University Yembila Abdoulaye TOGUYENI, BP 54 Fada N' Gourma, Burkina Faso and Geosciences and Environment Laboratory (LaGE), Joseph KI-ZERBO University, BP 7021, Ouagadougou, Burkina Faso.
Baowendsom Judicael YANOGO
Center for Mineral Resource Studies (CERM) | University of Quebec at Chicoutimi, 555 Bd de l’Université, Chicoutimi, QC G7H 2B1, Canada and Probe Gold, Exploration Company, Quebec, Canada.
*Author to whom correspondence should be addressed.
Abstract
Surface water in Burkina Faso is essential for domestic use, agriculture, and ecosystem services, yet it is increasingly impacted by human activities and climate variability. This study used the Water Quality Index (WQI), multivariate statistics and a Multilayer Perceptron (MLP) neural network to assess and predict water quality. A total of 139 samples were analyzed for 17 physicochemical parameters. The results revealed slightly alkaline waters ([pH] 6.04–9.23), low-to-moderate mineralization (electric conductivity [EC] 39 – 387 micro siemens per centimeter [µS/cm]; total dissolved solids [TDS] 39 –1100 milligrams per liter [mg/L]), and spatially variable nutrient concentrations (ammonium [NH₄⁺], nitrate [NO₃⁻], and phosphate [PO₄³⁻]), which are indicative of both natural and anthropogenic inputs. Correlation and factor analyses identified three main influences on water quality: geogenic weathering; nutrient and sediment inputs from human activities; and salinity and mineral contributions. MLP modelling showed that deeper architectures with two hidden layers (12, 6 and 12,12) achieved the highest predictive accuracy (R² ≈ 0.825, RMSE ≈ 61, and MAE ≈ 40), and the best model generalized well to test data (R²_(Test) = 0.95, RMSE_(Test) = 37.3). This integrated approach shows the potential for combining statistical analysis and machine learning to monitor, manage, and predict surface water quality in Burkina Faso.
Keywords: Water quality index, statistical analysis, multilayer perceptron, surface water prediction, Burkina Faso