A currently increasing interest in water reuse is met with the concern about water quality. Excitation-emission matrix (EEM) measurements, which are widely implemented in laboratory analysis, emerge as a promising tool for characterizing both microbial and chemical water qualities in the online monitoring of water reuse systems. However, the robustness of EEM measurements has been rarely validated in actual online monitoring campaigns where predictions are made for new samples independent of those used to establish EEM analysis models, including the popular parallel factor analysis (PARAFAC). In this study, two strategies of conducting PARAFAC were examined for the online monitoring of a greywater reuse system using two EEM datasets from two monitoring periods for model establishment and model testing respectively. With the first strategy that is commonly used in laboratory analyses, an entire EEM datasets from one period was used to establish one PARAFAC model, and the maximum fluorescence intensity (Fmax) of a PARAFAC component was used to predict total cell count (TCC) in another period. However, under the disturbance of dissolved organic matter (DOM) fluorescence in the background, Fmax gave unreliable predictions in model testing. To address this problem, a second and novel strategy was proposed using an EEM clustering and PARAFAC component shift mining technique. This unsupervised algorithm, named K-PARAFACs, automatically groups EEMs into K clusters and on each cluster establishes a cluster-specific PARAFAC model with distinct component shapes. With this method, multiple PARAFAC models were established on one EEM dataset, with each model representing samples with certain TCC ranges and DOM compositions. In model testing, these cluster-specific PARAFAC models served as EEM classifiers. A new sample was not characterized by Fmax but by the cluster-specific model that best fitted the EEM signal of the sample with the least numerical error. The proposed strategy demonstrates its robustness by successfully predicting the TCC trend in test datasets. Our findings suggest that K-PARAFACs is a promising tool that enables robust qualitative monitoring of water reuse systems with background DOM variability.
Keywords: Dissolved organic matter (DOM); Excitation-emission matrix (EEM); Microbial quality; PARAFAC; Water reuse, online monitoring.
Copyright © 2024. Published by Elsevier Ltd.