Chemical flow analysis (CFA) can be used for collecting life-cycle inventory (LCI), estimating environmental releases, and identifying potential exposure scenarios for chemicals of concern at the end-of-life (EoL) stage. Nonetheless, the demand for comprehensive data and the epistemic uncertainties about the pathway taken by the chemical flows make CFA, LCI, and exposure assessment time-consuming and challenging tasks. Due to the continuous growth of computer power and the appearance of more robust algorithms, data-driven modelling represents an attractive tool for streamlining these tasks. However, a data ingestion pipeline is required for the deployment of serving data-driven models in the real world. Hence, this work moves forward by contributing a chemical-centric and data-centric approach to extract, transform, and load comprehensive data for CFA at the EoL, integrating cross-year and country data and its provenance as part of the data lifecycle. The framework is scalable and adaptable to production-level machine learning operations. The framework can supply data at an annual rate, making it possible to deal with changes in the statistical distributions of model predictors like transferred amount and target variables (e.g., EoL activity identification) to avoid potential data-driven model performance decay over time. For instance, it can detect that recycling transfers of 643 chemicals over the reporting years (1988 to 2020) are 29.87%, 17.79%, and 20.56% for Canada, Australia, and the U.S. Finally, the developed approach enables research advancements on data-driven modelling to easily connect with other data sources for economic information on industry sectors, the economic value of chemicals, and the environmental regulatory implications that may affect the occurrence of an EoL transfer class or activity like recycling of a chemical over years and countries. Finally, stakeholders gain more context about environmental regulation stringency and economic affairs that could affect environmental decision-making and EoL chemical exposure predictions.
Keywords: Chemical flow analysis; Data modelling; End-of-life; Exploratory data analysis; Exposure scenario; Life cycle inventory.