Accurate estimation of atmospheric chemical concentrations from multiple observations is crucial for assessing the health effects of air pollution. However, existing methods are limited by imbalanced samples from observations. Here, we introduce a novel deep-learning model-measurement fusion method (DeepMMF) constrained by physical laws inferred from a chemical transport model (CTM) to estimate NO2 concentrations over the Continental United States (CONUS). By pretraining with spatiotemporally complete CTM simulations, fine-tuning with satellite and ground measurements, and employing a novel optimization strategy for selecting proper prior emission, DeepMMF delivers improved NO2 estimates, showing greater consistency and daily variation alignment with observations (with NMB reduced from -0.3 to -0.1 compared to original CTM simulations). More importantly, DeepMMF effectively addressed the sample imbalance issue that causes overestimation (by over 100%) of downwind or rural concentrations in other methods. It achieves a higher R2 of 0.98 and a lower RMSE of 1.45 ppb compared to surface NO2 observations, overperforming other approaches, which show R2 values of 0.4-0.7 and RMSEs of 3-6 ppb. The method also offers a synergistic advantage by adjusting corresponding emissions, in agreement with changes (-10% to -20%) reported in the NEI between 2019 and 2020. Our results demonstrate the great potential of DeepMMF in data fusion to better support air pollution exposure estimation and forecasting.
Keywords: NO2; TROPOMI satellite; deep learning; model-measurement fusion; physically constrained.