We are seeking a highly skilled Machine Learning Researcher/Data Scientist to work on the retrieval of greenhouse gases (GHG) concentrations from satellite measurements [1,2,3], and uncertainties stemming from the interpolation of detected emissions in the concentration maps.
Satellite observations are hyperspectral images: two spatial dimensions (~30 meters resolution), and a spectral dimension (~5 nanometers resolution). A pixel is thus a vector of size N, the number of measured spectral channels (~100). The spectral dimension allows the identification of atmospheric species using the knowledge of their absorption spectrum.
Existing retrieval methods perform [4] the inversion of a state vector describing the atmospheric composition of a satellite pixel. This inversion requires many a priori estimated parameters and relies on a so-called Radiative Transfer Model (RTM) [5] as a forward model that simulates the interaction of light with the atmosphere and the ground before it is being measured by the satellite. The inversion is traditionally performed on a per-pixel basis and the RTM is a heavy full-physics model, so processing times can be long and expensive in an operational setting.
Two complementary approaches will be explored:
- Speed up and enhance the traditional retrieval framework by using neural networks as forward models. The training of such a surrogate model would be done using synthetic data generated from an existing RTM.
- Direct learning of an inverse model by a neural network to directly produce a concentration map from a satellite observation.
From these GHG concentration maps, spot emissions are detected and quantified and eventually used along other data to estimate total emissions on a continuous period of time. In this role, you will additionally apply state-of-the-art uncertainty quantification techniques—specifically Conformal Prediction [6]—to this complex, multi-stage measurement pipeline. Your mission will be to ensure our environmental data is not just highly accurate, but rigorously bounded using distribution-free confidence intervals
References
[1] C. Borger, S. Beirle, A. Butz, L. O. Scheidweiler, and T. Wagner, “High-resolution observations of NO2 and CO2 emission plumes from EnMAP satellite measurements,” Environ. Res. Lett., vol. 20, no. 4, p. 044034, Mar. 2025, doi: 10.1088/1748-9326/adc0b1.
[2] D. H. Cusworth et al., “Quantifying Global Power Plant Carbon Dioxide Emissions With Imaging Spectroscopy,” AGU Advances, vol. 2, no. 2, p. e2020AV000350, June 2021, doi: 10.1029/2020AV000350.
[3] M. Dogniaux et al., “The Adaptable 4A Inversion (5AI): description and first X CO2 retrievals from Orbiting Carbon Observatory-2 (OCO-2) observations,” Atmos. Meas. Tech., vol. 14, no. 6, pp. 4689–4706, June 2021, doi: 10.5194/amt-14-4689-2021.
[4] C. Frankenberg, U. Platt, and T. Wagner, “Iterative maximum a posteriori (IMAP)-DOAS for retrieval of strongly absorbing trace gases: Model studies for CH4 and CO2 retrieval from near infrared spectra of SCIAMACHY onboard ENVISAT,” Atmos. Chem. Phys., 2005, https://acp.copernicus.org/articles/5/9/2005/
[5] C. Rodgers, Retrieval of atmospheric temperature and composition from remote measurements of thermal radiation, Reviews of Geophysics, vol.18, issue.7, pp.609-624, 1976, doi: 10.1029/RG014i004p00609
[6] A. Angelopoulos, R. Foygel Barber, and S. Bates, Theoretical Foundations of Conformal Prediction, Cambridge University Press, 2025, https://arxiv.org/abs/2411.11824