foregrounds-diffusion¶
A denoising diffusion probabilistic model (DDPM) pipeline for generating correlated CIB and tSZ extragalactic CMB foreground maps.
Built on denoising-diffusion-pytorch and developed as MPhil Project 7 (Data Intensive Science, Cambridge).
Getting started
API reference
Tutorials
- 01 — Halo Catalogue
- 02 — Masking
- 03 — Patch Extraction and Normalisation
- 04 — Model Architecture and Training
- 05 — Sampling and Post-Processing
- 06 — Power Spectra Comparison
- 07 — Higher-Order Statistics (Bispectrum and Trispectrum)
- 08 — Pixel Histograms and Minkowski Functionals
- 09 — tSZ Cluster Stacking
- 10 — Peak and Minima Counts
- ─────────────────────────────────────────────────────────────────────────────
- Imports and physical constants.
- pixel_res_arcmin is used by the Gaussian smoothing kernel inside
- compute_peak_minima_counts to convert FWHM in arcmin to pixels.
- ─────────────────────────────────────────────────────────────────────────────
- Load normalised CIB (_zero suffix = min-max to [0,1]) and tSZ (_norm suffix
- = z-score). norm_params stores (cib_mean, cib_std, tsz_mean, tsz_std) in
- physical µK, computed over the full training set during patch extraction.
- DDPM samples are stored channels-first: shape (N, 2, H, W).
- Gaussian baseline maps (channels-first, same shape) were generated by
- drawing realisations from the Agora-measured power spectrum.
- Denormalise both back to µK before computing statistics.
- compute_peak_minima_counts:
- maps_nhw : (N, H, W) array in physical units (µK)
- thresholds_peaks : 1D array of ν = T/σ values for peak binning
- thresholds_minima : 1D array of ν values for minima binning (typically < 0)
- smoothing_scales_arcmin: list of FWHM values for the pre-smoothing Gaussian
- Returns dict keyed by scale → {‘peaks’: (N, T), ‘minima’: (N, T)}
- 11 — Scattering Transforms
- Check for the Cheng et al. scattering backend and add it to sys.path.
- This must be done before importing scattering_stats, which tries to import
- the backend at module level and falls back to kymatio if not found.
- Load normalised arrays (same split and denormalisation as notebooks 06-10).
- Use float32 throughout to halve memory and avoid kymatio’s implicit f64 cast.
- compute_scattering_coefficients(maps, J, L):
- maps : (N, H, W) float32 array in µK
- J : number of dyadic scales (wavelet support ≈ 2^j pixels per scale)
- L : number of orientations (angular resolution = 180° / L)
- Returns dict with keys ‘S0’, ‘S1’ (N, J, L), ‘S2’ (N, J, J, L)
- 12 — Minkowski Tensors
- Load modules and set configuration.
- THRESHOLDS is in the normalised [0, 1] range (after apply_maxmin_normalization);
- avoid ν = 0 and ν = 1 where the excursion set is trivially full or empty.
- Load normalised patches. The Minkowski tensor code re-normalises internally
- via norm_fn, so the absolute pixel values here are less critical than for
- power spectra. Using the standard z-score / min-max loading ensures
- consistency with the train/test split.
- DDPM samples shape: (N, 2, H, W) channels-first.
- Index [:, 0] = CIB channel, [:, 1] = tSZ channel.
- Denormalise: CIB = raw * (cib_max - cib_min) + cib_min;
- tSZ = raw * tsz_std + tsz_mean.
- compute_minkowski_tensors(maps_nhw, norm_fn, thresholds, tensor_types, centred)
- norm_fn : normalisation applied per-map before thresholding
- (use apply_maxmin_normalization so ν ∈ [0,1] is meaningful)
- centred : if True, subtract excursion-set centroid from r⊗r tensors
- (affects W200, W201 only; default True)
- n_jobs : pass n_jobs=-1 to use all CPU cores via joblib
- 13 — Profiling and Benchmarking Baseline
- Paper figures
- Fig 1
- Minkowski Functionals