Data conventions
Array layout
Raw .npy files saved by the preprocessing pipeline are channels-last:
shape (N, H, W, C). PyTorch requires channels-first (N, C, H, W), so
train.py transposes on load.
C = 1for single-channel files (separate CIB and tSZ files).C = 2for joint files (CIB channel 0, tSZ channel 1).DDPM samples output by
sample.pyare channels-first(N, 2, H, W).
Patch geometry
All patches are 6° × 6° flat-sky projections of HEALPix maps, pixelised onto a 256 × 256 grid.
Parameter |
Value |
|---|---|
|
256 |
|
1.40625 (= 6° × 60 / 256) |
|
|
Pass flatskymapparams to all flatmaps functions.
Normalisation
Warning
The normalisation scheme is currently contested between notebooks — see
inconsistency #7 in docs/paper_code_inconsistencies.md. Notebook 03 (the
data producer) z-scores both channels and saves _zscore_ files with
norm_params = [cib_mean, cib_std, tsz_mean, tsz_std], which matches the
denormalize_dm_maps (x·std+mean) paragraph below and the checkpoint
name v3_zscore_.... The table immediately below documents the legacy
min–max scheme still referenced by notebooks 06–14’s load filenames. Confirm
which .npy files exist on disk and how the checkpoint was trained before
relying on either.
Two normalisation schemes have been used, one per channel (legacy scheme shown; see the warning above):
Channel |
Scheme |
File suffix |
Notes |
|---|---|---|---|
CIB 150 GHz |
Min–max → [0, 1] |
|
CIB pixels are non-negative by construction |
tSZ Compton-y |
Z-score (μ=0, σ=1) |
|
tSZ spans negative and positive values |
Normalisation parameters (cib_mean, cib_std, tsz_mean, tsz_std) are saved
alongside the patches as norm_params_{ptsrc}mJy.npy. Use
denormalize_dm_maps() to invert.
Masking and filtering
Point-source mask: sources brighter than 2 mJy at 150 GHz are masked; masked pixels are set to zero (not NaN).
Low-pass filter: modes above ℓ = 7000 are removed; negative pixels arising from filtering artefacts are zeroed.
Patch rejection: patches where all pixels are zero (footprint entirely within the Galactic-plane mask) are discarded before training.
Multipole conventions
map2cl() returns the unbinned power
spectrum \(C_\ell\) in map-unit² sr. To convert to
\(D_\ell = \ell(\ell+1)C_\ell / 2\pi\):
el, cl, _ = mean_cls(maps, flatskymapparams, lmin=300, lmax=4000, binsize=60)
dl = el * (el + 1) / (2 * np.pi) * cl