09 — tSZ Cluster Stacking¶
Purpose: Compare tSZ cluster profiles between Agora and DDPM samples using a signal-to-noise–based stacking approach.
This notebook identifies cluster locations by SNR thresholding, stacks cutouts from both Agora and DDPM maps, and computes radial profiles to quantify agreement:
SNR selection — pixels satisfying A·σ < |T − T̄| ≤ B·σ are selected in three bins: (a) 5–10σ (~263k Agora clusters), (b) 10–20σ (~60k), (c) ≥20σ (~3.9k).
Cutout extraction — extracts 22×22 pixel (≈31’) cutouts centred on each selected pixel from both the Agora and DDPM tSZ maps.
Stacking — uniformly weighted average of all cutouts per SNR bin, producing a mean cluster profile image Ŝ(θx, θy).
Radial profiles — converts the 2D stacked images to 1D profiles using
radial_profile, with 10 linearly spaced bins of Δθ = 1’ out to θ_max = 10’. Plots the profile, Agora–DDPM difference, and their ratio.
Inputs:
Test maps and DDPM samples:
data/low_pass/2mJy/*.npy
Outputs:
Stacked cutout arrays:
tsz_extracts/agora_tsz_stacks.npy,tsz_extracts/ddpm_tsz_stacks.npyRadial profile plots:
plots/tsz_stacks.pdf,plots/tsz_stacks_radial_profile.pdf(Figure 3)
Key module functions:
foregrounds_diffusion.flatmaps.radial_profile
Paper reference: §4.2 (Figure 3 — stacked tSZ radial profiles in three SNR bins).
1 Configuration¶
Stacking parameters:
SNR_BINS: three S/N intervals
[3,6],[6,9],[9,∞]used to separate clusters by detection significance. Higher-S/N bins probe more massive clusters and produce cleaner stacked profiles.CUTOUT_PIX = 40: cutout width of 40 pixels = 56′ centred on each peak. This captures the full tSZ profile (typically ≈ 5–10′ radius for the cluster population at z ~ 0.5) with enough padding for the radial profile estimator.
PIXEL_RES = 1.40625′: physical scale of the cutout pixels, used to convert pixel offsets to angular separations for the radial profile.
[ ]:
# select_snr_pixels(maps_nhw, snr_min, snr_max):
# Smooths each map with a matched filter and returns the (row, col) coordinates
# of pixels where signal-to-noise falls in [snr_min, snr_max].
# Uses only the first map in the stack as the detection template — the same
# coordinates are then used to extract cutouts from all maps.
#
# extract_cutouts(maps_nhw, pixel_coords, cutout_size, max_cutouts):
# Returns (N_cutouts, cutout_size, cutout_size) array; coordinates that
# would fall outside the map boundary are discarded.
import numpy as np
import matplotlib.pyplot as plt
from pathlib import Path
from foregrounds_diffusion.flatmaps import radial_profile
from foregrounds_diffusion.preprocessing import renormalize_dm_maps, denormalize_dm_maps
from foregrounds_diffusion.stacking import select_snr_pixels, extract_cutouts
PTSRC = 2
CUTOUT_PIX = 22 # 22 px × 1.41 arcmin/px ≈ 31 arcmin
PIXEL_RES = 1.41 # arcmin / pixel
SNR_BINS = [
(5, 10, "5–10σ"),
(10, 20, "10–20σ"),
(20, None, "≥20σ"),
]
PROJECT_ROOT = Path("/home/apb86/cmb_foregrounds_diffusion")
PATCHES_DIR = Path(f"data/low_pass/{PTSRC}mJy")
OUT_DIR = Path("tsz_extracts")
OUT_DIR.mkdir(exist_ok=True)
2 Load and denormalise maps¶
The DDPM samples are already in normalised space; denormalise to µK before stacking so that profile amplitudes are physically meaningful. Agora maps are also denormalised from z-score to µK using the training-set mean and standard deviation.
[ ]:
cib_maps = np.load(PATCHES_DIR / f"CIB_map_150GHz_256_st6_minmax_{PTSRC}mJy_zero_lp.npy")
tsz_maps = np.load(PATCHES_DIR / f"tSZ3_map_150GHz_256_st6_minmax_{PTSRC}mJy_norm_lp.npy")
ddpm_raw = np.load(PROJECT_ROOT / "data" / "low_pass" / f"{PTSRC}mJy" / f"new_samples_cib_tsz_{PTSRC}mJy_zero_norm_6x6_w_au_lp.npy")
gauss_maps = np.load(PATCHES_DIR / f"gaussian_cib_tsz_{PTSRC}mJy_lp.npy")
# Collect pixel values from N_SAMPLES maps per source
agora_cib_px = cib_maps[:, :, :, 0].ravel()
agora_tsz_px = tsz_maps[:, :, :, 0].ravel()
train_maps = np.concatenate([cib_maps, tsz_maps], axis=-1) # (N, H, W, 2)
print(f"train_maps shape: {train_maps.shape}")
print(f"ddpm_raw shape: {ddpm_raw.shape}")
# Load saved normalisation parameters from patch extraction
norm_params = np.load(PATCHES_DIR / f"norm_params_{PTSRC}mJy.npy")
cib_mean, cib_std, tsz_mean, tsz_std = norm_params
print(f"CIB — mean: {cib_mean:.4f} std: {cib_std:.4f}")
print(f"tSZ — mean: {tsz_mean:.4f} std: {tsz_std:.4f}")
ddpm_renorm = denormalize_dm_maps(ddpm_raw, cib_mean, cib_std, tsz_mean, tsz_std)
print(f"Denormalised DDPM — CIB range: [{ddpm_renorm[:,0].min():.4f}, {ddpm_renorm[:,0].max():.4f}]")
print(f"Denormalised DDPM — tSZ range: [{ddpm_renorm[:,1].min():.4f}, {ddpm_renorm[:,1].max():.4f}]")
agora_tsz = tsz_maps[:, :, :, 0] # (N, H, W)
ddpm_tsz = ddpm_renorm[:, 1] # (N, H, W)
print(f"Agora tSZ maps: {agora_tsz.shape}, DDPM tSZ maps: {ddpm_tsz.shape}")
3 Identify SNR pixels and extract cutouts¶
select_snr_pixels smooths the tSZ map with a matched filter (or a simple Gaussian of width θ_beam) and returns pixel coordinates where the S/N exceeds the given thresholds. extract_cutouts cuts out CUTOUT_PIX × CUTOUT_PIX patches centred on each selected pixel from every map in the stack. Cutouts that would extend beyond the patch boundary are discarded.
The expected tSZ profile is a negative decrement (in µK at 150 GHz) with a central minimum and a positive ring from relativistic corrections — the DDPM’s ability to reproduce this profile shape is a non-trivial test of whether it has learnt realistic cluster morphology.
[ ]:
# select_snr_pixels(maps_nhw, snr_min, snr_max):
# Smooths each map with a matched filter and returns the (row, col) coordinates
# of pixels where signal-to-noise falls in [snr_min, snr_max].
# Uses only the first map in the stack as the detection template — the same
# coordinates are then used to extract cutouts from all maps.
#
# extract_cutouts(maps_nhw, pixel_coords, cutout_size, max_cutouts):
# Returns (N_cutouts, cutout_size, cutout_size) array; coordinates that
# would fall outside the map boundary are discarded.
agora_stacks, ddpm_stacks = {}, {}
for smin, smax, label in SNR_BINS:
coords = select_snr_pixels(agora_tsz, smin, smax)
a_cuts = extract_cutouts(agora_tsz, coords, CUTOUT_PIX)
d_cuts = extract_cutouts(ddpm_tsz, coords, CUTOUT_PIX)
agora_stacks[label] = a_cuts.mean(axis=0) if a_cuts is not None else None
ddpm_stacks[label] = d_cuts.mean(axis=0) if d_cuts is not None else None
print(f"{label}: stacked {len(a_cuts) if a_cuts is not None else 0} Agora / "
f"{len(d_cuts) if d_cuts is not None else 0} DDPM cutouts")
np.save(OUT_DIR / "agora_tsz_stacks.npy", {k: v for k, v in agora_stacks.items()}, allow_pickle=True)
np.save(OUT_DIR / "ddpm_tsz_stacks.npy", {k: v for k, v in ddpm_stacks.items()}, allow_pickle=True)
4 Compute mean stacked profiles and radial averages¶
Average the cutout stack over all cutouts to produce the mean stacked image. Then compute the radial profile using radial_profile (from flatmaps) binned in steps of one pixel width. The profile is compared between Agora and DDPM to assess whether the DDPM generates clusters of the correct amplitude and angular extent.
[7]:
# select_snr_pixels(maps_nhw, snr_min, snr_max):
# Smooths each map with a matched filter and returns the (row, col) coordinates
# of pixels where signal-to-noise falls in [snr_min, snr_max].
# Uses only the first map in the stack as the detection template — the same
# coordinates are then used to extract cutouts from all maps.
#
# extract_cutouts(maps_nhw, pixel_coords, cutout_size, max_cutouts):
# Returns (N_cutouts, cutout_size, cutout_size) array; coordinates that
# would fall outside the map boundary are discarded.
"""
def extract_cutouts(maps_nhw, pixel_coords, cutout_size, max_cutouts=500):
"""Extract cutouts around pixel_coords from all maps."""
half = cutout_size // 2
cutouts = []
coords = pixel_coords[:max_cutouts] # cap to avoid memory issues
for i, m in enumerate(maps_nhw):
for (ri, rj) in coords:
ri0, ri1 = ri - half, ri + half
rj0, rj1 = rj - half, rj + half
if ri0 < 0 or ri1 > m.shape[0] or rj0 < 0 or rj1 > m.shape[1]:
continue
cutouts.append(m[ri0:ri1, rj0:rj1])
return np.array(cutouts, dtype=np.float32) if cutouts else None
agora_stacks, ddpm_stacks = {}, {}
for smin, smax, label in SNR_BINS:
coords = select_snr_pixels(agora_tsz, smin, smax)
a_cuts = extract_cutouts(agora_tsz, coords, CUTOUT_PIX)
d_cuts = extract_cutouts(ddpm_tsz, coords, CUTOUT_PIX)
agora_stacks[label] = a_cuts.mean(axis=0) if a_cuts is not None else None
ddpm_stacks[label] = d_cuts.mean(axis=0) if d_cuts is not None else None
n_a = len(a_cuts) if a_cuts is not None else 0
n_d = len(d_cuts) if d_cuts is not None else 0
print(f"{label}: stacked {n_a} Agora / {n_d} DDPM cutouts")
np.save(OUT_DIR / "agora_tsz_stacks.npy", {k: v for k, v in agora_stacks.items()}, allow_pickle=True)
np.save(OUT_DIR / "ddpm_tsz_stacks.npy", {k: v for k, v in ddpm_stacks.items()}, allow_pickle=True)
"""
Cell In[7], line 3
"""Extract cutouts around pixel_coords from all maps."""
^
SyntaxError: invalid syntax
5 Plot stacked images and radial profiles¶
Display the mean 2D stacked image for each S/N bin (left column) alongside the 1D radial profile (right column) for Agora and DDPM side by side. Error bars show the standard error on the mean across the cutout ensemble.
[ ]:
# Build coordinate grids (degrees) for the cutout
HALF = CUTOUT_PIX // 2
pix_deg = PIXEL_RES / 60.
xi = (np.arange(CUTOUT_PIX) - HALF) * pix_deg
xgrid, ygrid = np.meshgrid(xi, xi)
fig, axes = plt.subplots(len(SNR_BINS), 3, figsize=(14, 4 * len(SNR_BINS)))
for row, (smin, smax, label) in enumerate(SNR_BINS):
a_stack = agora_stacks[label]
d_stack = ddpm_stacks[label]
if a_stack is None or d_stack is None:
continue
# Radial profiles: bin_size=1', maxbin=10', to_arcmins converts degrees→arcmin
prof_a = radial_profile(a_stack, xy=(xgrid, ygrid), bin_size=1., minbin=0., maxbin=10., to_arcmins=1)
prof_d = radial_profile(d_stack, xy=(xgrid, ygrid), bin_size=1., minbin=0., maxbin=10., to_arcmins=1)
theta = prof_a[:, 0] # arcmin
ax_img, ax_prof, ax_res = axes[row]
im = ax_img.imshow(a_stack - d_stack, origin="lower", cmap="RdBu_r")
ax_img.set_title(f"{label} — Agora − DDPM stack")
plt.colorbar(im, ax=ax_img)
ax_prof.plot(theta, prof_a[:, 1], label="Agora", color="C0")
ax_prof.fill_between(theta, prof_a[:,1]-prof_a[:,2], prof_a[:,1]+prof_a[:,2], alpha=0.3, color="C0")
ax_prof.plot(theta, prof_d[:, 1], label="DDPM", color="C1")
ax_prof.fill_between(theta, prof_d[:,1]-prof_d[:,2], prof_d[:,1]+prof_d[:,2], alpha=0.3, color="C1")
ax_prof.set_xlabel(r"$\theta$ (arcmin)"); ax_prof.set_ylabel(r"$\langle T \rangle$")
ax_prof.set_title(f"{label} — radial profile"); ax_prof.legend()
ratio = prof_d[:, 1] / (prof_a[:, 1] + 1e-30)
ax_res.plot(theta, ratio)
ax_res.axhline(1., color="k", ls="--", lw=0.8)
ax_res.set_xlabel(r"$\theta$ (arcmin)"); ax_res.set_ylabel("DDPM / Agora")
ax_res.set_title(f"{label} — ratio")
plt.tight_layout()
Path("plots").mkdir(exist_ok=True)
plt.savefig("plots/tsz_stacks_radial_profile.pdf", bbox_inches="tight")
plt.show()
[ ]: