09 — tSZ Cluster Stacking

Purpose: Compare tSZ cluster profiles between Agora and DDPM samples using a signal-to-noise–based stacking approach.

This notebook identifies cluster locations by SNR thresholding, stacks cutouts from both Agora and DDPM maps, and computes radial profiles to quantify agreement:

  1. SNR selection — pixels satisfying A·σ < |T − T̄| ≤ B·σ are selected in three bins: (a) 5–10σ (~263k Agora clusters), (b) 10–20σ (~60k), (c) ≥20σ (~3.9k).

  2. Cutout extraction — extracts 22×22 pixel (≈31’) cutouts centred on each selected pixel from both the Agora and DDPM tSZ maps.

  3. Stacking — uniformly weighted average of all cutouts per SNR bin, producing a mean cluster profile image Ŝ(θx, θy).

  4. Radial profiles — converts the 2D stacked images to 1D profiles using radial_profile, with 10 linearly spaced bins of Δθ = 1’ out to θ_max = 10’. Plots the profile, Agora–DDPM difference, and their ratio.

Inputs:

  • Test maps and DDPM samples: data/low_pass/2mJy/*.npy

Outputs:

  • Stacked cutout arrays: tsz_extracts/agora_tsz_stacks.npy, tsz_extracts/ddpm_tsz_stacks.npy

  • Radial profile plots: plots/tsz_stacks.pdf, plots/tsz_stacks_radial_profile.pdf (Figure 3)

Key module functions:

  • foregrounds_diffusion.flatmaps.radial_profile

Paper reference: §4.2 (Figure 3 — stacked tSZ radial profiles in three SNR bins).

1 Configuration

Stacking parameters:

  • SNR_BINS: three S/N intervals [3,6], [6,9], [9,∞] used to separate clusters by detection significance. Higher-S/N bins probe more massive clusters and produce cleaner stacked profiles.

  • CUTOUT_PIX = 40: cutout width of 40 pixels = 56′ centred on each peak. This captures the full tSZ profile (typically ≈ 5–10′ radius for the cluster population at z ~ 0.5) with enough padding for the radial profile estimator.

  • PIXEL_RES = 1.40625′: physical scale of the cutout pixels, used to convert pixel offsets to angular separations for the radial profile.

[ ]:
# select_snr_pixels(maps_nhw, snr_min, snr_max):
#   Smooths each map with a matched filter and returns the (row, col) coordinates
#   of pixels where signal-to-noise falls in [snr_min, snr_max].
#   Uses only the first map in the stack as the detection template — the same
#   coordinates are then used to extract cutouts from all maps.
#
# extract_cutouts(maps_nhw, pixel_coords, cutout_size, max_cutouts):
#   Returns (N_cutouts, cutout_size, cutout_size) array; coordinates that
#   would fall outside the map boundary are discarded.
import numpy as np
import matplotlib.pyplot as plt
from pathlib import Path

from foregrounds_diffusion.flatmaps import radial_profile
from foregrounds_diffusion.preprocessing import renormalize_dm_maps, denormalize_dm_maps
from foregrounds_diffusion.stacking import select_snr_pixels, extract_cutouts

PTSRC      = 2
CUTOUT_PIX = 22       # 22 px × 1.41 arcmin/px ≈ 31 arcmin
PIXEL_RES  = 1.41     # arcmin / pixel

SNR_BINS = [
    (5,  10,  "5–10σ"),
    (10, 20,  "10–20σ"),
    (20, None, "≥20σ"),
]

PROJECT_ROOT = Path("/home/apb86/cmb_foregrounds_diffusion")
PATCHES_DIR = Path(f"data/low_pass/{PTSRC}mJy")

OUT_DIR = Path("tsz_extracts")
OUT_DIR.mkdir(exist_ok=True)

2 Load and denormalise maps

The DDPM samples are already in normalised space; denormalise to µK before stacking so that profile amplitudes are physically meaningful. Agora maps are also denormalised from z-score to µK using the training-set mean and standard deviation.

[ ]:
cib_maps    = np.load(PATCHES_DIR / f"CIB_map_150GHz_256_st6_minmax_{PTSRC}mJy_zero_lp.npy")
tsz_maps    = np.load(PATCHES_DIR / f"tSZ3_map_150GHz_256_st6_minmax_{PTSRC}mJy_norm_lp.npy")
ddpm_raw    = np.load(PROJECT_ROOT / "data" / "low_pass" / f"{PTSRC}mJy" / f"new_samples_cib_tsz_{PTSRC}mJy_zero_norm_6x6_w_au_lp.npy")
gauss_maps  = np.load(PATCHES_DIR / f"gaussian_cib_tsz_{PTSRC}mJy_lp.npy")

# Collect pixel values from N_SAMPLES maps per source
agora_cib_px  = cib_maps[:, :, :, 0].ravel()
agora_tsz_px  = tsz_maps[:, :, :, 0].ravel()

train_maps = np.concatenate([cib_maps, tsz_maps], axis=-1)  # (N, H, W, 2)
print(f"train_maps shape: {train_maps.shape}")
print(f"ddpm_raw shape:   {ddpm_raw.shape}")

# Load saved normalisation parameters from patch extraction
norm_params = np.load(PATCHES_DIR / f"norm_params_{PTSRC}mJy.npy")
cib_mean, cib_std, tsz_mean, tsz_std = norm_params
print(f"CIB — mean: {cib_mean:.4f}  std: {cib_std:.4f}")
print(f"tSZ — mean: {tsz_mean:.4f}  std: {tsz_std:.4f}")

ddpm_renorm = denormalize_dm_maps(ddpm_raw, cib_mean, cib_std, tsz_mean, tsz_std)
print(f"Denormalised DDPM — CIB range: [{ddpm_renorm[:,0].min():.4f}, {ddpm_renorm[:,0].max():.4f}]")
print(f"Denormalised DDPM — tSZ range: [{ddpm_renorm[:,1].min():.4f}, {ddpm_renorm[:,1].max():.4f}]")

agora_tsz = tsz_maps[:, :, :, 0]   # (N, H, W)
ddpm_tsz  = ddpm_renorm[:, 1]         # (N, H, W)
print(f"Agora tSZ maps: {agora_tsz.shape},  DDPM tSZ maps: {ddpm_tsz.shape}")

3 Identify SNR pixels and extract cutouts

select_snr_pixels smooths the tSZ map with a matched filter (or a simple Gaussian of width θ_beam) and returns pixel coordinates where the S/N exceeds the given thresholds. extract_cutouts cuts out CUTOUT_PIX × CUTOUT_PIX patches centred on each selected pixel from every map in the stack. Cutouts that would extend beyond the patch boundary are discarded.

The expected tSZ profile is a negative decrement (in µK at 150 GHz) with a central minimum and a positive ring from relativistic corrections — the DDPM’s ability to reproduce this profile shape is a non-trivial test of whether it has learnt realistic cluster morphology.

[ ]:
# select_snr_pixels(maps_nhw, snr_min, snr_max):
#   Smooths each map with a matched filter and returns the (row, col) coordinates
#   of pixels where signal-to-noise falls in [snr_min, snr_max].
#   Uses only the first map in the stack as the detection template — the same
#   coordinates are then used to extract cutouts from all maps.
#
# extract_cutouts(maps_nhw, pixel_coords, cutout_size, max_cutouts):
#   Returns (N_cutouts, cutout_size, cutout_size) array; coordinates that
#   would fall outside the map boundary are discarded.
agora_stacks, ddpm_stacks = {}, {}

for smin, smax, label in SNR_BINS:
    coords = select_snr_pixels(agora_tsz, smin, smax)
    a_cuts = extract_cutouts(agora_tsz, coords, CUTOUT_PIX)
    d_cuts = extract_cutouts(ddpm_tsz,  coords, CUTOUT_PIX)
    agora_stacks[label] = a_cuts.mean(axis=0) if a_cuts is not None else None
    ddpm_stacks[label]  = d_cuts.mean(axis=0) if d_cuts is not None else None
    print(f"{label}: stacked {len(a_cuts) if a_cuts is not None else 0} Agora / "
          f"{len(d_cuts) if d_cuts is not None else 0} DDPM cutouts")

np.save(OUT_DIR / "agora_tsz_stacks.npy", {k: v for k, v in agora_stacks.items()}, allow_pickle=True)
np.save(OUT_DIR / "ddpm_tsz_stacks.npy",  {k: v for k, v in ddpm_stacks.items()},  allow_pickle=True)

4 Compute mean stacked profiles and radial averages

Average the cutout stack over all cutouts to produce the mean stacked image. Then compute the radial profile using radial_profile (from flatmaps) binned in steps of one pixel width. The profile is compared between Agora and DDPM to assess whether the DDPM generates clusters of the correct amplitude and angular extent.

[7]:
# select_snr_pixels(maps_nhw, snr_min, snr_max):
#   Smooths each map with a matched filter and returns the (row, col) coordinates
#   of pixels where signal-to-noise falls in [snr_min, snr_max].
#   Uses only the first map in the stack as the detection template — the same
#   coordinates are then used to extract cutouts from all maps.
#
# extract_cutouts(maps_nhw, pixel_coords, cutout_size, max_cutouts):
#   Returns (N_cutouts, cutout_size, cutout_size) array; coordinates that
#   would fall outside the map boundary are discarded.
"""
def extract_cutouts(maps_nhw, pixel_coords, cutout_size, max_cutouts=500):
    """Extract cutouts around pixel_coords from all maps."""
    half = cutout_size // 2
    cutouts = []
    coords = pixel_coords[:max_cutouts]  # cap to avoid memory issues
    for i, m in enumerate(maps_nhw):
        for (ri, rj) in coords:
            ri0, ri1 = ri - half, ri + half
            rj0, rj1 = rj - half, rj + half
            if ri0 < 0 or ri1 > m.shape[0] or rj0 < 0 or rj1 > m.shape[1]:
                continue
            cutouts.append(m[ri0:ri1, rj0:rj1])
    return np.array(cutouts, dtype=np.float32) if cutouts else None

agora_stacks, ddpm_stacks = {}, {}
for smin, smax, label in SNR_BINS:
    coords = select_snr_pixels(agora_tsz, smin, smax)
    a_cuts = extract_cutouts(agora_tsz, coords, CUTOUT_PIX)
    d_cuts = extract_cutouts(ddpm_tsz,  coords, CUTOUT_PIX)
    agora_stacks[label] = a_cuts.mean(axis=0) if a_cuts is not None else None
    ddpm_stacks[label]  = d_cuts.mean(axis=0) if d_cuts is not None else None
    n_a = len(a_cuts) if a_cuts is not None else 0
    n_d = len(d_cuts) if d_cuts is not None else 0
    print(f"{label}: stacked {n_a} Agora / {n_d} DDPM cutouts")

np.save(OUT_DIR / "agora_tsz_stacks.npy", {k: v for k, v in agora_stacks.items()}, allow_pickle=True)
np.save(OUT_DIR / "ddpm_tsz_stacks.npy",  {k: v for k, v in ddpm_stacks.items()},  allow_pickle=True)
"""
  Cell In[7], line 3
    """Extract cutouts around pixel_coords from all maps."""
       ^
SyntaxError: invalid syntax

5 Plot stacked images and radial profiles

Display the mean 2D stacked image for each S/N bin (left column) alongside the 1D radial profile (right column) for Agora and DDPM side by side. Error bars show the standard error on the mean across the cutout ensemble.

[ ]:
# Build coordinate grids (degrees) for the cutout
HALF    = CUTOUT_PIX // 2
pix_deg = PIXEL_RES / 60.
xi      = (np.arange(CUTOUT_PIX) - HALF) * pix_deg
xgrid, ygrid = np.meshgrid(xi, xi)

fig, axes = plt.subplots(len(SNR_BINS), 3, figsize=(14, 4 * len(SNR_BINS)))

for row, (smin, smax, label) in enumerate(SNR_BINS):
    a_stack = agora_stacks[label]
    d_stack = ddpm_stacks[label]
    if a_stack is None or d_stack is None:
        continue

    # Radial profiles: bin_size=1', maxbin=10', to_arcmins converts degrees→arcmin
    prof_a = radial_profile(a_stack, xy=(xgrid, ygrid), bin_size=1., minbin=0., maxbin=10., to_arcmins=1)
    prof_d = radial_profile(d_stack, xy=(xgrid, ygrid), bin_size=1., minbin=0., maxbin=10., to_arcmins=1)
    theta  = prof_a[:, 0]   # arcmin

    ax_img, ax_prof, ax_res = axes[row]

    im = ax_img.imshow(a_stack - d_stack, origin="lower", cmap="RdBu_r")
    ax_img.set_title(f"{label} — Agora − DDPM stack")
    plt.colorbar(im, ax=ax_img)

    ax_prof.plot(theta, prof_a[:, 1], label="Agora", color="C0")
    ax_prof.fill_between(theta, prof_a[:,1]-prof_a[:,2], prof_a[:,1]+prof_a[:,2], alpha=0.3, color="C0")
    ax_prof.plot(theta, prof_d[:, 1], label="DDPM",  color="C1")
    ax_prof.fill_between(theta, prof_d[:,1]-prof_d[:,2], prof_d[:,1]+prof_d[:,2], alpha=0.3, color="C1")
    ax_prof.set_xlabel(r"$\theta$ (arcmin)");  ax_prof.set_ylabel(r"$\langle T \rangle$")
    ax_prof.set_title(f"{label} — radial profile");  ax_prof.legend()

    ratio = prof_d[:, 1] / (prof_a[:, 1] + 1e-30)
    ax_res.plot(theta, ratio)
    ax_res.axhline(1., color="k", ls="--", lw=0.8)
    ax_res.set_xlabel(r"$\theta$ (arcmin)");  ax_res.set_ylabel("DDPM / Agora")
    ax_res.set_title(f"{label} — ratio")

plt.tight_layout()
Path("plots").mkdir(exist_ok=True)
plt.savefig("plots/tsz_stacks_radial_profile.pdf", bbox_inches="tight")
plt.show()

[ ]: