10 — Peak and Minima Counts¶
Purpose: Evaluate non-Gaussian structure in CIB and tSZ patches using peak and minima counts following Sabyr, Hill & Haiman (2024), arXiv:2410.21247.
This notebook computes counts of local maxima (peaks) and local minima as a function of amplitude threshold ν = T/σ for Agora, DDPM, and Gaussian baseline maps. This statistic is sensitive to the abundance of clusters and voids at different signal levels and has been shown to be highly informative for cosmological inference.
The pipeline:
Smooth patches at three angular scales θ_s = 1, 2.5, 5 arcmin
Identify local maxima and minima using scipy’s maximum/minimum filter
Bin counts as a function of ν ∈ [-4σ, 4σ] in 30 bins
Compare Agora, DDPM, and Gaussian distributions
No additional packages required beyond numpy and scipy.
Inputs:
Test maps:
data/low_pass/2mJy/*.npyDDPM samples:
data/low_pass/2mJy/new_samples_*.npyNorm params:
data/low_pass/2mJy/norm_params_2mJy.npy
Outputs: plots/peak_minima_counts.pdf
Key module functions:
foregrounds_diffusion.peak_counts.compute_peak_minima_counts
Paper reference: Sabyr, Hill & Haiman (2024), arXiv:2410.21247
─────────────────────────────────────────────────────────────────────────────¶
Imports and physical constants.¶
pixel_res_arcmin is used by the Gaussian smoothing kernel inside¶
compute_peak_minima_counts to convert FWHM in arcmin to pixels.¶
─────────────────────────────────────────────────────────────────────────────¶
1 Configuration¶
The peak and minima count statistic follows Sabyr, Hill & Haiman (2024, arXiv:2410.21247). Maps are smoothed at three angular scales before peak finding to probe structure at different physical sizes:
θ_s = 1′ — angular scale of typical tSZ cluster cores at z ~ 0.5.
θ_s = 2.5′ — intermediate scale probing extended cluster gas profiles.
θ_s = 5′ — large-scale SZ signal from merging clusters and the warm-hot intergalactic medium (WHIM).
Threshold arrays THRESHOLDS_PEAKS and THRESHOLDS_MINIMA are in units of the per-patch standard deviation σ (i.e. ν = T/σ), so they are directly comparable across different sample populations with different normalisation.
[ ]:
import numpy as np
import matplotlib.pyplot as plt
from pathlib import Path
from foregrounds_diffusion.peak_counts import compute_peak_minima_counts
PTSRC = 2
PIXEL_RES_ARCMIN = 1.40625 # 6 deg / 256 px
N_MAPS = 120 # number of test maps to use
# Smoothing scales matching Sabyr et al. (2024)
SMOOTHING_SCALES = [1.0, 2.5, 5.0] # arcmin FWHM
# Threshold ranges in units of σ
THRESHOLDS_PEAKS = np.linspace(-1, 5, 30)
THRESHOLDS_MINIMA = np.linspace(-5, 1, 30)
PATCHES_DIR = Path(f'data/low_pass/{PTSRC}mJy')
PROJECT_ROOT = Path('/home/apb86/cmb_foregrounds_diffusion')
Load normalised CIB (_zero suffix = min-max to [0,1]) and tSZ (_norm suffix¶
= z-score). norm_params stores (cib_mean, cib_std, tsz_mean, tsz_std) in¶
physical µK, computed over the full training set during patch extraction.¶
2 Load and denormalise maps¶
Load normalised patches and the DDPM samples. Denormalise CIB from min-max (divide by max − min, then multiply back) and tSZ from z-score (multiply by σ and add µ). Both channels must be in the same physical units (µK) so that the per-patch threshold ν = T/σ is meaningful and comparable.
[ ]:
# ---------------------------------------------------------------------------
# Load and denormalise maps
# ---------------------------------------------------------------------------
norm_params = np.load(PATCHES_DIR / f'norm_params_{PTSRC}mJy.npy')
cib_mean, cib_std, tsz_mean, tsz_std = norm_params
print(f'CIB — mean: {cib_mean:.4f} µK std: {cib_std:.4f} µK')
print(f'tSZ — mean: {tsz_mean:.4f} µK std: {tsz_std:.4f} µK')
cib_maps = np.load(PATCHES_DIR / f'CIB_map_150GHz_256_st6_zscore_{PTSRC}mJy_lp.npy')
tsz_maps = np.load(PATCHES_DIR / f'tSZ3_map_150GHz_256_st6_zscore_{PTSRC}mJy_lp.npy')
# 80/20 split (same seed as training)
rng = np.random.default_rng(seed=42)
indices = rng.permutation(len(cib_maps))
test_idx = indices[int(0.8 * len(cib_maps)):]
# Denormalise to physical units (µK)
cib_test = cib_maps[test_idx, :, :, 0] * cib_std + cib_mean # (N, H, W)
tsz_test = tsz_maps[test_idx, :, :, 0] * tsz_std + tsz_mean
print(f'Test set: {len(cib_test)} patches')
print(f'CIB pixel range: [{cib_test.min():.2f}, {cib_test.max():.2f}] µK')
print(f'tSZ pixel range: [{tsz_test.min():.2f}, {tsz_test.max():.2f}] µK')
DDPM samples are stored channels-first: shape (N, 2, H, W).¶
Gaussian baseline maps (channels-first, same shape) were generated by¶
drawing realisations from the Agora-measured power spectrum.¶
Denormalise both back to µK before computing statistics.¶
3 Load DDPM samples and Gaussian baseline¶
The Gaussian baseline is a set of maps generated by drawing independent Fourier coefficients from the AGORA power spectrum (via make_gaussian_realisation). These serve as a null hypothesis: a model that matches only the two-point statistics. Any excess in Agora peak counts relative to the Gaussian baseline reflects non-Gaussian structure (tSZ cluster peaks, CIB source clustering).
[ ]:
# Load DDPM samples and denormalise
ddpm_raw = np.load(
PROJECT_ROOT / 'data' / 'low_pass' / f'{PTSRC}mJy' /
f'new_samples_cib_tsz_{PTSRC}mJy_zero_norm_6x6_w_au_lp.npy'
) # (N, 2, H, W)
ddpm_cib = ddpm_raw[:, 0] * cib_std + cib_mean # (N, H, W)
ddpm_tsz = ddpm_raw[:, 1] * tsz_std + tsz_mean
print(f'DDPM samples: {len(ddpm_cib)} patches')
# Load Gaussian baseline
gauss_maps = np.load(PATCHES_DIR / f'gaussian_cib_tsz_{PTSRC}mJy_lp.npy') # (N, 2, H, W)
gauss_cib = gauss_maps[:, 0] if gauss_maps.shape[1] == 2 else gauss_maps[:, :, :, 0]
gauss_tsz = gauss_maps[:, 1] if gauss_maps.shape[1] == 2 else gauss_maps[:, :, :, 1]
print(f'Gaussian samples: {len(gauss_cib)} patches')
compute_peak_minima_counts:¶
maps_nhw : (N, H, W) array in physical units (µK)¶
thresholds_peaks : 1D array of ν = T/σ values for peak binning¶
thresholds_minima : 1D array of ν values for minima binning (typically < 0)¶
smoothing_scales_arcmin: list of FWHM values for the pre-smoothing Gaussian¶
Returns dict keyed by scale → {‘peaks’: (N, T), ‘minima’: (N, T)}¶
4 Compute peak and minima counts¶
compute_peak_minima_counts returns a nested dict keyed by smoothing scale:
result = {
1.0: {'peaks': ndarray(N, 30), 'minima': ndarray(N, 30)},
2.5: {'peaks': ..., 'minima': ...},
5.0: {'peaks': ..., 'minima': ...},
}
Each (N, 30) array gives the count of local maxima/minima in threshold bin j for map i, normalised by the patch area in arcmin². The computation takes ≈ 5–10 min for 120 maps × 3 scales; reduce N_MAPS for a quick test.
[ ]:
# ---------------------------------------------------------------------------
# Compute peak and minima counts at all smoothing scales
# NOTE: This cell takes ~5-10 minutes for 120 maps × 3 scales
# ---------------------------------------------------------------------------
N = min(N_MAPS, len(cib_test), len(ddpm_cib), len(gauss_cib))
print(f'Using {N} maps per source')
print('\nComputing Agora CIB counts...')
agora_cib_counts = compute_peak_minima_counts(
cib_test[:N], THRESHOLDS_PEAKS, THRESHOLDS_MINIMA,
smoothing_scales_arcmin=SMOOTHING_SCALES)
print('\nComputing Agora tSZ counts...')
agora_tsz_counts = compute_peak_minima_counts(
tsz_test[:N], THRESHOLDS_PEAKS, THRESHOLDS_MINIMA,
smoothing_scales_arcmin=SMOOTHING_SCALES)
print('\nComputing DDPM CIB counts...')
ddpm_cib_counts = compute_peak_minima_counts(
ddpm_cib[:N], THRESHOLDS_PEAKS, THRESHOLDS_MINIMA,
smoothing_scales_arcmin=SMOOTHING_SCALES)
print('\nComputing DDPM tSZ counts...')
ddpm_tsz_counts = compute_peak_minima_counts(
ddpm_tsz[:N], THRESHOLDS_PEAKS, THRESHOLDS_MINIMA,
smoothing_scales_arcmin=SMOOTHING_SCALES)
print('\nComputing Gaussian CIB counts...')
gauss_cib_counts = compute_peak_minima_counts(
gauss_cib[:N], THRESHOLDS_PEAKS, THRESHOLDS_MINIMA,
smoothing_scales_arcmin=SMOOTHING_SCALES)
print('\nComputing Gaussian tSZ counts...')
gauss_tsz_counts = compute_peak_minima_counts(
gauss_tsz[:N], THRESHOLDS_PEAKS, THRESHOLDS_MINIMA,
smoothing_scales_arcmin=SMOOTHING_SCALES)
print('Done.')
5 Plot peak and minima counts¶
Six-panel grid: three rows (smoothing scales θ_s = 1, 2.5, 5′) × four columns (CIB peaks, CIB minima, tSZ peaks, tSZ minima). Agora (blue solid), DDPM (orange solid), and Gaussian (green dashed). Error bands show the standard error on the mean across the N test maps.
The key diagnostic is whether the DDPM’s peak count distribution falls between the Agora and Gaussian curves (partial non-Gaussianity recovery) or matches Agora (full recovery). For tSZ at small scales, a prominent peak excess above the Gaussian baseline traces the cluster mass function.
[ ]:
# ---------------------------------------------------------------------------
# Plot: peaks and minima for each smoothing scale, CIB and tSZ
# ---------------------------------------------------------------------------
fig, axes = plt.subplots(
len(SMOOTHING_SCALES), 4,
figsize=(18, 4 * len(SMOOTHING_SCALES))
)
sources = [
('Agora', 'C0', '-', agora_cib_counts, agora_tsz_counts),
('DDPM', 'C1', '-', ddpm_cib_counts, ddpm_tsz_counts),
('Gaussian', 'C2', '--', gauss_cib_counts, gauss_tsz_counts),
]
col_titles = [
'CIB Peaks', 'CIB Minima',
'tSZ Peaks', 'tSZ Minima',
]
for row, fwhm in enumerate(SMOOTHING_SCALES):
for col, (title, counts_cib, counts_tsz, thresholds) in enumerate([
('CIB Peaks', 'peaks', 'peaks', THRESHOLDS_PEAKS),
('CIB Minima', 'minima', 'minima', THRESHOLDS_MINIMA),
('tSZ Peaks', 'peaks', 'peaks', THRESHOLDS_PEAKS),
('tSZ Minima', 'minima', 'minima', THRESHOLDS_MINIMA),
]):
ax = axes[row, col]
is_tsz = col >= 2
stat = 'peaks' if col % 2 == 0 else 'minima'
thresh = THRESHOLDS_PEAKS if col % 2 == 0 else THRESHOLDS_MINIMA
for label, color, ls, cib_c, tsz_c in sources:
counts = tsz_c[fwhm][stat] if is_tsz else cib_c[fwhm][stat]
mean = counts.mean(axis=0)
std = counts.std(axis=0)
ax.plot(thresh, mean, color=color, ls=ls, lw=1.5, label=label)
ax.fill_between(thresh, mean - std, mean + std,
alpha=0.2, color=color)
ax.set_xlabel(r'Threshold $\nu = T/\sigma$')
ax.set_ylabel('Count per map')
ax.set_title(f'{col_titles[col]} — θ_s = {fwhm}\'')
if col == 0 and row == 0:
ax.legend(fontsize=9)
plt.suptitle('Peak and Minima Counts', fontsize=14, y=1.01)
plt.tight_layout()
Path('plots').mkdir(exist_ok=True)
plt.savefig('plots/peak_minima_counts.pdf', dpi=200, bbox_inches='tight')
plt.show()
6 Residuals: (Agora − DDPM) / σ_Agora¶
Standardised residuals reveal whether any excess in Agora peak counts is reproduced by the DDPM, in units of the Agora sample variance. Values within ±2 indicate agreement at the 2σ level. Systematic offsets at particular threshold values would identify where the DDPM’s cluster population statistics deviate from the AGORA simulation.
[ ]:
# ---------------------------------------------------------------------------
# Residuals: (Agora - DDPM) / sigma_Agora
# ---------------------------------------------------------------------------
fig, axes = plt.subplots(len(SMOOTHING_SCALES), 2, figsize=(12, 4 * len(SMOOTHING_SCALES)))
for row, fwhm in enumerate(SMOOTHING_SCALES):
for col, (label, agora_c, ddpm_c, thresh) in enumerate([
('CIB peaks', agora_cib_counts[fwhm]['peaks'],
ddpm_cib_counts[fwhm]['peaks'], THRESHOLDS_PEAKS),
('tSZ peaks', agora_tsz_counts[fwhm]['peaks'],
ddpm_tsz_counts[fwhm]['peaks'], THRESHOLDS_PEAKS),
]):
ax = axes[row, col]
sigma = agora_c.std(axis=0) + 1e-10
resid = (agora_c.mean(axis=0) - ddpm_c.mean(axis=0)) / sigma
ax.plot(thresh, resid, color='C0', lw=1.5)
ax.axhline(0, color='k', lw=0.8, ls='--')
ax.axhline(1, color='gray', lw=0.6, ls=':')
ax.axhline(-1, color='gray', lw=0.6, ls=':')
ax.set_xlabel(r'Threshold $\nu$')
ax.set_ylabel(r'$(N^\mathrm{Agora} - N^\mathrm{DDPM})/\sigma$')
ax.set_title(f'{label} residual — θ_s = {fwhm}\'')
plt.tight_layout()
plt.savefig('plots/peak_minima_residuals.pdf', dpi=200, bbox_inches='tight')
plt.show()