Research index·1 note · 2025–2026

Notes from an independent AI safety practice.

Misalignment write-ups, eval-design pieces, and occasional method notes. Most carry a Hugging Face artefact or a public repo; everything is reproducible.

01 · RESEARCH NOTE · 21 May 2026

Same scenario, two different deceptions: how o3 and GPT-5 diverged from a single elicitation.

Frontier model evalsPalisade Bounty