Blog - Paul Colognese

Buddhist Wisdom and the Challenge of AI Emotions

Talk Slides 14 Apr 2026

Automating AI Evaluation Production

My Experience 11 Dec 2025

Technical Post 28 Dec 2025

Old Blog

Automating AI Control Evaluations - Exploration Summary

22 January 2025
Ideal Responsible Scaling Policies

16 May 2024
Policy for Mitigating Catastrophic Risks from AI

16 May 2024
My Math PhD - Summary

28 March 2024
Aligned AI via monitoring objectives in AutoGPT-like systems

27 March 2024
Anomalous Concept Detection for Detecting Hidden Cognition

27 March 2024
Auditing games for high-level interpretability

27 March 2024
Deception?! I ain’t got time for that!

27 March 2024
Explaining the AI Alignment Problem to Tibetan Buddhist Monks

27 March 2024
Hidden Cognition Detection Methods and Benchmarks

27 March 2024
High-level interpretability: detecting an AI's objectives

27 March 2024
Internal Target Information for AI Oversight

27 March 2024
Kaczynski’s Self-Propagating Systems Theory and the Future of Humanity

27 March 2024
Notes on Internal Objectives in Toy Models of Agents

27 March 2024