“Explicitly approve model behavior for each launch, weighing both quantitative and qualitative signals: We’ll adjust our safety review process to formally consider behavior issues—such as hallucination, deception, reliability, and personality—as blocking concerns. Even if these issues aren’t perfectly quantifiable today, we commit to blocking launches based on proxy measurements or qualitative signals, even when metrics like ”
"Explicitly approve model behavior for each launch, weighing both quantitative and qualitative signals: We’ll adjust our safety review process to formally con..." - Expanding on what we missed with sycophancy | OpenAI · Read original
Created at 8/8/2025, 10:07:08 AM
"Explicitly approve model behavior for each launch, weighing both quantitative and qualitative signals: We’ll adjust our safety review process to formally con..." - "Explicitly approve model behavior for each launch, weighing both quantitative and qualitative signals: We’ll adjust our safety review process to formally con..." - Expanding on what we missed with sycophancy | OpenAI