2025-05-01
Anthropic activates ASL-3 safeguards, creating the first operational AI safety framework tied to capability thresholds
事件摘要
Anthropic activated ASL-3 safety protocol in 2025 — the first operational AI safety framework tied to capability thresholds. How frontier labs are building structured safety responses.
影响评估
-
Paradigm Shift +2 · Long-term
First operational AI safety framework tied to concrete capability thresholds. Created a precedent that safety measures should scale with model capabilities. Influenced regulatory frameworks including SB 53 and EU AI Act implementation.
Affected Groups: AI safety researchers, AI labs, policymakers, regulators
-
Risk Creation -1 · Medium-term
Critics argued thresholds were self-defined and self-enforced, lacking independent oversight. Raised questions about whether AI labs could credibly self-regulate capability thresholds.
Affected Groups: public, ethicists, regulators
共识度与来源
重要度
L1
分类
Capability Breakthrough
共识度
Emerging Consensus
影响指数
5/10
-
1
We activated ASL-3 safeguards for relevant models in May 2025 and have been working to improve them ever since.Reference Evidence Citation logged Live source