专题
返回时间轴
2025-05-01

Anthropic activates ASL-3 safeguards, creating the first operational AI safety framework tied to capability thresholds

Capability Breakthrough

事件摘要

Anthropic activated ASL-3 safety protocol in 2025 — the first operational AI safety framework tied to capability thresholds. How frontier labs are building structured safety responses.

影响评估

  • Paradigm Shift +2 · Long-term

    First operational AI safety framework tied to concrete capability thresholds. Created a precedent that safety measures should scale with model capabilities. Influenced regulatory frameworks including SB 53 and EU AI Act implementation.

    Affected Groups: AI safety researchers, AI labs, policymakers, regulators

  • Risk Creation -1 · Medium-term

    Critics argued thresholds were self-defined and self-enforced, lacking independent oversight. Raised questions about whether AI labs could credibly self-regulate capability thresholds.

    Affected Groups: public, ethicists, regulators

共识度与来源

重要度 L1
分类 Capability Breakthrough
共识度 Emerging Consensus
影响指数 5/10
  • 1

    URL: https://www.anthropic.com/news/responsible-scaling-policy-v3

    We activated ASL-3 safeguards for relevant models in May 2025 and have been working to improve them ever since.
    Reference Evidence Citation logged Live source