加速响应，快速模式。

加速响应，快速模式。
Speed up responses with fast mode

原始链接: https://code.claude.com/docs/en/fast-mode

## Claude 快速模式：摘要 Claude 现在为 Opus 4.6 模型提供“快速模式”，优先考虑速度而非成本。它不是新模型，而是不同的 API 配置，提供更快的响应——保持与标准 Opus 4.6 相同质量和功能。使用 Claude Code CLI 或 VS Code 扩展中的 `/fast` 激活它。定价从 150M tokens 的 30 美元起，在 2 月 16 日之前可享受 50% 的折扣。快速模式使用额外的使用量，不包含在标准订阅限制中（Pro/Max/Team/Enterprise 计划）。 **关键信息：** * **成本：** 每 token 的定价明显更高。在对话中途启用比从快速模式开始更昂贵。 * **使用时机：** 适用于交互式任务，例如快速编码迭代和实时调试。 * **不可用：** 在第三方云提供商（Bedrock、Vertex AI、Azure）上。 * **速率限制：** 快速模式有单独的速率限制；超出限制将恢复到标准 Opus 4.6。目前处于研究预览阶段，功能和定价可能会发生变化。

## Claude推出“快速模式” - 摘要 Claude推出了“快速模式”，以提供更快的响应速度，但需要额外付费。无论剩余计划额度如何，使用该模式的费用都将计入额外使用量，并且价格显著更高——大约是标准费用的6倍（30美元/150 MTok）。目前，用户可以申领50美元的额外使用额度来试用。讨论的重点在于*如何*实现速度提升。理论包括优先级提升、使用新硬件（如Groq或Cerebras），或量化技术。有人推测这最终可能成为标准功能，类似于Opus模型的降价。人们对潜在的“黑暗模式”表示担忧——通过减慢标准响应速度来鼓励用户采用快速模式——类似于过去对苹果等公司的批评。用户也想知道实际的提速效果（据报道约为2.5倍，有消息来源称更高），以及它是否仅仅是跳过队列，还是真正的性能提升。许多用户表示该模式的成本过高。

原文

Fast mode is in research preview. The feature, pricing, and availability may change based on feedback.

Fast mode delivers faster Opus 4.6 responses at a higher cost per token. Toggle it on with /fast when you need speed for interactive work like rapid iteration or live debugging, and toggle it off when cost matters more than latency. Fast mode is not a different model. It uses the same Opus 4.6 with a different API configuration that prioritizes speed over cost efficiency. You get identical quality and capabilities, just faster responses. What to know:

Use /fast to toggle on fast mode in Claude Code CLI. Also available via /fast in Claude Code VS Code Extension.
Fast mode for Opus 4.6 pricing starts at $30/150 MTok. Fast mode is available at a 50% discount for all plans until 11:59pm PT on February 16.
Available to all Claude Code users on subscription plans (Pro/Max/Team/Enterprise) and Claude Console.
For Claude Code users on subscription plans (Pro/Max/Team/Enterprise), fast mode is available via extra usage only and not included in the subscription rate limits.

This page covers how to toggle fast mode, its cost tradeoff, when to use it, requirements, and rate limit behavior.

Toggle fast mode

Toggle fast mode in either of these ways:

Type /fast and press Tab to toggle on or off
Set "fastMode": true in your user settings file

Fast mode persists across sessions. For the best cost efficiency, enable fast mode at the start of a session rather than switching mid-conversation. See understand the cost tradeoff for details. When you enable fast mode:

If you’re on a different model, Claude Code automatically switches to Opus 4.6
You’ll see a confirmation message: “Fast mode ON”
A small ↯ icon appears next to the prompt while fast mode is active
Run /fast again at any time to check whether fast mode is on or off

When you disable fast mode with /fast again, you remain on Opus 4.6. The model does not revert to your previous model. To switch to a different model, use /model.

Understand the cost tradeoff

Fast mode has higher per-token pricing than standard Opus 4.6:

Mode	Input (MTok)	Output (MTok)
Fast mode on Opus 4.6 (<200K)	$30	$150
Fast mode on Opus 4.6 (>200K)	$60	$225

Fast mode is compatible with the 1M token extended context window. When you switch into fast mode mid-conversation, you pay the full fast mode uncached input token price for the entire conversation context. This costs more than if you had enabled fast mode from the start.

Decide when to use fast mode

Fast mode is best for interactive work where response latency matters more than cost:

Rapid iteration on code changes
Live debugging sessions
Time-sensitive work with tight deadlines

Standard mode is better for:

Long autonomous tasks where speed matters less
Batch processing or CI/CD pipelines
Cost-sensitive workloads

Fast mode vs effort level

Fast mode and effort level both affect response speed, but differently:

Setting	Effect
Fast mode	Same model quality, lower latency, higher cost
Lower effort level	Less thinking time, faster responses, potentially lower quality on complex tasks

You can combine both: use fast mode with a lower effort level for maximum speed on straightforward tasks.

Requirements

Fast mode requires all of the following:

Not available on third-party cloud providers: fast mode is not available on Amazon Bedrock, Google Vertex AI, or Microsoft Azure Foundry. Fast mode is available through the Anthropic Console API and for Claude subscription plans using extra usage.
Extra usage enabled: your account must have extra usage enabled, which allows billing beyond your plan’s included usage. For individual accounts, enable this in your Console billing settings. For Teams and Enterprise, an admin must enable extra usage for the organization.

Fast mode usage is billed directly to extra usage, even if you have remaining usage on your plan. This means fast mode tokens do not count against your plan’s included usage and are charged at the fast mode rate from the first token.

Admin enablement for Teams and Enterprise: fast mode is disabled by default for Teams and Enterprise organizations. An admin must explicitly enable fast mode before users can access it.

If your admin has not enabled fast mode for your organization, the /fast command will show “Fast mode has been disabled by your organization.”

Enable fast mode for your organization

Admins can enable fast mode in:

Handle rate limits

Fast mode has separate rate limits from standard Opus 4.6. When you hit the fast mode rate limit or run out of extra usage credits:

Fast mode automatically falls back to standard Opus 4.6
The ↯ icon turns gray to indicate cooldown
You continue working at standard speed and pricing
When the cooldown expires, fast mode automatically re-enables

To disable fast mode manually instead of waiting for cooldown, run /fast again.

Research preview

Fast mode is a research preview feature. This means:

The feature may change based on feedback
Availability and pricing are subject to change
The underlying API configuration may evolve

Report issues or feedback through your usual Anthropic support channels.