Guide

A weekly-cap workflow that actually fits inside seven days

Claude Pro and Max ship with two stacked rate limits: the familiar 5-hour session window, and a rolling 7-day weekly cap that started on August 28, 2025. Most write-ups stop at “here is the limit, wait it out.” This is the routine that fits the heavy work into the window the cap leaves you, plus the structured signal a builder can read so the unlock moment is a UNIX timestamp instead of a guess.

Matthew Diakonov, Written with AI

Published May 6, 20266 min

Direct answer (verified May 6, 2026)

How do I work around Claude's weekly cap?

Five moves, in order, that fit inside one rolling 7-day window:

Triage the turn. Scaffolding and edits go to Haiku, reasoning to Sonnet, Opus only on the hard turns.
Cluster the work. Stack heavy Sonnet turns inside a single 5-hour session instead of drip-feeding them across days.
OAuth the builder. Connect Pro or Max so a build session bills your existing plan, not a metered key.
Read the reset epoch. rateLimitType: "weekly" plus aresetsAt UNIX milliseconds field arrives in the streaming response. Convert to local time. That is the exact minute the cap unlocks.
Extra usage last. Only buy API overage when a deadline is real. Otherwise wait the rolling window out.

Sources: Claude API rate-limits docs, TechCrunch coverage of the July 28, 2025 announcement, and the published Pro and Max plan pages on claude.com/pricing.

The 5-step weekly-cap workflow

1
Triage the turn
Scaffolding and edits go to Haiku. Reasoning goes to Sonnet. Opus only on hard turns.
2
Cluster the work
Stack heavy Sonnet turns inside one 5-hour session. Stop drip-feeding across days.
3
OAuth the builder
Connect Pro or Max. The build session bills your existing plan, not a metered key.
4
Read the reset epoch
rateLimitType + resetsAt UNIX in the stream. The exact minute the cap unlocks.
5
Extra usage, last
Only buy API overage when a deadline is real. Otherwise wait the cap out.

Step 1: triage the turn before you send it

The single biggest lever on the weekly cap is which model handles which turn. Anthropic's published weekly ranges are stated in Sonnet hours and Opus hours. Haiku turns burn far less of that allowance for the same scaffolding work. A typical session is 60 to 80 percent turns that do not need Sonnet at all: rename a file, regenerate boilerplate, fix a small bug, write copy, edit a CSS class. Send those to Haiku.

mk0r exposes this directly. The model picker in src/components/header.tsx labels the available models default as “Fast”, haikuas “Scary” (because it is the only model an anonymous visitor can run, and it ships sub-30-second builds), sonnet[1m]as “Fast+”, and opus[1m]as “Smart”. A heavy user on a Pro plan who keeps the picker on Haiku for the routine turns and only flips up to Sonnet or Opus when a turn actually needs more brain trades a small amount of intelligence per turn for a meaningful expansion of the weekly window.

Step 2: cluster, do not drip-feed

The weekly cap and the 5-hour cap stack. The 5-hour cap is cheaper than people think, because it resets four to five times per work day, but the weekly cap is harsher: once you blow it, you wait days. The trick is to load each 5-hour session full so the prompts that share context also share the rolling window.

Practically: pick a feature, sit down for a focused two- or three-hour block, and stay in the same conversation. Prompt caching makes back-to-back turns dramatically cheaper than cold-starting a new session every twenty minutes; the same cached system prompt and history bills at 0.1x the base input rate when read versus 1x when fresh. A clustered session burns wall-clock minutes inside the 5-hour window but spends fewer of the weekly Sonnet hours per useful unit of work than a fragmented day of cold sessions.

Step 3: connect a builder via OAuth so it bills your plan

If you are paying Anthropic $20, $100, or $200 per month for Pro or Max, the worst outcome is paying that AND a metered API bill on the same builder for the same work. Anthropic shipped an OAuth scope called user:inference alongside Claude Code that lets an external client spend tokens against your existing Pro or Max plan. Builders that implement the same PKCE handshake (mk0r does, the constants live in src/lib/claude-oauth.ts) can route a chat turn through your subscription instead of a metered key.

On the rate-limit side, this means a connected builder consumes from the same weekly window as Claude.com itself. So the cap shows up faster, but every prompt you send through the builder also counts toward what you already pay for. The decision happens on a single line: src/app/api/chat/route.ts reads const apiKey = oauthToken ? "" : sharedApiKey ?? ""; If oauthToken is set, the metered key is wiped and the inference bills your plan.

Step 4: read the reset epoch instead of guessing

When a cap fires, Anthropic's streaming response carries a structured event that names the cap and tells you the exact unlock moment. Specifically, a rate_limit_event with a rate_limit_infoobject containing eight fields: status, resetsAt (UNIX milliseconds), rateLimitType, utilization, overageStatus, overageDisabledReason, isUsingOverage, surpassedThreshold. The status tells you whether the prompt went through with a warning or got rejected. The rateLimitType says “five_hour” or “weekly”. The resetsAt is the exact UNIX timestamp the cap unlocks.

mk0r forwards all eight fields. The passthrough lives at src/core/vm-scripts.ts lines 143 to 163, and the typed event in src/lib/chat-events.ts lines 73 to 83. The chat UI converts resetsAt to the user's local time and pins a banner: usage limit reached, resets at H:MM. No guessing, no email-support roundtrip, no tab-refresh-and-pray.

This matters more than it sounds because the stock @agentclientprotocol/claude-agent-acp wrapper, which is what most Claude-backed builders use, drops the rate_limit_event in its session-update translator. Builders that have not patched the wrapper surface a generic prompt error and nothing else, so the user has to refresh the Claude.com Settings → Usage page to know when to come back.

weekly

“rateLimitType is the field on the structured event that names the cap. Either 'five_hour' or 'weekly'. mk0r forwards it from src/core/vm-scripts.ts:153, paired with a UNIX resetsAt in milliseconds, so a hit on the rolling 7-day cap is a precise epoch in the chat stream, not a generic toast.”

src/core/vm-scripts.ts:143-163, src/lib/chat-events.ts:73-83

Step 5: buy extra usage only when a deadline is real

Max plans expose an extra-usage option at standard API rates. Pro plans do not, so a Pro user past the cap has three honest choices: wait the rolling window out, switch the work to the per-token API for the rest of the week, or upgrade. The cheapest of the three for most patterns is wait. The weekly cap, by Anthropic's own estimate at the August 2025 rollout, only affects roughly the top 5 percent of subscribers, so a single bad day usually does not repeat in the next 7-day window if you change Step 1 and Step 2.

When a deadline is real, the per-token API path with prompt caching is the move. Cache reads bill at 0.1x the base input rate, cache writes at 1.25x for the 5-minute cache or 2x for the 1-hour cache. A long Sonnet session with a stable system prompt and history hits the cache-read rate after the first turn, which is roughly an order of magnitude cheaper than fresh input. That is usually cheaper than turning on extra usage on the consumer plan, because the API path lets you control the cache strategy directly.

How the announced ranges line up across plans

Anthropic's announcement on July 28, 2025 published the ranges below. The ranges are wide because actual hours depend on prompt size, tool-call count, and output length. Use them as planning anchors, not contract numbers.

Feature	Max ($100 / $200)	Pro ($20/mo)
Headline price	$20 / month	$100 (Max 5x), $200 (Max 20x)
Sonnet hours / week (announced range)	40-80 hrs	140-280 hrs (5x), 240-480 hrs (20x)
Opus hours / week (announced range)	Effectively none on Pro	15-35 hrs (5x), 24-40 hrs (20x)
Reset cadence	Rolling 7 days from your first prompt	Same rolling 7 days, same clock
Extra usage past the cap	Not available on Pro	Yes, billed at standard API rates
Coexists with the 5-hour window	Yes	Yes

Numbers are Anthropic's announced ranges at the August 28, 2025 rollout. The 'Sonnet hours' and 'Opus hours' figures assume Claude Code style sessions; lighter chat-only usage stretches further. The weekly cap and the 5-hour cap stack, both visible in Settings then Usage on Claude.com.

Want a second pair of eyes on your Claude routing?

15 minutes on Cal. We will look at your typical session shape, figure out which turns belong on Haiku versus Sonnet, and tell you whether OAuth + Pro is the right setup for your weekly volume.

Frequently asked questions

How does Claude's weekly cap actually reset?

On a rolling 7-day window from your first prompt, not a fixed wall-clock time. So if you sent your first prompt of the week at 09:14 on a Tuesday, that prompt rolls off the window at 09:14 the following Tuesday. The cap is in addition to the 5-hour session window, which resets every five hours from the first message of a session. Both can fire on the same prompt: a single chat turn can be the one that pushes you over a 5-hour ceiling, a weekly ceiling, or both. The streaming response from Anthropic includes a rateLimitType field that tells you which one was hit ('five_hour' or 'weekly') plus a resetsAt UNIX timestamp for the exact unlock moment.

When did the weekly cap start, and who does it actually affect?

Anthropic announced the new weekly rate limits for Claude Pro and Max on July 28, 2025 and rolled them out on August 28, 2025. The official estimate at announcement time was that the new limits would apply to less than 5 percent of subscribers based on then-current usage, so most casual Pro users will hit the 5-hour window long before they ever brush against the weekly cap. The weekly limits matter for heavy Claude Code users, anyone running a long Sonnet or Opus session every working day, and people who script multiple agents against the same plan. Source: Anthropic on X, July 28 2025, plus the Pro/Max plan pages on claude.com.

How many hours of Sonnet do I get in a week on Pro versus Max?

Anthropic's published ranges at the August 2025 rollout: Pro ($20/mo) gets approximately 40 to 80 hours of Sonnet per week through Claude Code. Max 5x ($100/mo) gets 140 to 280 hours of Sonnet plus 15 to 35 hours of Opus. Max 20x ($200/mo) gets 240 to 480 hours of Sonnet plus 24 to 40 hours of Opus. The ranges are wide because the actual hours depend on your prompt size, tool-call count, and how much output you stream. A heavy prompt with a lot of cached input and a long output burns the same wall-clock minute faster than a one-shot generation. Anthropic does not publish a precise token-to-minute formula.

Do Haiku turns count against the weekly cap?

Haiku is on the same family of plans, but it is the cheapest model and burns the weekly hours far slower than Sonnet for the same work. Anthropic's published ranges are stated in Sonnet and Opus hours, not Haiku hours. Practically, a Haiku turn for scaffolding, file edits, copy tweaks, or trivial tool calls is the cheapest way to make forward progress when your Sonnet hours are running thin. mk0r leans on this directly: src/app/api/chat/model/route.ts pins FREE_MODEL = 'haiku' for anonymous and free-tier sessions, and the model picker in src/components/header.tsx exposes Haiku, Sonnet, and Opus separately so a Pro user can route the easy turns to Haiku and reserve Sonnet for the turns that actually need it.

Where do I see how much of my weekly cap I have used?

Inside Claude.com on a Pro, Max, Team, or seat-based Enterprise plan: Settings then Usage. There are progress bars for the 5-hour session window and the weekly limits. The page distinguishes Opus from non-Opus usage because the Opus weekly cap is separate from the overall weekly cap. Inside Claude Code or a builder that streams the structured rate_limit_event, you also get a real-time signal: the event includes a utilization field between 0 and 1 that tells you how close to the cap the most recent prompt left you. A status of 'allowed_warning' is the soft warning that you are approaching the wall.

What does the structured signal look like in code?

Anthropic's streaming response carries a rate_limit_event with a rate_limit_info object. The event carries eight fields: status (allowed, allowed_warning, rejected, unknown), resetsAt (UNIX milliseconds for the unlock moment), rateLimitType (e.g. 'five_hour' or 'weekly'), utilization (0 to 1), overageStatus, overageDisabledReason, isUsingOverage, surpassedThreshold. mk0r forwards all eight at src/core/vm-scripts.ts:143-163, typed in src/lib/chat-events.ts:73-83. The stock @agentclientprotocol/claude-agent-acp wrapper drops the message because it has no mapping for rate_limit_event yet, which is why most Claude-backed builders surface only a generic 'something went wrong' toast.

Should I buy extra usage when I hit the weekly cap?

Only when a deadline is real. Extra usage on Max plans is billed at standard API rates, which is fine for a single high-value session but adds up across a week. The cheaper move 90 percent of the time is to wait the rolling window out. Read the resetsAt epoch the streaming response gives you, convert it to your local time, and schedule the next heavy session for then. If the work cannot wait, the API path with prompt caching (cache reads at 0.1x base, cache writes at 1.25x base for the 5-minute cache) is usually cheaper than enabling extra usage on the consumer plan, because cache reads bill against the API account directly and you control the cache hit rate.

Can I avoid the weekly cap by switching to a different builder?

No, not if the new builder uses the same Anthropic credentials. The cap travels with the credentials, not the tool. If both builders authenticate against your Pro plan via OAuth, or both use your personal API key, you hit the same wall in both places. The only way to route around the weekly cap is to route through different credentials: a separate Anthropic API account, a different OAuth identity, or a different model provider entirely for the work that does not have to be Claude. The flip side: connecting a builder via OAuth instead of an API key means the build sessions bill your Pro plan, which is exactly what you want when Pro is otherwise sitting idle while the API spend climbs.

How does mk0r route around the cap by default?

Three things, all readable in the source. First, anonymous sessions are pinned to Haiku at src/app/api/chat/model/route.ts line 5 (FREE_MODEL = 'haiku'), so the cheapest model carries the first turn and the next-prompt iteration loop. Second, signed-in users with a connected Pro or Max plan get the OAuth path: src/app/api/chat/route.ts decides between the shared API key and the user's OAuth blob, and inference billed against an OAuth scope absorbs into the same plan they already pay for. Third, when a weekly cap actually fires, the patched ACP forwards the structured event so the chat UI can show 'usage limit reached, resets at H:MM' instead of a generic error toast. Three small things, but together they turn the weekly cap from a brick wall into a planned schedule.