Quota Usage

Agent Nine runs on a subscription with a transparent quota system. This page explains how your usage is measured, where to track it, and what happens when you hit the limit.

How usage is measured

All accounting is done in unit-tokens (UT) — an internal metric that accounts for:

Model cost. Claude Sonnet = 1 UT per message (baseline weight), Haiku ≈ 0.2 UT (5× cheaper), Opus ≈ 6 UT (6× more expensive).
Context size. A message with a large attached file or long conversation history weighs more than a short question.
Output length. A long agent response costs more than a short one.

Pricing page

On agentnine.ru/pricing we show an approximate message count per month — a rough estimate at average load (≈ 60,000 UT per user message). Actual usage scales with content and chosen model.

Three limit levels

Every paid tier has three parallel limits:

1. Monthly cap

The main tier boundary. Resets on the 1st of each month at 00:00 UTC. Pick your tier based on this number — it roughly equals the amount of work you comfortably do in 30 days.

2. Weekly cap

Protection against "I sat down for a week and burned the whole month in three days". Lets you work at peak load for 2–3 days, but not 30 days straight at max. Reset — rolling 7-day window.

3. 5-hour rolling window (burst protection)

A rolling 5-hour window. Anti-abuse protection from cyclic workloads (e.g., an accidentally-launched infinite loop). Not tied to "midnight" — updates smoothly as earlier requests "age out".

In practice, you'll almost always hit the monthly cap first. The other two are safety nets.

Where to track usage

Open Settings → Usage in the web interface: app.agentnine.ru/settings/usage

There you'll see:

Progress bars for all three limits (as % of quota)
Daily usage chart
Breakdown by model (Sonnet / Haiku / Opus)
Current subscription and renewal date

What happens when you hit a limit

100% monthly

Agent Nine doesn't block you instantly. You get an emergency buffer of 1–2 messages, then:

Option to wait until the 1st of next month, or
Upgrade — work resumes immediately on the new quota

Your chat history and all data stay intact regardless of which option you pick.

100% weekly

Wait until the current 7-day window ends, or upgrade. If you hit this regularly, it's a signal you're in a crunch period — taking a higher tier for that month usually makes sense.

100% 5h burst

Rare during normal work. Usually means automation, a bot, or an accidental loop. Resets automatically in 5 hours. If you're genuinely working actively and hit the 5h cap often — that's a clear upgrade signal.

How to save tokens

Our balancer automatically uses prompt caching — repeating parts of context (system prompt, tool definitions, conversation history) are billed at a heavily discounted rate (~10× cheaper than fresh input).

To maximize cache hits:

Work within a single session. Context is cached on the provider side and makes subsequent requests cheaper.
Avoid frequent context clears. Each new session starts with a cold cache.
Don't change the system prompt often. If you've customized agent instructions in settings, keep them stable.

Models and their "weight"

Model	Weight (UT per message)	When to use
Claude Haiku	~0.2	Fast, simple tasks, autocomplete, summarization
Claude Sonnet	1.0	Baseline model — ~90% of work
Claude Opus	~6.0	Complex architectural analysis, large refactors, non-trivial debugging

Using Opus 5 times in a row equals roughly 30 Sonnet messages in quota spend. If the task doesn't require Opus — Sonnet will handle it.

TIP

All three models are available on every paid tier without selection restrictions. You decide when to reach for the heavier model — it's your trade-off between quality and quota economy.

Plans and comparison

Full tier comparison, limits, and prices on agentnine.ru/pricing.

In brief (approximate messages per week):

Standard — ≈ 500 messages / week, baseline traffic priority. Hobby and part-time work.
Pro — ≈ 1,500 messages / week, elevated traffic priority. Daily productive work with the agent.
Ultra — ≈ 4,000 messages / week, maximum traffic priority. Full-time AI development, almost no quota anxiety.
Team — ≈ 300 messages / week per seat, elevated queue priority. Teams with shared quota pool and centralized billing.
Enterprise — custom limits, SLA, compliance, on-prem options, and dedicated queue (priority above all other tiers). On request.

What is traffic priority

During peak hours (when many users hit the agent simultaneously) requests are queued. The higher your tier, the faster your request is picked up. Under normal load the difference is invisible, but during spikes (for example when Anthropic itself is rate-limited) an Ultra/Pro priority saves minutes of waiting.

Priority order (highest → lowest): Enterprise → Ultra → Pro → Team → Standard → Free.

TIP

The per-week message count is an approximate estimate at average load. Actual usage depends on context size, selected model, and cache-hit rate. The monthly cap usually gives more flexibility — weekly reset allows peak days without locking out the whole month.

Quota Usage ​

How usage is measured ​

Three limit levels ​

1. Monthly cap ​

2. Weekly cap ​

3. 5-hour rolling window (burst protection) ​

Where to track usage ​

What happens when you hit a limit ​

100% monthly ​

100% weekly ​

100% 5h burst ​

How to save tokens ​

Models and their "weight" ​

Plans and comparison ​

What is traffic priority ​