From a goal to deterministic, replayable action
Twin is the browser execution layer for LLM agents. Here is the full pipeline — how a plain-English goal becomes a compiled skill, and how the next similar request skips the model entirely.
- Open billing.acme.comdone
- Compile DOM → indexed state (42 elements, ~3k tokens)done
- Plan: log in → open invoices → download latestdone
- Act: fill #user · fill #pass · click “Sign in”running
- Freeze the path into skill sk_9f2cqueued
One cold run, captured: open, compile the page into indexed state, plan, act, and freeze the path into a skill the next run replays for free.
Eight stages, one cost curve that bends down
The LLM does the hard thinking once, on the cold path. Everything after is cache and replay.
- 01
DOM → indexed-state compiler
A live page is compiled into a compact, numerically-indexed map of just the interactive elements — under a token budget — instead of raw HTML. A 50-step flow becomes ~3k tokens of state the model can actually reason over. token-efficient DOM →
- 02
Planner picks actions
The planner reads the indexed state and chooses the next action — click element 14, type into element 7, submit. This is the only stage that needs an LLM, and only on the cold path.
- 03
Successful run compiles into a skill
When a goal completes, the path is frozen into a reusable skill: a deterministic action plan keyed to the page’s structure, stored in your agent/skill library. skill compilation →
- 04
Semantic dispatch cache matches re-phrased requests
A new request is embedded and matched by meaning against compiled skills — so “schedule a call” finds the skill you built for “book a demo,” not just an exact string repeat.
- 05
Deterministic replay (zero LLM)
On a cache hit, the skill replays deterministically with no model call. This is where the cost curve bends: a hit is ~5× cheaper and the marginal cost of the next run trends toward zero.
- 06
Human-in-the-loop handoff
A blocked step — an approval, or MFA on a flow you’re authorized to run — pauses and hands off to a human via the live view, then resumes the skill where it left off.
- 07
Cross-tenant skill corpus reuse
A skill compiled once can be safely reused across tenants, so the cache-hit rate compounds as the whole network runs — you benefit from skills you never had to compile yourself.
- 08
Live view, session video, vault & audit
Every run streams a real-time live view and is recorded to durable session video. Credentials live in a per-tenant vault and every call is written to an audit log. credential vault →
The cold run, then the cache hit
First call compiles a skill. The next similar call — even re-worded — replays it deterministically with near-zero LLM tokens.
curl -X POST https://twin-browser.com/api/v1/run \
-H "Authorization: Bearer $TWIN_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"goal": "Log in and download this month'\''s invoice",
"url": "https://billing.acme.com"
}'
# cold path: the planner compiles a skill,
# returns the result + a skill_id (llm_tokens: ~3120)curl -X POST https://twin-browser.com/api/v1/dispatch \
-H "Authorization: Bearer $TWIN_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"goal": "Grab the latest invoice PDF",
"url": "https://billing.acme.com"
}'
# semantic cache HIT -> deterministic replay
# ~0 LLM tokens, ~5x cheaper than the cold runFull reference on the API and docs pages, or drive the exact same pipeline from your editor over the MCP server.
What runs on every call
Twin automates the web where you’re authorized — first-party sites, operator-approved automation, internal RPA, accessibility, and authorized testing.
The run’s target URL is the authorization signal. On every call, authentication, usage billing, and audit logging run before any action is taken. The backend is multi-tenant Supabase with default-deny RLS, per-tenant API keys, an audit log, and a credential vault for the secrets a flow needs.
Twin is not a CAPTCHA-bypass-for-hire or anti-bot evasion service. It’s the execution layer for the automation you’re allowed to run — and it keeps the receipts to prove it.
The pipeline, answered
Why compile the DOM into indexed state instead of sending raw HTML?
When does an LLM actually run?
How does the semantic cache match a re-worded request?
What happens when a step needs a human?
Is anything shared across tenants?
Run your first skill in minutes
Free to start. Usage-based credits from $29/mo, with LLM cost metered and passed through at 1×.