Use case

AI agents

Give your LLM agent a real browser it can drive — and stop paying the model on every single run.

The problem

Autonomous agents need to click, type, log in, and read live pages, not just call APIs. The usual fix is to hand the agent a raw cloud browser and let the LLM reason over raw HTML on every step. That works in a demo and falls apart in production: each run re-pays the model, raw DOM blows the context window, and the same task costs the same every time no matter how often it runs.

See how Twin works

app.example.com

AI agents

Receive goal from your agentdone
Compile DOM → indexed staterunning
Match semantic dispatch cachequeued
Replay skill — zero LLM callsqueued
Return structured resultqueued

A Twin run for ai agents — compile once, then replay on a cache hit.

The wedge

How Twin solves it

Twin is the browser execution layer for LLM agents. It compiles a goal into a deterministic, replayable skill the first time, fuzzy-matches the next re-phrased request to that skill with a semantic dispatch cache, and replays it with zero LLM calls. Your agent keeps a clean, token-efficient view of the page instead of raw HTML, so marginal cost per run trends toward zero as your agents run more.

1Point your agent at POST /api/v1/run with a natural-language goal; Twin returns a token-efficient, numerically-indexed map of the page instead of raw HTML.
2The first successful run compiles into a skill — the planned action path, generalized and stored.
3The next, differently-worded request hits the semantic dispatch cache and matches that skill, so it runs without re-invoking the planner LLM.
4Matched skills replay deterministically; blocked steps (approval, MFA on an authorized flow) pause for human-in-the-loop handoff, then resume.
5A cross-tenant skill corpus means a skill compiled once can be safely reused, so your hit rate climbs as the network runs.

In practice

One call, then it gets cheaper

Hand your agent a goal. The first run compiles a skill; the next re-phrased request hits the semantic cache and replays with zero LLM calls.

run.tsts

import Twin from '@twin-browser/sdk';

const twin = new Twin({ apiKey: process.env.TWIN_API_KEY });

// Your agent passes a natural-language goal — Twin handles the browser.
const run = await twin.agents.run({
  goal: 'Open the dashboard and read the latest order status',
  url: 'https://app.example.com',
});

console.log(run.status);       // 'completed'
console.log(run.cached);       // true on a cache hit — no planner LLM
console.log(run.creditsUsed);  // ~1 on replay vs ~10 on cold compile
console.log(run.result);       // structured data, not raw HTML

What happens on this call

Twin compiles the goal into a deterministic, replayable skill.
The next re-phrased request matches it in the semantic dispatch cache.
Matched runs replay with zero LLM calls — credits drop back toward ~1.
Every call is authenticated, billed, and written to the audit log.

Read the API docs

Under the hood

The machinery that bends the cost curve

Every use case runs on the same primitives — the wedge that makes browser work cheaper the more your agents run.

Semantic dispatch cache

Re-phrased requests fuzzy-match a skill you already compiled, so they skip the planner LLM entirely.

Learn more

Deterministic replay

Matched skills replay the same way every time — a pass is a pass, and the marginal cost trends toward zero.

Learn more

Token-efficient DOM state

A live page becomes a compact, numerically-indexed map of interactive elements instead of raw HTML.

Learn more

Human-in-the-loop handoff

Blocked steps — approvals, MFA on an authorized flow — pause for a person, then resume cleanly.

Learn more

The outcome

A repetitive, authenticated agent task that costs the full model bill every run on raw-browser infra instead settles to a cache hit — illustratively ~5x cheaper per run after warmup — while staying fully observable through live view and session video.

Go deeper

Guide: How to give an AI agent a browser Glossary: Browser execution layer Integrate over MCP

FAQ

AI agents on Twin — common questions

How is this different from giving my agent a raw cloud browser?

A raw browser re-runs the LLM on every execution, so cost scales linearly with usage. Twin adds a semantic dispatch cache and deterministic replay on top, so repeated and re-phrased tasks hit a compiled skill at a fraction of the LLM cost.

Which agent frameworks does Twin work with?

Twin exposes a REST API under /api/v1/*, an MCP server (tools: run, compile_skill, run_skill) for Cursor, Claude Desktop, Claude Code and Cline, and LangChain and AutoGen tool adapters. Any agent that can call an HTTP endpoint or load an MCP tool can drive a Twin browser.

Does my agent still control the browser step by step?

Yes. You can run goal-to-action end to end, or drive lower-level steps and read the indexed DOM state yourself. Twin handles the execution, caching, replay, vault, and handoff underneath.

RPA replacement

Replace brittle, selector-keyed RPA bots with skills that adapt to the page and get cheaper the more they run.

Internal workflow automation

Automate the internal tools and vendor portals that have no API — with audit logging and human approval built in.

Data extraction at scale

Extract from authenticated, multi-step pages — and stop re-paying the model to read the same site every run.

Put ai agents on autopilot.

Start free, compile your first skill, and watch the marginal cost per run trend toward zero.

Start free Read the guides

AI agents

The problem

How Twin solves it

One call, then it gets cheaper

What happens on this call

The machinery that bends the cost curve

Semantic dispatch cache

Deterministic replay

Token-efficient DOM state

Human-in-the-loop handoff

The outcome

Go deeper

AI agents on Twin — common questions

More ways teams use Twin

RPA replacement

Internal workflow automation

Data extraction at scale

Put ai agents on autopilot.