Use case

AI agents

Give your LLM agent a real browser it can drive — and stop paying the model on every single run.

The problem

Autonomous agents need to click, type, log in, and read live pages, not just call APIs. The usual fix is to hand the agent a raw cloud browser and let the LLM reason over raw HTML on every step. That works in a demo and falls apart in production: each run re-pays the model, raw DOM blows the context window, and the same task costs the same every time no matter how often it runs.

app.example.com
  1. Receive goal from your agentdone
  2. Compile DOM → indexed staterunning
  3. Match semantic dispatch cachequeued
  4. Replay skill — zero LLM callsqueued
  5. Return structured resultqueued

A Twin run for ai agents — compile once, then replay on a cache hit.

The wedge

How Twin solves it

Twin is the browser execution layer for LLM agents. It compiles a goal into a deterministic, replayable skill the first time, fuzzy-matches the next re-phrased request to that skill with a semantic dispatch cache, and replays it with zero LLM calls. Your agent keeps a clean, token-efficient view of the page instead of raw HTML, so marginal cost per run trends toward zero as your agents run more.

  1. 1Point your agent at POST /api/v1/run with a natural-language goal; Twin returns a token-efficient, numerically-indexed map of the page instead of raw HTML.
  2. 2The first successful run compiles into a skill — the planned action path, generalized and stored.
  3. 3The next, differently-worded request hits the semantic dispatch cache and matches that skill, so it runs without re-invoking the planner LLM.
  4. 4Matched skills replay deterministically; blocked steps (approval, MFA on an authorized flow) pause for human-in-the-loop handoff, then resume.
  5. 5A cross-tenant skill corpus means a skill compiled once can be safely reused, so your hit rate climbs as the network runs.
In practice

One call, then it gets cheaper

Hand your agent a goal. The first run compiles a skill; the next re-phrased request hits the semantic cache and replays with zero LLM calls.

run.tsts
import Twin from '@twin-browser/sdk';

const twin = new Twin({ apiKey: process.env.TWIN_API_KEY });

// Your agent passes a natural-language goal — Twin handles the browser.
const run = await twin.agents.run({
  goal: 'Open the dashboard and read the latest order status',
  url: 'https://app.example.com',
});

console.log(run.status);       // 'completed'
console.log(run.cached);       // true on a cache hit — no planner LLM
console.log(run.creditsUsed);  // ~1 on replay vs ~10 on cold compile
console.log(run.result);       // structured data, not raw HTML

What happens on this call

  • Twin compiles the goal into a deterministic, replayable skill.
  • The next re-phrased request matches it in the semantic dispatch cache.
  • Matched runs replay with zero LLM calls — credits drop back toward ~1.
  • Every call is authenticated, billed, and written to the audit log.
Read the API docs

The outcome

A repetitive, authenticated agent task that costs the full model bill every run on raw-browser infra instead settles to a cache hit — illustratively ~5x cheaper per run after warmup — while staying fully observable through live view and session video.

FAQ

AI agents on Twin — common questions

How is this different from giving my agent a raw cloud browser?
A raw browser re-runs the LLM on every execution, so cost scales linearly with usage. Twin adds a semantic dispatch cache and deterministic replay on top, so repeated and re-phrased tasks hit a compiled skill at a fraction of the LLM cost.
Which agent frameworks does Twin work with?
Twin exposes a REST API under /api/v1/*, an MCP server (tools: run, compile_skill, run_skill) for Cursor, Claude Desktop, Claude Code and Cline, and LangChain and AutoGen tool adapters. Any agent that can call an HTTP endpoint or load an MCP tool can drive a Twin browser.
Does my agent still control the browser step by step?
Yes. You can run goal-to-action end to end, or drive lower-level steps and read the indexed DOM state yourself. Twin handles the execution, caching, replay, vault, and handoff underneath.

Put ai agents on autopilot.

Start free, compile your first skill, and watch the marginal cost per run trend toward zero.