CAPABILITY · DOM

A compact, indexed map of the page — not raw HTML

Twin compiles a live page into a numerically-indexed list of interactive elements under a token budget, so the planner sees signal, not markup.

x-twin-dom-tokens: 2,941
Built for the cost wedge

What token-efficient dom state does

Feeding raw HTML to a model is slow and expensive — most of it is layout, scripts, and noise. Twin parses the live DOM into a compact, numerically-indexed map of just the interactive elements, capped to a token budget. The planner acts on stable indices instead of brittle selectors, which is faster, cheaper, and more reliable.

Interactive elements only

Buttons, links, inputs, and roles are extracted; layout, scripts, and decoration are dropped before the model ever sees the page.

Stable numeric indices

The agent acts on [1], [2], [3] instead of fragile CSS/XPath selectors, so plans survive small page changes.

Hard token budget

The map is capped to fit a budget, so even huge pages stay cheap to reason about — a 50-step flow can run on a few thousand tokens.

Accessibility-aware

Roles and labels feed the map, which makes Twin a natural fit for accessibility automation as well as agents.

How it works

From a goal to deterministic action

  1. 1Load the pageTwin drives a real browser to the target URL and waits for the interactive state to settle.
  2. 2Extract interactivesThe DOM is walked for actionable elements with their roles, labels, and positions.
  3. 3Index and budgetElements are assigned stable numeric indices and trimmed to fit the token budget.
  4. 4Hand to the plannerThe planner picks actions by index; the executor maps them back to real elements.
In practice

See it on a real call

A live page becomes a short indexed list — the planner acts on [2], [3], [4], not raw HTML.

dom-state.txttext
# Indexed interactive map  (x-twin-dom-tokens: 2,941)
[1] link    "Calendar"
[2] button  "Tuesday"
[3] combobox "Time slot"
[4] button  "Reserve"
[5] input   "Notes" (optional)

→ plan: click[2] · select[3]="09:00" · click[4]
api.twin-browser.com
  1. Load the pagedone
  2. Extract interactivesrunning
  3. Index and budgetqueued
  4. Hand to the plannerqueued
At a glance

What token-efficient dom state is

The facts — how it works, what it costs, and the signal you get back on every call.

PropertyTwin Browser
InputLive DOM of the target page
OutputIndexed interactive map
BudgetCapped token count
AddressingStable numeric indices
DroppedLayout, scripts, decoration
Signalx-twin-dom-tokens header
FAQ

Token-efficient DOM state — common questions

Why not just send the HTML to the model?
Raw HTML is mostly noise and burns tokens fast. Twin extracts only interactive elements into a compact indexed map, which is cheaper to reason about and more robust to page changes.
What keeps plans from breaking on layout tweaks?
The agent acts on stable numeric indices and element roles rather than CSS/XPath selectors, so cosmetic changes rarely break a compiled skill.
How big can a page be?
The map is capped to a token budget regardless of page size, so even very large pages stay affordable to plan against.

Make every run cheaper than the last.

Start free, compile your first skill, and watch the marginal cost per run trend toward zero as your agents reuse what they have already learned.