CUPComputer Use Protocol
Core Concepts

UI Trees

How CUP represents screen contents as a tree of accessible elements.

What is a UI tree?

Every desktop application exposes a hierarchy of UI elements through its operating system's accessibility API. A window contains panels, panels contain buttons and text fields, text fields contain text. This hierarchy is the accessibility tree.

CUP captures this tree, normalizes it into a universal schema, and serializes it into a compact text format optimized for LLM consumption.

Tree structure

A CUP tree is an array of nodes. Each node represents a UI element:

{
  "id": "e14",
  "role": "button",
  "name": "Submit",
  "bounds": { "x": 120, "y": 340, "w": 88, "h": 36 },
  "states": ["focused"],
  "actions": ["click"],
  "children": []
}

In compact format, the same node becomes:

[e14] btn "Submit" 120,340 88x36 {foc} [clk]

Node properties

PropertyDescriptionExample
idUnique identifier (ephemeral per snapshot)e14
roleARIA-derived semantic rolebutton, textbox
nameAccessible name (max 200 chars)"Submit"
boundsScreen position and size{ x: 120, y: 340, w: 88, h: 36 }
statesActive state flags["focused", "expanded"]
actionsAvailable interactions["click", "type"]
childrenNested child nodes[...]
valueCurrent value (for inputs/sliders)"hello"
platformNative platform properties{ windows: { ... } }

Element IDs

Element IDs follow the pattern e0, e1, e2, etc. They are ephemeral — valid only for the snapshot that generated them. After performing any action, you must re-capture to get fresh IDs.

This design keeps the protocol stateless: every snapshot is a complete, self-contained representation of the UI.

Tree depth

Trees can be deep for complex UIs. CUP's pruning pipeline removes noise (decorative elements, zero-size nodes, redundant containers) to keep the tree focused on interactive and meaningful elements.

A typical Spotify window goes from 280 raw nodes to 63 nodes after pruning — a 78% reduction in tree size, on top of the ~75% token reduction from compact encoding.

On this page