MCP Tools
Reference for all 9 CUP tools exposed via the MCP server.
Tools overview
CUP's MCP server exposes 9 tools for AI agents to perceive and interact with the UI.
snapshot
Capture the active window's accessibility tree.
snapshot()Returns the foreground window's UI tree in compact text format with a header containing platform, screen, and app metadata.
snapshot_app
Capture a specific app's window by title.
snapshot_app(app: string)| Parameter | Type | Description |
|---|---|---|
app | string | Case-insensitive substring match against window titles |
Use this when you need to interact with a window that is not in the foreground.
overview
List all open windows. Near-instant, no tree walking.
overview()Returns a lightweight window list showing app names, PIDs, and bounds. Use this to discover what apps are open before targeting one with snapshot_app.
snapshot_desktop
Capture the desktop surface (icons, widgets, shortcuts).
snapshot_desktop()Returns the desktop accessibility tree for interacting with desktop items.
find
Search the last captured tree for elements matching criteria.
find(query?, role?, name?, state?)| Parameter | Type | Description |
|---|---|---|
query | string? | Freeform semantic query (e.g., "play button") |
role | string? | Role filter with synonyms |
name | string? | Fuzzy name match |
state | string? | Exact state match |
Searches the full tree (including pruned elements) with semantic matching and relevance ranking. Results are sorted by relevance. See Session API > find() for usage examples.
action
Perform an action on a UI element or send a keyboard shortcut.
action(action, element_id?, value?, direction?, keys?)| Parameter | Type | Description |
|---|---|---|
action | string | Action name (click, type, press, etc.) |
element_id | string? | Target element (e.g., "e14") |
value | string? | Text for type or setvalue |
direction | string? | Direction for scroll (up/down/left/right) |
keys | string? | Key combo for press (e.g., "ctrl+s") |
See Actions Reference for all 15 actions and their parameters.
open_app
Open an application by name with fuzzy matching.
open_app(name: string)| Parameter | Type | Description |
|---|---|---|
name | string | App name (fuzzy matched against installed apps) |
Waits for the app window to appear before returning.
page
Page through clipped content in a scrollable container.
page(element_id, direction?, offset?, limit?)| Parameter | Type | Description |
|---|---|---|
element_id | string | Scrollable container element ID (e.g., "e5") |
direction | string? | Page direction: up, down, left, or right |
offset | int? | Jump to a specific child index (overrides direction) |
limit | int? | Override page size (default: match viewport count) |
When a snapshot shows "N more items — page(...) to see", use this tool to retrieve the next batch of hidden children from the cached tree. This does not scroll the actual UI — it serves from the cached tree. After any action or new snapshot, pagination resets.
screenshot
Capture a screenshot of the screen.
screenshot(region_x?, region_y?, region_w?, region_h?)| Parameter | Type | Description |
|---|---|---|
region_x | int? | Left edge of capture region |
region_y | int? | Top edge of capture region |
region_w | int? | Width of capture region |
region_h | int? | Height of capture region |
By default captures the full primary monitor. Returns a PNG image.