Getting Started
Quick Start
Capture your first UI tree and perform an action in 30 seconds.
Your first snapshot
Capture the foreground window
import cup
screen = cup.snapshot()
print(screen)import { snapshot } from "computeruseprotocol";
const screen = await snapshot();
console.log(screen);This prints the current foreground window's UI tree in compact format.
Find an element
session = cup.Session()
# Capture the tree
screen = session.snapshot()
# Search for a button
results = session.find(query="submit button")
print(results)import { Session } from "computeruseprotocol";
const session = await Session.create();
// Capture the tree
const screen = await session.snapshot();
// Search for a button
const results = await session.find({ query: "submit button" });
console.log(results);Perform an action
# Click the element
result = session.action("e14", "click")
print(result) # ActionResult(success=True, message="Clicked Submit")
# Type into a field
session.action("e5", "type", value="hello world")
# Press a keyboard shortcut
session.press("ctrl+s")// Click the element
const result = await session.action("e14", "click");
console.log(result); // { success: true, message: "Clicked Submit" }
// Type into a field
await session.action("e5", "type", { value: "hello world" });
// Press a keyboard shortcut
await session.press("ctrl+s");The typical agent workflow
Most AI agents follow a simple loop:
1. snapshot() → capture the current UI state
2. find() → locate the target element
3. action() → interact with it
4. snapshot() → verify the resultEach snapshot() returns fresh element IDs. Always re-capture after performing actions, since the UI tree may have changed.
What's next?
- Learn about UI trees — how CUP represents screens
- Explore all 15 actions — click, type, scroll, and more
- Set up MCP — connect CUP to Claude Code or Cursor