CUPComputer Use Protocol
Core Concepts

Capture Scopes

The four capture scopes control what CUP captures — from a single window to the entire screen.

Scopes

CUP supports four capture scopes that control the breadth and depth of the UI snapshot.

ScopeCapturesTree walkingUse case
foregroundActive window + window listYesDefault — interact with the focused app
overviewWindow list onlyNo (instant)Discover what apps are open
desktopDesktop surfaceYesInteract with desktop icons/widgets
fullAll windowsYesMulti-window workflows

Usage

import cup

session = cup.Session()

# Default: capture the foreground window
screen = session.snapshot()

# Just get the window list (instant, no tree walking)
windows = session.snapshot(scope="overview")

# Desktop icons and taskbar
desktop = session.snapshot(scope="desktop")

# Every window on screen
everything = session.snapshot(scope="full")

# Filter by app name (full scope only)
discord = session.snapshot(scope="full", app="Discord")

Foreground (default)

The most common scope. Captures the active/focused window's full accessibility tree plus a window list header for context.

# CUP 0.1.0 | windows | 2560x1440
# app: VS Code
# windows: VS Code (focused) | Terminal | Slack | Chrome
# 45 nodes (180 before pruning)

[e0] win "VS Code" ...

Overview

Returns only the window list — no tree walking. This is instant because it doesn't need to traverse any accessibility trees. Use this when you just need to know what apps are open.

# CUP 0.1.0 | windows | 2560x1440
# scope: overview
# windows:

VS Code (focused) | pid: 1234 | 120,40 1680x1020
Terminal | pid: 5678 | 0,0 800x600
Slack | pid: 9012 | 900,100 1000x800

Desktop

Captures the desktop surface — icons, widgets, taskbar, and system tray. Useful for launching apps or interacting with desktop shortcuts.

Full

Captures every window on screen. Use app parameter to filter:

# All windows
session.snapshot(scope="full")

# Just Discord's windows
session.snapshot(scope="full", app="Discord")

On this page