Desktop Client Architecture¶

Stack¶

Layer	Technology
GUI	PyQt6 + qasync
Audio	sounddevice + numpy
Screen Capture	mss + Pillow
Input Automation	pyautogui
Auth	Firebase REST API (httpx)
Transport	WebSocket (websockets)

Plugin System¶

The desktop client uses a plugin-based architecture where each capability is a separate module:

desktop-client/src/
├── main.py               # Typer CLI entry point
├── ws_client.py          # WebSocket client with auto-reconnect
├── gui.py                # PyQt6 main window
├── config.py             # Pydantic settings
├── plugin_registry.py    # Plugin discovery + registration
├── plugins/
│   ├── file_plugin.py    # File read/write/list/info + search
│   ├── command_plugin.py # Shell command execution
│   ├── screen_plugin.py  # Screen capture
│   └── input_plugin.py   # Mouse/keyboard automation
├── files.py              # Sandboxed file access
├── screen.py             # Screen capture implementation
├── actions.py            # pyautogui wrappers
├── audio.py              # PCM16 audio streaming
├── login_dialog.py       # Firebase login UI
└── firebase_auth.py      # Firebase REST auth

T3 Tool Registration¶

When the desktop client connects, it advertises its capabilities and tool definitions to the backend. The backend registers these as T3 client tools that the agent can invoke via reverse-RPC:

sequenceDiagram
    participant DC as Desktop Client
    participant BE as Backend
    participant AG as Agent

    DC->>BE: auth {capabilities, local_tools}
    BE->>BE: Register T3 tools
    AG->>BE: tool_call: "click(x=100, y=200)"
    BE->>DC: tool_invocation: {action: "click", params: {x:100, y:200}}
    DC->>DC: pyautogui.click(100, 200)
    DC->>BE: tool_result: {ok: true}
    BE->>AG: function_response

File Upload to E2B¶

The desktop client can upload local files to an E2B cloud desktop sandbox, enabling workflows like:

User has a CSV on their local machine
Agent instructs desktop client to upload it to E2B
Agent runs analysis code in the E2B sandbox
Results displayed via GenUI on the dashboard