Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,7 @@ Commands with JSON output support:
- `anthropic-computer-use` - Anthropic Computer Use prompt loop
- `openai-computer-use` - OpenAI Computer Use Agent sample
- `gemini-computer-use` - Implements a Gemini computer use agent (TypeScript only)
- `moondream-groq-computer-use` - Moondream + Groq computer use agent (TypeScript + Python)
- `openagi-computer-use` - OpenAGI Lux computer-use models (Python only)
- `magnitude` - Magnitude framework sample (TypeScript only)
- `claude-agent-sdk` - Claude Agent SDK browser automation agent
Expand Down Expand Up @@ -517,6 +518,9 @@ kernel create --name my-agent --language ts --template stagehand
# Create a Python Computer Use app
kernel create --name my-cu-app --language py --template anthropic-computer-use

# Create a Moondream + Groq Computer Use app (TypeScript or Python)
kernel create --name my-moondream-cu --language ts --template moondream-groq-computer-use

# Create a Claude Agent SDK app (TypeScript or Python)
kernel create --name my-claude-agent --language ts --template claude-agent-sdk
```
Expand Down
2 changes: 2 additions & 0 deletions pkg/templates/python/moondream-groq-computer-use/.env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
MOONDREAM_API_KEY=
GROQ_API_KEY=
64 changes: 64 additions & 0 deletions pkg/templates/python/moondream-groq-computer-use/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# Kernel Python Sample App - Moondream Computer Use

This Kernel app runs a lightweight computer-use agent powered by Moondream vision models, Groq fast LLM orchestration.

## Setup

1. Get your API keys:
- **Moondream**: [moondream.ai](https://moondream.ai)
- **Groq**: [console.groq.com](https://console.groq.com)

2. Deploy the app:
```bash
kernel login
cp .env.example .env # Add your MOONDREAM_API_KEY and GROQ_API_KEY
kernel deploy main.py --env-file .env
```

## Usage

Natural-language query (Groq LLM orchestrates Moondream + Kernel):
```bash
kernel invoke python-moondream-cua cua-task --payload '{"query": "Navigate to https://example.com and describe the page"}'
```

Structured steps (optional fallback for deterministic automation):
```bash
kernel invoke python-moondream-cua cua-task --payload '{
"steps": [
{"action": "navigate", "url": "https://example.com"},
{"action": "caption"},
{"action": "click", "target": "More information link", "retries": 4},
{"action": "type", "target": "Search input", "text": "kernel", "press_enter": true}
]
}'
```

## Step Actions

Each step is a JSON object with an `action` field. Supported actions:

- `navigate`: `{ "url": "https://..." }`
- `click`: `{ "target": "Button label or description" }`
- `type`: `{ "target": "Input field description", "text": "...", "press_enter": false }`
- `scroll`: `{ "direction": "down" }` or `{ "x": 0.5, "y": 0.5, "direction": "down" }`
- `query`: `{ "question": "Is there a login button?" }`
- `caption`: `{ "length": "short" | "normal" | "long" }`
- `wait`: `{ "seconds": 2.5 }`
- `key`: `{ "keys": "ctrl+l" }`
- `go_back`, `go_forward`, `search`, `open_web_browser`

Optional step fields:
- `retries`: override retry attempts for point/click/type
- `retry_delay_ms`: wait between retries
- `x`, `y`: normalized (0-1) or pixel coordinates to bypass Moondream pointing (pixel coords use detected screenshot size)

## Replay Recording

Add `"record_replay": true` to the payload to capture a video replay (paid Kernel plans only).

## Notes

- The agent uses Moondream for visual reasoning and pointing.
- Kernel screenshots are PNG; Moondream queries are sent as base64 data URLs.
- The Groq LLM must output JSON actions; the agent repairs and parses JSON with json-repair.
6 changes: 6 additions & 0 deletions pkg/templates/python/moondream-groq-computer-use/_gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
.venv/
__pycache__/
*.pyc
.env
.env.local
uv.lock
Loading