Metadata-Version: 2.4
Name: shypmate
Version: 0.1.0.post20
Summary: Autonomous AI dev agents as Docker containers — add to any project as a submodule
Author-email: Paul R <paulr978@gmail.com>
License-Expression: MIT
Project-URL: Homepage, https://github.com/paulr978/shypmate
Project-URL: Repository, https://github.com/paulr978/shypmate
Project-URL: Issues, https://github.com/paulr978/shypmate/issues
Keywords: ai,agents,docker,autonomous,dev-agents,ci,llm
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries :: Application Frameworks
Classifier: Topic :: Software Development :: Build Tools
Requires-Python: >=3.12
Description-Content-Type: text/markdown
Requires-Dist: click>=8.1.7
Requires-Dist: docker>=7.1.0
Requires-Dist: PyGithub>=2.5.0
Requires-Dist: pyyaml>=6.0.2
Requires-Dist: rich>=13.7.0

# Shypmate — Your AI Development Multiplier

**Shypmate is not a copilot.** It's a workforce multiplier for developers.

While a copilot sits in your editor suggesting code as you type, shypmate takes tasks off your plate entirely. You assign work, shypmate agents execute it in isolated Docker containers — cloning the repo, creating a branch, implementing the change, validating, and opening a PR. You review and merge. Your capacity to multitask goes from 1x to Nx.

**You are the brain. Shypmate agents are your extra hands.**

```
Developer                          Shypmate Agents
─────────                          ────────────────
"Add pagination to /api/products"  → Agent 1 → PR #42
"Fix receipt parser bug"           → Agent 2 → PR #43
"Add unit tests for auth module"   → Agent 3 → PR #44
(continues working on own tasks)
```

Project-agnostic. Configure for any repo via `shypmate.yml` + `.env` — no code changes needed. Works with Python, Node.js, Go, Java, Rust, or any language with Docker support.

---

## Table of Contents

- [Quick Start](#quick-start)
- [Architecture Overview](#architecture-overview)
- [Directory Structure](#directory-structure)
- [CLI Reference](#cli-reference)
- [Agent Lifecycle](#agent-lifecycle)
- [PR Monitor (Legacy)](#pr-monitor-legacy)
- [Tools Reference](#tools-reference)
- [Configuration](#configuration)
- [Agent Hats & Task Pipeline](#agent-hats--task-pipeline)
- [Human Input](#human-input)
- [Development Mode](#development-mode)
- [PR Metadata Format](#pr-metadata-format)
- [Shared Services Policy](#shared-services-policy)
- [Conflict & Self-Healing Model](#conflict--self-healing-model)
- [Reusability Guide](#reusability-guide)
- [Security Model](#security-model)
- [Development Setup](#development-setup)
- [Extending the System](#extending-the-system)
- [Troubleshooting](#troubleshooting)

---

## Quick Start

### Prerequisites

- Docker installed and running
- Python 3.12+
- `GITHUB_TOKEN` with repo + PR permissions (stored in `.env`)
- `ANTHROPIC_API_KEY` for Claude API access (stored in `.env`)

### Install

```bash
pip install shypmate
```

### Setup

```bash
# 1. Interactive setup — creates shypmate.yml, detects project settings
shypmate init

# 2. Add secrets to .env (created by init if missing)
#    GITHUB_TOKEN=ghp_...
#    ANTHROPIC_API_KEY=sk-ant-...

# 3. Validate config, secrets, Docker, and quotas
shypmate validate

# 4. End-to-end test — builds image and runs a real agent in Docker
shypmate verify
```

### Run

```bash
# Add tasks and launch agents
shypmate task add "Add pagination to /api/products/" --priority high
shypmate work

# Launch in foreground (watch output live, interactive human input)
shypmate work --no-detach

# Or launch an agent directly (bypass task queue)
shypmate run-agent --task "Fix receipt parser bug" --no-detach

# Start the manager (monitors PRs, plans tasks, cleans up)
shypmate start

# Check running agents
shypmate list-agents

# Stop the manager
shypmate stop
```

---

## Architecture Overview

```
Your Machine (Host)
├── shypmate start
│   └── Manager Container (always-on)
│         ├── Generates project brief (shared with all agents)
│         ├── Plans pending tasks (one LLM call each)
│         ├── Monitors PRs and cleans up merged containers
│         └── Refreshes brief when base branch advances
│
├── shypmate work --all
│   ├── Worker Container (fresh clone, shypmate/dev/task-A)
│   ├── Worker Container (fresh clone, shypmate/dev/task-B)
│   └── ...
│
├── Docker Network (project_backend_net from shypmate.yml)
│   ├── db
│   ├── redis
│   ├── minio
│   └── ...
│
└── [Legacy] PR Monitor (polls GitHub, re-spawns stale agents)
```

### Manager / Worker Model

The **manager** (`shypmate start`) runs as a persistent container that:
- Generates a project brief via a single LLM call, shared with all workers
- Plans each pending task with a small LLM call
- Monitors open PRs and cleans up containers for merged/closed PRs
- Refreshes the brief when the base branch advances

**Workers** (`shypmate work`) are ephemeral agent containers. When a manager is running, workers automatically use the manager's pre-built context, skipping discovery and reducing LLM calls. Workers can also run without a manager.

### Core Principles

1. **One task = one container.** Full filesystem isolation. No agent can interfere with another.
2. **Shared services are external.** Agents connect to the project's existing Docker network (db, redis, minio) but never own or mutate them destructively.
3. **Each agent gets its own branch.** Branch naming: `shypmate/dev/<agent-id>-<task-slug>-<timestamp>`.
4. **Each agent opens one PR.** PR body contains hidden metadata so the monitor can reconstruct the task later.
5. **Self-healing via PR monitor.** When master advances, the monitor detects stale AI PRs and re-spawns the original agent in rebase mode.

---

## Directory Structure

```
shypmate/
├── __init__.py
├── __main__.py                  # `python -m shypmate` entry point
├── main.py                      # Host-side CLI (click) — tasks, work, dashboard, agents
├── task_manager.py              # Task queue CRUD (shypmate_tasks.yml backend)
├── entrypoint.py                # Container lifecycle runner
├── Dockerfile                   # Agent container image
├── requirements.txt             # Python dependencies
├── docker-compose.monitor.yml   # Compose file for the PR monitor
├── setup_venv.sh                # Venv setup script (Linux/macOS/WSL)
├── setup_venv.bat               # Venv setup script (Windows)
├── .venv/                       # shypmate's own virtual environment (gitignored)
│
├── config/
│   └── project.py               # Loads config from env + shypmate.yml (framework code, don't edit)
│
├── context/
│   └── project_context.py       # Reads repo files → context string for the agent
│
├── agents/
│   └── dev_agent.py             # Agent definition (system prompt, tool registry)
│
├── tasks/
│   └── dev_task.py              # Task prompt builder (normal + rebase modes)
│
├── engine/
│   ├── __init__.py
│   ├── agent_loop.py            # Think → tool → repeat loop
│   ├── providers.py             # LLM provider abstraction (Anthropic, OpenAI)
│   └── tool_registry.py         # Tool schema generation + dispatch
│
├── crews/
│   └── dev_crew.py              # Dev crew assembly: provider + agent + task
│
├── tools/
│   ├── git_tools.py             # 12 git tools (branch, commit, rebase, push, diff...)
│   ├── github_tools.py          # 6 GitHub tools (create PR, comment, list AI PRs...)
│   ├── file_tools.py            # 5 file tools (read, write, list, search, patch)
│   └── shell_tools.py           # 3 shell tools (constrained commands, ruff check/format)
│
├── monitor/
│   ├── __init__.py
│   ├── Dockerfile               # Monitor container image
│   └── pr_monitor.py            # Polling service: detects stale PRs, re-spawns agents
│
└── docs/
    └── README.md                # This file
```

---

## CLI Reference

All commands run on the host via `shypmate <command>` (or `python -m shypmate <command>`).

### Task Management

#### `task add`

Add a task to the queue.

```bash
shypmate task add "Add pagination to /api/products/" --priority high
shypmate task add "Fix receipt parser bug" --priority critical --id fix-parser
shypmate task add "Refactor analytics module"  # defaults to normal priority
```

| Flag | Default | Description |
|------|---------|-------------|
| `--priority`, `-p` | `normal` | Priority: `critical`, `high`, `normal`, `low` |
| `--id` | auto-generated | Custom task ID |

#### `task list`

Show tasks in the queue.

```bash
shypmate task list              # pending + running only
shypmate task list --all        # include done/failed/cancelled
shypmate task list -s pending   # filter by status
shypmate task list -p high      # filter by priority
```

#### `task remove`

Remove a task.

```bash
shypmate task remove fix-parser
```

#### `task update`

Change a task's priority, status, or description.

```bash
shypmate task update fix-parser --priority critical
shypmate task update fix-parser --status cancelled
shypmate task update fix-parser --description "New description"
```

#### `work`

Pick up pending tasks (by priority) and launch agents.

```bash
shypmate work                # launch the next highest-priority task
shypmate work --all          # launch agents for ALL pending tasks
shypmate work -n 3           # launch the top 3 tasks
shypmate work --no-detach    # single task, foreground (watch output)
shypmate work --dev          # mount local src/ for live code changes
```

Tasks are picked in priority order: `critical > high > normal > low`, then FIFO within the same priority.

#### `dashboard`

Live overview of tasks, running agents, and results.

```bash
shypmate dashboard            # show once
shypmate dashboard -w         # auto-refresh every 5 seconds
```

### Direct Agent Control

#### `run-agent`

Launch an agent directly, bypassing the task queue.

```bash
shypmate run-agent --task "Add pagination to /api/products/" --no-detach
shypmate run-agent --task "Fix bug" --id my-fix --rebase
```

| Flag | Default | Description |
|------|---------|-------------|
| `--task`, `-t` | (required) | Task description |
| `--id` | auto | Agent identifier |
| `--branch`, `-b` | auto | Git branch name |
| `--rebase` | off | Rebase/conflict-resolution mode |
| `--detach/--no-detach` | detach | Background or foreground |
| `--dev/--no-dev` | auto-detect | Dev mode: mount local src/ for live code changes |

### Setup & Validation

#### `init`

Interactive setup wizard. Auto-detects project settings (repo, branch, Docker network, tech stack, install command) and creates `shypmate.yml`.

```bash
shypmate init                    # interactive — prompts for each setting
shypmate init --non-interactive  # accept all auto-detected defaults
```

Re-run to update an existing `shypmate.yml`.

#### `validate`

Check configuration, secrets, Docker connectivity, and quotas. Reports errors and warnings.

```bash
shypmate validate
```

#### `verify`

End-to-end test. Builds the agent image (if needed) and runs a real agent in a Docker container with a small test task to confirm everything works.

```bash
shypmate verify                           # auto-generated test task
shypmate verify --task "Add a comment"    # custom test task
shypmate verify --skip-build              # skip image build
```

### Agent Operations

#### `review`

Review an agent's task, branch, container status, and recent logs.

```bash
shypmate review <id>                # agent ID, task ID, or container name
shypmate review <id> --logs 100    # show 100 lines of logs (default: 30)
```

#### `retry`

Re-launch an agent for the same task on the same branch. Useful after a failure.

```bash
shypmate retry <id>
shypmate retry <id> --no-detach    # foreground mode
```

#### `sync`

Reconcile task queue states with container and PR reality. Updates tasks whose containers have exited or whose PRs have been merged/closed.

```bash
shypmate sync
```

#### `clean`

Remove exited agent containers. By default, only removes containers whose PRs are merged or closed.

```bash
shypmate clean              # selective (merged/closed PRs only)
shypmate clean --all        # all exited containers
shypmate clean --dry-run    # preview without removing
```

### Human Input

#### `inbox`

Show agents waiting for human input. Scans running containers for pending questions.

```bash
shypmate inbox
```

#### `respond`

Respond to an agent's question.

```bash
shypmate respond <request-id> "your answer"
```

### Infrastructure

#### `build`

Build the agent Docker image.

```bash
shypmate build [--no-cache]
```

#### `start` / `stop`

Start or stop the shypmate manager container. The manager handles PR monitoring, task planning, and container cleanup. Replaces the legacy `monitor-start`/`monitor-stop` commands.

```bash
shypmate start                    # start manager
shypmate start --poll-interval 30 # custom poll interval
shypmate stop                     # stop manager
shypmate stop --remove-volume     # also remove shared brief volume
```

#### `list-agents`

List agent containers (running by default).

```bash
shypmate list-agents         # running only
shypmate list-agents --all   # include exited
```

#### `logs`

Show logs for an agent container.

```bash
shypmate logs <id>              # last 50 lines
shypmate logs <id> -f           # follow (live)
shypmate logs <id> -n 200      # last 200 lines
```

#### `test-loop`

Internal agent loop test. Runs the think-tool-repeat loop outside Docker for debugging.

```bash
shypmate test-loop --provider anthropic --model claude-sonnet-4-6
shypmate test-loop --mock          # mock LLM responses
shypmate test-loop --keep-workspace
```

#### `monitor-start` / `monitor-stop` (legacy)

Start/stop the standalone PR monitor container. **Prefer `shypmate start`/`stop` instead** -- the manager includes all monitor functionality plus task planning and brief generation.

```bash
shypmate monitor-start [--interval 60]
shypmate monitor-stop
```

---

## Agent Lifecycle

When you run `shypmate run-agent --task "..."`, this is what happens:

```
Host: main.py
  │
  ├── Generate agent ID, branch name
  ├── Validate GITHUB_TOKEN and ANTHROPIC_API_KEY
  ├── docker run shypmate-agent:latest with env vars
  │
  └── Container starts → entrypoint.py
        │
        ├── 1. Clone repo from GitHub (authenticated via GITHUB_TOKEN)
        ├── 2. Configure git identity (shypmate-<agent-id>)
        ├── 3. Create branch (or checkout existing for rebase)
        ├── 4. Install backend requirements (pip install -r backend/requirements.txt)
        ├── 5. Build project context (reads CLAUDE.md, decisions.md, etc.)
        ├── 6. Run dev agent (think → tool → repeat)
        │     │
        │     └── dev_agent.py executes the task:
        │           ├── Read existing code (file_tools)
        │           ├── Implement changes (file_tools)
        │           ├── Run validation (shell_tools: ruff check, ruff format)
        │           ├── Commit changes (git_tools)
        │           ├── Push branch (git_tools)
        │           └── Create PR with metadata (github_tools)
        │
        ├── 7. Log result
        └── 8. Exit (container stops)
```

### Rebase Mode

When `AGENT_REBASE=true` (set by the monitor or `--rebase` flag):

1. Checkout existing branch
2. Fetch latest from origin
3. Rebase onto `origin/master`
4. Resolve conflicts (simple ones automatically, complex ones flagged)
5. Re-run validation
6. Force-push the updated branch
7. Comment on the PR with a summary

---

## PR Monitor (Legacy)

> **Note:** The PR monitor is superseded by the manager (`shypmate start`), which includes all monitor functionality plus task planning and brief generation. Use `shypmate start`/`stop` for new setups.

The PR monitor (`monitor/pr_monitor.py`) is a simple polling service — not an agent.

### What It Does

```
Every N seconds:
  │
  ├── Get latest master SHA from GitHub
  ├── If master hasn't changed → skip
  ├── If master advanced:
  │     ├── List all open PRs on shypmate/* branches
  │     ├── For each PR:
  │     │     ├── Compare branch HEAD to master
  │     │     ├── If stale (behind master):
  │     │     │     ├── Comment on PR: "Re-spawning agent to rebase"
  │     │     │     ├── docker run agent container in rebase mode
  │     │     │     └── Track as active respawn
  │     │     └── If up-to-date → skip
  │     └── Clean up finished respawns
  └── Sleep N seconds
```

### Running the Monitor

Option A — via CLI:
```bash
shypmate monitor-start --interval 60
```

Option B — via docker-compose:
```bash
cd shypmate
docker-compose -f docker-compose.monitor.yml up -d
```

The monitor container mounts `/var/run/docker.sock` so it can spawn sibling agent containers.

### Environment Variables

| Variable | Default | Description |
|----------|---------|-------------|
| `GITHUB_TOKEN` | (required) | GitHub PAT |
| `ANTHROPIC_API_KEY` | (required) | Passed to spawned agents |
| `GITHUB_REPO` | (required) | Repository to watch |
| `BASE_BRANCH` | `master` | Branch to monitor |
| `POLL_INTERVAL` | `60` | Seconds between polls |
| `DOCKER_NETWORK` | (required) | Network agents join |
| `AGENT_IMAGE` | `shypmate-agent:latest` | Image for spawned agents |

---

## Tools Reference

### Git Tools (`tools/git_tools.py`)

All tools operate on the local repo clone at `/workspace`.

| Tool | Description |
|------|-------------|
| `create_branch` | Create and checkout a new branch |
| `checkout_branch` | Checkout an existing branch |
| `git_fetch` | Fetch from remote |
| `git_rebase` | Rebase current branch onto another |
| `git_rebase_continue` | Continue rebase after conflict resolution |
| `git_rebase_abort` | Abort a rebase |
| `git_commit` | Stage all changes and commit |
| `git_push` | Push branch to origin (supports force-with-lease) |
| `git_diff` | Show stat diff against a base branch |
| `git_changed_files` | List files changed vs a base branch |
| `git_status` | Show working tree status |
| `git_log` | Show recent log entries |

### GitHub Tools (`tools/github_tools.py`)

| Tool | Description |
|------|-------------|
| `create_pull_request` | Create a PR with optional metadata block |
| `update_pull_request` | Update PR title/body |
| `comment_on_pr` | Add a comment to a PR |
| `search_issues` | Search GitHub issues for context before implementing (checks title + body) |
| `list_ai_pull_requests` | List all open PRs on `shypmate/*` branches |
| `get_pr_metadata` | Extract shypmate metadata from a PR body |
| `get_pr_changed_files` | List files changed in a PR |

### File Tools (`tools/file_tools.py`)

All paths are sandboxed to `/workspace`. Attempts to escape are rejected.

| Tool | Description |
|------|-------------|
| `read_file` | Read file contents (truncated at 50K chars) |
| `write_file` | Write/create a file |
| `list_directory` | Recursive directory listing (configurable depth) |
| `search_files` | Regex search across files (default: `*.py`) |
| `patch_file` | Find-and-replace a specific text block in a file |

### Shell Tools (`tools/shell_tools.py`)

Shell access is **constrained by an allowlist**. The agent cannot run arbitrary commands.

| Tool | Description |
|------|-------------|
| `run_shell_command` | Run an allowlisted command |
| `run_ruff_check` | Run `ruff check` (with optional `--fix`) |
| `run_ruff_format` | Run `ruff format` |

**Built-in allowed prefixes:** `ruff `, `pytest `, `pip install`, `pip list`, `pip show`, `cat `, `head `, `tail `, `wc `, `diff `, `find `, `ls `

**Built-in blocked patterns:** `rm -rf`, `rm -r /`, `DROP`, `DELETE FROM`, `TRUNCATE`, `FLUSHALL`, `FLUSHDB`

**Project-specific additions** are configured in `shypmate.yml` under `shell.allowed_prefixes` and `shell.blocked_patterns`. These are merged with the built-in lists at runtime.

### Human Tools (`tools/human_tools.py`)

| Tool | Description |
|------|-------------|
| `request_human_input` | Ask a human for help when stuck or needing a decision. Blocks until a response arrives. Used automatically when soft quotas are reached. |

### Memory Tools (`tools/memory_tools.py`)

Agents build shared institutional knowledge that persists across runs. Memories are stored in `.shypmate/comms/memory/` and capped to prevent bloat.

| Tool | Description |
|------|-------------|
| `recall` | Read shared memories at the start of a task. Keyword search and category filter supported. |
| `remember` | Save a lesson, pattern, decision, or gotcha for future agents. Categories: `lesson`, `pattern`, `decision`, `context`. |
| `forget` | Remove outdated or incorrect memories by matching text. |

**CLI commands:**

| Command | Description |
|------|-------------|
| `shypmate memory list` | Show all shared memories (filter with `-c lesson`) |
| `shypmate memory add "fact" -c pattern` | Manually add a memory |
| `shypmate memory remove "substring"` | Remove entries matching text |
| `shypmate memory reset` | Clear all memories (with confirmation) |

Pinned entries survive eviction. When limits are reached, oldest unpinned entries are removed first.

**Configuration in `shypmate.yml`:**
```yaml
memory:
  max_entries: 50    # max number of memories (default: 50)
  max_size: 8192     # max file size in bytes (default: 8KB)
```

**Or in `shypmate.py`:**
```python
config = {
    "memory": {"max_entries": 100, "max_size": 16384},
}

# Or as functions:
def memory_max_entries():
    return 100
```

---

## Configuration

shypmate is **project-agnostic**. All project-specific configuration comes from two sources:

### 1. `shypmate.yml` (project root)

This file lives at the root of your project (not inside `shypmate/`). It defines everything the agents need to know about your project: validation commands, context files, safety rules, shell allowlists, etc.

```yaml
github:
  repo: your-org/your-repo
  base_branch: main
  branch_prefix: shypmate/dev
  pr_title_prefix: "[AI]"
  pr_label: shypmate-agent

docker:
  network: your-project_backend_net

context_files:
  - README.md
  - docs/architecture.md

install_command: "pip install -r requirements.txt"

validation_commands:
  - command: "npm run lint"
    description: "Run linter"
  - command: "npm test"
    description: "Run tests"

shared_service_rules:
  - "Do NOT drop or truncate database tables."
  - "Do NOT flush Redis."

quota:
  soft:                          # pause and ask human
    max_cost: 0.50               # USD
    max_iterations: 15
  hard:                          # stop immediately (required)
    max_cost: 2.00               # USD
    max_iterations: 30

tech_stack:
  - "Node.js 20"
  - "PostgreSQL 16"

shell:
  allowed_prefixes:
    - "npm "
    - "npx "
  blocked_patterns:
    - "npm run db:reset"

llm:
  provider: anthropic            # or openai, gemini, groq, ollama, etc.
  model: claude-sonnet-4-6
  max_tokens: 8192
  # base_url: ""                 # auto-filled from provider preset
  # api_key_var: ANTHROPIC_API_KEY  # env var name for the API key
```

See the `shypmate.yml` at the root of this repo for a complete example.

#### Supported LLM Providers

| Provider | Key | Default Model | Notes |
|----------|-----|---------------|-------|
| `anthropic` | `ANTHROPIC_API_KEY` | claude-sonnet-4-6 | Native Anthropic SDK |
| `openai` | `OPENAI_API_KEY` | gpt-4o | Native OpenAI SDK |
| `gemini` | `GOOGLE_API_KEY` | gemini-2.5-pro | Via OpenAI-compatible endpoint |
| `azure` | `AZURE_OPENAI_API_KEY` | gpt-4o | Requires custom base_url |
| `mistral` | `MISTRAL_API_KEY` | mistral-large-latest | Via OpenAI-compatible endpoint |
| `groq` | `GROQ_API_KEY` | llama-3.3-70b-versatile | Via OpenAI-compatible endpoint |
| `deepseek` | `DEEPSEEK_API_KEY` | deepseek-chat | Via OpenAI-compatible endpoint |
| `together` | `TOGETHER_API_KEY` | Llama-3.3-70B-Instruct-Turbo | Via OpenAI-compatible endpoint |
| `ollama` | *(none)* | llama3.1 | Local — http://localhost:11434 |
| `lmstudio` | *(none)* | local-model | Local — http://localhost:1234 |
| `custom` | *(user-defined)* | *(user-defined)* | Any OpenAI-compatible API |

#### LLM Pool (multiple providers)

When running multiple agents in parallel, a single LLM provider can hit rate limits. The **LLM pool** distributes agents across multiple providers using round-robin assignment.

```yaml
llm:
  provider: anthropic           # fallback if no pool is defined
  pool:
    - provider: anthropic
      model: claude-sonnet-4-6
      api_key_var: ANTHROPIC_API_KEY

    - provider: openai
      model: gpt-4o
      api_key_var: OPENAI_API_KEY

    - provider: ollama           # local LLM in the mix
      model: llama3.1

    - provider: anthropic        # second Anthropic key to double rate limits
      model: claude-sonnet-4-6
      api_key_var: ANTHROPIC_API_KEY_2
```

When you run `shypmate work --all` or `shypmate work -n 4`, each agent is assigned the next provider in the pool:

- Agent 0 → Anthropic (key 1)
- Agent 1 → OpenAI
- Agent 2 → Ollama (local)
- Agent 3 → Anthropic (key 2)
- Agent 4 → wraps back to Anthropic (key 1)

Each pool entry can override `model`, `max_tokens`, `base_url`, and `api_key_var`. Values not specified fall back to the provider's defaults.

If no pool is defined, all agents use the single `llm:` config.

`shypmate validate` shows the pool status including which API keys are found:

```
LLM pool (4 providers — round-robin)
  [0] Anthropic (Claude) / claude-sonnet-4-6 (key found)
  [1] OpenAI (GPT) / gpt-4o (key found)
  [2] Ollama (local) / llama3.1 (no key needed)
  [3] Anthropic (Claude) / claude-sonnet-4-6 (key found)
```

#### Agent Quotas (required)

Every project **must** configure at least one hard quota to prevent runaway costs. Agents will refuse to run without a hard limit set.

Two tiers of limits:

- **Soft quota** — pauses the agent and asks the human whether to continue. Triggers once per run.
- **Hard quota** — stops the agent immediately. No questions asked. **Required.**

```yaml
quota:
  soft:
    max_cost: 0.50         # USD — pause and ask at $0.50
    max_iterations: 15     # pause at 15 API calls
    max_seconds: 300       # pause at 5 minutes
    max_tokens: 100000     # pause at 100K total tokens
  hard:
    max_cost: 2.00         # USD — hard stop at $2.00
    max_iterations: 30     # hard stop at 30 API calls
    max_seconds: 600       # hard stop at 10 minutes
    max_tokens: 500000     # hard stop at 500K total tokens
```

All costs are in **USD**. Cost is estimated using the configured provider's pricing (Sonnet: $3/M input, $15/M output).

When a soft quota is reached, the agent asks via `request_human_input`:
```
I've reached a resource quota: cost ($0.51 >= $0.50).
So far I've used 150,234 tokens ($0.51) in 12 iterations over 180s.
Should I continue working, or stop here?
Reply 'continue' to keep going or 'stop' to end.
```

When a hard quota is reached, the agent stops and logs:
```
AGENT STOPPED: HARD QUOTA: cost limit ($2.01 >= $2.00)
```

`shypmate validate` will show quota status and flag missing hard quotas as an error.

`shypmate init` prompts for quotas with sensible defaults ($0.50 soft / $2.00 hard).

### 2. `.env` (secrets + overrides)

Secrets and simple overrides go in your `.env` file (gitignored). **Never put API keys in `shypmate.yml`.**

**Required:**

| Variable | Purpose |
|----------|---------|
| `GITHUB_TOKEN` | Clone, push, PR operations |
| LLM API key | Depends on provider (see table above) |

Add the API key for each provider you use:

```
GITHUB_TOKEN=ghp_...
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY_2=sk-ant-...   # optional second key for pool
```

**Optional overrides** (take precedence over `shypmate.yml`):

| Variable | Overrides |
|----------|-----------|
| `SHYPMATE_GITHUB_REPO` | `github.repo` |
| `SHYPMATE_BASE_BRANCH` | `github.base_branch` |
| `SHYPMATE_DOCKER_NETWORK` | `docker.network` |
| `SHYPMATE_BRANCH_PREFIX` | `github.branch_prefix` |
| `SHYPMATE_LLM_PROVIDER` | `llm.provider` |
| `SHYPMATE_LLM_MODEL` | `llm.model` |
| `SHYPMATE_LLM_MAX_TOKENS` | `llm.max_tokens` |
| `SHYPMATE_LLM_BASE_URL` | `llm.base_url` |
| `SHYPMATE_POLL_INTERVAL` | `monitor.poll_interval_seconds` |
| `SHYPMATE_AGENT_IMAGE_NAME` | `agent.image_name` |
| `SHYPMATE_AGENT_IMAGE_TAG` | `agent.image_tag` |

### 3. `shypmate.py` (optional — programmatic config)

For configuration that requires logic — conditional settings, custom LLM selection strategies, or config derived from runtime state — create a `shypmate.py` at your project root.

`shypmate.py` uses the same config keys as `shypmate.yml`, but expressed as a Python dict. It also supports **hooks** — functions that customize shypmate's behavior at runtime.

You can provide config three ways in `shypmate.py` — use whichever fits:

**Option A: Config dict** (like shypmate.yml but in Python)
```python
config = {
    "github": {"repo": "myorg/myproject"},
    "tech_stack": ["Python", "Django"],
}
```

**Option B: Individual functions** (for dynamic/derived values)
```python
import subprocess

def github_repo():
    """Derive repo from git remote."""
    result = subprocess.run(["git", "remote", "get-url", "origin"], capture_output=True, text=True)
    # parse and return "owner/repo"
    return "myorg/myproject"

def tech_stack():
    return ["Python", "Django", "PostgreSQL"]

def install_command():
    return "pip install -r requirements.txt && npm ci"
```

**Option C: Both** — functions override the dict per-field
```python
config = {
    "tech_stack": ["Python", "Django"],
    "docker": {"network": "myproject_default"},
}

# This overrides config["github"]["repo"]
def github_repo():
    return "myorg/myproject"
```

All three can coexist with `shypmate.yml` — yml always wins per-field.

#### Available config functions

Any of these function names will be called if defined, and their return value used:

| Function | Returns | Equivalent yml path |
|----------|---------|-------------------|
| `github_repo()` | `str` | `github.repo` |
| `base_branch()` | `str` | `github.base_branch` |
| `branch_prefix()` | `str` | `github.branch_prefix` |
| `pr_title_prefix()` | `str` | `github.pr_title_prefix` |
| `pr_label()` | `str` | `github.pr_label` |
| `docker_network()` | `str` | `docker.network` |
| `install_command()` | `str` | `install_command` |
| `tech_stack()` | `list[str]` | `tech_stack` |
| `context_files()` | `list[str]` | `context_files` |
| `validation_commands()` | `list[dict]` | `validation_commands` |
| `shared_service_rules()` | `list[str]` | `shared_service_rules` |
| `allowed_shell_prefixes()` | `list[str]` | `shell.allowed_prefixes` |
| `blocked_shell_patterns()` | `list[str]` | `shell.blocked_patterns` |
| `quota()` | `dict` | `quota` |
| `memory_max_entries()` | `int` | `memory.max_entries` |
| `memory_max_size()` | `int` | `memory.max_size` |
| `llm_provider()` | `str` | `llm.provider` |
| `llm_model()` | `str` | `llm.model` |
| `llm_pool()` | `list[dict]` | `llm.pool` |

#### Hooks

Hooks are functions that customize shypmate's runtime behavior (not just config values):

```python
def select_llm(task, pool, agent_index):
    """Custom LLM selection instead of round-robin.

    Each pool entry has: .provider, .model, .base_url, .api_key_var
    """
    if "refactor" in task.lower():
        return next((e for e in pool if e.provider == "anthropic"), pool[0])
    if "typo" in task.lower():
        return next((e for e in pool if e.provider == "ollama"), pool[0])
    return pool[agent_index % len(pool)]
```

| Hook | Signature | Purpose |
|------|-----------|---------|
| `select_llm` | `(task, pool, agent_index) -> pool_entry` | Custom LLM provider selection per task |

See `docs/shypmate_example.py` for a complete example with multiple strategies.

### 4. `config/project.py` (framework code — do not edit)

This file loads values from env vars, `shypmate.yml`, and `shypmate.py`. It has no hardcoded project values. You should never need to modify it.

### Priority order

```
Environment variable  >  shypmate.yml  >  shypmate.py  >  auto-detect  >  built-in default
```

---

## Agent Hats & Task Pipeline

Agents can wear different **hats** — specialized roles that form a pipeline per task. Instead of one agent doing everything, each role focuses on what it does best.

### Built-in Hats

| Hat | Role | Default | What it does |
|-----|------|---------|-------------|
| `developer` | Senior developer | Enabled | Implements the task, commits, pushes branch, writes test instructions |
| `qa` | QA engineer | Enabled | Validates the branch — runs tests, checks quality, passes or fails |
| `product_owner` | Product manager | **Disabled** | Reviews changes against requirements, approves or rejects |

**Default pipeline:** Developer → QA → PR opened for human review.

The developer reviews and merges the PR themselves. If you want an AI product owner in the loop, enable it in config.

### Pipeline Flow

```
Developer → QA → PR opened (for human review)
    ↑         |
    └─ FAIL ──┘  (developer retries with QA's feedback)
```

The developer provides test instructions to QA. QA doesn't need to know the implementation details — it runs the tests and validates. If QA fails, the developer gets specific feedback and retries.

### Usage

```bash
# Run with default pipeline (developer → QA → PR)
shypmate run-pipeline --task "Add pagination to /api/products"

# Add product owner review before PR
shypmate run-pipeline --task "Fix auth bug" --hats "developer,qa,product_owner"

# Check pipeline progress
shypmate pipeline
shypmate pipeline <task-id>
```

### Adding More Hats

Enable a built-in hat or add custom ones. One line in `shypmate.yml`:

**Enable product owner:**
```yaml
hats:
  product_owner:
    enabled: true
```

**Add a custom hat:**
```yaml
hats:
  security_reviewer:
    prompt: "You are a security engineer. Review code for OWASP top 10 vulnerabilities."
    tools: [read_file, search_files, comment_on_pr]
```

**Conditional hats in `shypmate.py`:**
```python
def hats():
    return {
        "devops": {
            "prompt": "Review Dockerfile and CI/CD changes.",
            "tools": ["read_file", "run_shell_command"],
            "trigger": lambda task: "docker" in task.lower() or "ci" in task.lower(),
        },
    }
```

The `trigger` function makes hats activate only for relevant tasks. A DevOps hat only runs when the task mentions Docker or CI.

### SDLC Workflow

The pipeline flow is described by an **SDLC prompt** — natural language that tells each agent the overall process and its role. This replaces hardcoded position numbers.

**Default SDLC (shipped with shypmate):**
```
1. Developer implements the task, commits, and pushes the branch.
   The developer provides test instructions for QA.
2. QA validates the branch — runs the developer's test instructions,
   checks code quality, and looks for obvious issues.
   If QA fails, developer gets specific feedback and retries.
3. Once QA passes, a PR is opened for the human developer to review and merge.
```

**Custom SDLC in `shypmate.yml`** — when you add hats, update the flow:
```yaml
sdlc: |
  1. Developer implements the task.
  2. Security reviewer checks for vulnerabilities (only for auth/payment code).
  3. QA validates — runs tests, coverage must be >80%.
  4. DevOps reviews if Dockerfile or CI files changed.
  5. Product owner verifies requirements are met.
  6. PR opened for human review.
  Failures at any stage loop back to developer with feedback.

hats:
  product_owner:
    enabled: true
  security_reviewer:
    prompt: "Review for OWASP top 10 vulnerabilities."
    tools: [read_file, search_files]
  devops:
    prompt: "Review infrastructure and CI/CD changes."
    tools: [read_file, run_shell_command]
    trigger: lambda task: "docker" in task.lower()
```

**Or dynamically in `shypmate.py`:**
```python
import os

def sdlc():
    """Tighter process for production releases."""
    if os.environ.get("PROD_RELEASE"):
        return "1. Developer. 2. Security review. 3. QA. 4. PO. 5. PR."
    return "1. Developer. 2. QA. 3. PR."
```

Each agent receives the SDLC prompt so it understands the overall process and its specific role. The `--hats` flag on `run-pipeline` controls the order: `--hats "developer,security_reviewer,qa"`.

### Per-Hat Quotas

Each hat has its own cost/iteration limits since they do different amounts of work:

```yaml
hats:
  developer:
    max_cost: 1.00       # USD — does the most work
    max_iterations: 20
  qa:
    max_cost: 0.25       # mostly runs commands
    max_iterations: 10
  product_owner:
    enabled: true
    max_cost: 0.10       # just reads and evaluates
    max_iterations: 5
```

---

## Human Input

Agents can ask humans for help when stuck, facing ambiguous requirements, or hitting resource quotas.

### How It Works

1. The agent calls the `request_human_input` tool with a question and optional context.
2. The tool writes a JSON request file to a shared Docker volume (`shypmate-comms`) and blocks, polling for a response.
3. The human sees the question via `shypmate inbox` and replies with `shypmate respond`.
4. The response file appears on the shared volume, the tool returns the answer, and the agent continues.

### Commands

```bash
shypmate inbox                                  # list agents waiting for input
shypmate respond <request-id> "your answer"     # reply to an agent
```

### Interactive Mode (`--no-detach`)

When running an agent in foreground mode (`shypmate work --no-detach` or `shypmate run-agent --no-detach`), human input requests are displayed directly in the terminal. The CLI detects the `###SHYPMATE_HUMAN_INPUT###` marker in the container output and prompts you inline -- no need to use `inbox`/`respond`.

### Automatic Quota Prompts

When a soft quota is reached (cost, iterations, time, or tokens), the agent automatically calls `request_human_input` to ask whether to continue or stop. Hard quotas stop the agent immediately without asking.

---

## Development Mode

When developing shypmate itself, **dev mode** mounts your local `src/` directory into agent containers so code changes take effect without rebuilding the Docker image.

### Flags

```bash
shypmate work --dev         # force dev mode on
shypmate work --no-dev      # force dev mode off
shypmate run-agent --dev --task "..."
```

### Auto-Detection

If you installed shypmate in editable mode (`pip install -e .`), dev mode is enabled automatically. The CLI checks for a `pyproject.toml` in the source directory with `name = "shypmate"` -- if found, it mounts the local source as a read-only volume at `/opt/shypmate` inside the container and sets `SHYPMATE_DEV_MODE=1`.

### What It Does

- Mounts local `src/` to `/opt/shypmate` (read-only) inside the container, overriding the baked-in framework code.
- Sets the `SHYPMATE_DEV_MODE=1` environment variable in the container.
- Applies to both `work` and `run-agent` commands.

This means you can edit agent logic, tools, or the engine locally and test immediately without `docker build`.

---

## PR Metadata Format

Every AI-created PR contains a hidden metadata block in its body:

```html
<!-- SHYPMATE_META
{
  "task": "Add pagination to /api/products/",
  "agent_id": "dev-a1b2c3",
  "branch": "shypmate/dev/dev-a1b2c3-add-pagination-20260328-143022",
  "created_at": "2026-03-28T14:30:22.000000+00:00",
  "rebase_count": 0
}
SHYPMATE_META -->
```

This block is invisible when viewing the PR on GitHub but parseable by the monitor. It allows the monitor to:

- Reconstruct the original task description
- Re-spawn the exact same agent
- Track how many times a branch has been rebased

The `encode_pr_metadata()` and `decode_pr_metadata()` functions in `tools/github_tools.py` handle serialization.

---

## Shared Services Policy

Agents attach to the project's Docker network and can reach shared services. Strict rules apply:

### Allowed

- Read from the running database (inspect schema, query data)
- Read from Redis (check keys, inspect state)
- Run targeted validation commands (`manage.py check`, `manage.py showmigrations`)
- Run linting and formatting
- Run isolated tests if safe (must use per-agent test DB: `test_agent_{agent_id}`)

### Not Allowed

- `manage.py migrate` — never migrate the shared development database
- `manage.py flush` / `manage.py loaddata` / `manage.py dumpdata`
- `FLUSHALL` / `FLUSHDB` on Redis
- Delete or overwrite objects in MinIO
- Any destructive operation on shared infrastructure

These rules are enforced at two levels:
1. **Shell allowlist** — blocked patterns in `shell_tools.py` prevent the commands from running
2. **Agent instructions** — safety rules are injected into the agent's backstory so the LLM avoids attempting them

---

## Conflict & Self-Healing Model

### Automatic Resolution (agent handles)

- Simple merge conflicts in different parts of a file
- Migration numbering conflicts (e.g., two agents both create `0005_...`)
- Import ordering conflicts
- Naming conflicts for newly added files/classes if detectable
- Minor upstream refactors that don't change semantics

### Escalation (agent marks PR for human review)

- Semantic conflicts in the same function/method
- Contradictory schema changes
- Cases where the task intent is no longer valid after upstream changes
- Risky shared-service changes

When escalating, the agent:
1. Updates the PR body with a warning section
2. Comments on the PR explaining the conflict
3. Leaves the branch in a clean state (rebase aborted if necessary)

---

## Reusability Guide

shypmate is designed to work with any project. It has zero hardcoded project values.

### Step 1: Install shypmate

```bash
pip install shypmate
```

### Step 2: Initialize configuration

```bash
shypmate init        # interactive wizard — creates shypmate.yml and .env
shypmate validate    # check everything is configured correctly
```

### Step 3: Verify and run

```bash
shypmate verify                                    # end-to-end test in Docker
shypmate task add "Your first task" --priority high
shypmate work
```

### Files that are project-specific

| File | Purpose |
|------|---------|
| `shypmate.yml` | Project config (repo, network, validation, rules) |
| `.env` | Secrets and optional overrides |

---

## Security Model

### Secrets

- `GITHUB_TOKEN` and `ANTHROPIC_API_KEY` are passed as container env vars
- Never baked into Docker images
- Never committed to the repo

### GitHub Token Scopes

The token needs:
- `repo` — clone, push branches
- `pull_request` — create, update, comment on PRs

### Container Isolation

- Each agent runs in its own container with its own filesystem
- Workspace is sandboxed — file tools reject paths outside `/workspace`
- Shell access is allowlisted — only pre-approved commands run
- Destructive patterns are explicitly blocked

### Monitor Access

The monitor container mounts the Docker socket (`/var/run/docker.sock`) to spawn sibling containers. This is a privileged operation — the monitor should only run in trusted environments.

---

## Development Setup

If you're contributing to shypmate itself, install in editable mode after cloning:

```bash
git clone https://github.com/paulr978/shypmate.git
cd shypmate
pip install -e .
```

This makes the `shypmate` CLI point directly at your local `src/` files — any edits take effect immediately without reinstalling. Re-run only if you change `pyproject.toml` (new dependencies, entry points, etc.).

To verify your setup:

```bash
shypmate validate
```

---

## Extending the System

### Adding a New Tool

1. Create a function in the appropriate `tools/*.py` module
2. Decorate with `@tool("tool_name")`
3. Add it to the agent's tool list in `agents/dev_agent.py`

Example:

```python
# tools/shell_tools.py
@tool("run_mypy")
def run_mypy(path: str = ".") -> str:
    """Run mypy type checker."""
    return _run_command(f"mypy {path}")
```

Then add `"mypy "` to `shell.allowed_prefixes` in `shypmate.yml` and add the tool import in `dev_agent.py`.

### Adding a New Agent Type

1. Create `agents/frontend_agent.py` (or similar)
2. Create a corresponding task in `tasks/frontend_task.py`
3. Optionally create a new crew in `crews/frontend_crew.py`
4. Add a new CLI command in `main.py` (e.g., `run-frontend-agent`)

### Adding New Validation Commands

Update `shypmate.yml` at your project root:

```yaml
validation_commands:
  - command: "ruff check --fix ."
    description: "Lint check and auto-fix"
  - command: "pytest backend/apps/myapp/tests/ -x"
    description: "Run targeted tests"

shell:
  allowed_prefixes:
    - "pytest "   # must be in the allowlist for validation to work
```

---

## Troubleshooting

### Agent container exits immediately

Check logs:
```bash
docker logs shypmate-<agent-id>
```

Common causes:
- Missing `GITHUB_TOKEN` or `ANTHROPIC_API_KEY`
- GitHub token lacks push/PR permissions
- Agent image not built (`shypmate build`)

### Agent can't reach shared services

Make sure the project's Docker network exists and services are running:
```bash
docker network ls | grep backend_net
make up  # from project root
```

### Monitor doesn't detect stale PRs

- Check monitor logs: `docker logs -f shypmate-monitor`
- Verify `GITHUB_REPO` and `BASE_BRANCH` are correct
- PRs must be on `shypmate/*` branches to be detected

### Clone fails inside container

- Verify `GITHUB_TOKEN` has `repo` scope
- For private repos, ensure the token has access

### Rebase fails with complex conflicts

The agent will attempt automatic resolution for simple conflicts. If it can't resolve, it will:
1. Abort the rebase
2. Comment on the PR explaining the conflict
3. The PR remains open for human review

Check the PR comments for details.
