Metadata-Version: 2.4
Name: trapdoor-mcp
Version: 1.2.6
Summary: Work item + test management MCP server for AI agents
Project-URL: Homepage, https://pypi.org/project/trapdoor-mcp/
Project-URL: Documentation, https://github.com/nyecov/trapdoor-support
Project-URL: Issues, https://github.com/nyecov/trapdoor-support/issues
Requires-Python: <4.0,>=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: mcp>=1.0.0
Requires-Dist: pydantic<3.0,>=2.7
Requires-Dist: pyyaml>=6.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Requires-Dist: ruff>=0.3.0; extra == "dev"
Requires-Dist: tomli<3.0,>=2.0; python_version < "3.11" and extra == "dev"
Provides-Extra: semantic
Requires-Dist: chromadb<2.0,>=1.5; extra == "semantic"
Requires-Dist: sentence-transformers<6.0,>=5.5; extra == "semantic"
Requires-Dist: requests<3.0,>=2.34; extra == "semantic"
Requires-Dist: opentelemetry-api<1.41,>=1.40; extra == "semantic"
Requires-Dist: opentelemetry-sdk<1.41,>=1.40; extra == "semantic"
Requires-Dist: opentelemetry-exporter-otlp-proto-grpc<1.41,>=1.40; extra == "semantic"
Dynamic: license-file

# Trapdoor — Work Item + Test Management for AI Agents

Trapdoor is a **black box work item and test management MCP server** for AI agents. Agents interact exclusively through MCP tools — never touch files, never run git, never leave the protocol.

```
Agent ──MCP──> Trapdoor ──git──> Private submodule (source of truth)
```

## Table of Contents

- [Quick Start](#quick-start)
- [Installation](#installation)
- [First-Run Setup](#first-run-setup)
- [MCP Client Configuration](#mcp-client-configuration)
- [All MCP Tools (28)](#all-mcp-tools-28)
- [CLI Commands](#cli-commands)
- [Key Concepts](#key-concepts)
- [TDD Workflow](#tdd-workflow)
- [Agent Integration](#agent-integration)
- [Configuration](#configuration)
- [Storage Layout](#storage-layout)
- [Multi-Agent Setup](#multi-agent-setup)
- [Documentation](#documentation)
- [License](#license)

---

## Quick Start

```bash
# Install from PyPI (stable release)
pip install trapdoor-mcp

# Or from GitHub (latest commit)
pip install git+https://github.com/nyecov/TrapDoor.git

# One-command first run (creates .trapdoor/ + git repo + starts server)
trapdoor-mcp --init
```

**AI agents** inside this repo can also run the auto-setup script:
```powershell
.\tools\Setup-Trapdoor.ps1          # from PyPI
.\tools\Setup-Trapdoor.ps1 -FromGitHub  # from GitHub
```

That's it. Your MCP client connects and discovers 28 tools automatically.

---

## Installation

**Requirements:** Python 3.10+

Install from **PyPI** (stable release):

```bash
pip install trapdoor-mcp
```

Or from **GitHub** (latest commit, no local clone needed):

```bash
pip install git+https://github.com/nyecov/TrapDoor.git
```

Or from a **local source checkout** (editable, for development):

```bash
git clone https://github.com/nyecov/TrapDoor
cd TrapDoor
pip install -e .
```

There is no separate "non-semantic" version — the base package is the only package. It works fully out of the box with zero extra dependencies.

**Optional — enable semantic search via ChromaDB:**

```bash
pip install trapdoor-mcp[semantic]
```

This adds ~130MB of dependencies (ChromaDB + `sentence-transformers`) for vector-based search. Without it, search defaults to SQLite FTS5 (bundled with Python, no install cost). The rest of Trapdoor is identical either way.

**AI agent auto-setup:**

```powershell
.\tools\Setup-Trapdoor.ps1          # from PyPI
.\tools\Setup-Trapdoor.ps1 -FromGitHub  # from GitHub
```

Verify the install:

```bash
trapdoor --version       # trapdoor-mcp 1.0.2
trapdoor-mcp --help      # shows --init flag
```

---

## First-Run Setup

There are two approaches depending on your use case:

### A) Single Agent — Direct Tracking (Simplest)

If only one agent uses Trapdoor, track `.trapdoor/` directly in your project repo:

1. Remove `.trapdoor/` from `.gitignore` (see the explanatory comment in the file)
2. Run `trapdoor init` — creates `.trapdoor/` with default configs
3. Start the MCP server

### B) Multi-Agent — Private Submodule (Recommended for Teams)

Each agent gets a deploy key to a **separate private repo** — never to the project source:

1. Create a private repo (e.g. `my-project-trapdoor`)
2. Run `trapdoor init --submodule git@github.com:team/my-project-trapdoor.git`

This runs `git submodule add` and fills `.trapdoor/` with default configs.

### C) One-Command Auto-Init

For development or evaluation:

```bash
trapdoor-mcp --init       # auto-creates .trapdoor/ + git repo + starts server
trapdoor serve --init     # same via the CLI
```

If `.trapdoor/` is missing and `--init` is not passed, the server raises a clear error.

---

## MCP Client Configuration

Each MCP client needs a config pointing to the server. Examples:

**Claude Desktop (`claude_desktop_config.json`):**

```json
{
  "mcpServers": {
    "trapdoor": {
      "command": "trapdoor-mcp",
      "args": [],
      "cwd": "/path/to/your/project"
    }
  }
}
```

**opencode (`opencode.json`):**

```json
{
  "mcpServers": {
    "trapdoor": {
      "command": "trapdoor-mcp",
      "args": []
    }
  }
}
```

**Codex / Cursor / Gemini (`.codex.md` / `.cursorrules` / `.gemini.md`):**

```yaml
mcpServers:
  trapdoor:
    command: trapdoor-mcp
    args: []
```

The server discovers `.trapdoor/` by walking up from `cwd`. Ensure the working directory contains or is inside a project with `.trapdoor/`.

---

## All MCP Tools (29)

### Diagnostics (2)

| Tool | Description | Key Parameters |
|------|-------------|----------------|
| `health_check` | Server health + search backend status (configured vs effective type, chromadb availability, index size) | — |
| `get_config` | Inspect running server configuration (storage_format, sync_mode, search_backend, duplicate_detection) | — |

### Knowledge Retrieval (5)

| Tool | Description | Key Parameters |
|------|-------------|----------------|
| `search_items` | Search items — tool description shows active backend (`chromadb` or `fts`) at server start | `query`, `kind?`, `status?`, `include_terminal?`, `archived?`, `valid_before?`, `valid_after?`, `limit?`, `format?` |
| `get_chain` | Traverse parent/child, relates, tests, and bug links | `id`, `direction?`, `depth?`, `temporal?`, `aggregate?`, `include_test_stats?` |
| `get_neighbors` | All items within N hops — broad graph expansion | `id`, `depth?`, `relation?`, `direction?` (forward/backward/both) |
| `find_path` | Shortest relation chain between two items | `source`, `target`, `max_depth?` |
| `explain_chain` | Why an item is connected to its neighbors | `id` |

### Work Items (9)

| Tool | Description | Key Parameters |
|------|-------------|----------------|
| `create_item` | Create any kind of work item | `title`, `kind?`, `priority?`, `description?`, `labels?`, `parent_id?`, `iteration_id?`, `relates?`, `metadata?`, `collaborators?`, `valid_from?`, `valid_until?`, `superseded_by?` |
| `show_item` | Full item with links block | `id`, `format?` (json or markdown) |
| `update_item` | Change status, assignee, fields | `id`, `title?`, `status?`, `priority?`, `assignee?`, `labels?`, `parent_id?`, `iteration_id?`, `relates?`, `metadata?`, `comment?`, `collaborators?`, `valid_from?`, `valid_until?`, `superseded_by?` |
| `list_items` | Filter items by any field | `status?`, `kind?`, `assignee?`, `label?`, `priority?`, `iteration_id?`, `collaborator?`, `archived?`, `valid_before?`, `valid_after?`, `include_terminal?`, `limit?`, `offset?`, `format?` |
| `claim_item` | Assign to agent with session tracking | `id`, `agent_id` |
| `complete_item` | Test-gated completion | `id`, `resolution` (done/cancelled/won_t_do), `comment?`, `force?` |
| `archive_item` | Soft-delete with reason | `id`, `reason?` |
| `unarchive_item` | Restore an archived item | `id` |
| `batch_update_items` | Bulk status/assignee/iteration changes | `status?`, `kind?`, `assignee?`, `label?`, `iteration_id?`, `new_status?`, `new_assignee?`, `new_iteration_id?` |
| `get_summary` | Project-wide stats | — |

### Tests (12)

| Tool | Description | Key Parameters |
|------|-------------|----------------|
| `create_test_suite` | Group test designs | `name`, `item_id?`, `description?` |
| `create_test_design` | Write Gherkin scenarios | `title`, `level?`, `gherkin`, `suite_id?`, `item_id?`, `priority?`, `tags?` |
| `show_test_design` | Full design with latest execution | `id`, `format?` (json or markdown) |
| `update_test_design` | Update Gherkin, status, priority | `id`, `title?`, `gherkin?`, `status?`, `priority?`, `tags?` |
| `list_test_designs` | Filter by level, suite, status, tag, item | `level?`, `suite_id?`, `status?`, `tag?`, `item_id?`, `format?` |
| `execute_test` | Record a test execution | `test_design_id`, `result`, `run_id?`, `phase?`, `framework?`, `duration_ms?`, `step_results?`, `artifacts?`, `bug_ids?` |
| `create_test_run` | Create a named test run | `name`, `item_id?`, `description?` |
| `import_results` | Import Playwright/Cucumber/JUnit reports | `framework`, `report_path`, `run_id?`, `auto_create_designs?` |
| `export_feature_files` | Write `.feature` files to disk | `item_id?`, `suite_id?`, `output_dir?` |
| `import_gherkin` | Import `.feature` files | `glob`, `item_id?`, `suite_id?` |
| `list_executions` | Query execution history | `test_design_id?`, `run_id?`, `result?`, `framework?`, `environment?`, `limit?`, `offset?` |
| `export_obsidian` | Export items as Obsidian Markdown | `output_dir?` |

### Persistence (1)

| Tool | Description | Key Parameters |
|------|-------------|----------------|
| `sync_items` | Commit + push all mutations; rebuilds index and links | `message?`, `force?` |

---

## CLI Commands

| Command | Description |
|---------|-------------|
| `trapdoor init` | Create `.trapdoor/` with default configs |
| `trapdoor init --submodule <url>` | Create as a git submodule linked to a remote |
| `trapdoor validate` | Full diagnostic scan |
| `trapdoor doctor` | Health check + auto-repair |
| `trapdoor show <id>` | Render item or test design as Markdown (supports `--format json`) |
| `trapdoor serve` | Start the MCP server via stdio |
| `trapdoor serve --init` | Auto-init and start in one command |
| `trapdoor export --json` | Export all items as JSON |
| `trapdoor export --dot` | Export as Graphviz DOT |
| `trapdoor export --graph` | Interactive HTML graph |
| `trapdoor export --timeline` | Lifecycle events as JSON |
| `trapdoor export --feature-files` | Generate `.feature` files |
| `trapdoor export --obsidian` | Export items as Obsidian Markdown notes |
| `trapdoor import --playwright <report>` | Import Playwright JSON report |
| `trapdoor import --cucumber <report>` | Import Cucumber JSON report |
| `trapdoor skill install` | Install agent integration skill |
| `trapdoor skill uninstall` | Remove agent integration skill |
| `trapdoor report` | Generate a sanitized public issue report (works without `.trapdoor/`) |
| `trapdoor-mcp --init` | Start MCP server, auto-init if missing |

---

## Key Concepts

### Work Items

Items have a `kind` (open string), a `status` (defined by the kind's state machine), and fields:

- **`priority`**: low / medium / high / critical
- **`labels`**: Free-form string tags
- **`assignee`**: Agent or human identifier
- **`parent_id`**: Link to parent item (creates hierarchy)
- **`iteration_id`**: Reference to an iteration item for sprint tracking
- **`relates`**: Cross-references to other items with relation types
- **`collaborators`**: Multi-agent pair mode
- **`metadata`**: Extensible dict (story_points, sprint, archived, etc.)
- **`valid_from` / `valid_until`**: Temporal validity window for decisions/experiments
- **`superseded_by`**: Link to replacement item when superseded
- **`comments`**: Append-only log of agent/human notes

Default kinds: `epic`, `story`, `bug`, `task`, `incident`, `iteration`, `plan`. Add custom kinds via `STATUS-MACHINES.json`.

### Temporal Validity

Items have a `valid_from` (defaults to creation time) and optional `valid_until`. Decisions, feature flags, and experiments use these to express lifespans. The `superseded_by` field links to the replacement item.

Querying:
- `list_items(valid_before="2026-07-01", valid_after="2026-06-01")` — items active in June 2026
- `search_items(query, valid_before=..., valid_after=...)` — same for search
- `get_chain(id, temporal=True)` — annotates chain with validity windows

### Relations

Cross-item relationships defined in `RELATIONS.json`. Each relation has a named `inverse`. The server auto-derives the reverse direction in `links.json`.

Built-in types:
| Relation | Inverse | Description |
|----------|---------|-------------|
| `blocks` | `blocked_by` | This item prevents another from progressing |
| `causes` | `caused_by` | This item introduces or causes the target |
| `duplicates` | `duplicated_by` | This item duplicates another |
| `relates_to` | `relates_to` | Symmetric relationship |
| `plans_for` | `planned_by` | This item is a plan for the target item |

Agents set the forward direction on the source item. The server auto-derives the inverse.

### Iterations (Sprints)

Iterations are first-class work items of kind `iteration`:

```
planning → active → completed → closed
```

Features:
- Items join an iteration via `iteration_id`
- **Auto-rollover**: When an iteration is completed, unfinished items get their `iteration_id` cleared and a rollover comment is appended
- **Auto-close**: When the last item in an iteration is completed, the iteration closes automatically
- Query: `list_items(iteration_id="WI-042")`

### Archive Lifecycle

Items can be archived (soft-deleted) with a reason:

```python
archive_item("WI-001", reason="Superseded by WI-042")
# Returns { id, archived: true, archived_at, archived_reason }

unarchive_item("WI-001")
# Returns { id, archived: false }
```

Archived items are excluded from default `list_items` and `search_items` results unless explicitly filtered with `archived=True`.

### Plans

Plans are first-class work items of kind `plan`, used by agents to describe **how** a work item will be implemented. Plans are linked to their target item via the `plans_for`/`planned_by` relation.

Plan lifecycle:

```
drafting → planned → executing → complete
                 → abandoned
drafting → superseded
planned → superseded
executing → superseded
```

Workflow:

```python
# 1. Agent reads the item and enters /plan mode
show_item("WI-001")

# 2. Create a plan linked to the item
plan = create_item(
    kind="plan",
    title="Plan: Add rate limiting",
    description="## Implementation Steps\n1. ...\n2. ...\n3. ...",
    parent_id="WI-001",
)
update_item(plan["id"], relates=[{"id": "WI-001", "relation": "plans_for"}])

# 3. Finalize the plan
update_item(plan["id"], status="planned")

# 4. Execute
update_item(plan["id"], status="executing")
# (implementation happens outside Trapdoor)

# 5. Complete
complete_item(plan["id"], "done", force=True)
# Maps to status "complete" for plan kind
```

### Format Parameter

Several tools accept `format=` to control response shape:

| Tool | Formats |
|------|---------|
| `show_item` | default (json), `"markdown"` |
| `show_test_design` | default (json), `"markdown"` |
| `search_items` | default (full), `"ids"` |
| `list_items` | default (full), `"ids"` |
| `list_test_designs` | default (full), `"ids"` |

### Batch Operations

Update multiple items matching a filter in one call:

```python
batch_update_items(kind="bug", new_status="backlog")
# Returns { updated_count: 5, updated_ids: [...], errors: [...] }

batch_update_items(iteration_id="WI-042", new_iteration_id="WI-050")
# Moves all items from sprint 1 to sprint 2
```

Invalid transitions are reported per-item in `errors`; valid items still update.

### Summary Statistics

```python
get_summary()
# Returns { total_items, by_kind: { task: 12, bug: 3 }, by_status: { backlog: 8, done: 4 }, by_assignee: {...} }
```

### Tests

Gherkin-based test designs with auto-parsed steps. Executions are append-only with step-level results.

**Lifecycle:**
1. `create_test_suite(name="Auth", item_id="WI-001")` — group tests
2. `create_test_design(title="Login", gherkin="Given...\\nWhen...\\nThen...", suite_id=..., item_id=...)`
3. `execute_test(test_design_id, result="pass")` — record result
4. `import_results(framework="playwright", report="./report.json")` — bulk import

**Flaky tests:** Mark a design as `flaky: true` and it won't block `complete_item`.

**Test-gated completion:** `complete_item(resolution="done")` checks all linked test designs. If any are failing and not flaky, the call is rejected.

### Supported Import Formats

| Framework | Format | Auto-Create Designs |
|-----------|--------|---------------------|
| Playwright | JSON (`{ suites: [...] }`) | ✅ |
| Cucumber | JSON (`[{ name, steps }]`) | ✅ |
| JUnit | XML (`<testsuites><testsuite>...`) | — |

---

## TDD Workflow

```
1. show_item("WI-001")              Read requirement
2. create_test_suite("Auth")        Group tests
3. create_test_design("Login",      Write Gherkin scenario
     gherkin="Given...")
4. export_feature_files(...)        Write .feature files to disk
5. (implement code)                 Outside Trapdoor
6. (run tests)                      Outside Trapdoor
7. import_results("playwright",     Import pass/fail
     "./report.json")
8. list_executions(td_id)           Check what passed/failed
9. create_item(kind="bug", ...)     Create bug for failure
10. (fix code, re-run)              Outside Trapdoor
11. import_results(...)             Re-import
12. complete_item(id, "done")       Gated by passing tests
```

---

## Agent Integration

The agent connects to the MCP server and discovers tools automatically. For deeper integration, install the Trapdoor skill:

```bash
trapdoor skill install                   # auto-detect platform + agent
trapdoor skill install --platform codex  # explicit platform
trapdoor skill install --all-agents      # all roles under .agent/agents/
```

This copies the canonical skill folder (SKILL.md + references/) to your CLI's skill directory so the agent knows tool patterns, workflows, data model, and best practices.

The MCP server warns on startup if the skill is not installed (non-blocking).

---

## Configuration

`.trapdoor/config.json`:

```json
{
  "storage_format": "yaml",
  "sync_mode": "manual",
  "artifact_dir": ".trapdoor-artifacts",
  "auto_sync": {
    "max_mutations": 10,
    "max_interval_seconds": 300
  }
}
```

| Key | Values | Description |
|-----|--------|-------------|
| `storage_format` | `"yaml"` (default) or `"json"` | File format for item and test files |
| `sync_mode` | `"manual"` or `"auto"` | Auto-sync after every mutation vs explicit `sync_items` |
| `artifact_dir` | Path string | Where test artifacts (screenshots, traces) are stored |

In `sync_mode: "manual"`, auto-sync triggers after:
- 10 mutations since last sync
- 5 minutes since last sync
- Any `complete_item` call

---

## Storage Layout

```
.trapdoor/
  config.json                  # sync_mode, artifact_dir, storage_format
  SCHEMA.json                  # Work item JSON Schema
  TEST-SCHEMA.json             # Test design + execution JSON Schema
  STATUS-MACHINES.json         # Work item kinds, states, transitions
  RELATIONS.json               # Relation types with named inverses
  TEST-LEVELS.json             # Test levels (unit, integration, contract, ...)
  index.json                   # Cache: maps ID to kind/status/title
  items/
    WI-*.{json,yaml}           # Work item files
  tests/
    designs/TCD-*              # Test designs (Gherkin)
    suites/TCS-*               # Test suites
    executions/TCE-*           # Append-only execution records
  sessions/                    # Claim sessions
  links.json                   # Auto-managed bidirectional link registry
  exports/                     # Generated exports (feature files, Obsidian)
```

All files support `.json` or `.yaml` based on `storage_format` config.

---

## Multi-Agent Setup

Trapdoor supports multiple agents using the same `.trapdoor/` submodule simultaneously.

### Topologies

| Topology | How it works | Best for |
|----------|-------------|----------|
| **Single server** | One `trapdoor server`, multiple MCP clients via HTTP/SSE | Multiple agents, one machine |
| **Multi-server (same instance_id)** | Each agent runs its own `trapdoor` process, same `instance_id` | Agents on different machines, same submodule |
| **Multi-server (unique instance_id)** | Each process has unique `instance_id`, globally unique IDs | Fully isolated agents |

### Coordination

- **Counter isolation**: `instance_id` prefix prevents ID collisions (`WI-srvA-001` vs `WI-srvB-001`)
- **Claim coordination**: `claim_item` checks for active claims before assigning
- **Merge conflicts**: Last-writer-wins on same-item clashes. Index and links are rebuilt from filesystem on every `sync_items`
- **Session tracking**: Stale claims auto-release on `sync_items` if no update > N minutes

### Sync Commands

| Command | When to use |
|---------|-------------|
| `sync_items()` | After batch mutations — commits and pushes |
| `sync_items(force=True)` | Bypass pre-sync validation warnings |

---

## Development

This project uses **pre-commit hooks** to enforce code quality. After cloning:

```powershell
.\tools\install-hooks.ps1          # idempotent — safe to re-run
```

Hooks run automatically on `git commit` and cover:

| Category | Hooks |
|----------|-------|
| **Linting & format** | `ruff`, `ruff-format` |
| **Validation** | YAML, JSON, TOML, merge conflict, large files, secret detection |
| **Convention** | Commit message format `[Tag] [WI-xxxx-xxx:]`, test file naming |
| **Project health** | `trapdoor validate` (advisory — warns only, never blocks) |

See [`tools/README.md`](tools/README.md) for the full hook inventory.

## Documentation

- [Development environment](docs/DEVELOPMENT.md) - isolated setup, dependency
  checks, warning policy, and supported Python versions

| Doc | What it covers |
|-----|----------------|
| `docs/REFINED-ARCHITECTURE.md` | Full design: all tools, data model, relations, integrations, multi-agent |
| `docs/DESIGN-NOTES.md` | Decision log and rationale |
| `docs/DEV-PLAN.md` | Development plan, backlog, completed milestones |
| `docs/PUBLISHING.md` | Step-by-step PyPI publishing guide |
| `docs/ISSUE_REPORTING.md` | AI-assisted and manual public issue reporting |
| `AGENTS.md` | Project rules for AI agents |
| `support-repo/` | Public issue intake repository, included as a submodule |

## Issue Reporting

PyPI users do not need access to the private source repository. Generate a
sanitized report:

```powershell
trapdoor report
```

Review `trapdoor-issue-report.md`, then submit it to the
[public Trapdoor support repository](https://github.com/nyecov/trapdoor-support/issues/new).
AI agents installed with `trapdoor skill install` receive detailed collection,
privacy-review, submission, and manual-fallback instructions. See
[`docs/ISSUE_REPORTING.md`](docs/ISSUE_REPORTING.md).

---

## License

MIT. OSS free, self-hosted. Future hosted version paid above revenue/headcount threshold.
