Metadata-Version: 2.4
Name: cairn-ai
Version: 0.2.0
Summary: Runtime signal capture and distillation for AI-assisted Python development
Project-URL: Homepage, https://github.com/Cope-Labs/cairn
Project-URL: Repository, https://github.com/Cope-Labs/cairn
Project-URL: Issues, https://github.com/Cope-Labs/cairn/issues
Author-email: Cope Labs LLC <seth@copelabs.dev>
License: AGPL-3.0-or-later
License-File: LICENSE
Keywords: agent,ai,instrumentation,mcp,observability,testing
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: GNU Affero General Public License v3 or later (AGPLv3+)
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Quality Assurance
Classifier: Topic :: Software Development :: Testing
Requires-Python: >=3.10
Requires-Dist: click<9.0,>=8.1
Provides-Extra: compat
Requires-Dist: nox>=2024.3; extra == 'compat'
Provides-Extra: config
Requires-Dist: tomli<3.0,>=2.0; (python_version < '3.11') and extra == 'config'
Provides-Extra: coverage
Requires-Dist: coverage<8.0,>=7.3; extra == 'coverage'
Provides-Extra: dev
Requires-Dist: coverage<8.0,>=7.3; extra == 'dev'
Requires-Dist: margin<1.0,>=0.9.23; extra == 'dev'
Requires-Dist: nox>=2024.3; extra == 'dev'
Requires-Dist: pytest<9.0,>=7.4; extra == 'dev'
Requires-Dist: ruff<1.0,>=0.5; extra == 'dev'
Requires-Dist: sentence-transformers<4.0,>=2.2; extra == 'dev'
Requires-Dist: tomli<3.0,>=2.0; (python_version < '3.11') and extra == 'dev'
Provides-Extra: embed
Requires-Dist: sentence-transformers<4.0,>=2.2; extra == 'embed'
Provides-Extra: health
Requires-Dist: margin<1.0,>=0.9.23; extra == 'health'
Provides-Extra: pytest
Requires-Dist: pytest<9.0,>=7.4; extra == 'pytest'
Description-Content-Type: text/markdown

# Cairn

[![PyPI version](https://img.shields.io/pypi/v/cairn-ai.svg)](https://pypi.org/project/cairn-ai/)
[![PyPI downloads](https://img.shields.io/pypi/dm/cairn-ai.svg)](https://pypi.org/project/cairn-ai/)
[![Python versions](https://img.shields.io/pypi/pyversions/cairn-ai.svg)](https://pypi.org/project/cairn-ai/)
[![License: AGPL v3](https://img.shields.io/badge/License-AGPL%20v3-blue.svg)](https://www.gnu.org/licenses/agpl-3.0)

**Writes its own `CLAUDE.md` — from what your code actually does.**

Your AI coding agent starts every session cold. Cairn gives it memory of what this codebase has actually failed on — captured from your runtime, distilled into typed patterns, injected into the agent's context before the next session.

```bash
pip install cairn-ai
pytest                    # cairn's pytest plugin auto-registers
cairn install             # scaffold CLAUDE.md / AGENTS.md / Cursor rules
cairn status              # see what was captured
```

No config, no decorators, no changes to your tests. Captured records are
scrubbed of secrets and PII before they hit disk.

---

## What it captures

Every run, Cairn writes structured signal to a local SQLite store:

- **Tests** — pytest pass/fail, assertion values, local state at failure
- **Execution** — function calls, return values, exceptions, duration (Python 3.12+ `sys.monitoring`; `@cairn.watch` decorator fallback on older)
- **Structure** — AST scan for 9 anti-patterns, untested branches (via `coverage`)
- **Git** — commit → test delta → error delta correlation

Records are idempotent, vector-embedded for semantic search, and decay on a configurable half-life.

`cairn distill` clusters failures, names them (deterministic heuristics or LLM via OpenRouter), and produces:

- **Context snippets** for the agent's system prompt
- **Lint rules** encoded as ruff/flake8 rules so the pattern can't recur silently
- **Test templates** — property-based stubs from observed function behaviour

---

## Safety

Captured content (pytest locals, monitor return values, ingested JSONL) is
passed through a secret/PII scrubber before it is written to SQLite. The
scrubber redacts:

- sensitive env-var values (by harvesting env at import time)
- `key=value` / `key: value` pairs whose key matches a sensitive-name heuristic
  (`password`, `token`, `api_key`, `authorization`, `cookie`, `session`,
  `private_key`, `credential`, …)
- recognised token shapes (AWS, GitHub, Anthropic / OpenAI / OpenRouter, Slack,
  JWT, PEM private keys)
- common PII shapes (email local-parts, IPv4, credit-card-shaped digit runs)

Records are content-addressed after scrubbing, so two observations differing
only by a secret collapse into a single row and the hash can never be inverted
to recover the original value.

---

## How it differs

| Tool | What it does | What Cairn adds |
|---|---|---|
| Sentry / Datadog | Production errors | Feeds dev-time agent context |
| pytest-cov | Untested paths | Persists and distills over time |
| CLAUDE.md / AGENTS.md | Manual agent context | Writes itself from runtime |
| Cursor / Cline | Generates from context | Remembers what failed last session |

---

## Architecture

```
cairn/
├── capture/     # sys.monitoring, pytest plugin, AST scanner, git linker
├── store/       # TruthRecord schema, SQLite writer, semantic search
├── distill/     # clustering, naming, decay, health (margin-powered)
├── inject/      # context builder, MCP server (stdio + HTTP/SSE)
└── cli.py
```

---

## TruthRecord

```python
@dataclass
class TruthRecord:
    id: str                      # sha256(source:domain:content)
    source: str                  # "pytest" | "monitor" | "ast" | "git" | "distill"
    domain: str                  # project namespace / module path
    content: str                 # the fact
    confidence: float            # 0.0–1.0, decays with age unless re-confirmed
    occurred_at: datetime
    source_file: str | None
    source_line: int | None
    exception_type: str | None
    pattern_cluster: str | None  # assigned by distillation
    embedding: list[float]       # semantic search
```

Re-running the same failure updates the existing record rather than inserting a duplicate.

---

## CLI

| Command | Description |
|---|---|
| `cairn status` | Summary: counts, active vs stale, recent failures, clusters |
| `cairn stats` | Detailed breakdown by source, domain, confidence, age |
| `cairn scan [PATH]` | AST scan, optionally with coverage + git linkage |
| `cairn distill` | Cluster + name patterns; `--use-llm` for richer labels |
| `cairn context` | Print Markdown context (respects `--max-tokens`, default 4000) |
| `cairn query Q` | Keyword / semantic search with source + age filters |
| `cairn watch TARGET` | Instrument and run a script or module (3.12+) |
| `cairn serve` | MCP server — stdio or HTTP+SSE |
| `cairn templates` | Generate property-based test stubs |
| `cairn ingest [FILE]` | Cross-language JSONL ingest |
| `cairn export` | JSON or CSV |
| `cairn sync SOURCE_DB` | Merge another store into this one |
| `cairn annotate [PATH]` | Insert / remove inline `# cairn:` comments |
| `cairn prune` | Delete stale or low-confidence records |
| `cairn embed --repair` | Backfill missing embeddings |
| `cairn install` | Scaffold agent instructions (CLAUDE.md, AGENTS.md, Copilot, Cursor, Continue) |
| `cairn install-hook` | Git post-commit hook for auto-distill |
| `cairn config show` | Print resolved config |

Run `cairn <command> --help` for full options.

---

## Configuration

Cairn reads `[tool.cairn]` from the nearest `pyproject.toml`, merging with defaults:

```toml
[tool.cairn]
db     = ".cairn/store.db"
domain = "myproject"

[tool.cairn.decay]    # half-life in days
pytest  = 21.0
monitor = 21.0
ast     = 14.0
git     = 30.0
distill = 60.0
```

---

## Install extras

```bash
pip install cairn-ai                   # core
pip install cairn-ai[pytest]           # pytest auto-registration (recommended)
pip install cairn-ai[coverage]         # untested-branch capture
pip install cairn-ai[health]           # margin drift / anomaly / causality
```

---

## Health analysis (margin integration)

With `cairn-ai[health]` installed, distillation gains trajectory-aware analysis from [margin](https://github.com/Cope-Labs/margin):

- **Drift** — each cluster as a weekly time series: STABLE / DRIFTING / ACCELERATING / DECELERATING / REVERTING / OSCILLATING with polarity-normalized direction.
- **Anomaly** — latest week classified against history: EXPECTED / UNUSUAL / ANOMALOUS / NOVEL.
- **Causality** — lag-aware temporal correlations across clusters.
- **Escalation** — HALT / ALERT / LOG based on combined health state.

Context output changes from:

```markdown
## ValueError in config.py (12×) [pytest]
```

to:

```markdown
## HALT — ValueError in config.py (12×) [pytest]
_Health: ABLATED | Drift: ACCELERATING WORSENING | Anomaly: NOVEL_
_Causality: --CORRELATES--> KeyError::validate_request (strength 1.00, lag 2 weeks)_
```

Without margin installed, all health fields are `None` and exponential decay runs unchanged.

---

## MCP server

```bash
cairn serve                                             # stdio (Claude Code, Cline)
cairn serve --transport http --port 8765                # HTTP + SSE
cairn serve --transport http --token "$CAIRN_TOKEN"     # with bearer auth
```

Tools exposed: `cairn_status`, `cairn_search`, `cairn_context`, `cairn_recent_failures`, `cairn_health`, `cairn_synthesis`.

Add to your agent's MCP config:

```json
{
  "servers": {
    "cairn": {
      "command": "cairn",
      "args": ["serve"],
      "cwd": "/path/to/your/project"
    }
  }
}
```

The HTTP transport also exposes REST endpoints:

| Method | Path | Auth |
|---|---|---|
| `POST` | `/ingest` | bearer if `--token` set |
| `POST` | `/query` | bearer if `--token` set |
| `GET` / `POST` | `/sse` · `/message` | MCP handshake |
| `GET` | `/health` | none |

---

## Cross-language signal

Any language can feed Cairn via JSONL — only `content` is required:

```bash
nasti-js apply src/ | cairn ingest --source nasti-js --domain myapp/src
cairn ingest records.jsonl --source custom --domain myservice
cairn embed --repair                         # backfill embeddings after direct inserts
```

Records from any source are semantically searchable together via `cairn query` or the MCP server.

---

## Python support

Python 3.10+. `sys.monitoring` capture requires 3.12+; earlier versions fall back to the `@cairn.watch` decorator. All other features work everywhere.

---

## Development

```bash
git clone https://github.com/Cope-Labs/cairn
cd cairn
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
pytest
```

The backtest suite (`tests/test_backtest.py`) is a machine-readable spec of Cairn's core contracts. If you change an algorithm, it tells you what changed.

---

## Name

A cairn is a stack of stones left on a trail by previous travelers. Found by the next one.

---

## License

AGPL-3.0-or-later. Commercial licensing available for closed-source use — open an issue.
