Metadata-Version: 2.4
Name: margin
Version: 0.10.0
Summary: Typed health classification, uncertainty algebra, and correction auditing for any system that measures things and needs to explain what happened.
Author: Cope Labs LLC
License: MIT
Project-URL: Homepage, https://copelabs.dev
Project-URL: Repository, https://github.com/Cope-Labs/margin
Project-URL: Issues, https://github.com/Cope-Labs/margin/issues
Project-URL: Changelog, https://github.com/Cope-Labs/margin/blob/main/CHANGELOG.md
Project-URL: Documentation, https://github.com/Cope-Labs/margin#readme
Keywords: health,monitoring,uncertainty,observability,typed,classification,threshold,polarity,correction,audit,ledger,policy
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: System :: Monitoring
Classifier: Typing :: Typed
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
License-File: NOTICE
Dynamic: license-file

# margin

[![PyPI version](https://img.shields.io/pypi/v/margin.svg)](https://pypi.org/project/margin/)
[![Downloads](https://static.pepy.tech/badge/margin)](https://pepy.tech/project/margin)
[![Downloads/month](https://static.pepy.tech/badge/margin/month)](https://pepy.tech/project/margin)
[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)

**Typed primitives for system self-awareness.**

When a system has to answer *"how am I doing, and what should I do about it?"* — an LLM agent between steps, an ML pipeline mid-inference, a data job deciding whether to ship — margin gives it the vocabulary. Typed `Health`, `Confidence`, `Absence`, `Issue`, `Flow`, `Intent`, `Contract` — composable, serializable, zero-dependency Python.

```python
from adapters.agent import agent_parser, AgentStep, agent_flow

parser = agent_parser()
flow   = agent_flow()                 # plan → act → reflect

# Record what you already measure after one LLM + tool step:
step = AgentStep(latency_ms=2400, tokens_used=1800,
                 tool_success_rate=0.9, retry_count=1, confidence=0.72)

expr     = parser.parse(step.to_values())
guidance = flow.observe(step.to_values())

print(expr.to_string())
# [latency_ms:INTACT(-0.60σ)] [tokens_used:INTACT(-1.25σ)]
# [tool_success_rate:INTACT(-0.05σ)] [retry_count:INTACT(+0.00σ)]
# [confidence:INTACT(-0.10σ)]

print(guidance.to_atom())
# flow:agent:plan[1/3] → ADVANCE
```

Every metric is polarity-aware (more tokens = worse, higher confidence = better), sigma-normalised so you can compare them, and emits typed diagnostics when something goes wrong. Nothing stringly-typed; your decision loop reads typed enums.

Not building an agent? Margin works on any scalar with a threshold:

```python
from margin import Parser, Thresholds

parser = Parser(
    baselines={"throughput": 500.0, "error_rate": 0.002},
    thresholds=Thresholds(intact=400.0, ablated=150.0),
    component_thresholds={
        "error_rate": Thresholds(intact=0.005, ablated=0.05, higher_is_better=False),
    },
)
expr = parser.parse({"throughput": 480.0, "error_rate": 0.03})
print(expr.to_string())
# [throughput:INTACT(-0.04σ)] [error_rate:DEGRADED(-14.00σ)]
```

Throughput and error rate on the same scale. One higher-is-better, the other lower-is-better. Both classified correctly. Sigma-normalised so you can compare them.

## Install

```bash
pip install margin
```

Zero dependencies. Pure Python. 3.10+.

## What it does

A number comes in. Margin gives it:

- **Health** — INTACT / DEGRADED / ABLATED / RECOVERING / OOD
- **Polarity** — higher-is-better or lower-is-better, handled correctly everywhere
- **Sigma** — dimensionless deviation from baseline, always positive = healthier
- **Confidence** — how much the uncertainty interval overlaps the threshold
- **Provenance** — where this value came from, for correlation detection
- **Validity** — how the measurement ages (static, decaying, event-invalidated)
- **Drift** — trajectory classification: STABLE / DRIFTING / ACCELERATING / DECELERATING / REVERTING / OSCILLATING
- **Anomaly** — statistical outlier detection: EXPECTED / UNUSUAL / ANOMALOUS / NOVEL

Then the correction loop:

- **Policy** — typed rules that decide what to do (RESTORE / SUPPRESS / AMPLIFY)
- **Constraints** — alpha clamping, cooldown, rate limiting
- **Escalation** — LOG / ALERT / HALT when the policy can't act
- **Contract** — typed success criteria ("reach INTACT within 5 steps")
- **Causal** — dependency graphs ("api is DEGRADED because db is ABLATED")
- **Auto-correlation** — discover which components move together from data, with lag detection
- **Streaming** — incremental trackers: `Monitor.update(values)` updates health + drift + anomaly + correlation in one call
- **Flow** — ordered stages with typed live guidance: ADVANCE / HOLD / RETRY / ROLLBACK / ESCALATE for multi-stage processes
- **Issue** — typed diagnostics: library surfaces its own misconfigurations, degraded assumptions, recovered failures
- **Config** — define everything in YAML/JSON: `margin.load_config("margin.yaml")`
- **Persistence** — save/restore Monitor state across restarts, batch replay from CSV
- **Intent** — goal feasibility: `intent.evaluate_monitor(monitor)` → FEASIBLE / AT_RISK / INFEASIBLE with ETA
- **CLI** — `python -m margin status`, `monitor`, `replay` — no Python code required
- **Ledger** — full audit trail of every correction, serializable, replayable

All stages in one call:

```python
from margin import full_step

result = full_step(monitor, values, policy, graph=graph, contract=contract, intent=intent)
# result.expression    — current health
# result.drift         — per-component trajectories
# result.anomaly       — per-component outlier states
# result.step.explanations — why it happened (causal)
# result.step.correction   — what to do (policy)
# result.step.contract     — are we meeting our goals?
# result.intent        — can we still make it?
```

## The polarity bug

Every health system you've written has this bug. You check `if value >= threshold` and it works for throughput. Then you add error rate monitoring and the same check says 15% error rate is "healthy" because 0.15 >= 0.02.

Margin handles both polarities:

```python
# Higher is better (throughput, signal strength)
Thresholds(intact=80.0, ablated=30.0)

# Lower is better (error rate, latency)
Thresholds(intact=0.02, ablated=0.10, higher_is_better=False)
```

One flag. Threads through every comparison, every sigma calculation, every correction decision, every recovery ratio. You never think about it again.

## Multi-stage process guidance

For processes where each stage has distinct entry/quality/exit criteria — deployment pipelines, fermentation phases, clinical trials, incident response:

```python
from margin import Flow, Stage, Parser, Thresholds

flow = Flow(stages=[
    Stage(
        name="lag",
        parser=Parser(baselines={"biomass": 0.1}, thresholds=Thresholds(intact=0.08, ablated=0.02)),
        advance_when=lambda expr: expr.observations[0].value >= 0.15,
    ),
    Stage(name="exponential", parser=..., advance_when=...),
    Stage(name="stationary", parser=...),
], label="ferment")

result = flow.observe({"biomass": 0.12})
print(result.to_atom())
# flow:ferment:lag[1/3] → HOLD

# When the advance_when predicate fires:
result = flow.observe({"biomass": 0.18})
# result.guidance == Guidance.ADVANCE
flow.advance()
```

Predicate exceptions are caught as ERROR issues, guidance degrades to HOLD, the flow stays alive — safety-first semantics for safety-critical contexts.

## Typed diagnostics

The library surfaces its own operational state the same way it classifies observed state:

```python
monitor.update({"mystery": 1.0})       # emits WARNING: BASELINE_MISSING
monitor.drift("unknown")                # emits WARNING: UNKNOWN_COMPONENT

# Query it programmatically
monitor.issues.by_code("BASELINE_MISSING")       # [Issue(...)]
monitor.issues.at_least(IssueLevel.WARNING)       # all severe enough
monitor.diagnose()                                # JSON-safe dashboard summary

# Bridge to your logger
import logging
monitor.issues.attach_logger(logging.getLogger("margin"))
```

**Harvest-safe on hard fail:** if `Monitor.update()` or `Flow.observe()` raises, a `FATAL(INTERNAL_ERROR)` issue is recorded *before* the exception propagates. `monitor.issues` survives crashes intact.

## Auto-calibrate from data

Don't guess thresholds. Derive them from healthy measurements:

```python
from margin import parser_from_calibration

parser = parser_from_calibration(
    {"rps": [490, 510, 505, 495], "latency": [48, 52, 50, 51]},
    polarities={"latency": False},
)
```

## Architecture

| Layer | Question | Key types |
|---|---|---|
| **Foundation** | What was measured? | `Health`, `Observation`, `Expression`, `UncertainValue` |
| **Observability** | What changed? When will it cross? | `diff()`, `forecast()`, `track()`, `calibrate()` |
| **Streaming** | Is it drifting or anomalous? | `Monitor`, `DriftTracker`, `AnomalyTracker`, `CorrelationTracker` |
| **Policy** | What should we do? | `PolicyRule`, `Action`, `Constraint`, `Escalation` |
| **Contract** | Are we meeting our goals? | `HealthTarget`, `SustainHealth`, `RecoveryThreshold` |
| **Causal** | Why did this happen? | `CausalGraph`, `CausalLink`, `Explanation` |
| **Intent** | Can we still get there? | `Intent`, `Feasibility`, `IntentResult` |
| **Flow** | What stage are we in, and what should we do? | `Flow`, `Stage`, `Guidance`, `FlowResult` |
| **Issue** | What did the library notice about itself? | `Issue`, `IssueLevel`, `IssueBuffer`, `IssueCode` |

`full_step()` orchestrates the first seven in one call. `Flow` composes them into a sequence. `Issue` is ambient — every layer emits diagnostics into the same buffer.

## Adapters

Core adapters ship in the package. Each is a Parser factory + optional Flow / Intent / Monitor helpers with domain-appropriate thresholds:

| Adapter | What it monitors |
|---|---|
| [**agent**](https://github.com/Cope-Labs/margin/tree/main/adapters/agent/) | LLM / agent step health — latency, tokens, tool success rate, retries, confidence. Includes a three-stage `plan → act → reflect` Flow and a default feasibility Intent. |
| [**dataframe**](https://github.com/Cope-Labs/margin/tree/main/adapters/dataframe/) | Data quality (completeness, null rate, drift, freshness, schema) |
| [**numpy**](https://github.com/Cope-Labs/margin/tree/main/adapters/numpy/) | Array health (NaN rate, drift, distribution shape) |
| [**fastapi**](https://github.com/Cope-Labs/margin/tree/main/adapters/fastapi/) | Endpoint health (latency, error rate, throughput, queue depth) — ASGI middleware |
| [**pytest**](https://github.com/Cope-Labs/margin/tree/main/adapters/pytest/) | Test suite health (pass rate, flake rate, coverage, duration) — conftest plugin |

Additional domain profiles (healthcare, ros2, transformer circuit interpretability, homeassistant, aquarium, greenhouse, evcharging, fitness, infrastructure, weather, printer3d, neuro, godot, celery, database) live in [`contrib/`](https://github.com/Cope-Labs/margin/tree/main/contrib/) — reference material, not shipped in the pip package.

## Docs

Full specification: [margin-language.md](https://github.com/Cope-Labs/margin/blob/main/margin/margin-language.md) — a typed grammar for system self-awareness.

## License

MIT
