Metadata-Version: 2.4
Name: stronk-mask
Version: 0.1.0
Summary: Security-first self-hosted privacy masking and routing proxy for LLM traffic.
Author: EYYCHEEV
License: Apache-2.0
Requires-Python: >=3.11
Requires-Dist: fastapi<1.0,>=0.115
Requires-Dist: httpx<1.0,>=0.27
Requires-Dist: presidio-analyzer<3.0,>=2.2
Requires-Dist: pydantic-settings<3.0,>=2.6
Requires-Dist: pydantic<3.0,>=2.9
Requires-Dist: spacy<4.0,>=3.8
Requires-Dist: uvicorn<1.0,>=0.32
Requires-Dist: websockets<16,>=13
Provides-Extra: dev
Requires-Dist: mypy<2.0,>=1.13; extra == 'dev'
Requires-Dist: pytest-cov<7.0,>=6.0; extra == 'dev'
Requires-Dist: pytest<9.0,>=8.3; extra == 'dev'
Requires-Dist: ruff<1.0,>=0.8; extra == 'dev'
Description-Content-Type: text/markdown

# stronk-mask

Security-first self-hosted masking and rehydration proxy for large language model traffic.

`stronk-mask` sits between callers and upstream providers. It detects sensitive input, applies per-type policy, forwards only masked payloads, and rehydrates model output before returning it to the caller.

This repo now also includes a separate operator control plane:

- `proxy-api`: the caller-facing masking proxy
- `admin-api`: sanitized monitoring API
- `admin-ui`: modern light/dark monitoring workspace
- `admin-gateway`: authenticated reverse-proxy entrypoint for the control plane

## Implemented Provider Surfaces

- `POST /openai/v1/chat/completions`
- `POST /openai/v1/responses`
- `GET /openai/v1/responses` websocket upgrade for OpenAI Responses-style realtime clients
- `POST /anthropic/v1/messages`
- `GET /health`

Supported behavior:

- non-streaming request masking and response rehydration
- streaming SSE rehydration for the supported provider families
- downstream OpenAI Responses websocket compatibility for `response.create` and `response.append`
- real `previous_response_id` continuations are preserved for upstream `/responses` requests
- local `generate:false` prewarm stays local and uses memory-only turn-state recovery instead of leaking synthetic response IDs upstream
- request-scoped opaque placeholders
- config-driven policy per detector type: `allow`, `mask`, `block`, `route_local`
- deny-by-default behavior when upstream egress or local routing is unsafe
- sanitized audit-event persistence for the optional control plane

## Safety Guarantees

- Raw sensitive values are never forwarded upstream when a detection is masked or blocked.
- Raw request bodies, response bodies, rehydrated text, headers, provider credentials, and placeholder maps are not persisted by default.
- Upstream egress is fixed to configured provider base URLs; the caller cannot choose arbitrary upstream targets.
- Secret classes default to `block` rather than `mask`.
- If policy requires `route_local` and no local handler exists, the request is rejected instead of falling back upstream.
- The admin plane is separate from proxy routes and is disabled by default.
- The admin plane requires trusted identity headers, allowed roles, and a shared gateway secret when enabled.

## Control Plane

The operator surface is intentionally read-only in this phase.

What it shows:

- request counts, mask/block/rehydration totals, and error counts
- detector and policy action mix
- sanitized per-request events with endpoint, model, latency, counts, and touched JSON paths
- safe config posture and recent control-plane access logs

What it does not show:

- raw request bodies
- raw response bodies
- rehydrated plaintext
- upstream `Authorization` or `X-API-Key` headers
- placeholder-to-original mappings

## Detection Coverage

Deterministic detectors are implemented first and augmented with a local Presidio + spaCy layer by default. The detector interface stays pluggable so richer local NER can still be added later without redesigning the pipeline.

Current detector set:

- English and Chinese person names
- Company and organization names
- English and Chinese addresses
- Emails
- US and China phone numbers
- API keys, including `sk-...`, `sk-proj-...`, `sk-ant-...`, and common provider key prefixes
- Bearer tokens
- JWT-like tokens

## Architecture

- `src/stronk_mask/redaction/` - detection, masking, placeholder vault, and structured payload traversal
- `src/stronk_mask/policy/` - per-detector policy resolution
- `src/stronk_mask/providers/` - fixed provider endpoint specs
- `src/stronk_mask/proxy/` - upstream transport, header controls, SSE rehydration, and audit writes
- `src/stronk_mask/admin/` - trusted-header auth, SQLite-backed sanitized event store, and UI lookup helpers
- `src/stronk_mask/app.py` - proxy app factory
- `src/stronk_mask/admin_app.py` - separate admin app factory
- `web/` - React/Vite operator UI with light and dark mode
- `compose/` - local proxy + admin + Caddy stack
- `docs/` - architecture, threat model, and execution plans

## Quick Start

### Local Python + frontend

```bash
python3 -m venv .venv
source .venv/bin/activate
pip install -e .[dev]
python -m spacy download en_core_web_sm
python -m spacy download zh_core_web_sm
make ui-install
make ui-build
make test
```

Run the proxy:

```bash
make run
```

Run the admin API/UI separately:

```bash
STRONK_MASK_AUDIT_STORAGE_ENABLED=true \
STRONK_MASK_AUDIT_DB_PATH=./data/stronk-mask-audit.sqlite3 \
STRONK_MASK_ENABLE_ADMIN_API=true \
STRONK_MASK_ENABLE_ADMIN_UI=true \
STRONK_MASK_ADMIN_PROXY_SECRET=local-dev-shared-secret \
STRONK_MASK_ADMIN_UI_DIR=./web/dist \
make admin-run
```

The admin app expects trusted identity headers plus the shared gateway secret. In normal operation, place it behind Caddy plus basic auth locally, or behind `oauth2-proxy` plus OIDC in production. Do not publish the admin backend directly.

### Local Docker Compose

1. Generate a Caddy bcrypt hash:

```bash
docker run --rm caddy:2.8-alpine caddy hash-password --plaintext 'change-me'
```

2. Export it:

```bash
export STRONK_MASK_ADMIN_PASSWORD_HASH='<bcrypt hash>'
```

3. Start the stack:

```bash
docker compose -f compose/docker-compose.yml up --build
```

Default local endpoints:

- proxy: `http://127.0.0.1:8787`
- admin UI: `http://127.0.0.1:8788`

## Release Publishing

The preferred release path now uses GitHub Actions for both GHCR image publishing and PyPI Trusted Publishing. The recorded release flow, plus the Bitwarden-backed local PyPI fallback, lives in [docs/release-publishing.md](/Users/eyy/Documents/Work/Dev/repos/stronk-mask/docs/release-publishing.md).

The short version:

```bash
cd /Users/eyy/Documents/Work/Dev/repos/stronk-mask
uv build
BWS_ACCESS_TOKEN="$BWS_STRONK_TERMINAL_ACCESS_TOKEN" \
bws run -- uv publish
```

Keep GHCR publishing in GitHub Actions; do not add local container-registry tokens to this flow.

## Configuration

Core proxy environment variables:

- `STRONK_MASK_OPENAI_UPSTREAM_BASE_URL`
- `STRONK_MASK_ANTHROPIC_UPSTREAM_BASE_URL`
- `STRONK_MASK_ALLOW_INSECURE_UPSTREAMS=false`
- `STRONK_MASK_ALLOW_NONSTANDARD_UPSTREAM_HOSTS=false`
- `STRONK_MASK_ENABLE_DEBUG_MASK_ENDPOINT=false`
- `STRONK_MASK_PRESIDIO_ENABLED=true`
- `STRONK_MASK_PRESIDIO_ENGLISH_MODEL=en_core_web_sm`
- `STRONK_MASK_PRESIDIO_CHINESE_MODEL=zh_core_web_sm`
- `STRONK_MASK_*_ACTION`

Admin plane environment variables:

- `STRONK_MASK_AUDIT_STORAGE_ENABLED=false`
- `STRONK_MASK_AUDIT_DB_PATH`
- `STRONK_MASK_AUDIT_MAX_EVENTS=2000`
- `STRONK_MASK_ENABLE_ADMIN_API=false`
- `STRONK_MASK_ENABLE_ADMIN_UI=false`
- `STRONK_MASK_ADMIN_USER_HEADER=X-Stronk-Admin-User`
- `STRONK_MASK_ADMIN_ROLES_HEADER=X-Stronk-Admin-Roles`
- `STRONK_MASK_ADMIN_PROXY_SECRET_HEADER=X-Stronk-Admin-Proxy-Secret`
- `STRONK_MASK_ADMIN_PROXY_SECRET`
- `STRONK_MASK_ADMIN_ALLOWED_ROLES=admin,operator,auditor`
- `STRONK_MASK_ADMIN_ACCESS_LOG_MAX_ENTRIES=500`
- `STRONK_MASK_ADMIN_UI_DIR=./web/dist`

Default policy:

- `email`, `phone`, `person_name`, `organization`, `address` -> `mask`
- `api_key`, `bearer_token`, `jwt` -> `block`

## Tests

The suite includes:

- unit coverage for detector behavior, overlap resolution, placeholder generation, policy parsing, audit summaries, event-store behavior, and SSE rehydration
- integration coverage for all supported provider endpoints
- websocket regression coverage for OpenAI Responses `response.create`, `response.append`, prewarm, invalid events, and incomplete upstream streams
- websocket regression coverage for real `previous_response_id` continuations, fresh-chain resets, reconnect recovery via `x-codex-turn-state`, and binary-frame rejection
- regression checks proving raw values do not leak into forwarded upstream payloads
- admin-plane coverage for `401/403` auth enforcement and sanitized proxy-to-admin event flow
- streaming tests for OpenAI chat, OpenAI responses, and Anthropic messages
- bypass and collision canaries including zero-width-key variants and literal placeholder collisions

## Stronger Than PasteGuard

This repo now claims stronger behavior only where it is implemented and tested:

- Explicit OpenAI `responses` endpoint coverage, not just chat completions.
- OpenAI `responses` websocket compatibility on the same public path used by Codex-style clients.
- Fixed upstream egress targets with request-header allowlists and redirect refusal.
- Official upstream host pinning is on by default; non-standard compatible hosts require explicit opt-in.
- Default secret handling is `block`, not best-effort masking.
- Raw detection values are not reflected back through the debug path.
- Streaming rehydration is covered for all three supported provider surfaces.
- Rehydration is limited to human-readable assistant text paths; tool arguments stay masked by default.
- Regression tests cover placeholder collisions, bypass attempts, and `/responses` as a first-class path.
- The monitoring plane is separate from proxy routes, disabled by default, and stores sanitized telemetry only.

## Threat Model Summary

- Caller credentials for upstream providers are part of the data plane and must not be reused for admin authentication.
- The admin plane trusts reverse-proxy identity headers and a shared gateway secret, so production deployments should terminate auth at the edge with OIDC or another identity-aware proxy and keep the backend private.
- Local Compose uses Caddy basic auth for convenience only; it is appropriate for loopback and small-team local testing, not for broad internet exposure.
- The SQLite event store persists sanitized events only. It is not a safe place for raw prompts, raw completions, or placeholder vault state.

## Known Limitations

- Name, organization, and address detection is heuristic. It is materially stronger than regex-only email and key detection, but it is not equivalent to a full local NER model.
- Presidio + spaCy are enabled by default in this repo. Fresh environments must install `en_core_web_sm` and `zh_core_web_sm` or explicitly disable Presidio.
- `route_local` is a clean policy boundary today, but the local-model execution path is still a scaffold and fails closed.
- WebSocket support is intentionally scoped to OpenAI Responses-style JSON text events. `stronk-mask` is not a generic websocket tunnel and does not currently support binary or audio frames.
- The websocket bridge keeps the upstream side on HTTP plus Server-Sent Events (SSE). It does not yet proxy upstream websocket transports.
- The admin plane is read-only in this phase. There is no browser-based policy editor, request replay, or break-glass raw reveal workflow.
- The admin app expects trusted identity headers. If it is exposed without a proper reverse proxy or identity-aware gateway, requests should fail rather than silently trust the caller.
- The admin backend should stay on a private network. The shared gateway secret is a second trust signal, not a substitute for edge authentication.
- The frontend build currently uses npm-managed assets and should be built as part of image creation or CI before enabling the admin UI.
