Security Whitepaper - v0.2666

Defense in depth
for AI agent workflows.

Sandcastle applies layered security controls at every stage - from API authentication and sandbox isolation to EU data residency enforcement and provider audit trails. Your workflows run in ephemeral sandboxes with zero trust by default.

9
Security Layers
6
AI Providers
4
Sandbox Backends
30+
Credential Patterns
9400+
Passing Tests

Layered security model

Every request passes through multiple independent security layers. A failure in one layer is caught by the next.

Perimeter
CORS Policy Rate Limiting Security Headers CSP SSRF Prevention Input Validation
Auth
HMAC-SHA256 Keys Key Rotation Key Expiry IP Allowlisting Tenant Isolation Admin Scoping
Policy
PII Redaction Secret Blocking Cost Limits Approval Gates Data Residency SLO Routing
Execution
Ephemeral Sandboxes Seccomp Profiles CapDrop ALL Circuit Breaker Auto-Failover Timeout Enforcement Resource Limits
Data
Credential Encryption Policy Violations Approval Trail Provider Audit Trail Run History Key Usage Tracking

What's built in

Every feature listed below ships with Sandcastle out of the box. No plugins, no paid tiers, no configuration required for the defaults.

Sandbox Isolation

AI agent code runs in ephemeral sandboxes that are destroyed after each execution. No shared state, no persistent access to host resources. Docker containers run with seccomp profiles and dropped capabilities.

  • E2B - Cloud-hosted microVMs with SOC 2 Type II certification
  • Docker - CapDrop: ALL, seccomp syscall allowlist, PID limits, CPU quotas, non-root user (UID 1000), bridge networking, auto-remove
  • Seccomp profiles - Blocks ptrace, mount, kexec_load, keyctl, reboot, swapon and other dangerous syscalls
  • Cloudflare Workers - Edge V8 isolates with per-request timeout
  • Local - Subprocess mode for development only (no isolation)
  • Runner file validation - blocks path traversal (..) and absolute paths before execution

Authentication

API key authentication with HMAC-SHA256 hashing, key rotation with grace periods, expiry enforcement, and per-key IP allowlisting.

  • HMAC-SHA256 with configurable server-side pepper (API_KEY_PEPPER)
  • Keys accepted via X-API-Key header, Authorization: Bearer, or query param (SSE)
  • Key rotation - POST /api/api-keys/{id}/rotate generates new key, old key enters configurable grace period
  • Key expiry - expires_at enforced in auth middleware, returns 401 KEY_EXPIRED
  • IP allowlisting - per-key CIDR allowlist (IPv4 + IPv6), empty list allows all IPs
  • sc_ prefix with 32-byte URL-safe random token, first 8 chars stored as key_prefix
  • last_used_at timestamp updated on every authenticated request
  • Soft deletion via is_active flag - preserves audit trail

Multi-Tenant Isolation

Each API key is scoped to a tenant. Tenant-scoped queries filter all data access to prevent cross-tenant leaks.

  • tenant_id set from API key on every request via middleware
  • Admin keys (no tenant_id) can access all data
  • Tenant keys only see their own runs, workflows, schedules
  • Per-key cost limits (max_cost_per_run_usd)

Policy Engine

Declarative policy rules that evaluate step outputs in real-time. Automatically redact PII, block secrets, and trigger approvals.

  • PII patterns - email, phone, SSN, credit card regex detection
  • 30+ credential patterns - Slack, GitHub, AWS, Stripe, OpenAI, and more
  • Actions - redact, block, inject_approval, alert, log
  • Severity levels - critical, high, medium, low
  • Safe expression evaluation via simpleeval (no Python eval/exec)
  • Auto-generated credential policies when tools are used
  • Redacted output stored separately - originals preserved for LLM context

SSRF Prevention

All webhook callback URLs are validated against private network ranges before any HTTP request is made.

  • Blocks loopback (127.0.0.0/8, ::1)
  • Blocks private networks (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16)
  • Blocks link-local (169.254.0.0/16) and IPv6 ULA (fc00::/7)
  • DNS resolution performed before IP check (prevents DNS rebinding via TOCTOU)
  • Scheme validation - only http:// and https://

Path Traversal Protection

All file operations are sandboxed within base directories. Resolved paths are validated to prevent escape.

  • Path.resolve() + is_relative_to() check on all storage operations
  • Raises ValueError on traversal attempts (../../etc/passwd)
  • Dashboard SPA fallback rejects paths containing ..
  • Runner file validation blocks .. and absolute paths

Rate Limiting

Pluggable rate limiting with in-memory and distributed Redis backends. Sliding window counter per tenant or IP prevents abuse of sandbox endpoints.

  • Default: 10 requests per 60 seconds on execution endpoints
  • Keyed by tenant:{id} when authenticated, ip:{addr} when anonymous
  • In-memory backend - default, zero config, sliding window with stale entry pruning
  • Redis backend - distributed sorted set pipeline, auto-selected when REDIS_URL is set
  • HTTP 429 response with Retry-After, X-RateLimit-Limit, X-RateLimit-Remaining headers

Credential Management

Tool credentials are encrypted at rest with Fernet symmetric encryption, resolved from environment variables or database-stored named connections. Never logged, never exposed in outputs.

  • Encryption at rest - Fernet (AES-128-CBC + HMAC-SHA256) via CREDENTIAL_ENCRYPTION_KEY
  • Graceful fallback - no key configured means plaintext passthrough (backwards compatible)
  • TOOL_* prefix convention for environment variables
  • Named connections (postgresql:analytics) stored in DB with unique constraint
  • Credential masking for UI display - shows first 4 + last 4 chars only
  • Credentials injected into sandbox env at runtime, not stored in workflow YAML
  • Export API (/workflows/{name}/export) sanitizes credentials from YAML

Telemetry Anonymization

Opt-in error reporting via Sentry with aggressive anonymization. Disabled by default. No user data, no secrets, no hostnames.

  • Opt-in only: TELEMETRY_ENABLED=true + SENTRY_DSN
  • send_default_pii=False, server_name=None, zero tracing
  • Before-send hook strips: API keys, tokens, secrets, file paths, request data, user data, stack frame variables
  • 40+ sensitive env var names blocklisted from reporting
  • Local fallback: failed sends saved to data/error_reports/

Webhook Signing

Outgoing webhooks are signed with HMAC-SHA256. Recipients can verify authenticity using the signature header.

  • HMAC-SHA256 signature over JSON payload
  • Header: X-Sandcastle-Signature
  • Timing-safe comparison via hmac.compare_digest()
  • 10-second timeout on webhook delivery

Security Headers

Middleware injects hardened HTTP headers on every response. Content Security Policy applied to dashboard routes, with a report-only mode for gradual rollout.

  • X-Content-Type-Options: nosniff - prevents MIME sniffing
  • X-Frame-Options: DENY - blocks clickjacking
  • Referrer-Policy: strict-origin-when-cross-origin
  • Permissions-Policy - disables camera, microphone, geolocation, payment
  • CSP - default-src 'self' with scoped overrides for styles, scripts, images, fonts
  • CSP applied to dashboard paths only (not /api) to avoid breaking JSON responses
  • CSP_REPORT_ONLY=true uses Content-Security-Policy-Report-Only header for safe testing

Circuit Breaker

Three-state circuit breaker protects against cascading backend failures. Automatically recovers when the backend stabilizes.

  • CLOSED - normal operation, requests pass through
  • OPEN - after 5 consecutive failures, reject immediately
  • HALF_OPEN - after 30s cooldown, allow one test request
  • Success in HALF_OPEN resets to CLOSED
  • Metrics tracking: total queries, failures, failovers, CB rejections

Audit Trail

SHA-256 hash chain audit trail. Every event links to the previous via cryptographic hash - tamper-evident and independently verifiable.

  • Hash chain - each AuditEvent stores entry_hash (SHA-256 of payload) and prev_hash linking to previous event
  • 7 executor hooks - workflow_started, step_started, step_completed, step_failed, workflow_completed, workflow_failed, approval_requested
  • 9 admin hooks - emergency_stop, emergency_reset, api_key_created, api_key_rotated, api_key_revoked, credential_stored, credential_deleted, config_changed, compliance_mode_changed
  • GET /audit/runs/{run_id} - full event log for a run
  • GET /audit/verify/{run_id} - verify chain integrity, returns broken links
  • GET /audit/admin - paginated admin action log
  • PolicyViolation - run_id, step_id, policy_id, severity, trigger_details, action_taken
  • ApprovalRequest - status, reviewer_id, reviewer_comment, full request/response data

Automatic secret scrubber

The policy engine automatically detects and redacts credentials from 30+ services in step outputs. Two-layer defense: PEM key blocks first, then token regex. Idempotent - safe to run twice. Patterns applied before data reaches storage, webhooks, or logs.

Category Services / Patterns Examples
Communication Slack, Discord, Twilio, SendGrid, Resend, WhatsApp, Intercom xoxb- xoxp- SG.
AI Providers OpenAI, Anthropic, ElevenLabs, Tavily sk- sk-ant-
Cloud & DevOps AWS, Vercel, Datadog, PagerDuty, Cloudflare AKIA Bearer
Version Control GitHub, Jira, Linear ghp_ gho_
Data & Storage Supabase, Pinecone, Airtable, Snowflake, Redis eyJ (JWT) UUID patterns
Payments & ERP Stripe, Shopify, Plaid, QuickBooks, DocuSign sk_live_ sk_test_
CRM HubSpot, Salesforce, Zendesk pat- Bearer tokens
PII Email, Phone, SSN, Credit Card Regex patterns for common PII formats
Connection URLs PostgreSQL, Redis, MySQL, MongoDB connection strings postgres://user:pass@host redis://:pass@host
PEM Private Keys RSA, EC, DSA, ENCRYPTED private key blocks -----BEGIN RSA PRIVATE KEY-----
Cloud Credentials Azure Storage AccountKey, AWS compound keywords AccountKey=... aws_secret_access_key
JSON Secrets JSON-quoted key/value pairs containing secrets "password": "value" "secret": "..."

Your data. Your borders. Enforced.

v0.2666 adds hard enforcement where previous versions had soft controls. Data residency, provider failover, and cost tracking - all with full audit trail.

EU Data Residency Enforcement

Set DATA_RESIDENCY=eu and all AI processing routes through EU-based providers (Mistral) or stays local (Ollama). This is not a policy document - it's hard enforcement at the routing layer.

  • Routing layer rejects non-EU providers when EU mode is active
  • Failover chain respects residency - only EU/local providers in the chain
  • Provider audit trail logs region for every AI call
  • Dashboard shows residency status per provider
  • Auto-generated GDPR privacy notice via GET /compliance/privacy-notice

Smart Auto-Failover

Provider hits rate limit or returns 5xx? Sandcastle switches to the next provider in the failover chain automatically. Per-provider cooldown prevents hammering a failing endpoint.

  • Triggers on HTTP 429, 5xx, and connection timeout
  • Per-provider cooldown (default 60s) prevents retry storms
  • Failover chain configurable per provider
  • Respects data residency constraints during failover
  • All failover events logged in provider audit trail

Provider Audit Trail

Every AI call logs which provider handled it, which model was used, which region the data was processed in, and whether it was a failover event.

  • Provider, model, region logged per advisor call
  • Failover events include original provider and reason
  • Cost tracked per provider for billing transparency
  • Integrated with existing SHA-256 hash chain audit

SLO-Aware Model Routing

Critical operations automatically get the most capable model. Simple tasks get the cheapest. No manual configuration - quality tier matching is automatic per task purpose.

  • Quality tiers: high (generation), medium (analysis), low (formatting)
  • Per-provider model mapping for each tier
  • Override with ADVISOR_QUALITY_MODE=always_best or always_cheapest
  • Audit trail includes quality tier selection reason

EU AI Act compliance - built in

Deadline: August 2, 2026. Sandcastle is the first AI orchestrator with native EU AI Act support.

EU AI Act Ready

Full compliance toolkit in every workflow

Risk classification, tamper-evident audit trail, transparency reports, technical documentation generation, PII redaction, and emergency stop - all configured in YAML, enforced at runtime.

Risk Classification

Classify workflows as minimal, limited, high, or unacceptable per EU AI Act Annex III. High-risk workflows require human approval gates.

risk_level: high

Tamper-Evident Audit Trail

SHA-256 hash chain on every event. Each entry links to the previous via cryptographic hash. Chain integrity verifiable via API.

GET /audit/verify/{run_id}

Transparency Reports

Article 13 compliant reports generated per-run. AI models used, human oversight, policy violations, cost breakdown.

GET /runs/{id}/transparency-report

Annex IV Documentation

Auto-generated technical documentation stubs covering intended purpose, AI models, risk classification, testing evidence, and data handling.

GET /workflows/{name}/annex-iv

Global Emergency Stop

One API call halts all running and queued workflows. In-memory + Redis flag checked by executor before every step.

POST /admin/emergency-stop

Compliance Mode

Set COMPLIANCE_MODE=eu_ai_act to enforce: high-risk workflows without approval steps are blocked, not just warned.

compliance_mode: eu_ai_act

Privacy Router - PII redaction at every boundary

7 PII patterns detected and redacted before data leaves your infrastructure.

Pattern Example Redacted As
Emailuser@company.com[EMAIL]
Phone+1 (555) 123-4567[PHONE]
SSN123-45-6789[SSN]
Credit Card4111 1111 1111 1111[CREDIT_CARD]
IP Address192.168.1.1[IP_ADDRESS]
IBANDE89 3704 0044 0532[IBAN]
Date of Birth15/03/1990[DOB]

Per-Workflow Config

privacy:
  enabled: true
  mode: redact
  entities: [email, phone, ssn]
  apply_to: [outputs, webhooks]

Per-Server Config

PRIVACY_ENABLED=true
PRIVACY_ENTITIES=email,phone,ssn,credit_card
PRIVACY_APPLY_TO=outputs,webhooks

Compliance posture

Sandcastle's security controls align with industry compliance frameworks. E2B's cloud sandbox backend holds SOC 2 Type II certification.

SOC 2 Type II

E2B cloud sandbox infrastructure is SOC 2 Type II certified

E2B Backend

Data Isolation

Multi-tenant scoping on every API query. Ephemeral sandboxes with zero shared state

Active

Audit Logging

Policy violations, approval decisions, and key usage tracked in database

Active

Secret Management

Fernet encryption at rest. Key rotation with grace periods. Never logged, masked for display

Active

PII Protection

Automatic redaction of email, phone, SSN, credit card patterns in outputs

Active

Telemetry Privacy

Opt-in only. Aggressive anonymization strips all secrets and user data

Active

GDPR Ready

No PII in telemetry. Tenant data isolation. PII redaction in policy engine

Active

Report a vulnerability

Found a security issue? Please report it responsibly to security@sandcastle-ai.eu. We respond within 24 hours and aim to fix critical issues within 72 hours.