v0.2666 was the vision. v0.2666b1 is the proof. We shipped universal advisor, EU data residency, smart failover. Then we spent a week trying to break it. 1,875 new tests. 90 bug fixes. 10 security patches. Every edge case we found - fixed.
AI Assistant got smarter
Generates workflow, finds errors, fixes them automatically. Suggests matching templates. Learns your style from existing workflows. Knows which tools you have configured.
First 60 seconds redesigned
Auto-detects Ollama and API keys on first load. One-click demo workflow runs in 10 seconds. Onboarding wizard: 2 steps, not 5.
Failover Dashboard
See when and why providers failed over. Per-workflow cost recommendations with savings amounts.
Agent Memory Page
Search, browse, delete agent memories. Importance scores and decay visualization.
Security hardened
Admin auth on evolution endpoints, template injection prevention, EU residency enforced on every LLM call, rate limiting, credential logging sanitized.
Faster everywhere
GZip compression, 2-phase dashboard loading, cache headers, database indexes, search debounce.
By the numbers
The release where Sandcastle stopped being an orchestrator and became a platform. You used to choose a provider and hope for the best. Now you describe what you want and the system figures out the rest - which model, which provider, what it costs, where the data lives, and what happens when something breaks.
Universal Advisor - one setting changes everything
Set SANDCASTLE_ADVISOR_PROVIDER=mistral and every AI feature - generation, evolution, evaluation, quality scoring - switches to Mistral. Tomorrow you want Claude? Change one line. Ollama on your laptop? Same line. Six providers, one setting, zero code changes.
EU Data Residency
Set DATA_RESIDENCY=eu. That's it. All AI processing routes through EU providers or stays local. Try sending data to a US provider with EU mode on - Sandcastle won't let you. Not a promise. Enforcement.
Smart Auto-Failover
Provider hits a rate limit at 3 AM? Sandcastle switches to the next one. No error. No alert. No intervention. You wake up, check the dashboard, see "Failover activated 2x overnight. $0.12 additional cost." That's the whole story.
SLO-Aware Routing
Critical analysis gets Claude Opus. Simple formatting gets Haiku. Not because you configured 47 rules - because Sandcastle matches model quality to task importance automatically. You set the priorities. It picks the brain.
Cost Intelligence - you'll finally know where the money goes
Per-provider cost breakdown. "Last 30 days: $120 via Claude. Same workloads via Mistral: $45." One click to see the math. And proactive recommendations that actually make sense - not just data, but advice. "Switch these 3 workflows to Mistral and save $75/month with EU residency included."
OpenClaw Integration
New step type: type: openclaw. Your workflows can now call OpenClaw agents as a step in any pipeline. One more thing you don't have to build yourself.
Document Parser
New step type: type: parse. PDF, DOCX, XLSX at 576 pages per second. No external service. No API key. No monthly bill. Just pip install sandcastle-ai[parse] and go.
Dashboard that gets out of your way
Getting Started checklist for new users. Quick Run cards - jump straight to results. Collapsible sidebar. Step palette with categories. Backend configurator with copy-paste .env snippets. Lazy loading. The dashboard got simpler by getting smarter.
By the numbers
Workflow Evolution
Your workflows optimize themselves. Set an eval suite, click "Evolve", and Sandcastle autonomously mutates prompts, swaps models, and simplifies steps - keeping only changes that improve your score.
Composite Scoring
quality * confidence - cost_penalty - latency_penalty. Every mutation is evaluated mechanically. No subjective judgments.
Mutation Operators
Three strategies: prompt refinement (LLM-guided), model swapping (haiku <-> sonnet based on quality/cost), and simplification (the AI learns that less is more).
Evolution Dashboard
Track every iteration. See the score evolve. Compare baseline vs best. Accept the winner with one click.
Workflow as API
Your workflows are now production APIs. One click to publish, one curl to call. Your customers hit the endpoint, you see every run in the dashboard.
curl -X POST "https://api.example.com/api/v1/lead-enrichment" \
-H "Authorization: Bearer $SANDCASTLE_KEY" \
-H "Content-Type: application/json" \
-d '{"company": "Acme Corp", "domain": "acme.com"}'
Living Dashboard
The dashboard shows real data now. Real-time sparklines update on every run. Anomaly detection catches cost spikes and error streaks automatically.
Heatmap shows 6 months of activity - click any day to drill down into individual runs.
Agent Marketplace
Publish your workflows to the community hub. Others rate them, install them, remix them.
Think npm for AI agents. One command to publish, one click to install.
Multi-Agent Delegation
Agents that call other agents. Dynamic routing: the output of one step picks which workflow runs next.
Full depth tracking, real-time progress events at every level of delegation.
File Upload
Workflows accept real documents now. Drop a PDF, CSV, or image into the run modal.
Text files are inlined into prompts. Binary files are passed as references to your agent steps.
EU AI Act Compliance
Risk classification per EU AI Act categories. Compliance mode that blocks high-risk workflows without human approval.
Transparency reports (Article 13), Annex IV technical documentation, global emergency stop.
Tamper-Evident Audit Trail
SHA-256 hash chain on every event. Each entry links to the previous - if anyone modifies a record, the chain breaks.
Verify integrity via API. 7 executor hooks + 9 admin action hooks. 3 query endpoints.
Privacy Router
7 PII patterns: email, phone, SSN, credit card, IP address, IBAN, date of birth.
Per-workflow or per-server config. Redact sensitive data before it touches your LLM, or run in audit-only mode.
Browser Modes
LightPanda (10x faster CDP, zero memory bloat) and Browserbase (cloud, zero cold-start) join E2B, Docker, and local as sandbox options.
Playwright pre-baked in the Dockerfile. Switch backends with one env var.
OpenTelemetry
Workflow and step-level OTLP spans. Cost, duration, and token counts on every trace. Compatible with Jaeger, Tempo, Honeycomb, Datadog.
pip install sandcastle-ai[otel]
5 New Connectors
Langfuse (LLM observability), Qdrant (vector search), GCS (Google Cloud Storage), Azure Blob Storage, and Exa (semantic web search).
Cost Estimation
POST /runs/estimate tells you what a workflow will cost before you run it. Per-step breakdown with model pricing baked in.
Set budget guardrails. Workflows that would exceed the budget are blocked before a single token is spent.
Secret Scrubber
Catches credential URLs, PEM keys, Azure AccountKey, and JSON-quoted secrets before they reach logs or LLM context.
Two-layer defense. Idempotent - safe to run multiple times on the same payload.
Eval Framework
Detects quality regressions with IEEE 754-safe integer basis points. No more floating-point false positives that cause flaky CI.
Define expected outputs, run evals on every commit, catch regressions at 0.01 percentage point precision.
Composio Integration
500+ business app actions through a single step type. Gmail, Slack, Notion, Salesforce, HubSpot, and more - all wired in via the Composio connector.
One auth flow, any tool, zero custom code.