| Time ↕ | Prompt | Model | Provider | Tier | Tkns | Actual | GPT-5 | Sonnet | Gemini | Grok | Saved | Lat. |
|---|
Tokenly Core runs in a Docker container to ensure consistent routing logic and model benchmarks. Ensure Docker Desktop is running.
Download DockerInstall the Continue extension in VS Code or JetBrains. This is the recommended way to use Tokenly for coding.
Install ExtensionOnce the extension is installed and Tokenly is running, simply select "Tokenly" from the model selector dropdown.
Tokenly supports all major providers and intelligently routes between them based on complexity. Click a provider to see supported models.
Initializes your environment, validates API keys, and creates the persistent
configuration in ~/.tokenly/.
Launches the local proxy server (port 8001) and the analytics dashboard. Use
--port to override.
Quickly opens this dashboard in your default browser to view your savings and request logs.
Streams the real-time logs from the proxy service. Useful for debugging connection issues.
Local Development
Build and test AI-powered applications without worrying about high API costs. Tokenly handles the routing to cheaper models for simple tasks.
Benchmarking
Compare how different models perform on your specific prompts using the Benchmark view. See exactly how much you'd pay OpenAI vs Anthropic.
Context Compression
For long conversations or codebases, Tokenly automatically compresses your prompt context to save up to 80% on input tokens.
CI/CD Integration
Run your automated tests through the Tokenly proxy to monitor cost regressions and performance across different model tiers.