00 · OVERVIEW
One base_url.
Every provider behind it.
Krindel is a drop-in, OpenAI-compatible gateway in front of every LLM provider you use: routing and failover, per-key budgets and rate limits, PII redaction, shared caching, and a durable async queue, all governed in one place. Adopting it means changing one line.
01 · LIVE DEMO
What the provider would see (PII mode: redact)
Loading the engine…
02 · OPERATION
Three steps. Then it's boring.
- Point your SDK at Krindel. Change
base_url, use a Krindel-minted virtual key, and touch nothing else. This is tested, not assumed: the official, unmodified OpenAI SDKs (openai‑python 2.44.0, openai‑node 6.45.0, versions pinned inscripts/sdk_smoke.sh) pass non-streaming calls, SSE streaming, and typed error handling against a live gateway. The completion paths also run end-to-end through a real provider. - Mint virtual keys. Workspace-scoped keys with per-key RPM/TPM limits and monthly budgets; raw keys are shown once and stored only as hashes.
- Route and govern. Priority, round-robin, cheapest, or cascade routing, with retries and circuit breakers. Budgets are enforced atomically, even across a cluster.
03 · CAPABILITIES
Everything between your apps and the providers
Routing & failover
One public model name fans out to prioritized targets across providers, with retries, backoff, and per-target circuit breakers. A dead provider is skipped mid-request.
Budgets that hold under load
Worst-case cost gets reserved atomically before a request is routed. On the clustered (Postgres) backend, a concurrent burst across multiple gateway nodes still can't collectively beat a monthly budget. Reservations orphaned by a crash self-heal inside the 15-minute lease window.
Cluster-consistent rate limits
Per-key RPM and TPM token buckets. On the clustered (Redis) backend, N nodes enforce one limit, not N of them: the admission decision executes atomically on shared state.
One-way PII redaction
Emails, phone numbers, SSNs, credit cards, and more get replaced with [REDACTED_*] before anything leaves the gateway — it's the same engine running in the demo above. Off, log, redact, or block, set per deployment. There's no un-redaction, and that's deliberate.
Durable async jobs
POST /v1/jobs runs the same gate and router as the sync path. On the clustered backend, the queue is durable: kill a node mid-traffic and its accepted jobs are reclaimed and finished by the others. We test exactly that.
Shared response cache
Exact-match, workspace-isolated caching. On the clustered (Redis) backend, a completion cached by one node is a hit on all of them. Tenants never share cache entries.
Audit trail & export
Every request is recorded: tokens, cost, latency, status, PII types, retries. Stream it out as JSONL or CSV, full history on the clustered backend. Retention windows are opt-in, for when you need to prune.
Role-based admin access
Static config identities. A viewer/operator/admin role matrix over the admin API. Fail-closed. The shared admin key remains.
Zero-dependency core
The core is pure Go standard library. The one exception is the PostgreSQL driver, confined to a single package and enforced by an architecture test that fails CI on any leak. Redis gets spoken natively too, through a RESP client we wrote ourselves in about 400 lines.
04 · MEASURED
Numbers this repository produced
Krindel's constitution forbids publishing a performance claim the repo
didn't measure itself. These come from make bench, an
open-loop harness that boots the full gateway in-process against an
identical-stack baseline.
added latency, sync completions at 1,000 RPS sustained, 0 errors
the same run with PII redaction on (vs 149µs off) at 1,000 RPS
live smoke checks green on a single node, and the same suite (18/18, minus the node-local TLS section and the RBAC identity checks, which need identities provisioned on the target) green through a load balancer over two nodes
accepted jobs after SIGKILLing a node mid-traffic in the cluster validation harness
Caveat: benchmark numbers were measured on a shared 2-vCPU VM. A reference-hardware re-run is on the roadmap before these numbers appear anywhere money changes hands.
05 · AVAILABILITY
The gateway is real. The licensing is being finalized.
Krindel is under active development, with its commercial license under counsel review. Want an early conversation, or a live demo against your own provider keys (fully offline if you prefer)? Get in touch.