Open-Source AI Infrastructure

Routing Is Intelligence

Data and models are commodity. Routing is intelligence.
We build open-source infrastructure for local-first AI routing, autonomous agents, and compound learning systems. Ship models that get smarter overnight without touching a cloud bill.

ModelCascade ↗ View on GitHub

ROUTE LOCAL/ ESCALATE SMART/ NEVER OVERSPEND/ COMPOUND NIGHTLY/ YOUR KEYS/ YOUR HARDWARE/ OPEN SOURCE/ MIT LICENSE/ ROUTE LOCAL/ ESCALATE SMART/ NEVER OVERSPEND/ COMPOUND NIGHTLY/ YOUR KEYS/ YOUR HARDWARE/ OPEN SOURCE/ MIT LICENSE/

01 Thesis

74%

Local Coverage

Most requests don't need a cloud model. A 1.2B parameter model running on consumer hardware handles 74% of agentic requests at $0 cost with sub-100ms latency. The remaining 26% escalate intelligently.

(1+λ)ⁿ

Compound Learning

Every overnight session makes the next one smarter. Shadow DPO pairs, graph memory, and behavioral distillation create a compounding loop. Small lambda, large n, exponential result.

Dependency Tax

Your keys. Your hardware. Your data. No telemetry. No proxy. No API keys shared with a routing service. The core is MIT-licensed. Self-host forever. Paid tiers add operational tooling, not routing capability.

02 Projects

Flagship

ModelCascade

Open-source multi-model cascade routing for autonomous agents. Route 74% of requests locally at $0. Escalate to cloud only when stuck. Three lines of code. Any framework. BYOK. No telemetry.

10K+ Production Dispatches

$0 Local Tier Cost

6 Model Backends

MIT License

View Product Page ↗

Open Source

Local Model Router

Multi-model routing on Linux with AMD and NVIDIA GPUs. GGUF serving, KV cache management, and consumer-hardware inference optimization.

GitHub ↗

Research

Compound Learning

Autonomous overnight training loops. Shadow DPO collection, behavioral distillation, and graph-memory compression for agents that improve while you sleep.

Coming Soon

INTELLIGENCE(t) = ROUTING(t) × (1+λ)ⁿ / DEPENDENCY(t)

The Cornstalk Equation — maximize routing, compound learning, minimize dependency