Skip to main content

Inference Gateway

Overview

The core data plane of Agentic AI platforms is the inference gateway. A 2-Tier architecture is recommended: kgateway (Tier 1) handles authentication, rate limiting, and guardrails, while Bifrost (Tier 2) performs model routing, fallback, and cost tracking. This section provides routing strategy overviews, production deployment guides, Cascade Routing tuning, and OpenClaw implementation examples.

Document List