Skip to main content

4 docs tagged with "cascade-routing"

View all tags

Cascade Routing Production Tuning

Guide to tuning Inference Gateway Cascade Routing classification thresholds, Canary rollout, Fallback, and cost drift alerts based on production traces

Custom Model Pipeline Guide

Building a domain-optimized model serving pipeline with LoRA Fine-tuning, Multi-LoRA Hot-swap, and SLM Cascade Routing

Inference Gateway

Routing strategies, deployment, cascade tuning, and implementation examples for kgateway and Bifrost-based 2-Tier inference gateways