21 docs tagged with "scope:impl"

Advanced Features

LLM Classifier, CloudFront/WAF, Semantic Caching configuration

Basic Deployment

kgateway installation, HTTPRoute configuration, Bifrost Gateway Mode setup

Coding Tool Integration & Cost Analysis

Aider, Cline, Continue.dev integration + Bedrock vs Kiro vs self-hosting cost comparison

Continuous Training Pipeline

EKS-based 5-stage pipeline that automatically promotes Langfuse traces to training data and connects GRPO/DPO preference tuning with Canary deployment.

Custom Model Deployment Guide

Hands-on guide to deploying large open-source models on EKS, based on the GLM-5.1 experience

Custom Model Pipeline Guide

Building a domain-optimized model serving pipeline with LoRA Fine-tuning, Multi-LoRA Hot-swap, and SLM Cascade Routing

Threshold verification of trained checkpoints, kgateway-based gradual Canary deployment, MLflow Registry version management, automatic rollback on regression, cost and quality KPI dashboard configuration.

GRPO/DPO Training Job

Production configuration for running NeMo-RL (GRPO) and TRL (DPO) training jobs with labeled preference datasets on Karpenter Spot node pools and Volcano Gang Scheduling.

Inference Gateway & LLM Gateway Routing Strategy

kgateway + Bifrost/LiteLLM 2-Tier architecture with Cascade Routing, Semantic Router, and Hybrid Routing design patterns

Inference Gateway Deployment Guide

Step-by-step deployment guide for kgateway-based Inference Gateway (basic/advanced/troubleshooting)

MLOps Pipeline on EKS

End-to-end ML lifecycle management with Kubeflow + MLflow + vLLM + ArgoCD GitOps

Monitoring & Observability Setup Guide

Hands-on setup guide for integrated monitoring with Prometheus to AMP, AMG, Langfuse, and Bifrost OTel

Open-Weight Model Deployment Guide

A customer-facing decision guide for evaluating and choosing self-hosted open-weight LLM deployment from the perspectives of token economics and data sovereignty.

OpenClaw AI Agent Gateway Deployment & Full Observability

Deploy OpenClaw AI Agent Gateway on EKS with cost optimization, and achieve full observability using Bifrost Auto-Router + Cilium Hubble + Langfuse

Reference Architecture

Production deployment and configuration reference architecture for the Agentic AI Platform

Request Cascading — Intelligent Model Routing

Complexity-based automatic model routing — comparison of LLM Classifier, LiteLLM, and vLLM Semantic Router approaches, RouteLLM research reference, and cost savings

SageMaker-EKS Hybrid ML Architecture

A hybrid ML architecture that trains on SageMaker and serves on EKS

Tiered Gateway Architecture

Single definition of the Agentic AI Platform gateway layers: Tier 1 Ingress, Tier 2 Inference Routing (Inference Extension) and LLM API Gateway, and the Agent Data Plane — their role separation and how to fill each layer

Trace → Dataset Materializer

Load Langfuse OTel traces into S3 Parquet/Iceberg and automatically construct GRPO/DPO training datasets by labeling rewards with Ragas + LLM Judge Fleet.

Troubleshooting Guide

Common issues and solutions during Inference Gateway deployment and operations

오픈 웨이트 모델 자동 배포·관리 파이프라인 아키텍처

HuggingFace 리더보드 스캔부터 벤치마크 재현, 인스턴스별 성능 프로파일링, 멀티 타깃 배포 가이드 생성, 글로벌 스팟 캐파 확보까지 — 오픈 웨이트 모델 온보딩을 7단계 파이프라인으로 자동화하고 사람은 승인 게이트에만 개입하는 아키텍처를 제시합니다