Router Recipes
llm-d uses the llm-d Router to make intelligent request routing decisions for inference requests. There are two deployment modes:
Standalone (Default)
Use this when you do not want to deploy a proxy via Kubernetes Gateway APIs. The standalone chart deploys the llm-d Router with an Envoy sidecar to proxy the traffic directly.
Chart: oci://registry.k8s.io/gateway-api-inference-extension/charts/standalone
helm install <release-name> \
oci://registry.k8s.io/gateway-api-inference-extension/charts/standalone \
-f ${REPO_ROOT}/guides/recipes/router/base.values.yaml \
-f ${REPO_ROOT}/guides/recipes/router/features/monitoring.values.yaml \
-f ${REPO_ROOT}/guides/<your-guide>/router/<your-guide>.values.yaml \
--set provider.name=<gke|istio|none> \
-n ${NAMESPACE} \
--version v1.5.0
With Kubernetes Gateway API
Use this when you want to route traffic through a proxy managed by the Kubernetes Gateway API (e.g., GKE Gateway, Istio, Agentgateway). This requires:
- A Gateway control plane installed (see prereq/gateway-provider)
- Creating a Gateway resource (see recipes/gateway)
- Deploying the inferencepool chart (below)
Chart: oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool
helm install <release-name> \
oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool \
-f ${REPO_ROOT}/guides/recipes/router/base.values.yaml \
-f ${REPO_ROOT}/guides/recipes/router/features/monitoring.values.yaml \
-f ${REPO_ROOT}/guides/<your-guide>/router/<your-guide>.values.yaml \
--set provider.name=<gke|istio|none> \
-n ${NAMESPACE} \
--version v1.5.0
Values Layering
Both modes share a common base.values.yaml containing the router image, ports, and common pod selector labels. Feature values (monitoring, tracing) and guide-specific values are layered on top:
base.values.yaml # shared defaults (this directory)
+ features/monitoring.values.yaml # optional feature toggles
+ <guide>/router/<guide>.values.yaml # guide-specific overrides