Skip to main content

Router Recipes

llm-d uses the llm-d Router to make intelligent request routing decisions for inference requests. There are two deployment modes:

Standalone (Default)

Use this when you do not want to deploy a proxy via Kubernetes Gateway APIs. The standalone chart deploys the llm-d Router with an Envoy sidecar to proxy the traffic directly.

Chart: oci://registry.k8s.io/gateway-api-inference-extension/charts/standalone

helm install <release-name> \
oci://registry.k8s.io/gateway-api-inference-extension/charts/standalone \
-f ${REPO_ROOT}/guides/recipes/router/base.values.yaml \
-f ${REPO_ROOT}/guides/recipes/router/features/monitoring.values.yaml \
-f ${REPO_ROOT}/guides/<your-guide>/router/<your-guide>.values.yaml \
--set provider.name=<gke|istio|none> \
-n ${NAMESPACE} \
--version v1.5.0

With Kubernetes Gateway API

Use this when you want to route traffic through a proxy managed by the Kubernetes Gateway API (e.g., GKE Gateway, Istio, Agentgateway). This requires:

  1. A Gateway control plane installed (see prereq/gateway-provider)
  2. Creating a Gateway resource (see recipes/gateway)
  3. Deploying the inferencepool chart (below)

Chart: oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool

helm install <release-name> \
oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool \
-f ${REPO_ROOT}/guides/recipes/router/base.values.yaml \
-f ${REPO_ROOT}/guides/recipes/router/features/monitoring.values.yaml \
-f ${REPO_ROOT}/guides/<your-guide>/router/<your-guide>.values.yaml \
--set provider.name=<gke|istio|none> \
-n ${NAMESPACE} \
--version v1.5.0

Values Layering

Both modes share a common base.values.yaml containing the router image, ports, and common pod selector labels. Feature values (monitoring, tracing) and guide-specific values are layered on top:

base.values.yaml # shared defaults (this directory)
+ features/monitoring.values.yaml # optional feature toggles
+ <guide>/router/<guide>.values.yaml # guide-specific overrides