Agentgateway

This guide shows how to deploy llm-d with agentgateway as your inference gateway. By the end, inference requests will flow from an agentgateway-managed Gateway to your model servers via the llm-d EPP.

note

This guide assumes familiarity with Gateway API and llm-d.

Prerequisites

The environment variables ${GUIDE_NAME}, ${MODEL_NAME} and ${NAMESPACE} should be set as part of deploying one of the well-lit path guides.
A Kubernetes cluster running one of the three most recent Kubernetes releases
Helm
jq

Step 1: Install Gateway API and Gateway API Inference Extension CRDs

Install the required Gateway API and Gateway API Inference Extension CRDs:

GATEWAY_API_VERSION=v1.5.1
GAIE_VERSION=v1.5.0

kubectl apply -k "https://github.com/kubernetes-sigs/gateway-api/config/crd?ref=${GATEWAY_API_VERSION}"
kubectl apply -k "https://github.com/kubernetes-sigs/gateway-api-inference-extension/config/crd?ref=${GAIE_VERSION}"

Verify the APIs are available:

kubectl api-resources --api-group=gateway.networking.k8s.io
kubectl api-resources --api-group=inference.networking.k8s.io

Step 2: Install Agentgateway

Install the agentgateway CRDs and control plane with inference extension support enabled:

AGENTGATEWAY_VERSION=v1.1.0

helm upgrade --install agentgateway-crds \
  oci://cr.agentgateway.dev/charts/agentgateway-crds \
  --namespace agentgateway-system \
  --create-namespace \
  --version ${AGENTGATEWAY_VERSION}

helm upgrade --install agentgateway \
  oci://cr.agentgateway.dev/charts/agentgateway \
  --namespace agentgateway-system \
  --create-namespace \
  --version ${AGENTGATEWAY_VERSION} \
  --set inferenceExtension.enabled=true

Verify the installation:

kubectl get pods -n agentgateway-system
kubectl get gatewayclass agentgateway

Expected output:

NAME           CONTROLLER                      ACCEPTED   AGE
agentgateway   agentgateway.dev/agentgateway   True       30s

Step 3: Deploy the Gateway

Agentgateway

This deploys a gateway suitable for agentgateway, using the agentgateway gateway class. This is the preferred self-installed inference gateway recipe in llm-d.

kubectl apply -k ./guides/recipes/gateway/agentgateway -n ${NAMESPACE}

Agentgateway (OpenShift)

This deploys the preferred OpenShift-oriented agentgateway recipe. The rendered Gateway uses the agentgateway GatewayClass and an OpenShift-oriented AgentgatewayParameters resource.

kubectl apply -k ./guides/recipes/gateway/agentgateway-openshift -n ${NAMESPACE}

Verify the Gateway is programmed:

kubectl get gateway llm-d-inference-gateway -n ${NAMESPACE}

Expected output:

NAME                      CLASS          ADDRESS         PROGRAMMED   AGE
llm-d-inference-gateway   agentgateway   10.xx.xx.xx     True         30s

Wait until PROGRAMMED shows True before proceeding.

Step 4: Send a Request

important

Before sending requests, you must deploy a well-lit path guide. This sets up a model server deployment, an InferencePool, and an HTTPRoute to connect the Gateway to the pool.

Get the Gateway external address:

export IP=$(kubectl get gateway llm-d-inference-gateway -n ${NAMESPACE} -o jsonpath='{.status.addresses[0].value}')

Send an inference request via the managed Gateway:

curl -X POST http://${IP}/v1/completions \
    -H 'Content-Type: application/json' \
    -H 'X-Gateway-Base-Model-Name: '"$GUIDE_NAME"'' \
    -d '{
        "model": '\"${MODEL_NAME}\"',
        "prompt": "How are you today?"
    }' | jq

Cleanup

kubectl delete gateway llm-d-inference-gateway -n ${NAMESPACE}
helm uninstall agentgateway -n agentgateway-system
helm uninstall agentgateway-crds -n agentgateway-system
kubectl delete namespace agentgateway-system
kubectl delete gatewayclass agentgateway
kubectl delete -k "https://github.com/kubernetes-sigs/gateway-api/config/crd?ref=${GATEWAY_API_VERSION}"
kubectl delete -k "https://github.com/kubernetes-sigs/gateway-api-inference-extension/config/crd?ref=${GAIE_VERSION}"

Troubleshooting

Gateway not showing `PROGRAMMED=True`

kubectl describe gateway llm-d-inference-gateway -n ${NAMESPACE}
kubectl get pods -n agentgateway-system
kubectl logs -n agentgateway-system deployment/agentgateway --tail=20

Verify the agentgateway GatewayClass is present and accepted:

kubectl get gatewayclass agentgateway

HTTPRoute not accepted

kubectl describe httproute ${GUIDE_NAME} -n ${NAMESPACE}

Verify that parentRefs matches the Gateway name and backendRefs matches the InferencePool name.

No response from Gateway IP

kubectl get gateway llm-d-inference-gateway -n ${NAMESPACE} -o jsonpath='{.status.addresses[0].value}'

If the address is empty, your Gateway may still be waiting for a LoadBalancer service. Check that your cluster supports external load balancers.

Prerequisites​

Step 1: Install Gateway API and Gateway API Inference Extension CRDs​

Step 2: Install Agentgateway​

Step 3: Deploy the Gateway​

Agentgateway​

Agentgateway (OpenShift)​

Step 4: Send a Request​

Cleanup​

Troubleshooting​

Gateway not showing PROGRAMMED=True​

HTTPRoute not accepted​

No response from Gateway IP​

Prerequisites

Step 1: Install Gateway API and Gateway API Inference Extension CRDs

Step 2: Install Agentgateway

Step 3: Deploy the Gateway

Agentgateway

Agentgateway (OpenShift)

Step 4: Send a Request

Cleanup

Troubleshooting

Gateway not showing `PROGRAMMED=True`

HTTPRoute not accepted

No response from Gateway IP