llm-d on minikube

Prerequisites

Platform Setup

This can be run on a minimum ec2 node type g6e.12xlarge (4xL40S 48GB but only 2 are used by default) to infer the model meta-llama/Llama-3.2-3B-Instruct that will get spun up.

Verify you have properly installed the container toolkit with the runtime of your choice.

# Podman
podman run --rm --security-opt=label=disable --device=nvidia.com/gpu=all ubuntu nvidia-smi
# Docker
sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi

llm-d-infra Installation

TBD

Prerequisites​

Platform Setup​

llm-d-infra Installation​

Prerequisites

Platform Setup

llm-d-infra Installation