InferenceObjective

InferenceObjective represents the desired state of a specific model use case. It allows "Inference Workload Owners" to define performance and latency goals for a model within an InferencePool.

Group: inference.networking.x-k8s.io
Version: v1alpha2

InferenceObjective

Field	Description
`apiVersion`	`inference.networking.x-k8s.io/v1alpha2`
`kind`	`InferenceObjective`
`metadata`	metav1.ObjectMeta
`spec`	InferenceObjectiveSpec Spec represents the desired state of the model use case.
`status`	InferenceObjectiveStatus Status defines the observed state of the InferenceObjective.

InferenceObjectiveSpec

InferenceObjectiveSpec defines the priority and the pool reference for the model workload.

Field	Description
`priority`	`int` Optional Defines how important it is to serve the request compared to others in the same pool. Higher values have higher priority. Unset value is treated as `0`. Requests of higher priority are served first when resources are scarce.
`poolRef`	PoolObjectReference Required Reference to the inference pool. The pool must exist in the same namespace.

InferenceObjectiveStatus

InferenceObjectiveStatus defines the observed state of InferenceObjective.

Field	Description
`conditions`	[]metav1.Condition Conditions track the state of the InferenceObjective. Known type: `Accepted`.

PoolObjectReference

PoolObjectReference identifies an API object within the same namespace.

Field	Description
`group`	`string` Group of the referent. Defaults to `inference.networking.k8s.io`.
`kind`	`string` Kind of the referent. Defaults to `InferencePool`.
`name`	`string` Required Name of the referent.

Condition Types and Reasons

Accepted

Indicates if the objective configuration is accepted.

True Reasons:
- Accepted: Model conforms to the state of the pool.
Unknown Reasons:
- Pending: Initial state, controller has not yet reconciled the resource.

InferenceObjective​

InferenceObjectiveSpec​

InferenceObjectiveStatus​

PoolObjectReference​

Condition Types and Reasons​

Accepted​

InferenceObjective

InferenceObjectiveSpec

InferenceObjectiveStatus

PoolObjectReference

Condition Types and Reasons

Accepted