-
Notifications
You must be signed in to change notification settings - Fork 127
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Signed-off-by: Sivanantham Chinnaiyan <[email protected]>
- Loading branch information
1 parent
56533ad
commit a9df4df
Showing
1 changed file
with
175 additions
and
59 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,90 +1,206 @@ | ||
# Kubernetes Deployment Installation Guide | ||
KServe supports `RawDeployment` mode to enable `InferenceService` deployment with Kubernetes resources [`Deployment`](https://kubernetes.io/docs/concepts/workloads/controllers/deployment), [`Service`](https://kubernetes.io/docs/concepts/services-networking/service), [`Ingress`](https://kubernetes.io/docs/concepts/services-networking/ingress) and [`Horizontal Pod Autoscaler`](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale). Comparing to serverless deployment it unlocks Knative limitations such as mounting multiple volumes, on the other hand `Scale down and from Zero` is not supported in `RawDeployment` mode. | ||
KServe supports `RawDeployment` mode to enable `InferenceService` deployment with Kubernetes resources [`Deployment`](https://kubernetes.io/docs/concepts/workloads/controllers/deployment), [`Service`](https://kubernetes.io/docs/concepts/services-networking/service), [`Ingress`](https://kubernetes.io/docs/concepts/services-networking/ingress) / [`Gateway API`](https://kubernetes.io/docs/concepts/services-networking/gateway/) and [`Horizontal Pod Autoscaler`](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale). Comparing to serverless deployment it unlocks Knative limitations such as mounting multiple volumes, on the other hand `Scale down and from Zero` is not supported in `RawDeployment` mode. | ||
|
||
Kubernetes 1.28 is the minimally required version and please check the following recommended Istio versions for the corresponding | ||
Kubernetes version. | ||
|
||
!!! warning | ||
Kubernetes `Ingress` support is deprecated and will be removed in the future. `Gateway API` is the recommended option for KServe. | ||
Follow the [Gateway API migration guide](gatewayapi_migration.md) to migrate from Kubernetes Ingress to Gateway API. | ||
|
||
## Recommended Version Matrix | ||
| Kubernetes Version | Recommended Istio Version | | ||
| :----------------- | :------------------------ | | ||
| 1.28 | 1.22 | | ||
| 1.29 | 1.22, 1.23 | | ||
| 1.30 | 1.22, 1.23 | | ||
|
||
## 1. Install Ingress Controller | ||
|
||
In this guide we choose to install Istio as ingress controller. The minimally required Istio version is 1.22 and you can refer to the [Istio install guide](https://istio.io/latest/docs/setup/install). | ||
|
||
Once Istio is installed, create `IngressClass` resource for istio. | ||
```yaml | ||
apiVersion: networking.k8s.io/v1 | ||
kind: IngressClass | ||
metadata: | ||
name: istio | ||
spec: | ||
controller: istio.io/ingress-controller | ||
``` | ||
!!! note | ||
Istio ingress is recommended, but you can choose to install with other [Ingress controllers](https://kubernetes.io/docs/concepts/services-networking/ingress-controllers/) and create `IngressClass` resource for your Ingress option. | ||
|
||
|
||
## 2. Install Cert Manager | ||
## 1. Install Cert Manager | ||
The minimally required Cert Manager version is 1.15.0 and you can refer to [Cert Manager installation guide](https://cert-manager.io/docs/installation/). | ||
|
||
!!! note | ||
Cert manager is required to provision webhook certs for production grade installation, alternatively you can run self signed certs generation script. | ||
|
||
## 3. Install KServe | ||
!!! note | ||
The default KServe deployment mode is `Serverless` which depends on Knative. The following step changes the default deployment mode to `RawDeployment` before installing KServe. | ||
## 2. Install Network Controller | ||
|
||
=== "Install using Helm" | ||
=== "Gateway API" | ||
|
||
I. Install KServe CRDs | ||
Install the Gateway API CRD as it is not part of the Kubernetes installation. KServe implements Gateway API version 1.2.1. | ||
|
||
```shell | ||
helm install kserve-crd oci://ghcr.io/kserve/charts/kserve-crd --version v{{ kserve_release_version }} | ||
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.2.1/standard-install.yaml | ||
``` | ||
|
||
II. Install KServe Resources | ||
Then, create a `GatewayClass` resource using your preferred network controller. For this example, we will use [Envoy Gateway](https://gateway.envoyproxy.io/docs/) as the network controller. | ||
|
||
Set the `kserve.controller.deploymentMode` to `RawDeployment` and `kserve.controller.gateway.ingressGateway.className` to point to the `IngressClass` | ||
name created in [step 1](#1-install-ingress-controller). | ||
|
||
```shell | ||
helm install kserve oci://ghcr.io/kserve/charts/kserve --version v{{ kserve_release_version }} \ | ||
--set kserve.controller.deploymentMode=RawDeployment \ | ||
--set kserve.controller.gateway.ingressGateway.className=your-ingress-class | ||
```yaml | ||
apiVersion: gateway.networking.k8s.io/v1 | ||
kind: GatewayClass | ||
metadata: | ||
name: envoy | ||
spec: | ||
controllerName: gateway.envoyproxy.io/gatewayclass-controller | ||
``` | ||
Create a `Gateway` resource to expose the `InferenceService`. In this example, you will use the `envoy` `GatewayClass` that we created above. If you already have a `Gateway` resource, you can skip this step and can configure KServe to use the existing `Gateway`. | ||
|
||
=== "Install using YAML" | ||
|
||
I. Install KServe: | ||
`--server-side` option is required as the InferenceService CRD is large, see [this issue](https://github.com/kserve/kserve/issues/3487) for details. | ||
|
||
```bash | ||
kubectl apply --server-side -f https://github.com/kserve/kserve/releases/download/v{{kserve_release_version}}/kserve.yaml | ||
```yaml | ||
apiVersion: gateway.networking.k8s.io/v1 | ||
kind: Gateway | ||
metadata: | ||
name: kserve-ingress-gateway | ||
namespace: kserve | ||
spec: | ||
gatewayClassName: envoy | ||
listeners: | ||
- name: http | ||
protocol: HTTP | ||
port: 80 | ||
allowedRoutes: | ||
namespaces: | ||
from: All | ||
- name: https | ||
protocol: HTTPS | ||
port: 443 | ||
tls: | ||
mode: Terminate | ||
certificateRefs: | ||
- kind: Secret | ||
name: my-secret | ||
namespace: kserve | ||
allowedRoutes: | ||
namespaces: | ||
from: All | ||
infrastructure: | ||
labels: | ||
serving.kserve.io/gateway: kserve-ingress-gateway | ||
``` | ||
!!! note | ||
KServe comes with a default `Gateway` named kserve-ingress-gateway. You can enable the default gateway by setting Helm value `kserve.controller.gateway.ingressGateway.createGateway` to `true`. | ||
|
||
II. Install KServe default serving runtimes: | ||
=== "Kubernetes Ingress" | ||
|
||
In this guide we choose to install Istio as ingress controller. The minimally required Istio version is 1.22 and you can refer to the [Istio install guide](https://istio.io/latest/docs/setup/install). | ||
|
||
Once Istio is installed, create `IngressClass` resource for istio. | ||
|
||
```bash | ||
kubectl apply --server-side -f https://github.com/kserve/kserve/releases/download/v{{kserve_release_version}}/kserve-cluster-resources.yaml | ||
```yaml | ||
apiVersion: networking.k8s.io/v1 | ||
kind: IngressClass | ||
metadata: | ||
name: istio | ||
spec: | ||
controller: istio.io/ingress-controller | ||
``` | ||
|
||
!!! note | ||
Istio ingress is recommended, but you can choose to install with other [Ingress controllers](https://kubernetes.io/docs/concepts/services-networking/ingress-controllers/) and create `IngressClass` resource for your Ingress option. | ||
|
||
III. Change default deployment mode and ingress option | ||
|
||
First in the ConfigMap `inferenceservice-config` modify the `defaultDeploymentMode` from the `deploy` section to `RawDeployment`, | ||
|
||
## 3. Install KServe | ||
!!! note | ||
The default KServe deployment mode is `Serverless` which depends on Knative. The following step changes the default deployment mode to `RawDeployment` before installing KServe. | ||
|
||
```bash | ||
kubectl patch configmap/inferenceservice-config -n kserve --type=strategic -p '{"data": {"deploy": "{\"defaultDeploymentMode\": \"RawDeployment\"}"}}' | ||
``` | ||
=== "Gateway API" | ||
|
||
then modify the `ingressClassName` from `ingress` section to the `IngressClass` name created in [step 1](#1-install-ingress-controller). | ||
```yaml | ||
ingress: |- | ||
{ | ||
"ingressClassName" : "your-ingress-class", | ||
} | ||
``` | ||
=== "Install using Helm" | ||
|
||
I. Install KServe CRDs | ||
|
||
```shell | ||
helm install kserve-crd oci://ghcr.io/kserve/charts/kserve-crd --version v{{ kserve_release_version }} | ||
``` | ||
II. Install KServe Resources | ||
|
||
Set the `kserve.controller.deploymentMode` to `RawDeployment` and `kserve.controller.gateway.ingressGateway.kserveGateway` to point to the `Gateway` | ||
created in [step 2](#2-install-network-controller). | ||
|
||
```shell | ||
helm install kserve oci://ghcr.io/kserve/charts/kserve --version v{{ kserve_release_version }} \ | ||
--set kserve.controller.deploymentMode=RawDeployment \ | ||
--set kserve.controller.gateway.ingressGateway.enableGatewayApi=true | ||
--set kserve.controller.gateway.ingressGateway.kserveGateway=<gateway-namespace>/<gateway-name> | ||
``` | ||
|
||
=== "Install using YAML" | ||
|
||
I. Install KServe: | ||
`--server-side` option is required as the InferenceService CRD is large, see [this issue](https://github.com/kserve/kserve/issues/3487) for details. | ||
|
||
```bash | ||
kubectl apply --server-side -f https://github.com/kserve/kserve/releases/download/v{{kserve_release_version}}/kserve.yaml | ||
``` | ||
|
||
II. Install KServe default serving runtimes: | ||
|
||
```bash | ||
kubectl apply --server-side -f https://github.com/kserve/kserve/releases/download/v{{kserve_release_version}}/kserve-cluster-resources.yaml | ||
``` | ||
|
||
III. Change default deployment mode and ingress option | ||
|
||
First in the ConfigMap `inferenceservice-config` modify the `defaultDeploymentMode` from the `deploy` section to `RawDeployment`. | ||
|
||
```bash | ||
kubectl patch configmap/inferenceservice-config -n kserve --type=strategic -p '{"data": {"deploy": "{\"defaultDeploymentMode\": \"RawDeployment\"}"}}' | ||
``` | ||
|
||
Then from `ingress` section, modify the `enableGatewayApi` to the `true` and modify the `kserveIngressGateway` to point to the Gateway created in [step 2](#2-install-network-controller). | ||
```yaml | ||
ingress: |- | ||
{ | ||
"enableGatewayApi": true, | ||
"kserveIngressGateway": "<gateway-namespace>/<gateway-name>", | ||
} | ||
``` | ||
|
||
=== "Kubernetes Ingress" | ||
|
||
=== "Install using Helm" | ||
|
||
I. Install KServe CRDs | ||
|
||
```shell | ||
helm install kserve-crd oci://ghcr.io/kserve/charts/kserve-crd --version v{{ kserve_release_version }} | ||
``` | ||
II. Install KServe Resources | ||
|
||
Set the `kserve.controller.deploymentMode` to `RawDeployment` and `kserve.controller.gateway.ingressGateway.className` to point to the `IngressClass` | ||
name created in [step 2](#2-install-network-controller). | ||
|
||
```shell | ||
helm install kserve oci://ghcr.io/kserve/charts/kserve --version v{{ kserve_release_version }} \ | ||
--set kserve.controller.deploymentMode=RawDeployment \ | ||
--set kserve.controller.gateway.ingressGateway.className=your-ingress-class | ||
``` | ||
|
||
=== "Install using YAML" | ||
|
||
I. Install KServe: | ||
`--server-side` option is required as the InferenceService CRD is large, see [this issue](https://github.com/kserve/kserve/issues/3487) for details. | ||
```bash | ||
kubectl apply --server-side -f https://github.com/kserve/kserve/releases/download/v{{kserve_release_version}}/kserve.yaml | ||
``` | ||
II. Install KServe default serving runtimes: | ||
```bash | ||
kubectl apply --server-side -f https://github.com/kserve/kserve/releases/download/v{{kserve_release_version}}/kserve-cluster-resources.yaml | ||
``` | ||
III. Change default deployment mode and ingress option | ||
First in the ConfigMap `inferenceservice-config` modify the `defaultDeploymentMode` from the `deploy` section to `RawDeployment`, | ||
```bash | ||
kubectl patch configmap/inferenceservice-config -n kserve --type=strategic -p '{"data": {"deploy": "{\"defaultDeploymentMode\": \"RawDeployment\"}"}}' | ||
``` | ||
then modify the `ingressClassName` from `ingress` section to the `IngressClass` name created in [step 2](#2-install-network-controller). | ||
```yaml | ||
ingress: |- | ||
{ | ||
"ingressClassName" : "your-ingress-class", | ||
} | ||
``` |