Skip to content

Commit

Permalink
Add gateway api installation guide
Browse files Browse the repository at this point in the history
Signed-off-by: Sivanantham Chinnaiyan <[email protected]>
  • Loading branch information
sivanantha321 committed Jan 20, 2025
1 parent f9ca32e commit c084ff5
Showing 1 changed file with 143 additions and 59 deletions.
202 changes: 143 additions & 59 deletions docs/admin/kubernetes_deployment.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Kubernetes Deployment Installation Guide
KServe supports `RawDeployment` mode to enable `InferenceService` deployment with Kubernetes resources [`Deployment`](https://kubernetes.io/docs/concepts/workloads/controllers/deployment), [`Service`](https://kubernetes.io/docs/concepts/services-networking/service), [`Ingress`](https://kubernetes.io/docs/concepts/services-networking/ingress) and [`Horizontal Pod Autoscaler`](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale). Comparing to serverless deployment it unlocks Knative limitations such as mounting multiple volumes, on the other hand `Scale down and from Zero` is not supported in `RawDeployment` mode.
KServe supports `RawDeployment` mode to enable `InferenceService` deployment with Kubernetes resources [`Deployment`](https://kubernetes.io/docs/concepts/workloads/controllers/deployment), [`Service`](https://kubernetes.io/docs/concepts/services-networking/service), [`Ingress`](https://kubernetes.io/docs/concepts/services-networking/ingress) / [`Gateway API`](https://kubernetes.io/docs/concepts/services-networking/gateway/) and [`Horizontal Pod Autoscaler`](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale). Comparing to serverless deployment it unlocks Knative limitations such as mounting multiple volumes, on the other hand `Scale down and from Zero` is not supported in `RawDeployment` mode.

Kubernetes 1.28 is the minimally required version and please check the following recommended Istio versions for the corresponding
Kubernetes version.
Expand All @@ -11,80 +11,164 @@ Kubernetes version.
| 1.29 | 1.22, 1.23 |
| 1.30 | 1.22, 1.23 |

## 1. Install Ingress Controller

In this guide we choose to install Istio as ingress controller. The minimally required Istio version is 1.22 and you can refer to the [Istio install guide](https://istio.io/latest/docs/setup/install).

Once Istio is installed, create `IngressClass` resource for istio.
```yaml
apiVersion: networking.k8s.io/v1
kind: IngressClass
metadata:
name: istio
spec:
controller: istio.io/ingress-controller
```
!!! note
Istio ingress is recommended, but you can choose to install with other [Ingress controllers](https://kubernetes.io/docs/concepts/services-networking/ingress-controllers/) and create `IngressClass` resource for your Ingress option.


## 2. Install Cert Manager
## 1. Install Cert Manager
The minimally required Cert Manager version is 1.15.0 and you can refer to [Cert Manager installation guide](https://cert-manager.io/docs/installation/).

!!! note
Cert manager is required to provision webhook certs for production grade installation, alternatively you can run self signed certs generation script.

## 3. Install KServe
!!! note
The default KServe deployment mode is `Serverless` which depends on Knative. The following step changes the default deployment mode to `RawDeployment` before installing KServe.
## 2. Install Network Controller

=== "Install using Helm"
=== "Gateway API"

I. Install KServe CRDs
Install the Gateway API CRD as it is not part of the Kubernetes installation. KServe implements Gateway API version 1.2.1.

```shell
helm install kserve-crd oci://ghcr.io/kserve/charts/kserve-crd --version v{{ kserve_release_version }}
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.2.1/standard-install.yaml
```

II. Install KServe Resources
Then, create a `GatewayClass` resource using your preferred network controller. For this example, we will use [Envoy Gateway](https://gateway.envoyproxy.io/docs/) as the network controller.

Set the `kserve.controller.deploymentMode` to `RawDeployment` and `kserve.controller.gateway.ingressGateway.className` to point to the `IngressClass`
name created in [step 1](#1-install-ingress-controller).

```shell
helm install kserve oci://ghcr.io/kserve/charts/kserve --version v{{ kserve_release_version }} \
--set kserve.controller.deploymentMode=RawDeployment \
--set kserve.controller.gateway.ingressGateway.className=your-ingress-class
```yaml
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
name: envoy
spec:
controllerName: gateway.envoyproxy.io/gatewayclass-controller
```
Create a `Gateway` resource to expose the `InferenceService`. In this example, you will use the `envoy` `GatewayClass` that we created above. If you already have a `Gateway` resource, you can skip this step and can configure KServe to use the existing `Gateway`.

=== "Install using YAML"

I. Install KServe:
`--server-side` option is required as the InferenceService CRD is large, see [this issue](https://github.com/kserve/kserve/issues/3487) for details.

```bash
kubectl apply --server-side -f https://github.com/kserve/kserve/releases/download/v{{kserve_release_version}}/kserve.yaml
```yaml
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: kserve-ingress-gateway
namespace: kserve
spec:
gatewayClassName: envoy
listeners:
- name: http
protocol: HTTP
port: 80
allowedRoutes:
namespaces:
from: All
- name: https
protocol: HTTPS
port: 443
tls:
mode: Terminate
certificateRefs:
- kind: Secret
name: my-secret
namespace: kserve
allowedRoutes:
namespaces:
from: All
infrastructure:
labels:
serving.kserve.io/gateway: kserve-ingress-gateway
```
!!! note
KServe comes with a default `Gateway` named kserve-ingress-gateway. You can enable the default gateway by setting Helm value `kserve.controller.gateway.ingressGateway.createGateway` to `true`.

II. Install KServe default serving runtimes:
!!! note
Follow the [migration guide](gatewayapi_migration.md) to migrate from Kubernetes Ingress to Gateway API.

```bash
kubectl apply --server-side -f https://github.com/kserve/kserve/releases/download/v{{kserve_release_version}}/kserve-cluster-resources.yaml
```
=== "Kubernetes Ingress"

In this guide we choose to install Istio as ingress controller. The minimally required Istio version is 1.22 and you can refer to the [Istio install guide](https://istio.io/latest/docs/setup/install).

Once Istio is installed, create `IngressClass` resource for istio.

III. Change default deployment mode and ingress option
```yaml
apiVersion: networking.k8s.io/v1
kind: IngressClass
metadata:
name: istio
spec:
controller: istio.io/ingress-controller
```

!!! note
Istio ingress is recommended, but you can choose to install with other [Ingress controllers](https://kubernetes.io/docs/concepts/services-networking/ingress-controllers/) and create `IngressClass` resource for your Ingress option.

First in the ConfigMap `inferenceservice-config` modify the `defaultDeploymentMode` from the `deploy` section to `RawDeployment`,

## 3. Install KServe
!!! note
The default KServe deployment mode is `Serverless` which depends on Knative. The following step changes the default deployment mode to `RawDeployment` before installing KServe.

```bash
kubectl patch configmap/inferenceservice-config -n kserve --type=strategic -p '{"data": {"deploy": "{\"defaultDeploymentMode\": \"RawDeployment\"}"}}'
```
=== "Gateway API"

then modify the `ingressClassName` from `ingress` section to the `IngressClass` name created in [step 1](#1-install-ingress-controller).
```yaml
ingress: |-
{
"ingressClassName" : "your-ingress-class",
}
```
=== "Install using Helm"

I. Install KServe CRDs

```shell
helm install kserve-crd oci://ghcr.io/kserve/charts/kserve-crd --version v{{ kserve_release_version }}
```
II. Install KServe Resources

Set the `kserve.controller.deploymentMode` to `RawDeployment` and `kserve.controller.gateway.ingressGateway.kserveGateway` to point to the `Gateway`
created in [step 2](#2-install-network-controller).

```shell
helm install kserve oci://ghcr.io/kserve/charts/kserve --version v{{ kserve_release_version }} \
--set kserve.controller.deploymentMode=RawDeployment \
--set kserve.controller.gateway.ingressGateway.enableGatewayApi=true
--set kserve.controller.gateway.ingressGateway.kserveGateway=kserve/kserve-ingress-gateway
```

=== "Install using YAML"

I. Install KServe:
`--server-side` option is required as the InferenceService CRD is large, see [this issue](https://github.com/kserve/kserve/issues/3487) for details.

```bash
kubectl apply --server-side -f https://github.com/kserve/kserve/releases/download/v{{kserve_release_version}}/kserve.yaml
```

II. Install KServe default serving runtimes:

```bash
kubectl apply --server-side -f https://github.com/kserve/kserve/releases/download/v{{kserve_release_version}}/kserve-cluster-resources.yaml
```

III. Change default deployment mode and ingress option

First in the ConfigMap `inferenceservice-config` modify the `defaultDeploymentMode` from the `deploy` section to `RawDeployment`.

```bash
kubectl patch configmap/inferenceservice-config -n kserve --type=strategic -p '{"data": {"deploy": "{\"defaultDeploymentMode\": \"RawDeployment\"}"}}'
```

Then from `ingress` section, modify the `enableGatewayApi` to the `true` and modify the `kserveIngressGateway` to point to the Gateway created in [step 2](#2-install-network-controller).
```yaml
ingress: |-
{
"enableGatewayApi": true,
"kserveIngressGateway": "kserve/kserve-ingress-gateway",
}
```

=== "Kubernetes Ingress"

=== "Install using Helm"

I. Install KServe CRDs

```shell
helm install kserve-crd oci://ghcr.io/kserve/charts/kserve-crd --version v{{ kserve_release_version }}
```
II. Install KServe Resources

Set the `kserve.controller.deploymentMode` to `RawDeployment` and `kserve.controller.gateway.ingressGateway.className` to point to the `IngressClass`
name created in [step 1](#1-install-ingress-controller).

```shell
helm install kserve oci://ghcr.io/kserve/charts/kserve --version v{{ kserve_release_version }} \
--set kserve.controller.deploymentMode=RawDeployment \
--set kserve.controller.gateway.ingressGateway.className=your-ingress-class
```

0 comments on commit c084ff5

Please sign in to comment.