Skip to content

Commit

Permalink
Merge pull request #1032 from openziti/ha-docs-overview
Browse files Browse the repository at this point in the history
Add controller clustering reference documentation
  • Loading branch information
plorenz authored Feb 10, 2025
2 parents 0adfab6 + fa07731 commit 32032c0
Show file tree
Hide file tree
Showing 18 changed files with 901 additions and 18 deletions.
2 changes: 1 addition & 1 deletion docusaurus/docs/reference/30-configuration/_category_.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
label: Configuration
position: 40
position: 15
link:
type: doc
id: reference/configuration/conventions
22 changes: 9 additions & 13 deletions docusaurus/docs/reference/30-configuration/controller.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ The controller configuration file has several top level configuration sections t
related configuration settings.

- [`ctrl`](#ctrl) - define control channel listener
- [`cluster`](#cluster) - allows configuring the controller in an controller cluster
- [`db`](#db) - specifies database file location
- [`edge`](#edge) - configures edge specific functionality
- [`events`](#events) - allows configuration of event output
Expand All @@ -23,7 +24,6 @@ related configuration settings.
listening, and CA bundles
- [`network`](#network) - set network level cost values
- [`profile`](#profile) - enables profiling of controller memory and CPU statistics
- [`raft`](#raft) - allows configuring the controller in an HA cluster
- [`trace`](#trace) - adds a peek handler to all controller messaging for debug purposes
- [`web`](#web) - configures API presentation exposure
- [`v`](#v) - A special section to note the version of the configuration file, only `v: 3` is
Expand All @@ -32,13 +32,13 @@ related configuration settings.
The standard OpenZiti experience minimally requires the following sections:

- `ctrl`
- `db` or `raft`
- `db` or `cluster`
- `identity`
- `edge`
- `web`
- `v`

Of those values, to start the controller only the `ctrl`, `db` or `raft`, `v`, and `identity`
Of those values, to start the controller only the `ctrl`, `db` or `cluster`, `v`, and `identity`
sections are required. However, not including the `edge` section will start the controller in "
fabric only" mode and will not support any edge functionality or concepts (identities, JWT
enrollment, 3rd Party CAs, policies, etc.). Not including the `web` section will result in none of
Expand Down Expand Up @@ -91,7 +91,7 @@ This includes the protocol(s) used for router connections and how those connecti
See [addressing](./conventions.md#addressing).
- `options` - a set of option which includes the below options and those defined
in [channel options](./conventions.md#channel)
- `advertiseAddress` - (required when raft is enabled) - configures the address at which this
- `advertiseAddress` - (required when controller clustering is enabled) - configures the address at which this
controller should be reachable by other controllers in the cluster
- `newListener` - (optional) an `<protocol>:<interface>:<port>` address that is sent to routers
to indicate a controller address migration. Should only be specified when the new listener
Expand Down Expand Up @@ -473,12 +473,10 @@ profile:
intervalMs: 150000
```

### `raft`
### `cluster`

The raft section enables running multiple controllers in a cluster.
The cluster section enables running multiple controllers in a cluster.

- `bootstrapMembers` - (optional) Only used when bootstrapping the cluster. List of initial clusters
members. Should only be set on one of the controllers in the cluster.
- `commandHandler` - (optional)
- `maxQueueSize` - (optional, 1000) max size of the queue for processing incoming raft log
entries
Expand Down Expand Up @@ -510,13 +508,11 @@ The raft section enables running multiple controllers in a cluster.
be used to bring other nodes up to date that are only slightly behind, without having to send the
full snapshot. This is a cluster wide value and should be consistent across nodes in the cluster.
Otherwise the value from the most recently started controller will win.
- `warnWhenLeaderlessFor` - (optional, 1m) - Emits a warning log message if a controller is part of
a cluster with no leader for a duration which exceeds this threshold.

```text
raft:
bootstrapMembers:
- tls:127.0.0.1:6262
- tls:127.0.0.1:6363
- tls:127.0.0.1:6464
cluster:
commandHandler:
maxQueueSize: 1000
commitTimeout: 50ms
Expand Down
6 changes: 5 additions & 1 deletion docusaurus/docs/reference/30-configuration/router.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,8 @@ The `ctrl` section configures how the router will connect to the controller.
See [heartbeats](./conventions.md#heartbeats).
- `options` - a set of option which includes the below options and those defined
in [channel options](conventions.md#channel)
- `endpointsFile` - (optional, 'config file dir'/endpoints) - File location to save the current
known set of controller endpoints, when an endpoints update has been received from a controller.

Example:

Expand Down Expand Up @@ -164,6 +166,9 @@ Each dialer currently supports a number of [shared options](conventions.md#xgres
The `edge` section contains configuration that pertain to edge functionality. This section must be
present to enable edge functionality (e.g. listening for edge SDK connections, tunnel binding modes).

- `db` - (optional, `<path-to-config-file>.proto.gzip`) - Configures where the router data model will be snapshotted to
- `dbSaveIntervalSeconds` - (optional, 30s) - Configures how the router data model will be snapshotted

Example:

```text
Expand Down Expand Up @@ -210,7 +215,6 @@ routers at least one valid SAN must be provided.
- `uri` - (optional) - an array of URI SAN entries
- `email` - (optional) - an array of email SAN entries


### `forwarder`

The `forwarder` section controls options that affect how a router forwards payloads across links to
Expand Down
2 changes: 1 addition & 1 deletion docusaurus/docs/reference/_category_.yml
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
label: Reference
position: 40
position: 10
2 changes: 1 addition & 1 deletion docusaurus/docs/reference/config-types/index.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Builtin Config Types
sidebar_position: 10
sidebar_position: 20
---

## Overview
Expand Down
2 changes: 2 additions & 0 deletions docusaurus/docs/reference/ha/_category_.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
label: Controller Clustering
position: 22
6 changes: 6 additions & 0 deletions docusaurus/docs/reference/ha/bootstrapping/_category_.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
label: Bootstrapping
position: 10
link:
type: doc
id: reference/ha/bootstrapping/overview

95 changes: 95 additions & 0 deletions docusaurus/docs/reference/ha/bootstrapping/certificates.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
---
sidebar_label: Certificates
sidebar_position: 20
---

# Controller Certificates

For controllers to communicate and trust one another, they need certificates that have
been generated with the correct attributes and relationships.

## Glossary

### SPIFFE ID

A SPIFFE ID is a specially formatted URI which is intended to be embedded in a certificates. Applications
use these identifiers to figure out the following about the applications connecting to them.

1. What organization the peer belongs to
1. What type of application the peer is
1. The application's unique identifier

Controller certificates use SPIFFE IDs to allow the controllers to identify each during mTLS negotiation.

See [SPIFFE IDs](https://spiffe.io/docs/latest/spiffe-about/spiffe-concepts/#spiffe-id) for more information.

### Trust domain

A [trust domain](https://spiffe.io/docs/latest/spiffe-about/spiffe-concepts/#trust-domain)
is the part of a SPIFFE ID that indicates the organization that an identity belongs to.

## Requirements

1. The certificates must have a shared root of trust
1. The controller client and server certificates must contain a SPIFFE ID.
1. The SPIFFE ID must be set as the only URI in the `X509v3 Subject Alternative Name` field in the
certificate.
1. The SPIFFE ID must have the following format: `spiffe://<trust domain>/controller/<controller id>`

So if the trust domain is `example.com` and the controller id is `ctrl1`, then the SPIFFE id
would be:

```
spiffe://example.com/controller/ctrl1
```

## Steps to Certificate Creation
There are many ways to set up certificates, so this will just cover an example configuration.

The primary thing to ensure is that controllers have a shared root of trust.
One way of generating certs would be as follows:

1. Create a root CA
1. Create an intermediate CA for each controller
1. Issue a server cert using the intermediate CA for each controller
1. Issue a client cert using the intermediate CA for each controller

## Example

* The OpenZiti CLI supports creating SPIFFE IDs in certificates
* Use the `--trust-domain` flag when creating CAs
* Use the `--spiffe-id` flag when creating server or client certificates

Using the OpenZiti PKI tool, certificates for a three node cluster could be created as follows:

```bash
# Create the trust root, a self-signed CA
ziti pki create ca --trust-domain ha.test --pki-root ./pki --ca-file ca --ca-name 'HA Example Trust Root'

# Create the controller 1 intermediate/signing cert
ziti pki create intermediate --pki-root ./pki --ca-name ca --intermediate-file ctrl1 --intermediate-name 'Controller One Signing Cert'

# Create the controller 1 server cert
ziti pki create server --pki-root ./pki --ca-name ctrl1 --dns "localhost,ctrl1.ziti.example.com" --ip "127.0.0.1,::1" --server-name ctrl1 --spiffe-id 'controller/ctrl1'

# Create the controller 1 server cert
ziti pki create client --pki-root ./pki --ca-name ctrl1 --client-name ctrl1 --spiffe-id 'controller/ctrl1'

# Create the controller 2 intermediate/signing cert
ziti pki create intermediate --pki-root ./pki --ca-name ca --intermediate-file ctrl2 --intermediate-name 'Controller Two Signing Cert'

# Create the controller 2 server cert
ziti pki create server --pki-root ./pki --ca-name ctrl2 --dns "localhost,ctrl2.ziti.example.com" --ip "127.0.0.1,::1" --server-name ctrl2 --spiffe-id 'controller/ctrl2'

# Create the controller 2 client cert
ziti pki create client --pki-root ./pki --ca-name ctrl2 --client-name ctrl2 --spiffe-id 'controller/ctrl2'

# Create the controller 3 intermediate/signing cert
ziti pki create intermediate --pki-root ./pki --ca-name ca --intermediate-file ctrl3 --intermediate-name 'Controller Three Signing Cert'

# Create the controller 3 server cert
ziti pki create server --pki-root ./pki --ca-name ctrl3 --dns "localhost,ctrl3.ziti.example.com" --ip "127.0.0.1,::1" --server-name ctrl3 --spiffe-id 'controller/ctrl3'

# Create the controller 3 client cert
ziti pki create client --pki-root ./pki --ca-name ctrl3 --client-name ctrl3 --spiffe-id 'controller/ctrl3'
```
51 changes: 51 additions & 0 deletions docusaurus/docs/reference/ha/bootstrapping/configuration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
---
sidebar_label: Configuration
sidebar_position: 30
---

# Controller Configuration

### Config File

The controller requires a `cluster` section.

```yaml
cluster:
dataDir: /path/to/data/dir
```
The `dataDir` will be used to store the following:

* `ctrl-ha.db` - the OpenZiti data model bbolt database
* `raft.db` - the Raft bbolt database
* `snapshots/` - a directory to store Raft snapshots

Controllers use the control channel listener to communicate with each other. Unlike
routers, they need to know how to reach each other, so an advertise address must
be configured.

```yaml
ctrl:
listener: tls:0.0.0.0:1280
options:
advertiseAddress: tls:ctrl1.ziti.example.com:1280
```

Finally, cluster-capable SDK clients use OIDC for authentication, so an OIDC endpoint must be configured.

```yaml
web:
- name: all-apis-localhost
bindPoints:
- interface: 0.0.0.0:1280
address: ctrl1.ziti.example.com:1280
options:
minTLSVersion: TLS1.2
maxTLSVersion: TLS1.3
apis:
- binding: health-checks
- binding: fabric
- binding: edge-management
- binding: edge-client
- binding: edge-oidc
```
37 changes: 37 additions & 0 deletions docusaurus/docs/reference/ha/bootstrapping/initialization.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
---
sidebar_label: Initialization
sidebar_position: 40
---

# Initializing the First Controller

First start the controller:

```shell
ziti controller run </path/to/controller-config.yml>
```

Since this controller has not yet been initialized, it does not have an administrator
identity that can be used to manage the network. The controller will pause startup
and wait for initialization. While waiting it will periodically emit a message:

```buttonless title="Output"
[ 3.323] WARNING ziti/controller/server.(*Controller).checkEdgeInitialized: the
controller has not been initialized, no default admin exists. Add this node to a
cluster using 'ziti agent cluster add tls:ctrl1.ziti.example.com:1280' against an existing
cluster member, or if this is the bootstrap node, run 'ziti agent controller init'
to configure the default admin and bootstrap the cluster
```

As this is the first node in the cluster, there's no existing cluster to add it to.

To add the default administrator, run:

```
ziti agent cluster init <admin username> <admin password> <admin identity name>
```

This initializes an admin user that can be used to manage the network.

Once the admin user is created, the controller should be up and running. This is
now a functional HA cluster, albeit with a cluster size of one.
14 changes: 14 additions & 0 deletions docusaurus/docs/reference/ha/bootstrapping/overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
---
sidebar_label: Overview
sidebar_position: 10
---

# Bootstrapping A Cluster

To bring up a controller cluster, one starts with a single node.

Bootstrapping a cluster has the following steps:

1. [Creating Certificates](certificates.md)
1. [Setting Controller Config](configuration.md)
1. [Controller Initialization](initialization.md)
Loading

0 comments on commit 32032c0

Please sign in to comment.