Skip to content

[nexus] Managing local rack -> managing local fleet #1276

@smklein

Description

@smklein

Some operations within Nexus are implemented as "manage the state which may exist within my local rack". This includes:

  • Awaiting handoff from RSS
  • Ensuring a rack-wide CRDB instance exists
  • Ensuring sufficient redundancy for services exists with a rack

However, longer-term, we would ideally migrate many of these operations to be "fleet-wide" instead of "rack-wide". This way, one nexus could control multiple racks simultaneously, ensure that CRDB nodes are distributed within an AZ, and ensure that service redundancy suffices for multi-rack failure scenarios.

For additional context, see: https://github.com/oxidecomputer/omicron/pull/1234/files/28d87f51ab88cce3d8ff2560a8996904e8c78f81#diff-5a93a4691987ea1b28d848375a2728abcb26cec85d477d051243cb1863198392

Metadata

Metadata

Assignees

No one assigned

    Labels

    customerFor any bug reports or feature requests tied to customer requestsmulti-rack

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions