-
Notifications
You must be signed in to change notification settings - Fork 931
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attach VM root volumes as disk devices #14532
base: main
Are you sure you want to change the base?
Attach VM root volumes as disk devices #14532
Conversation
Currently when booting a VM with two root disk devices attached, the kernel chooses which partitions to mount at
Still investigating the cause. EDIT: The above scenarios all occurred with |
1a573ed
to
679879e
Compare
A little more info on the bootorder bug; this is worse than I thought. When two VMs are launched from the same cloud image, the UUIDs for the filesystems and partitions are all the same:
The UUID for the root filesystem is passed via kernel cmdline in Jammy:
But
TL;DR, VM root volume attachment as implemented here is only safe between machines deployed using different cloud images, using Jammy or older as the recovery machine. Working on figuring out how(/if) the public clouds get around this. |
679879e
to
0c3707d
Compare
Done some more reading:
I need to do some more reading on how we control the UEFI boot entries in LXD. More Friday. |
In OpenStack, the procedure for getting access to another VM's root volume is to create an image from the VM, a volume from the image, and then mount the volume on another VM:
This does suffer from the same duplicate UUID problem as described above. A VM can be created based on the modified volume. |
5ec4d3b
to
d3143c7
Compare
d3143c7
to
094830b
Compare
converting to draft until TODO items are marked as completed |
094830b
to
729f788
Compare
729f788
to
9c9ac1f
Compare
please can you rebase |
9c9ac1f
to
2f64287
Compare
Rebase is complete @tomponline |
doc/explanation/storage.md
Outdated
@@ -122,7 +122,7 @@ Storage volumes can be of the following types: | |||
: LXD automatically creates one of these storage volumes when you launch an instance. | |||
It is used as the root disk for the instance, and it is destroyed when the instance is deleted. | |||
|
|||
This storage volume is created in the storage pool that is specified in the profile used when launching the instance (or the default profile, if no profile is specified). | |||
This storage volume is created in the storage pool that is specified when launching the instance (or in the default profile, if no pool or profile is specified). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This storage volume is created in the storage pool that is specified when launching the instance (or in the default profile, if no pool or profile is specified). | |
This storage volume is created in the storage pool that is specified when launching the instance. This pool can be specified explicitly or by specifying a profile. If no pool or profile is specified, LXD uses the storage pool of the default profile. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this suggested version to be easier to understand because it explains that the storage pool can be specified explicitly or via the profile. The current version only implies (in "if no pool or profile is specified") that a profile can be used to specify the pool.
Without this edit, it's also difficult to parse what clause the parenthesized content modifies--that is, what does the part following "or" replace? I know, only after reading it a couple times, that it's meant to replace "when launching the instance". However, at first glance, it's natural to parse it as "This storage volume is created in the default profile, if no pool or profile is specified" due to the symmetry of "in the storage pool" and "in the default profile", and that doesn't make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I totally agree RE clarity. Having looked at this again today I'm not sure that this is the right spot in the docs to nail down the semantics of which pool is selected for a root disk device at all; the section above is probably more appropriate. I've pared this down some (necessarily removing some nuance). Let me know what you think.
I'm wondering if the relationship between the root disk device and the container
/virtual-machine
volume type is a bit of a docs gap; figuring out the relationship between those two took me some time. Probably out of scope for this PR.
Signed-off-by: Wesley Hershberger <[email protected]>
2f64287
to
0229c19
Compare
Signed-off-by: Wesley Hershberger <[email protected]>
Signed-off-by: Wesley Hershberger <[email protected]>
These keys can only be live-updated on containers Signed-off-by: Wesley Hershberger <[email protected]>
Signed-off-by: Wesley Hershberger <[email protected]>
Signed-off-by: Wesley Hershberger <[email protected]>
virtual-machine/container volumes in the default project do not include the project name. Signed-off-by: Wesley Hershberger <[email protected]>
Signed-off-by: Wesley Hershberger <[email protected]>
The change to the source property makes `vol1` and `custom/vol1` semantically identical even though they are not syntactically identical. It's not correct to simply compare the strings anymore. This is the only instance of this comparison in the lxc and client packages. We also don't need to check the volume type as an incorrect volume type will just give a "No device found for this storage volume" error. Signed-off-by: Wesley Hershberger <[email protected]>
… used Signed-off-by: Wesley Hershberger <[email protected]>
…tart This allows a VM root disk to be attached to another instance without setting `security.shared`. If we only allow vm roots to be attached when security.shared is set on the volume, it makes it possible to forget to unset security.shared when the volume is detached. Forgetting to unset security.protection.start is harder :) Signed-off-by: Wesley Hershberger <[email protected]>
I'm having a hard time coming up with a scenario where this would be desirable. Signed-off-by: Wesley Hershberger <[email protected]>
We can no longer short-circut here because a VM's root disk migt be attached to another instance. I fixed this proactively for containers as well, but it does incur a performance penalty. Signed-off-by: Wesley Hershberger <[email protected]>
...from a virtual machine when the VM's root disk device is attached to another instance. This works when the key is set on a profile or instance, since it checks the expanded config. Signed-off-by: Wesley Hershberger <[email protected]>
Will allow us to check when updating `virtual-machine` volumes Signed-off-by: Wesley Hershberger <[email protected]>
If a virtual-machine volume is attached to more than one instance, don't allow removing security.shared. Signed-off-by: Wesley Hershberger <[email protected]>
Signed-off-by: Wesley Hershberger <[email protected]>
Signed-off-by: Wesley Hershberger <[email protected]>
Signed-off-by: Wesley Hershberger <[email protected]>
0229c19
to
a4ebc1e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -206,7 +206,6 @@ func DiskVolumeSourceParse(source string) (volType drivers.VolumeType, dbVolType | |||
case cluster.StoragePoolVolumeTypeNameContainer: | |||
err = errors.New("Using container storage volumes is not supported") | |||
case cluster.StoragePoolVolumeTypeNameVM: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
case cluster.StoragePoolVolumeTypeNameVM: | |
case cluster.StoragePoolVolumeTypeNameVM, cluster.StoragePoolVolumeTypeNameCustom: |
Can be simplified as such, while removing case cluster.StoragePoolVolumeTypeNameCustom:
a few lines below.
@@ -971,6 +971,40 @@ func InstanceContentType(inst instance.Instance) drivers.ContentType { | |||
return contentType | |||
} | |||
|
|||
// volumeIsUsedByDevice |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment incomplete?
## `instance_root_volume_attachment` | ||
|
||
Adds support for instance root volumes to be attached to other instances as disk |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I unserstood the scope of this PR correctly, this should state virtual machines
instead of instances
as container
volumes won't be available for attaching.
rootVolumeType := cluster.StoragePoolVolumeTypeNameContainer | ||
if inst.Type == instancetype.VM { | ||
rootVolumeType = cluster.StoragePoolVolumeTypeNameVM | ||
} | ||
|
||
if inst.Name == vol.Name && rootVolumeType == vol.Type { | ||
return true, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could use something like
volumeType, err := InstanceTypeToVolumeType(inst.Type)
if err != nil {
return false, err
}
volumeDBType, _ := VolumeTypeToDBType(inst.Type)
and compare using cluster.StoragePoolVolumeTypeNames[volumeDBType]
@@ -257,7 +259,7 @@ func (c *cmdStorageVolumeAttach) run(cmd *cobra.Command, args []string) error { | |||
device := map[string]string{ | |||
"type": "disk", | |||
"pool": resource.name, | |||
"source": volName, | |||
"source": args[1], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if the user provides just the volume name without the type?
} | ||
|
||
apiInst.ExpandedConfig = instancetype.ExpandInstanceConfig(d.state.GlobalConfig.Dump(), apiInst.Config, inst.Profiles) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could put a comment here explaining why security.protection.start
makes it okay to attach the root volume on another instance.
if shared.IsFalseOrEmpty(changedConfig["security.shared"]) && volDBType == cluster.StoragePoolVolumeTypeVM { | ||
err = allowRemoveSecurityShared(b.state, inst.Project().Name, &curVol.StorageVolume) | ||
if err != nil { | ||
return err | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For VM volume sharing we should also consider the usage of security.protection.start
here right?
Say a VM volume is attached to more than one instance but one of them has security.protection.start
enabled, then we should allow unsetting security.shared
, isnt that correct?
// Handle instance volumes. | ||
if vol.Type == cluster.StoragePoolVolumeTypeNameContainer || vol.Type == cluster.StoragePoolVolumeTypeNameVM { | ||
volName, snapName, isSnap := api.GetParentAndSnapshotName(vol.Name) | ||
if isSnap { | ||
return []string{api.NewURL().Path(version.APIVersion, "instances", volName, "snapshots", snapName).Project(vol.Project).String()}, nil | ||
} | ||
|
||
return []string{api.NewURL().Path(version.APIVersion, "instances", volName).Project(vol.Project).String()}, nil | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For containers, doesn't it make sense to return early?
You are claiming it doesn't affect performance, but how can that be if we are running VolumeUsedByInstanceDevices
and VolumeUsedByProfileDevices
unnecessarily for containers?
I may have missed something here, if so sorry about this.
// If vol is the instance's root volume and it is defined in a profile, | ||
// it won't be added to the list by VolumeUsedByInstanceDevices. | ||
instancePath := api.NewURL().Path(version.APIVersion, "instances", volName).Project(vol.Project).String() | ||
if !slices.Contains(volumeUsedBy, instancePath) { | ||
volumeUsedBy = append(volumeUsedBy, instancePath) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the volume is being used in a profile device as an additional drive, instances with this device won't be added to volumeUsedBy
, right?
This is probably expected anyway, I am asking more for my own understanding.
return err | ||
} | ||
} | ||
|
||
// Load storage volume from database. | ||
dbVol, err := VolumeDBGet(b, inst.Project().Name, inst.Name(), volType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related to your changes, but are we getting the same volume from the database again?
On line 2895 we have curVol, err := VolumeDBGet(b, inst.Project().Name, inst.Name(), volType)
Includes commits from #14491
This PR enables attaching a virtual machine's root storage volume to another virtual machine via a disk device:
This has some constraints because simultaneous access to storage volumes with content-type
block
is unsafe:vm1
's root volume can be attached to exactly one other instance ifvm1
hassecurity.protection.start: true
vm1
's root volume can be attached to any number of other instances if the storage volumevirtual-machine/vm1
hassecurity.shared: true
security.protection.start
is recommended for interactive use; e.g. a user temporarily needs to access a bricked machine's root volume to fix it or recover data.security.shared
can be used if more than one running instance must have access to the block volume.TODO