Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to start ContainerManager" err="failed to get rootfs info: failed to get mount point for device..." #3839

Open
rnnr opened this issue Jan 5, 2025 · 11 comments

Comments

@rnnr
Copy link

rnnr commented Jan 5, 2025

I'm getting the error as desrivd in known issues (https://kind.sigs.k8s.io/docs/user/known-issues/) but the creating and using the cluster config file did not change anything:

Jan 05 23:26:32 kind-control-plane kubelet[1763]: E0105 23:26:32.106420 1763 kubelet.go:1649] "Failed to start ContainerManager" err="failed to get rootfs info: failed to get mount point for device "/dev/nvme0n1p2
": no partition info for device "/dev/nvme0n1p2""
Jan 05 23:26:32 kind-control-plane systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE

My cluster yaml looks this way, the partition has file system F2FS:
(starting it with kind create cluster --config ~/.kind/cluster.yaml)

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
  extraMounts:
    - hostPath: /dev/nvme0n1p2
      containerPath: /dev/nvme0n1p2
      propagation: HostToContainer

kind version:
kind v0.26.0 go1.23.4 linux/amd64

docker version:

Client:
 Version:           26.1.0
 API version:       1.45
 Go version:        go1.23.1
 Git commit:        9714adc6c797755f63053726c56bc1c17c0c9204
 Built:             Sun Dec  8 21:43:42 2024
 OS/Arch:           linux/amd64
 Context:           default

Server:
 Engine:
  Version:          26.1.0
  API version:      1.45 (minimum version 1.24)
  Go version:       go1.23.3
  Git commit:       061aa95809be396a6b5542618d8a34b02a21ff77
  Built:            Thu Dec 12 15:02:12 2024
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          v1.7.15
  GitCommit:        926c9586fe4a6236699318391cd44976a98e31f1
 runc:
  Version:          1.1.12
  GitCommit:        51d5e94601ceffbbd85688df1c928ecccbfa4685
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad007797e0dcd8b7126f27bb87401d224240

Is there something else I should check or another workaround?

@stmcginnis
Copy link
Contributor

I think we need to know a little more about your environment. Can you include the output from docker info?

You can also run kind create cluster --config ~/.kind/cluster.yaml --retain to keep the node container around after failure to inspect it for config issues or look for log messages by exec'ing in and running commands. You can also do kind export logs to collect up the various logs of interest from the node.

@BenTheElder
Copy link
Member

the partition has file system F2FS

not familiar with this one, but using a more common eg etx4 partition will probably fix it.

@rnnr
Copy link
Author

rnnr commented Jan 6, 2025

docker info:

Client:
 Version:    26.1.0
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  0.14.0
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.28.1
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 6
  Running: 3
  Paused: 0
  Stopped: 3
 Images: 31
 Server Version: 26.1.0
 Storage Driver: overlay2
  Backing Filesystem: f2fs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 926c9586fe4a6236699318391cd44976a98e31f1
 runc version: 51d5e94601ceffbbd85688df1c928ecccbfa4685
 init version: de40ad007797e0dcd8b7126f27bb87401d224240
 Security Options:
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 6.6.62-gentoo-dist
 Operating System: Gentoo Linux
 OSType: linux
 Architecture: x86_64
 CPUs: 24
 Total Memory: 188.5GiB
 Name: shodan
 ID: DLAE:EMXQ:UF4S:N7LR:JXGR:V5YJ:RLBU:FDIJ:C6FZ:C3X5:F7NM:HU5M
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Username: rnnr
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

@rnnr
Copy link
Author

rnnr commented Jan 6, 2025

the partition has file system F2FS

not familiar with this one, but using a more common eg etx4 partition will probably fix it.

My rootfs is on this and kind wants to know about rootfs, or no?
How should I change the disk it uses - the cluster config file seems to be ignored.

@rnnr
Copy link
Author

rnnr commented Jan 6, 2025

You can also run kind create cluster --config ~/.kind/cluster.yaml --retain to keep the node container around after failure to inspect it for config issues or look for log messages by exec'ing in and running commands. You can also do kind export logs to collect up the various logs of interest from the node.

I did this, or maybe withou the --retain but it does not matter, as the message I posted initially is repeatedly in kubelet.log and it kind create cluster gets stuck on it for a while.

Attaching the whole file - I will provide any other of the, I just do not want to flood it here with useless data, so please guide me.
kubelet.log

@BenTheElder
Copy link
Member

My rootfs is on this and kind wants to know about rootfs, or no?

kubelet is looking for stats, but from it's POV the "rootfs" will be whatever the storage for the "node" container is on.

The logs from kubelet don't make sense in this context because it's expected to be running directly on a "real" host (machine, VM), not in a container (which is not technically supported upstream)

So the rootfs in this case would be whatever filesystem docker's data root is on with your volumes and containers.

This code is not in kind, and the filesystem stats need to work inside the container.

@BenTheElder
Copy link
Member

How should I change the disk it uses - the cluster config file seems to be ignored.

https://docs.docker.com/engine/daemon/#daemon-data-directory

@BenTheElder
Copy link
Member

In theory we'd like kind to work with all of these, but in practice the container ecosystem is most well tested with ext4, possibly a few others, but definitely not all filesystems (and most of the relevant code is not in kind).

In the future instead of cadvisor the stats may be in kubelet and CRI (containerd here).

See also: https://github.com/kubernetes-sigs/kind/pull/1464/files (not sure if this sort of thing is relevant for f2fs)

@rnnr
Copy link
Author

rnnr commented Jan 7, 2025

Thanks for the pointers, I'll look at it hopefully soon more closely. I appreciate the info, it's just there came some more pushing things.

@rnnr
Copy link
Author

rnnr commented Jan 15, 2025

See also: https://github.com/kubernetes-sigs/kind/pull/1464/files (not sure if this sort of thing is relevant for f2fs)

I've checked the code. Not sure how is the function mountDevMapper supposed to be used, but the command docker info -f "{{.Driver}}" I see it callse returns "overlay2" on my machine, co the function would return false.

@BenTheElder
Copy link
Member

Yes, we have no attempt to support F2FS specifically (and I'm not sure what is necessary for it), but you could try manually configuring the equivalent /dev/mapper mount on the off chance we have the same problem here.

https://kind.sigs.k8s.io/docs/user/configuration/#extra-mounts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants