Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mlx5 offload enable #2676

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

mrbojangles3
Copy link

@mrbojangles3 mrbojangles3 commented Feb 19, 2025

[Enable Hardware Offloads in Mellanox driver ]

The configuration changes in this PR will allow for hardware offloading of traffic control and connection tracking operations. There is a netdev conf paper that talks about these features.

How to use

These kernel configuration changes will allow the hardware offload features of the Mellanox nics to be used. The usage of these features comes from user space. For those interested in containerized workloads, one possible way to use these is similar to Red Hat Openshift allows this configuration (via Open vSwitch). Outside of ovs, these offloads can be configured via the tc or nft commands.

Testing done

I have compiled these changes, and booted a physical node equipped with a mellanox Cx-7.

  • Changelog entries added in the respective changelog/ directory (user-facing change, bug fix, security fix, update)
  • Inspected CI output for image differences: /boot and /usr size, packages, list files for any missing binaries, kernel modules, config files, kernel modules, etc.

@mrbojangles3
Copy link
Author

A link to the thread that started this PR.

@ader1990
Copy link
Contributor

The pipelines failed with:

2025-02-19T18:47:35.6812600Z INFO    grub_install.sh: Installing GRUB x86_64-xen in flatcar_production_image.bin
2025-02-19T18:47:35.7002936Z INFO    grub_install.sh: Compressing modules in flatcar/grub/x86_64-xen
2025-02-19T18:47:37.5624582Z INFO    grub_install.sh: Generating flatcar/grub/x86_64-xen/load.cfg
2025-02-19T18:47:37.5794400Z INFO    grub_install.sh: Generating xen/pvboot-x86_64.elf
2025-02-19T18:47:37.5900771Z INFO    grub_install.sh: Installing default x86_64 Xen bootloader.
2025-02-19T18:47:37.6644329Z INFO    grub_install.sh: Elapsed time (grub_install.sh): 0m3s
2025-02-19T18:47:37.7125747Z INFO    build_image: Generating flatcar_production_image_pcr_policy.zip
2025-02-19T18:47:38.0202747Z INFO    build_image: Writing flatcar_production_image_contents.txt
2025-02-19T18:47:38.8309417Z INFO    build_image: Writing flatcar_production_image_contents_wtd.txt
2025-02-19T18:47:39.0791751Z cpio: premature end of file
2025-02-19T18:47:39.0798053Z rmdir: failed to remove '/home/sdk/trunk/src/scripts/artifacts/amd64-usr/developer-4249.0.0+nightly-20250217-2100-5-gad11c4677c-a1/tmp_initrd_contents/rootfs-0': Directory not empty
2025-02-19T18:47:39.0943167Z ERROR   build_image: script called: build_image '--board=amd64-usr' '--group=developer' '--output_root=/home/sdk/trunk/src/scripts/artifacts' 'prodtar' 'container' 'sysext'
2025-02-19T18:47:39.0948625Z ERROR   build_image: Backtrace:  (most recent call is last)
2025-02-19T18:47:39.0962922Z ERROR   build_image:   file build_image, line 176, called: create_prod_image 'flatcar_production_image.bin' 'base' 'developer' 'coreos-base/coreos' 'containerd-flatcar:app-containers/containerd,docker-flatcar:app-containers/docker&app-containers/docker-cli&app-containers/docker-buildx'
2025-02-19T18:47:39.0977229Z ERROR   build_image:   file prod_image_util.sh, line 169, called: finish_image 'flatcar_production_image.bin' 'base' '/home/sdk/trunk/src/scripts/artifacts/amd64-usr/developer-4249.0.0+nightly-20250217-2100-5-gad11c4677c-a1/rootfs' 'flatcar_production_image_contents.txt' 'flatcar_production_image_contents_wtd.txt' 'flatcar_production_image.vmlinuz' 'flatcar_production_image_pcr_policy.zip' 'flatcar_production_image.grub' 'flatcar_production_image.shim' 'flatcar_production_image_kernel_config.txt' 'flatcar_production_image_initrd_contents.txt' 'flatcar_production_image_initrd_contents_wtd.txt' 'flatcar_production_image_disk_usage.txt'
2025-02-19T18:47:39.0988460Z ERROR   build_image:   file build_image_util.sh, line 869, called: die_err_trap '"${BUILD_LIBRARY_DIR}/extract-initramfs-from-vmlinuz.sh" "${root_fs_dir}/boot/flatcar/vmlinuz-a" "${BUILD_DIR}/tmp_initrd_contents"' '1'
2025-02-19T18:47:39.0993394Z ERROR   build_image: 
2025-02-19T18:47:39.0999869Z ERROR   build_image: Command failed:
2025-02-19T18:47:39.1006572Z ERROR   build_image:   Command '"${BUILD_LIBRARY_DIR}/extract-initramfs-from-vmlinuz.sh" "${root_fs_dir}/boot/flatcar/vmlinuz-a" "${BUILD_DIR}/tmp_initrd_contents"' exited with nonzero code: 1

This is because the initrd now contains a phantom cpio piece, I solved this in my kernel upgrade PR here, probably will solve this issue: 2be94c2.
You can rebase and cherry-pick this commit, and I can rerun the pipeline.

Copy link

github-actions bot commented Feb 20, 2025

Build action triggered: https://github.com/flatcar/scripts/actions/runs/13496833507

@mrbojangles3
Copy link
Author

On my local machine I was able to pass the build packages and build images. I am hopeful that this will also be the case in CI.

Signed-off-by: Jeremi Piotrowski <[email protected]>
Requires for mlx5 hardware offload on arm64 too.

Signed-off-by: Jeremi Piotrowski <[email protected]>
These options are x86 only.

Signed-off-by: Jeremi Piotrowski <[email protected]>
@jepio
Copy link
Member

jepio commented Feb 24, 2025

The arm64 build failed due to CONFIG_SWITCHDEV being in the amd64-only config, and CONFIG_VFIO_PCI_{VGA,IGD} in commonconfig actually being x86 only.
I've pushed a couple of commits to fix this up (hope you don't mind @mrbojangles3) and to make as many of the options modules as possible. With that I hope we can review what the size impact is of all these options so that we can judge whether we can afford to enable all of them.

@jepio
Copy link
Member

jepio commented Feb 24, 2025

Also exectued the sort_config script to sort the new options.

@mrbojangles3
Copy link
Author

(hope you don't mind @mrbojangles3)

I do not. Thank you for the help.

Right now the tests are failing, does that mean things have gotten too big? Did I miss documentation somewhere on a size limitation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants