Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: abp_pcap_detection pipeline running slowly on AArch64 #2120

Closed
2 tasks done
dagardner-nv opened this issue Jan 21, 2025 · 6 comments
Closed
2 tasks done

[BUG]: abp_pcap_detection pipeline running slowly on AArch64 #2120

dagardner-nv opened this issue Jan 21, 2025 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@dagardner-nv
Copy link
Contributor

Version

25.02

Which installation method(s) does this occur on?

Source

Describe the bug.

This pipeline typically executes in under a minute (20s on my system), however on an Arm64 system this takes between 5-9 minutes.

On a first run I noticed it took 5m38s, initially suspecting the ONNX conversion was the culprit, however on a second run it took 9m2s.

Minimum reproducible example

Run the `examples/abp_pcap_detection` example workflow as documented on an Arm64 system

Relevant log output

Click here to see error details

[Paste the error here, it will be hidden by default]

Full env printout

Click here to see environment details

[Paste the results of print_env.sh here, it will be hidden by default]

Other/Misc.

No response

Code of Conduct

  • I agree to follow Morpheus' Code of Conduct
  • I have searched the open bugs and have found no duplicates for this bug report
@dagardner-nv dagardner-nv added the bug Something isn't working label Jan 21, 2025
@dagardner-nv
Copy link
Contributor Author

dagardner-nv commented Jan 21, 2025

The ransomware_detection pipeline has a similar performance difference 13m21.449s vs 1m29.207 (#2124).

This might be an issue with a specific stage.

@dagardner-nv dagardner-nv changed the title [BUG]: abp_pcap_detection pipeline running slowly on AArch64 [BUG]: abp_pcap_detection pipeline running slowly on AArch64 Jan 23, 2025
@dagardner-nv dagardner-nv self-assigned this Jan 23, 2025
@dagardner-nv dagardner-nv moved this from Todo to In Progress in Morpheus Boards Jan 23, 2025
@dagardner-nv
Copy link
Contributor Author

On the x86_64 side the slowest stage is the preprocessing stage.

@dagardner-nv
Copy link
Contributor Author

On ARM the preprocessing stage is also the slowest but takes 03m:38s vs 00m:13s.

@dagardner-nv
Copy link
Contributor Author

The issue appears to be the number of threads being used. The ARM system I've been testing on has 80 cores, resulting in a default thread count of 80.

This pipeline contains 12 stages, setting --num_threads=12 results in a runtime of 1m10.253s.

@dagardner-nv
Copy link
Contributor Author

Running with --num_threads=1 results in an execution time of 27s

@dagardner-nv
Copy link
Contributor Author

In testing we found that the performance problems observed happened under Ubuntu 22.04, but did not exist on Ubuntu 24.04.

rapids-bot bot pushed a commit that referenced this issue Jan 31, 2025
…es for DFP (#2162)

* Document Arm64 performance issues on older Linux Kernels
* Manually install PyTorch in the DFP container allowing Arm64 users to run the pipeline work-around issue #2095 (fix slated for PyTorch 2.6 / Morpheus 25.06).

Closes #2120
Closes #2124

## By Submitting this PR I confirm:
- I am familiar with the [Contributing Guidelines](https://github.com/nv-morpheus/Morpheus/blob/main/docs/source/developer_guide/contributing.md).
- When the PR is ready for review, new or existing tests cover these changes.
- When the PR is ready for review, the documentation is up to date with these changes.

Authors:
  - David Gardner (https://github.com/dagardner-nv)

Approvers:
  - Eli Fajardo (https://github.com/efajardo-nv)
  - Michael Demoret (https://github.com/mdemoret-nv)

URL: #2162
@github-project-automation github-project-automation bot moved this from In Progress to Done in Morpheus Boards Jan 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Done
Development

No branches or pull requests

1 participant