Forbid inefficient TensorDescriptor initialization #3393

CAHEK7 · 2024-11-17T22:43:26Z

Initializing TensorDescriptor from std::vector<int> is very inefficient due to extra checks and multiple intermediate vector, since internally std::vector<size_t> is used.

Changed all the initializations to the native size_t, removed constructors with std::vector<int> and added workarounds for a legacy descriptors initializations with int's.

It increased performance for the current RNN implementation for a few percents.

averinevg

LGTM, although there are a couple of minor comments.

src/tensor.cpp

CAHEK7 · 2024-11-21T22:25:57Z

Majority of tests heavily rely on vector<int>

* Improve 'size_t' support for the tests (related to #3393) * move tensor_scale/set tests to gtest * add binary and ternary subtensor operations and speed everything up unary: 2.1-3.8x binary: 1.8-2.7x ternary: 1.3-1.7x * move tenosor_cast/_copy tests to gtest, fix obvious bugs * use lambdas insteal of functor structs * replace inefficient loop * fix compilation error * remove removed tests * fix build * fix test names * clang-format * fix tensor initialization * fix msvc macro expansion

randyspauldingamd

I like it. A lot. I'd like it even more if we abandoned using std::vector directly and create objects for lengths and strides and wrap a lot of the conversions and checks into them.

averinevg · 2024-12-16T11:47:12Z

@randyspauldingamd I believe this is now your PR 😄. Does it pass the CI?

CAHEK7 · 2024-12-17T00:17:51Z

@randyspauldingamd I believe this is now your PR 😄. Does it pass the CI?

It doesn't.
The tests, especially old ctests, heavily rely on that int and that's why I've put the comment here #3424 (comment) - we must not bring the old legacy code into the GTests and there is no excuses (@Vsevolod1983, @BrianHarrisonAMD, could you please check this for any upcoming PRs?)

Fixing old the CTests is a pain in the everywhere. First of all, the template code in the test driver should be adapted (hard but manageable), next - mostly all the ctests should be modified (easy but lots of changes). Good news is - I've already updated various test data configs to be compatible with both int and size_t here https://github.com/ROCm/MIOpen/pull/3416/files#diff-1daaa3cdf3d99199e44b03aea4f80477db07637a62713338f21dd1eec71e8fcd
But probably we can get rid of the old tests soon, so the problem disappeared itself.

randyspauldingamd · 2024-12-17T15:45:21Z

I just had to open my big mouth...lol. Yeah, I can take this one. I expect I'll get a few of those ctest->gtest conversion tasks after the new year anyway :)

forbid 'vector<int>' descriptor initialization

b5e4971

CAHEK7 requested review from BrianHarrisonAMD, junliume and BradPepersAMD as code owners November 17, 2024 22:43

CAHEK7 added performance quality complexity_low labels Nov 17, 2024

averinevg reviewed Nov 18, 2024

View reviewed changes

src/tensor.cpp Show resolved Hide resolved

src/tensor.cpp Show resolved Hide resolved

averinevg approved these changes Nov 18, 2024

View reviewed changes

CAHEK7 marked this pull request as draft November 21, 2024 22:25

CAHEK7 added a commit that referenced this pull request Dec 1, 2024

Improve 'size_t' support for the tests (related to #3393)

0d605e0

This was referenced Dec 1, 2024

Move tensor_set/scale/cast/copy to gtest #3416

Merged

[GTests] Convert TensorOps CTest to GTest #3424

Merged

Merge branch 'develop' into C7/vector_size_t

9b3d669

randyspauldingamd approved these changes Dec 14, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Forbid inefficient TensorDescriptor initialization #3393

Forbid inefficient TensorDescriptor initialization #3393

CAHEK7 commented Nov 17, 2024

averinevg left a comment

CAHEK7 commented Nov 21, 2024

randyspauldingamd left a comment

averinevg commented Dec 16, 2024

CAHEK7 commented Dec 17, 2024 •

edited

Loading

randyspauldingamd commented Dec 17, 2024

Forbid inefficient TensorDescriptor initialization #3393

Are you sure you want to change the base?

Forbid inefficient TensorDescriptor initialization #3393

Conversation

CAHEK7 commented Nov 17, 2024

averinevg left a comment

Choose a reason for hiding this comment

CAHEK7 commented Nov 21, 2024

randyspauldingamd left a comment

Choose a reason for hiding this comment

averinevg commented Dec 16, 2024

CAHEK7 commented Dec 17, 2024 • edited Loading

randyspauldingamd commented Dec 17, 2024

CAHEK7 commented Dec 17, 2024 •

edited

Loading