Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
6.4 version fix and mergeback 6.3 hotfixes (#650)
* Remove website URL from comments (#600) Referencing or using code from some websites is prohibited in this repository. This change removes an informational reference in the comments. * Fix rare memory access faults when using internal serial merge (#597) * test: add tests for internal serial merge function * refactor(detail/merge_path.hpp): removed code duplication * fix(detail/merge_path.hpp): stricter boundary checking in serial merge * fix(detail/block_sort_merge.hpp): fix missing block-wide sync During a previous refactor, serial_merge does no longer do a block sync. This has now been re-added. * feat: add unsafe variant of serial merge * fix: use bounded version for serial merge to fix rare page faults * test(test_internal_merge_path): clean up internal merge path tests * style: standardize range_t<> construction * fix(detail/merge_path.hpp): fix 'range_t<>::count1()' and 'range_t<>::count2()' return types to be same as encapsulated type * perf(detail/merge_path.hpp): use const ref in function parameters * refactor(detail/merge_path.hpp): replace redundant use of 'OffsetT' with 'unsigned int' * chore: update changelog * fix: restore missing thread sync This got removed during a rebase. * Add gfx1151 target (#601) (#603) Co-authored-by: Stanley Tsang <[email protected]> * Merge back 6.2 hotfixes (#607) (#620) * Update dependency names for static builds (#557) This also removes the line setting `BUILD_SHARED_LIBS` to `ON`, which was previously required to get the correctly named packages when not specifically compiling for a static build. Updates to the ROCmCMakeBuildTools (rocm-cmake) should mean this is no longer necessary. * Fix BUILD_SHARED_LIBS for packaging (#558) * Fix the dependencies of the static packages (#563) * cmake: don't set CMAKE_C_COMPILER, as rocPRIM is a CXX project (#567) * add developer guidelines (#555) (#574) * Update Read the Docs config to Python 3.10 and latest rocm-docs-core (#564) (#579) * Cherry-pick: Optimize block_reduce_warp_reduce when block size is the same as warp size (#599) * Optimize block_reduce_warp_reduce when block size == warp size * Make conditional constexpr * Fix conflict in concepts.rst --------- Co-authored-by: Lauren Wrubleski <[email protected]> Co-authored-by: Steve Leung <[email protected]> Co-authored-by: randyh62 <[email protected]> Co-authored-by: Nol Moonen <[email protected]> Co-authored-by: Sam Wu <[email protected]> * Changed precondition for edge case in serial_merge to prevent assertion error (#622) * added std::min to ensure no out of bound acess * fixed typo keys->keys1 * updated changelog * reverted std::min * implemented suggested logic * edited to conform to standards (#618) * Memory leak fix for multiple rocPRIM unit tests (#614) * fixed mem leak in test_config_dispatch.cpp * added missing hip free for method==4 in test_block_scan.kernels * added graphHelpeer class that does not cause memory leak due to using hipGraphCreate * replaced old hipGraph helpers with new class in device_bin_search * changed HIP_CHECK_NON_VOID to HIP_CHECK * fixed mem leak in device_bin_search * added additional functions * changed out old calls to hipGraphCrete to new GraphHelper class * added missing stream sync for hipgrag_algs * n * added missing hipFree and HIP_CHECK for lookback_reproducibility * added missing hipFree in test_discard_iterator * fixed test failures * removed extra hipFree * removed unused variables * updated change log * removed redundant function --------- Co-authored-by: Your Name <[email protected]> Co-authored-by: root <[email protected]> * updated the changelog for 6.3 (#632) * updated default gpu to include gfx12 and gfx1151 * updated changelog * fixed minor grammar mistake in changelog * Update CHANGELOG.md Co-authored-by: spolifroni-amd <[email protected]> * Remove gfx940,gfx941 targets (#639) * Update rocPRIM version --------- Co-authored-by: Wayne Franz <[email protected]> Co-authored-by: Nara <[email protected]> Co-authored-by: amd-garydeng <[email protected]> Co-authored-by: Lauren Wrubleski <[email protected]> Co-authored-by: Steve Leung <[email protected]> Co-authored-by: randyh62 <[email protected]> Co-authored-by: Nol Moonen <[email protected]> Co-authored-by: Sam Wu <[email protected]> Co-authored-by: Di Nguyen <[email protected]> Co-authored-by: spolifroni-amd <[email protected]> Co-authored-by: Your Name <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: Val Movsik <[email protected]>
- Loading branch information