Releases: rapidsai/rmm
Releases · rapidsai/rmm
v25.08.00
🚨 Breaking Changes
- Update requirements to CUDA 12.0+ (#1984) @bdice
- Remove CUDA 11 from dependencies.yaml (#1934) @KyleFromNVIDIA
- stop uploading packages to downloads.rapids.ai (#1929) @jameslamb
🐛 Bug Fixes
- Temporarily disable failing test on HMM systems. (#1950) @bdice
- Fix race conditions and deadlocks in REPLAY_BENCH (#1940) @wence-
📖 Documentation
- Update Python build instructions to include librmm wheel (#1978) @gmarkall
- Fix Python path in CONTRIBUTING.md (#1936) @bdice
🚀 New Features
🛠️ Improvements
- Use size_type in device_uvector (#1992) @bdice
- chore: remove unused line from update-version.sh (#1989) @gforsyth
- Revert "Update branches that trigger nightlies (#1954)" (#1988) @gforsyth
- fix(docker): use versioned
-latest
tag for allrapidsai
images (#1987) @gforsyth - Move more implementations to precompiled shared library (#1980) @bdice
- [pre-commit.ci] pre-commit autoupdate (#1979) @pre-commit-ci[bot]
- Add managed memory resource to replay benchmark (#1938) @pentschev
- Remove CUDA 11 from dependencies.yaml (#1934) @KyleFromNVIDIA
- Remove CUDA 11 devcontainers and update CI scripts (#1933) @bdice
- refactor(rattler): remove cuda11 options and general cleanup (#1932) @gforsyth
- stop uploading packages to downloads.rapids.ai (#1929) @jameslamb
- Forward-merge branch-25.06 into branch-25.08 (#1925) @gforsyth
- Branch 25.08 merge branch 25.06 (#1914) @vyasr
- Forward-merge branch-25.06 into branch-25.08 (#1905) @gforsyth
[NIGHTLY] v25.10.00
🔗 Links
🐛 Bug Fixes
- Fix aligned resource adaptor bug when alignment is less than 256. (#2003) @bdice
- Skip test on HMM systems with older CUDA drivers (#1944) @bdice
🚀 New Features
- Update to use CUDA runtime APIs that work in CUDA 12 and 13 (#2004) @robertmaynard
🛠️ Improvements
- Update rapids-build-backend to 0.4.1 (#2007) @KyleFromNVIDIA
- ci(labeler): update labeler action to @v5 (#2006) @gforsyth
- [StepSecurity] Apply security best practices (#2000) @stepsecurity-app[bot]
- Use std::for_each instead of std::equal in test (#1999) @vyasr
- Allow latest OS in devcontainers (#1997) @bdice
- Update build infra to support new branching strategy (#1994) @robertmaynard
- Move more implementations to precompiled shared library (part 2) (#1983) @bdice
- Use GCC 14 in conda builds. (#1963) @vyasr
v25.06.00
🚨 Breaking Changes
- Convert part of RMM to a precompiled library (#1896) @bdice
- Move RMM C++ code into cpp directory. (#1883) @bdice
🐛 Bug Fixes
- Run system MR tests in isolation. (#1945) @bdice
- Use auditwheel to properly retag the wheel (#1913) @vyasr
- Fix logger macros (#1884) @vyasr
📖 Documentation
- Move docs to top level. (#1917) @bdice
- Update Readme for the logging
set_level
(#1911) @JigaoLuo - Fixed documentation example for
DeviceBuffer.to_device
(#1881) @TomAugspurger
🚀 New Features
- Convert part of RMM to a precompiled library (#1896) @bdice
- Set mempool hw_decompress flag if driver supports it (#1875) @bdice
- Expose option to enable fabric memory handle support to Python (#1787) @pentschev
🛠️ Improvements
- fix(pytest): disable warning that gets raised to INTERNALERROR in pytest8.4.0 (#1942) @gforsyth
- use 'rapids-init-pip' in wheel CI, other CI changes (#1926) @jameslamb
- Finish CUDA 12.9 migration and use branch-25.06 workflows (#1921) @bdice
- Update to clang 20 (#1918) @bdice
- Quote head_rev in conda recipes (#1915) @bdice
- Build and test with CUDA 12.9.0 (#1907) @bdice
- Fix cpp wheel name to librmm. (#1903) @bdice
- Revert "Publish wheels and conda packages from Github Artifacts" (#1898) @bdice
- Publish wheels and conda packages from Github Artifacts (#1897) @VenkateshJaya
- Download build artifacts from Github for CI (#1895) @VenkateshJaya
- remove mkdir and test corresponding shared workflow (#1892) @msarahan
- Revert "Auto-sync draft PRs" (#1891) @bdice
- Auto-sync draft PRs (#1890) @bdice
- Add ARM conda environments (#1889) @bdice
- Vendor RAPIDS.cmake to avoid network call. (#1886) @bdice
- [pre-commit.ci] pre-commit autoupdate (#1885) @pre-commit-ci[bot]
- Move RMM C++ code into cpp directory. (#1883) @bdice
- refactor(rattler): enable strict channel priority for builds (#1867) @gforsyth
- Add support for Python 3.13 (#1851) @bdice
- Streamlining wheel builds to use fixed location and uploading build artifacts to Github (#1810) @VenkateshJaya
v25.04.00
🚨 Breaking Changes
- Add OOM fail reason, attempted allocation size to exception messages (retry) (#1844) @pmattione-nvidia
- Use new rapids-logger library (#1808) @vyasr
🐛 Bug Fixes
- Revert "Set mempool hw_decompress flag if driver supports it (#1854)" (#1873) @wence-
- Fix run export on cudatoolkit (#1862) @vyasr
- Fix dependencies.yaml for update-version.sh (#1859) @raydouglass
- Embed
__FILE__
as C-string for prefix replacement (#1858) @jakirkham - Add OOM fail reason, attempted allocation size to exception messages (retry) (#1844) @pmattione-nvidia
- Revert "Add OOM fail reason, attempted allocation size to exception messages" (#1843) @pmattione-nvidia
- fix GITHUB_WORKSPACE not being present locally (#1841) @msarahan
- Add telemetry setup to build workflows (#1838) @bdice
- Use static gbench (#1837) @bdice
- Fixes for rattler recipe (#1835) @bdice
- Depend on rapids-logger in host to prevent redistribution (#1834) @bdice
- Add OOM fail reason, attempted allocation size to exception messages (#1827) @pmattione-nvidia
📖 Documentation
🚀 New Features
- Add async view memory resource bindings to Python. (#1864) @bdice
- Run examples in CI (#1850) @bdice
- Add tests for RMM internal macros. (#1847) @bdice
- Add basic example. (#1800) @bdice
🛠️ Improvements
- Set mempool hw_decompress flag if driver supports it (#1854) @wence-
- Error if LIBCUDACXX_ENABLE_EXPERIMENTAL_MEMORY_RESOURCE is not defined. (#1852) @bdice
- Fix for -fdebug-prefix-map breaking sccache (#1846) @bdice
- fix(rattler): force
cuda_major
anddate_string
to be strings (#1842) @gforsyth - use gha-tools rapids-telemetry-setup for mkdir -p (#1839) @msarahan
- fix(rattler): resolve all overlinking errors (#1836) @gforsyth
- Update rattler-build recipe with assorted small fixes (#1832) @gforsyth
- Sccache stats telemetry (#1830) @msarahan
- Consolidate more Conda solves in CI (#1828) @KyleFromNVIDIA
- Require CMake 3.30.4 (#1826) @robertmaynard
- Create Conda CI test env in one step (#1824) @KyleFromNVIDIA
- Apply IWYU changes and fix deprecated GTest usage (#1821) @bdice
- Remove unnecessary index (#1820) @vyasr
- Use shared-workflows branch-25.04 (#1816) @bdice
- Use
rapids-pip-retry
in CI jobs that might need retries (#1814) @gforsyth - Use nightly matrix for branch tests. (#1813) @bdice
- Use build_type input (#1812) @bdice
- Add
build_type
to workflow inputs (#1811) @gforsyth - Use new rapids-logger library (#1808) @vyasr
- Forward-merge branch-25.02 to branch-25.04 (#1806) @bdice
- disallow fallback to Make in wheel builds (#1804) @jameslamb
- Migrate to NVKS for amd64 CI runners (#1803) @bdice
- Branch 25.04 merge branch 25.02 (#1799) @vyasr
- Port to rattler-build (#1796) @gforsyth
v25.02.00
🚨 Breaking Changes
- Switch to using separate rapids-logger repo (#1774) @vyasr
- Remove deprecated factory functions from resource adaptors. (#1767) @bdice
- Remove
rmm._lib
(#1765) @Matt711 - Remove memory access flags from cuda_async_memory_resource (#1754) @abellina
- Create logger wrapper around spdlog that can be easily reused in other libraries (#1722) @vyasr
🐛 Bug Fixes
- Add missing array header include (#1771) @robertmaynard
- Remove memory access flags from cuda_async_memory_resource (#1754) @abellina
- Update build.sh (#1749) @vyasr
- Fix some logger issues (#1739) @vyasr
- Use consistent signature for target_link_libraries (#1738) @vyasr
📖 Documentation
🚀 New Features
- Make the stream module a part of the public API (#1775) @Matt711
- Remove deprecated factory functions from resource adaptors. (#1767) @bdice
- Remove
rmm._lib
(#1765) @Matt711 - Reduce dependencies on numba. (#1761) @bdice
- Use ruff, remove isort and black. (#1759) @bdice
- Use bindings layout for all cuda-python imports. (#1756) @bdice
- Add configuration for pre-commit.ci, update pre-commit hooks (#1746) @bdice
- Adds fabric handle and memory protection flags to cuda_async_memory_resource (#1743) @abellina
- Remove upper bounds on cuda-python to allow 12.6.2 and 11.8.5 (#1729) @bdice
🛠️ Improvements
- Revert CUDA 12.8 shared workflow branch changes (#1805) @vyasr
- Build and test with CUDA 12.8.0 (#1797) @bdice
- Disable exec checks for
device_uvector::operator=
(#1790) @miscco - Add upper bound to prevent usage of numba 0.61.0 (#1789) @galipremsagar
- Add shellcheck to pre-commit and fix warnings (#1788) @gforsyth
- Add spdlog back as a requirement for now (#1780) @vyasr
- [pre-commit.ci] pre-commit autoupdate (#1778) @pre-commit-ci[bot]
- Use rapids-cmake for the logger (#1776) @vyasr
- Switch to using separate rapids-logger repo (#1774) @vyasr
- Use GCC 13 in CUDA 12 conda builds. (#1773) @bdice
- Check if nightlies have succeeded recently enough (#1772) @vyasr
- Fix codespell behavior. (#1769) @bdice
- Remove ignored cuda-python deprecation warning. (#1768) @bdice
- Forward-merge branch-24.12 to branch-25.02 (#1766) @bdice
- Update version references in workflow (#1757) @AyodeAwe
- gate telemetry dispatch calls on TELEMETRY_ENABLED env var (#1752) @msarahan
- Update cuda-python lower bounds to 12.6.2 / 11.8.5 (#1751) @bdice
- remove certs and simplify telemetry summarize (#1750) @msarahan
- stop installing 'wheel' in wheel-building script (#1748) @jameslamb
- Require approval to run CI on draft PRs (#1737) @bdice
- Create logger wrapper around spdlog that can be easily reused in other libraries (#1722) @vyasr
- Add breaking change workflow trigger (#1719) @AyodeAwe
v24.12.00
🚨 Breaking Changes
🐛 Bug Fixes
- Query total memory in failure_callback_resource_adaptor tests (#1734) @harrism
- Treat deprecation warnings as errors and fix deprecation warnings in replay benchmark (#1728) @harrism
- Disallow cuda-python 12.6.1 and 11.8.4 (#1720) @bdice
- Fix typos in .gitignore (#1697) @charlesbluca
- Fix
rmm ._lib
imports (#1693) @Matt711
📖 Documentation
🚀 New Features
- Correct rmm tests for validity of device pointers (#1714) @robertmaynard
- Update rmm tests to use rapids_cmake_support_conda_env (#1707) @robertmaynard
- adding telemetry (#1692) @msarahan
- Make
cudaMallocAsync
logic non-optional as we require CUDA 11.2+ (#1667) @robertmaynard
🛠️ Improvements
- enforce wheel size limits, README formatting in CI (#1726) @jameslamb
- Remove all explicit usage of fmtlib (#1724) @harrism
- WIP: put a ceiling on cuda-python (#1723) @jameslamb
- use rapids-generate-pip-constraints to pin to oldest dependencies in CI (#1716) @jameslamb
- Deprecate
rmm._lib
(#1713) @Matt711 - print sccache stats in builds (#1712) @jameslamb
- [fea] Expose the arena mr to the Python interface. (#1711) @trivialfis
- devcontainer: replace
VAULT_HOST
withAWS_ROLE_ARN
(#1708) @jjacobelli - make conda installs in CI stricter (part 2) (#1703) @jameslamb
- Add BUILD_SHARED_LIBS option defaulting to ON (#1702) @wence-
- make conda installs in CI stricter (#1696) @jameslamb
- Prune workflows based on changed files (#1695) @KyleFromNVIDIA
- Deprecate support for directly accessing logger (#1690) @vyasr
- Use
rmm::percent_of_free_device_memory
in arena test (#1689) @wence- - exclude 'gcovr' from list of development pip packages (#1688) @jameslamb
- [Improvement] Reorganize Cython to separate C++ bindings and make Cython classes public (#1676) @Matt711
v24.10.00
🚨 Breaking Changes
- Inline functions that return static references must have default visibility (#1653) @wence-
- Hide visibility of non-public symbols (#1644) @jameslamb
- Deprecate adaptor factories. (#1626) @bdice
🐛 Bug Fixes
- Add missing include to
resource_ref.hpp
(#1677) @miscco - Remove the friend declaration with an attribute (#1669) @kingcrimsontianyu
- Fix
build.sh clean
to delete python build directory (#1658) @rongou - Stream synchronize before deallocating SAM (#1655) @rongou
- Explicitly mark RMM headers with
RMM_EXPORT
(#1654) @robertmaynard - Inline functions that return static references must have default visibility (#1653) @wence-
- Use
tool.scikit-build.cmake.version
(#1637) @KyleFromNVIDIA
📖 Documentation
- Recommend
miniforge
for conda install. (#1681) @bdice - Fix docs cross reference in DeviceBuffer.prefetch (#1636) @bdice
🚀 New Features
- [FEA] Allow setting
*_pool_size
with human-readable string (#1670) @Matt711 - Update RMM adaptors, containers and tests to use get/set_current_device_resource_ref() (#1661) @harrism
- Deprecate adaptor factories. (#1626) @bdice
- Allow testing of earliest/latest dependencies (#1613) @seberg
- Add resource_ref versions of get/set_current_device_resource (#1598) @harrism
🛠️ Improvements
- Update update-version.sh to use packaging lib (#1685) @AyodeAwe
- Use CI workflow branch 'branch-24.10' again (#1683) @jameslamb
- Update fmt (to 11.0.2) and spdlog (to 1.14.1). (#1678) @jameslamb
- Attempt to address oom failures in test suite (#1672) @wence-
- Add support for Python 3.12 (#1666) @jameslamb
- Update rapidsai/pre-commit-hooks (#1663) @KyleFromNVIDIA
- Drop Python 3.9 support (#1659) @jameslamb
- Remove NumPy <2 pin (#1650) @seberg
- Hide visibility of non-public symbols (#1644) @jameslamb
- Update pre-commit hooks (#1643) @KyleFromNVIDIA
- Improve update-version.sh (#1640) @bdice
- Install headers into
${CMAKE_INSTALL_INCLUDEDIR}
(#1633) @KyleFromNVIDIA - Merge branch-24.08 into branch-24.10 (#1631) @jameslamb
v24.08.00
🚨 Breaking Changes
🐛 Bug Fixes
- Rename
.devcontainer
s for CUDA 12.5 (#1615) @jakirkham - Avoid accessing statistics_resource_adaptor stack top if it is empty (#1588) @harrism
- Avoid
--find-links
. (#1583) @bdice - Fix test_python matrix (#1579) @KyleFromNVIDIA
- Allow anonymous user in devcontainer name (#1576) @bdice
📖 Documentation
- Instruct to create associated issue in PR template. (#1624) @harrism
- add rapids-build-backend to docs (#1614) @jameslamb
- Revert "Remove HTML builds of librmm (#1415)" (#1604) @bdice
- Add documentation for CPM usage (#1600) @pauleonix
- Update Thrust CMake Guide link in README.md (#1593) @pauleonix
🚀 New Features
- Prefetch resource adaptor (#1608) @bdice
- Add python wrapper for system memory resource (#1605) @rongou
- Refactor mr_ref_tests to not depend on MR base classes (#1589) @harrism
- Add system memory resource (#1581) @rongou
- Add rmm::prefetch() and DeviceBuffer.prefetch() (#1573) @harrism
🛠️ Improvements
- split up CUDA-suffixed dependencies in dependencies.yaml (#1627) @jameslamb
- Remove prefetch factory. (#1625) @bdice
- Use workflow branch 24.08 again (#1617) @KyleFromNVIDIA
- Build and test with CUDA 12.5.1 (#1607) @KyleFromNVIDIA
- skip CMake 3.30.0 (#1603) @jameslamb
- Add RMM_USE_NVTX cmake option to provide localized control of NVTX for RMM (#1602) @jlowe
- Use verify-alpha-spec hook (#1601) @KyleFromNVIDIA
- Avoid --find-links in wheel jobs (#1586) @jameslamb
- resolve dependency-file-generator warning, remove unnecessary rapids-build-backend configuration (#1582) @jameslamb
- Remove THRUST_WRAPPED_NAMESPACE and tests (#1578) @harrism
- Remove text builds of documentation (#1575) @vyasr
- ensure update-version.sh preserves alpha specs (#1572) @jameslamb
- Add
available_device_memory
to fetch free amount of memory on a GPU (#1567) @galipremsagar - Add a stack to the statistics resource (#1563) @madsbk
- Use rapids-build-backend. (#1502) @bdice
v24.06.00
🚨 Breaking Changes
- Refactor polymorphic allocator to use device_async_resource_ref (#1555) @harrism
- Remove deprecated functionality (#1537) @harrism
- Remove deprecated cuda_async_memory_resource constructor that takes thrust::optional parameters (#1535) @harrism
- Remove deprecated supports_streams and get_mem_info methods. (#1519) @harrism
🐛 Bug Fixes
- rmm needs to link to nvtx3::nvtx3-cpp to support installed nvtx3 (#1569) @robertmaynard
- Make sure rmm wheel dependency on librmm is updated [skip ci] (#1565) @raydouglass
- Don't ignore GCC-specific warning under Clang (#1557) @aaronmondal
- Add publish jobs for C++ wheels (#1554) @vyasr
- Explicitly use the current device resource in DeviceBuffer (#1514) @wence-
📖 Documentation
- Allow specifying mr in DeviceBuffer construction, and document ownership requirements in Python/C++ interfacing (#1552) @wence-
- Fix Python install instruction (#1547) @wence-
- Update multi-gpu discussion for device_buffer and device_vector dtors (#1524) @wence-
- Fix ordering / heading levels in README.md and python example in guide.md (#1513) @harrism
🚀 New Features
- Add NVTX support and RMM_FUNC_RANGE() macro (#1558) @harrism
- Always use a static gtest (#1532) @robertmaynard
- Build C++ wheel (#1529) @vyasr
- Remove deprecated supports_streams and get_mem_info methods. (#1519) @harrism
🛠️ Improvements
- update copyright dates (#1564) @jameslamb
- Overhaul ops-codeowners (#1561) @raydouglass
- Adding support for cupy.cuda.stream.ExternalStream (#1559) @lilohuang
- Refactor polymorphic allocator to use device_async_resource_ref (#1555) @harrism
- add RAPIDS copyright pre-commit hook (#1553) @jameslamb
- Enable warnings as errors for Python tests (#1551) @mroeschke
- Remove header existence tests. (#1550) @bdice
- Only use functions in the limited API (#1545) @vyasr
- Migrate to
{{ stdlib("c") }}
(#1543) @hcho3 - Fix
cuda11.8
nvcc dependency (#1542) @trxcllnt - add --rm and --name to devcontainer run args (#1539) @trxcllnt
- Remove deprecated functionality (#1537) @harrism
- Remove deprecated cuda_async_memory_resource constructor that takes thrust::optional parameters (#1535) @harrism
- Make thrust_allocator deallocate safe in multi-device setting (#1533) @wence-
- Move rmm Python package to subdirectory (#1526) @vyasr
- Remove a file not being used (#1521) @galipremsagar
- Remove unneeded
update-version.sh
update (#1520) @AyodeAwe - Enable all tests for
arm
arch (#1510) @galipremsagar
v24.04.00
🚨 Breaking Changes
- Accept stream argument in DeviceMemoryResource allocate/deallocate (#1494) @wence-
- Replace all internal usage of
get_upstream
withget_upstream_resource
(#1491) @miscco - Deprecate rmm::mr::device_memory_resource::supports_streams() (#1452) @harrism
- Remove deprecated rmm::detail::available_device_memory (#1438) @harrism
- Make device_memory_resource::supports_streams() not pure virtual. Remove derived implementations and calls in RMM (#1437) @harrism
- Deprecate rmm::mr::device_memory_resource::get_mem_info() and supports_get_mem_info(). (#1436) @harrism
🐛 Bug Fixes
- Fix search path for torch allocator in editable installs and ensure CUDA support is available (#1498) @vyasr
- Accept stream argument in DeviceMemoryResource allocate/deallocate (#1494) @wence-
- Run STATISTICS_TEST and TRACKING_TEST in serial to avoid OOM errors. (#1487) @bdice
📖 Documentation
🚀 New Features
- Replace all internal usage of
get_upstream
withget_upstream_resource
(#1491) @miscco - Add complete set of resource ref aliases (#1479) @nvdbaranec
- Automate include grouping using clang-format (#1463) @harrism
- Add
get_upstream_resource
to resource adaptors (#1456) @miscco - Deprecate rmm::mr::device_memory_resource::supports_streams() (#1452) @harrism
- Remove duplicated memory_resource_tests (#1451) @miscco
- Change
rmm::exec_policy
to takeasync_resource_ref
(#1449) @miscco - Change
device_scalar
to takeasync_resource_ref
(#1447) @miscco - Add device_async_resource_ref convenience alias (#1441) @harrism
- Remove deprecated rmm::detail::available_device_memory (#1438) @harrism
- Make device_memory_resource::supports_streams() not pure virtual. Remove derived implementations and calls in RMM (#1437) @harrism
- Deprecate rmm::mr::device_memory_resource::get_mem_info() and supports_get_mem_info(). (#1436) @harrism
- Support CUDA 12.2 (#1419) @jameslamb
🛠️ Improvements
- Use
conda env create --yes
instead of--force
(#1509) @bdice - Add upper bound to prevent usage of NumPy 2 (#1501) @bdice
- Remove hard-coding of RAPIDS version where possible (#1496) @KyleFromNVIDIA
- Requre NumPy 1.23+ (#1488) @jakirkham
- Use
rmm::device_async_resource_ref
in multi_stream_allocation benchmark (#1482) @miscco - Update devcontainers to CUDA Toolkit 12.2 (#1470) @trxcllnt
- Add support for Python 3.11 (#1469) @jameslamb
- target branch-24.04 for GitHub Actions workflows (#1468) @jameslamb
- [FEA]: Use
std::optional
instead ofthrust::optional
(#1464) @miscco - Add environment-agnostic scripts for running ctests and pytests (#1462) @trxcllnt
- Ensure that
ctest
is called with--no-tests=error
. (#1460) @bdice - Update ops-bot.yaml (#1458) @AyodeAwe
- Adopt the
rmm::device_async_resource_ref
alias (#1454) @miscco - Refactor error.hpp out of detail (#1439) @lamarrr