- Disable NN Descent Batch tests temporarily (#2453) @divyegala
- Fix sed syntax in
update-version.sh
(#2441) @raydouglass - Use runtime check of cudart version for eig (#2430) @lowener
- [BUG] Fix bitset function visibility (#2429) @lowener
- Exclude any kernel symbol that uses cutlass (#2425) @robertmaynard
- [Feat] add
repeat
,sparsity
,eval_n_elements
APIs tobitset
(#2439) @rhdong - [Opt] Enforce the UT Coverity and add benchmark for
transpose
(#2438) @rhdong - [FEA] Support for half-float mixed precise in brute-force (#2382) @rhdong
- bump NCCL floor to 2.19 (#2458) @jameslamb
- Deprecating vector search APIs and updating README accordingly (#2448) @cjnolet
- Update update-version.sh to use packaging lib (#2447) @AyodeAwe
- Switch traceback to
native
(#2446) @galipremsagar - bump NCCL floor to 2.18.1.1 (#2443) @jameslamb
- Add missing
cuda_suffixed: true
(#2440) @trxcllnt - Use CI workflow branch 'branch-24.10' again (#2437) @jameslamb
- Update to flake8 7.1.1. (#2435) @bdice
- Update fmt (to 11.0.2) and spdlog (to 1.14.1). (#2433) @jameslamb
- Allow coo_sort to work on int64_t indices (#2432) @benfred
- Adding NCCL clique to the RAFT handle (#2431) @viclafargue
- Add support for Python 3.12 (#2428) @jameslamb
- Update rapidsai/pre-commit-hooks (#2420) @KyleFromNVIDIA
- Drop Python 3.9 support (#2417) @jameslamb
- Use CUDA math wheels (#2415) @KyleFromNVIDIA
- Remove NumPy <2 pin (#2414) @seberg
- Update pre-commit hooks (#2409) @KyleFromNVIDIA
- Improve update-version.sh (#2408) @bdice
- Use tool.scikit-build.cmake.version, set scikit-build-core minimum-version (#2406) @jameslamb
- [FEA] Batching NN Descent (#2403) @jinsolp
- Update pip devcontainers to UCX v1.17.0 (#2401) @jameslamb
- Merge branch-24.08 into branch-24.10 (#2397) @jameslamb
- [Refactor] move
popc
to under util (#2394) @rhdong - [Opt] Expose the
detail::popc
as public API (#2346) @rhdong
- Add timeout to UCXX generic operations (#2398) @pentschev
- [Fix] bitmap set/test issue (#2371) @rhdong
- Fix 0 recall issue in
raft_cagra_hnswlib
ANN benchmark (#2369) @divyegala - Fix
ef
setting in HNSW wrapper (#2367) @divyegala - Fix cagra graph opt bug (#2365) @enp1s0
- Fix a bug where the wrong API is used to free the memory (#2361) @PointKernel
- Allow anonymous user in devcontainer name (#2355) @bdice
- Fix compilation error when _CLK_BREAKDOWN is defined in cagra. (#2350) @jiangyinzuo
- ensure raft-dask wheel tests install pylibraft wheel from the same CI run, fix wheel dependencies (#2349) @jameslamb
- Change --config-setting to --config-settings (#2342) @KyleFromNVIDIA
- Add workaround for syevd in CUDA 12.0 (#2332) @lowener
- [FEA] add the support of
masked_matmul
(#2362) @rhdong - [FEA] Dice Distance for Dense Inputs (#2359) @aamijar
- [Opt] Expose the
detail::popc
as public API (#2346) @rhdong - Enable distance return for NN Descent (#2345) @jinsolp
- [Refactor] move
popc
to under util (#2394) @rhdong - split up CUDA-suffixed dependencies in dependencies.yaml (#2388) @jameslamb
- Use workflow branch 24.08 again (#2385) @KyleFromNVIDIA
- Add cusparseSpMV_preprocess to cusparse wrapper (#2384) @Kh4ster
- Consolidate SUM reductions (#2381) @mfoerste4
- Use slicing kernel to copy distances inside NN Descent (#2380) @jinsolp
- Build and test with CUDA 12.5.1 (#2378) @KyleFromNVIDIA
- Add CUDA_STATIC_MATH_LIBRARIES (#2376) @KyleFromNVIDIA
- skip CMake 3.30.0 (#2375) @jameslamb
- Use verify-alpha-spec hook (#2373) @KyleFromNVIDIA
- Binarize Dice Distance for Dense Inputs (#2370) @aamijar
- [FEA] Add distance epilogue for NN Descent (#2364) @jinsolp
- resolve dependency-file-generator warning, other rapids-build-backend followup (#2360) @jameslamb
- Remove text builds of documentation (#2354) @vyasr
- Use default init in reduction (#2351) @akifcorduk
- ensure update-version.sh preserves alpha spec, add tests on version constants (#2344) @jameslamb
- remove unnecessary 'setuptools' dependencies (#2343) @jameslamb
- Use rapids-build-backend (#2331) @KyleFromNVIDIA
- Add FAISS with RAFT enabled Benchmarking to raft-ann-bench (#2026) @tarang-jain
- Rename raft-ann-bench module to raft_ann_bench (#2333) @KyleFromNVIDIA
- Scaling workspace resources (#2322) @achirkin
- [REVIEW] Adjust UCX dependencies (#2304) @pentschev
- Convert device_memory_resource* to device_async_resource_ref (#2269) @harrism
- Fix import of VERSION file in raft-ann-bench (#2338) @KyleFromNVIDIA
- Rename raft-ann-bench module to raft_ann_bench (#2333) @KyleFromNVIDIA
- Support building faiss main statically (#2323) @robertmaynard
- Refactor spectral scale_obs to use existing normalization function (#2319) @ChuckHastings
- Correct initializer list order found by cuvs (#2317) @robertmaynard
- ANN_BENCH: enable move semantics for configured_raft_resources (#2311) @achirkin
- Revert "Build C++ wheel (#2264)" (#2305" (#2305)) @vyasr
- Revert "Add
compile-library
by default on pylibraft build" (#2300) @vyasr - Add VERSION to raft-ann-bench package (#2299) @KyleFromNVIDIA
- Remove nonexistent job from workflow (#2298) @vyasr
libucx
should be run dependency ofraft-dask
(#2296) @divyegala- Fix clang intrinsic warning (#2292) @aaronmondal
- Replace too long index file name with hash in ANN bench (#2280) @tfeher
- Fix build command for C++ compilation (#2270) @lowener
- Fix a compilation error in CAGRA when enabling log output (#2262) @enp1s0
- Correct member initialization order (#2254) @robertmaynard
- Fix time computation in CAGRA notebook (#2231) @lowener
- Scaling workspace resources (#2322) @achirkin
- ANN_BENCH: AnnGPU::uses_stream() for optional algo GPU sync (#2314) @achirkin
- [FEA] Split Bitset code (#2295) @lowener
- [FEA] support of prefiltered brute force (#2294) @rhdong
- Always use a static gtest and gbench (#2265) @robertmaynard
- Build C++ wheel (#2264) @vyasr
- InnerProduct Distance Metric for CAGRA search (#2260) @tarang-jain
- [FEA] Add support for
select_k
on CSR matrix (#2140) @rhdong
- ANN_BENCH: common AnnBase::index_type (#2315) @achirkin
- ANN_BENCH: split instances of RaftCagra into multiple files (#2313) @achirkin
- ANN_BENCH: a global pool of result buffers across benchmark cases (#2312) @achirkin
- Remove the shared state and the mutex from NVTX internals (#2310) @achirkin
- docs: update README.md (#2308) @eltociear
- [REVIEW] Reenable raft-dask wheel tests requiring UCX-Py (#2307) @pentschev
- [REVIEW] Adjust UCX dependencies (#2304) @pentschev
- Overhaul ops-codeowners (#2303) @raydouglass
- Make thrust nosync execution policy the default thrust policy (#2302) @abc99lr
- InnerProduct testing for CAGRA+HNSW (#2297) @divyegala
- Enable warnings as errors for Python tests (#2288) @mroeschke
- Normalize dataset vectors in the CAGRA InnerProduct tests (#2287) @enp1s0
- Use dynamic version for raft-ann-bench (#2285) @KyleFromNVIDIA
- Make 'librmm' a 'host' dependency for conda packages (#2284) @jameslamb
- Fix comments in cpp/include/raft/neighbors/cagra_serialize.cuh (#2283) @jiangyinzuo
- Only use functions in the limited API (#2282) @vyasr
- define 'ucx' pytest marker (#2281) @jameslamb
- Migrate to
{{ stdlib("c") }}
(#2278) @hcho3 - add --rm and --name to devcontainer run args (#2275) @trxcllnt
- Update pip devcontainers to UCX v1.15.0 (#2274) @trxcllnt
#ifdef
out pragma deprecation warning messages (#2271) @trxcllnt- Convert device_memory_resource* to device_async_resource_ref (#2269) @harrism
- Update the developer's guide with new copyright hook (#2266) @KyleFromNVIDIA
- Improve coalesced reduction performance for tall and thin matrices (up to 2.6x faster) (#2259) @Nyrio
- Adds missing files to
update-version.sh
(#2255) @AyodeAwe - Enable all tests for
arm64
jobs (#2248) @galipremsagar - Update nvtx3 link in cmake (#2246) @lowener
- Add CAGRA-Q subspace dim = 4 support (#2244) @enp1s0
- Get rid of
cuco::sentinel
namespace (#2243) @PointKernel - Replace usages of raw
get_upstream
withget_upstream_resource()
(#2207) @miscco - Set the import mode for dask tests (#2142) @vyasr
- Add UCXX support (#1983) @pentschev
- Update pre-commit-hooks to v0.0.3 (#2239) @KyleFromNVIDIA
- MAINT: Simplify NCCL worker rank identification (#2228) @VibhuJawa
- Fix bug in blockRankedReduce (#2226) @akifcorduk
- Fix illegal acces mean/stdev, sum add Kahan Summation (#2223) @mfoerste4
- Batch cutlass distance kernels along N matrix dim (#2215) @mdoijade
- Fix out of bounds access in sum kernel (#2183) @tfeher
- Fix ANN bench ground truth generation for k>1024 (#2180) @tfeher
- Fixing cusparse aligned address issue and adding note (#2179) @cjnolet
- Launch
neighborhood_recall
kernel on CUDA stream (#2156) @divyegala - Add
compile-library
by default on pylibraft build (#2090) @lowener
- Add CAGRA-Q to ANN benchmarks (#2233) @achirkin
- Add CAGRA-Q build (compression) (#2213) @achirkin
- CAGRA-Q search (#2206) @enp1s0
- Demangle backtrace symbols on raft error (#2188) @achirkin
- Reapply: Support for fp16 in CAGRA and IVF-PQ (#2172) @achirkin
- Remove supports_streams from custom RAFT memory resources (#2121) @harrism
- [FEA] Add support for bitmap_view & the API of
bitmap_to_csr
(#2109) @rhdong
- Use
conda env create --yes
instead of--force
(#2247) @bdice - Align ucx version pinning with ucx-py/ucxx. (#2227) @bdice
- Add upper bound to prevent usage of NumPy 2 (#2222) @bdice
- Performance optimization of IVF-flat / select_k (#2221) @mfoerste4
- Replace local copyright check with pre-commit-hooks verify-copyright (#2220) @KyleFromNVIDIA
- Remove hard-coding of RAPIDS version where possible (#2219) @KyleFromNVIDIA
- Fix style. (#2214) @bdice
- Add explicit instantiations for IVF-PQ search kernels used in tests (#2212) @tfeher
- Improve RBC eps-neighborhood query performance (#2211) @mfoerste4
- Add test for spmm (#2210) @mfoerste4
- Only install necessary components in conda packages. (#2209) @bdice
- Automate C++ include file grouping and ordering using clang-format (#2202) @harrism
- Add support for Python 3.11, require NumPy 1.23+ (#2200) @jameslamb
- Pass
std::optional
instead ofthrust::optional
to RMM (#2199) @trxcllnt - Update devcontainers to CUDA Toolkit 12.2 (#2192) @trxcllnt
- target branch-24.04 for GitHub Actions workflows (#2189) @jameslamb
- Fixing workaround for cuSPARSE bug with correct copy dimensions (#2185) @mfoerste4
- Allow topk larger than 1024 in CAGRA (#2181) @benfred
- IVF-FLAT support k > 256 (#2169) @mfoerste4
- Add environment-agnostic scripts for running ctests and pytests (#2165) @trxcllnt
- Ensure that
ctest
is called with--no-tests=error
. (#2163) @bdice - Update ops-bot.yaml (#2158) @AyodeAwe
- random sampling of dataset rows with improved memory utilization (#2155) @tfeher
- [FIX] Ensure hnswlib can be found from RAFT's build dir (#2145) @trxcllnt
- Improve analysis experience for ANN benchmarks (#2139) @achirkin
- Enable CAGRA index building without adding dataset to the index (#2126) @tfeher
- Add fused cosine 1-NN cutlass based kernel (#2125) @mdoijade
- Update raft for compatibility with the latest cuco (#2118) @PointKernel
- Support CUDA 12.2 (#2092) @jameslamb
- Cache IVF-PQ and select-warpsort kernel launch parameters to reduce latency (#1786) @achirkin
- Switch to scikit-build-core (#2051) @vyasr
- Update to CCCL 2.2.0. (#2049) @bdice
- Update
raft-ann-bench
output filenames and add features to plotting (#2043) @divyegala - Remove selection_faiss (#2027) @benfred
- fix is_row/col_order for strided layouts (#2173) @mfoerste4
- Fix failing C++ tests and revert #2097, #2085. (#2168) @cjnolet
- Exclude tests from builds (#2162) @vyasr
- [HOTFIX] 24.02 Revert Random Sampling (#2144) @cjnolet
- Pin to pytest 7. (#2137) @bdice
- Conditionally include
hnsw
wrapper source in CMake (#2135) @divyegala - [BUG] Fix
SPMM
strided view (#2124) @lowener - Fixing small bug in CUSPARSE spmm w/ CUDA 12.2 (#2117) @cjnolet
- [BUG] Fix
num_cta_per_query
div (#2107) @lowener - Remove extraneous host pinnings from libraft-headers-only. (#2102) @bdice
- Remove unneeded CI symbol excludes (#2098) @robertmaynard
- Properly taking ownership of nccl subcomm (and destroying it) (#2094) @cjnolet
- Fix
max_queries
for CAGRA (#2081) @lowener - Fix compile failure on RTX 4090 (#2076) @JieFengWang
- Fix a crash in FAISS benchmark wrapper introduced in #2021 (#2062) @achirkin
- Correct function that wasn't returning a value (#2045) @robertmaynard
- Fixing small bug in raft-ann-bench (#2041) @cjnolet
- Make device_resources accessed from device_resources_manager thread-safe (#2030) @wphicks
- Fix ann-bench multithreading (#2021) @achirkin
- Fix
ci/checks/copyright.py
to mirror RAPIDS reference (#2008) @divyegala - Fix pyproject versions (#2002) @vyasr
- Adding license info for wiki-all dataset (#2129) @cjnolet
- [DOC] Documentation updates for release 24.02 (#2093) @cjnolet
- Fix errors with ingroup exposed by doxygen 1.10 (#2079) @wphicks
- Fix a typo (#2070) @narangvivek10
- Add usage example for brute_force::build (#2029) @benfred
- Add filtering to vector search tutorial (#1996) @lowener
- Update to use rapids-cmake for all deps (#2096) @robertmaynard
- Add IVF-PQ example into the template project (#2091) @achirkin
- Support for fp16 in CAGRA and IVF-PQ (#2085) @achirkin
- Add random subsampling for IVF methods (#2077) @tfeher
- Update
raft-ann-bench
output filenames and add features to plotting (#2043) @divyegala - Add brute_force index serialization (#2036) @wphicks
- Add eps-neighbor search via RBC (#2028) @mfoerste4
libraft
andpylibraft
API for CAGRA build and HNSW search (#2022) @divyegala- Export Pareto frontier in
raft-ann-bench.data_export
(#2009) @divyegala - Implement maybe-owning multi-dimensional container (mdbuffer) (#1999) @wphicks
- Add support for 1024+ dim vectors in CAGRA search (#1994) @enp1s0
- Replace GEMM backend: cublas.gemm -> cublaslt.matmul (#1736) @achirkin
- Remove get_mem_info functions from RAFT custom memory resources (#2108) @harrism
- Replace call to mr::get_mem_info() (#2099) @harrism
- Allow topk larger than 1024 in CAGRA (#2097) @benfred
- Remove usages of rapids-env-update (#2095) @KyleFromNVIDIA
- Provide explicit pool size for pool_memory_resources and clean up includes (#2088) @harrism
- refactor CUDA versions in dependencies.yaml (#2086) @jameslamb
- ANN bench fix latency measurement overhead (#2084) @tfeher
- Remove hardcoded limit in
print_results
function (#2080) @narangvivek10 - [FEA] Add support for SDDMM by wrapping the cusparseSDDMM (#2067) (#2067 (#2067)) @rhdong
- Benchmark brute force knn (#2063) @benfred
- [BUG] fix empty initialization of device_ndarray in pylibraft (#2061) @mfoerste4
- Improve parallelism of refine host (#2059) @anaruse
- Subsampling for IVF-PQ codebook generation (#2052) @abc99lr
- Switch to scikit-build-core (#2051) @vyasr
- Update to CCCL 2.2.0. (#2049) @bdice
- Use cuda::proclaim_return_type on device lambda. (#2048) @bdice
- Removing code that explicitly compares equality of rmm memory resources (#2047) @cjnolet
- Add public enum for select-k algorithm selection (#2046) @benfred
- Update dependencies.yaml to new pip index (#2042) @vyasr
- Remove RAFT_BUILD_WHEELS and standardize Python builds (#2040) @vyasr
- Fix ucx-py version pinning in dependencies.yaml. (#2035) @bdice
- [REVIEW] Fix typos in parameter tuning guide (#2034) @abc99lr
- Add AIR-Top-k reference (#2031) @tfeher
- Remove selection_faiss (#2027) @benfred
- Fixing json parse error in
raft-ann-bench.data_export
(#2025) @cjnolet - Updating cagra build constraint (#2016) @cjnolet
- Update to fmt 10.1.1 and spdlog 1.12.0. (#1957) @bdice
- Enable host dataset for IVF-Flat (#1635) @tfeher
- add half/bfloat support to myInf and abs (#1592) @Kh4ster
- Update actions/labeler to v4 (#2037) @raydouglass
- pylibraft only depends on numpy at runtime, not build time. (#2013) @bdice
- Fixes to update-version.sh (#1991) @raydouglass
- Adjusting end-to-end start time so it doesn't include stream creation time (#1989) @cjnolet
- CAGRA graph optimizer: clamp rev_graph_count (#1987) @tfeher
- Catching conversion errors in data_export instead of fully failing (#1979) @cjnolet
- Fix syncing mechanism in
raft-ann-bench
C++ search (#1961) @divyegala - Fixing hnswlib in latency mode (#1959) @cjnolet
- Fix
ucx-py
alpha version update forraft-dask
(#1953) @divyegala - Reduce NN Descent test threshold (#1946) @divyegala
- Fixes to new YAML config
raft-bench-ann
(#1945) @divyegala - Set RNG seeds in NN Descent to diagnose flaky tests (#1931) @divyegala
- Fix FAISS CPU algorithm names in
raft-ann-bench
(#1916) @divyegala - Increase iterations in NN Descent tests to avoid flakiness (#1915) @divyegala
- Fix filepath in
raft-ann-bench/split_groundtruth
module (#1911) @divyegala - Remove dynamic entry-points from raft-ann-bench (#1910) @benfred
- Remove unnecessary dataset path check in ANN bench (#1908) @tfeher
- Fixing Googletests and re-enabling in CI (#1904) @cjnolet
- Fix NN Descent overflows (#1875) @divyegala
- Build fix for CUDA 12.2 (#1870) @benfred
- [BUG] Fix a bug in NN descent (#1869) @enp1s0
- Brute Force Index documentation fix (#1944) @lowener
- Add
wiki_all
dataset config and documentation. (#1918) @cjnolet - Updates to raft-ann-bench docs (#1905) @cjnolet
- End-to-end vector search tutorial in docs (#1776) @cjnolet
- Adding
dry-run
option toraft-ann-bench
(#1970) @cjnolet - Add ANN bench scripts to generate ground truth (#1967) @tfeher
- CAGRA build + HNSW search (#1956) @divyegala
- Verify conda-cpp-post-build-checks (#1935) @robertmaynard
- Make all cuda kernels have hidden visibility (#1898) @robertmaynard
- Update rapids-cmake functions to non-deprecated signatures (#1884) @robertmaynard
- [FEA] Helpers for identifying contiguous layouts. (#1861) @trivialfis
- Add
raft::stats::neighborhood_recall
(#1860) @divyegala - [FEA] Helpers and CodePacker for IVF-PQ (#1826) @tarang-jain
- Pinning fmt and spdlog for raft-ann-bench-cpu (#2018) @cjnolet
- Build concurrency for nightly and merge triggers (#2011) @bdice
- Using
EXPORT_SET
inrapids_find_package_root
(#2006) @cjnolet - Remove static checks for serialization size (#1997) @cjnolet
- Skipping bad json parse (#1990) @cjnolet
- Update select-k heuristic (#1985) @benfred
- ANN bench: use different offset for each thread (#1981) @tfeher
- Allow
raft-ann-bench/run
to continue after encountering bad YAML configs (#1980) @divyegala - Add build and search params to
raft-ann-bench.data_export
CSVs (#1971) @divyegala - Use new
rapids-dask-dependency
metapackage for managing dask versions (#1968) @galipremsagar - Remove unused header (#1960) @wphicks
- Adding pool back in and fixing cagra benchmark params (#1951) @cjnolet
- Add constraints to
hnswlib
inraft-bench-ann
(#1949) @divyegala - Add support for iterating over batches in bfknn (#1947) @benfred
- Fix ANN bench latency (#1940) @tfeher
- Add YAML config files to run parameter sweeps for ANN benchmarks (#1929) @divyegala
- Relax ucx pinning (#1927) @vyasr
- Try using contiguous rank to fix cuda_visible_devices (#1926) @VibhuJawa
- Unpin
dask
anddistributed
for23.12
development (#1925) @galipremsagar - Adding
throughput
andlatency
modes toraft-ann-bench
(#1920) @cjnolet - Providing
aarch64
yaml environment files (#1914) @cjnolet - CAGRA ANN bench: parse build options for IVF-PQ build algo (#1912) @tfeher
- Fix python script location in ANN bench description (#1906) @tfeher
- Refactor install/build guide. (#1899) @cjnolet
- Check return values of raft-ann-bench subprocess calls (#1897) @benfred
- ANN bench options to specify CAGRA graph and dataset locations (#1896) @cjnolet
- Add check-json to pre-commit linters, and fix invalid ann-bench JSON config (#1894) @benfred
- Use branch-23.12 workflows. (#1886) @bdice
- Setup Consistent Nightly Versions for Pip and Conda (#1880) @divyegala
- Fix and improve one-block radix select (#1878) @yong-wang
- [FEA] Improvements on bitset class (#1877) @lowener
- Branch 23.12 merge 23.10 (#1873) @AyodeAwe
- Branch 23.12 merge 23.10 (#1868) @cjnolet
- Replace
raft::random
calls to not use deprecated API (#1867) @lowener - raft: Build CUDA 12.0 ARM conda packages. (#1853) @bdice
- Documentation for raft ANN benchmark containers. (#1833) @dantegd
- [FEA] Support vector deletion in ANN IVF (#1831) @lowener
- Provide a raft::copy overload for mdspan-to-mdspan copies (#1818) @wphicks
- Adding FAISS cpu to
raft-ann-bench
(#1814) @cjnolet
- Change CAGRA auto mode selection (#1830) @enp1s0
- Update CAGRA serialization (#1755) @benfred
- Improvements to ANN Benchmark Python scripts and docs (#1734) @divyegala
- Update to Cython 3.0.0 (#1688) @vyasr
- ANN-benchmarks: switch to use gbench (#1661) @achirkin
- [BUG] Fix a bug in the filtering operation in CAGRA multi-kernel (#1862) @enp1s0
- Fix conf file for benchmarking glove datasets (#1846) @dantegd
- raft-ann-bench package fixes for plotting and conf files (#1844) @dantegd
- Fix update-version.sh for all pyproject.toml files (#1839) @raydouglass
- Make RMM a run dependency of the raft-ann-bench conda package (#1838) @dantegd
- Printing actual exception in
require base set
(#1816) @cjnolet - Adding rmm to
raft-ann-bench
dependencies (#1815) @cjnolet - Use
conda mambabuild
notmamba mambabuild
(#1812) @bdice - Fix
raft-dask
naming in wheel builds (#1805) @divyegala - neighbors::refine_host: check the dataset bounds (#1793) @achirkin
- [BUG] Fix search parameter check in CAGRA (#1784) @enp1s0
- IVF-Flat: fix search batching (#1764) @achirkin
- Using expanded distance computations in
pylibraft
(#1759) @cjnolet - Fix ann-bench Documentation (#1754) @divyegala
- Make get_cache_idx a weak symbol with dummy template (#1733) @ahendriksen
- Fix IVF-PQ fused kernel performance problems (#1726) @achirkin
- Fix build.sh to enable NEIGHBORS_ANN_CAGRA_TEST (#1724) @enp1s0
- Fix template types for create_descriptor function. (#1680) @csadorf
- Fix the CAGRA paper citation (#1788) @enp1s0
- Add citation info for the CAGRA paper preprint (#1787) @enp1s0
- [DOC] Fix grouping for ANN in C++ doxygen (#1782) @lowener
- Update RAFT documentation (#1717) @lowener
- Additional polishing of README and docs (#1713) @cjnolet
- [FEA] Add
bitset_filter
for CAGRA indices removal (#1837) @lowener - ann-bench: miscellaneous improvements (#1808) @achirkin
- [FEA] Add bitset for ANN pre-filtering and deletion (#1803) @lowener
- Adding config files for remaining (relevant) ann-benchmarks million-scale datasets (#1761) @cjnolet
- Port NN-descent algorithm to use in
cagra::build()
(#1748) @divyegala - Adding conda build for libraft static (#1746) @cjnolet
- [FEA] Provide device_resources_manager for easy generation of device_resources (#1716) @wphicks
- Add option to brute_force index to store non-owning reference to norms (#1865) @benfred
- Pin
dask
anddistributed
for23.10
release (#1864) @galipremsagar - Update image names (#1835) @AyodeAwe
- Fixes for OOM during CAGRA benchmarks (#1832) @benfred
- Change CAGRA auto mode selection (#1830) @enp1s0
- Update to clang 16.0.6. (#1829) @bdice
- Add IVF-Flat C++ example (#1828) @tfeher
- matrix::select_k: extra tests and benchmarks (#1821) @achirkin
- Add index class for brute_force knn (#1817) @benfred
- [FEA] Add pre-filtering to CAGRA (#1811) @enp1s0
- More updates to ann-bench docs (#1810) @cjnolet
- Add best deep-100M configs for IVF-PQ to ANN benchmarks (#1807) @tfeher
- A few fixes to
raft-ann-bench
recipe and docs (#1806) @cjnolet - Simplify wheel build scripts and allow alphas of RAPIDS dependencies (#1804) @divyegala
- Various fixes to reproducible benchmarks (#1800) @cjnolet
- ANN-bench: more flexible cuda_stub.hpp (#1792) @achirkin
- Add RAFT devcontainers (#1791) @trxcllnt
- Cagra memory optimizations (#1790) @benfred
- Fixing a couple security concerns in
raft-dask
nccl unique id generation (#1785) @cjnolet - Don't serialize dataset with CAGRA bench (#1781) @benfred
- Use
copy-pr-bot
(#1774) @ajschmidt8 - Add GPU and CPU packages for ANN benchmarks (#1773) @dantegd
- Improvements to raft-ann-bench scripts, docs, and benchmarking implementations. (#1769) @cjnolet
- [REVIEW] Introducing host API for PCG (#1767) @vinaydes
- Unpin
dask
anddistributed
for23.10
development (#1760) @galipremsagar - Add ivf-flat notebook (#1758) @tfeher
- Update CAGRA serialization (#1755) @benfred
- Remove block size template parameter from CAGRA search (#1740) @enp1s0
- Add NVTX ranges for cagra search/serialize functions (#1737) @benfred
- Improvements to ANN Benchmark Python scripts and docs (#1734) @divyegala
- Fixing forward merger for 23.08 -> 23.10 (#1731) @cjnolet
- [FEA] Use CAGRA in C++ template (#1730) @lowener
- fixed box around raft image (#1710) @nwstephens
- Enable CUTLASS-based distance kernels on CTK 12 (#1702) @ahendriksen
- Update bench-ann configuration (#1696) @lowener
- Update to Cython 3.0.0 (#1688) @vyasr
- Update CMake version (#1677) @vyasr
- Branch 23.10 merge 23.08 (#1672) @vyasr
- ANN-benchmarks: switch to use gbench (#1661) @achirkin
- Separate CAGRA index type from internal idx type (#1664) @tfeher
- Stop using setup.py in build.sh (#1645) @vyasr
- CAGRA max_queries auto configuration (#1613) @enp1s0
- Rename the CAGRA prune function to optimize (#1588) @enp1s0
- CAGRA pad dataset for 128bit vectorized load (#1505) @tfeher
- Sparse Pairwise Distances API Updates (#1502) @divyegala
- Cagra index construction without copying device mdarrays (#1494) @tfeher
- [FEA] Masked NN for connect_components (#1445) @tarang-jain
- Limiting workspace memory resource (#1356) @achirkin
- Remove push condition on docs-build (#1693) @raydouglass
- IVF-PQ: Fix illegal memory access with large max_samples (#1685) @achirkin
- Fix missing parameter for select_k (#1682) @ucassjy
- Separate CAGRA index type from internal idx type (#1664) @tfeher
- Add rmm to pylibraft run dependencies, since it is used by Cython. (#1656) @bdice
- Hotfix: wrong constant in IVF-PQ fp_8bit2half (#1654) @achirkin
- Fix sparse KNN for large batches (#1640) @viclafargue
- Fix uploading of RAFT nightly packages (#1638) @dantegd
- Fix cagra multi CTA bug (#1628) @enp1s0
- pass correct stream to cutlass kernel launch of L2/cosine pairwise distance kernels (#1597) @mdoijade
- Fix launchconfig y-gridsize too large in epilogue kernel (#1586) @mfoerste4
- Fix update version and pinnings for 23.08. (#1556) @bdice
- Fix for function exposing KNN merge (#1418) @viclafargue
- Critical doc fixes and updates for 23.08 (#1705) @cjnolet
- Fix the documentation about changing the logging level (#1596) @enp1s0
- Fix raft::bitonic_sort small usage example (#1580) @enp1s0
- Use rapids-cmake new parallel testing feature (#1623) @robertmaynard
- Add support for row-major slice (#1591) @lowener
- IVF-PQ tutorial notebook (#1544) @achirkin
- [FEA] Masked NN for connect_components (#1445) @tarang-jain
- raft: Build CUDA 12 packages (#1388) @vyasr
- Limiting workspace memory resource (#1356) @achirkin
- Pin
dask
anddistributed
for23.08
release (#1711) @galipremsagar - Add algo parameter for CAGRA ANN bench (#1687) @tfeher
- ANN benchmarks python wrapper for splitting billion-scale dataset groundtruth (#1679) @divyegala
- Rename CAGRA parameter num_parents to search_width (#1676) @tfeher
- Renaming namespaces to promote CAGRA from experimental (#1666) @cjnolet
- CAGRA Python wrappers (#1665) @dantegd
- Add notebook for Vector Search - Question Retrieval (#1662) @lowener
- Fix CMake CUDA support for pylibraft when raft is found. (#1659) @bdice
- Cagra ANN benchmark improvements (#1658) @tfeher
- ANN-benchmarks: avoid using the dataset during search when possible (#1657) @achirkin
- Revert CUDA 12.0 CI workflows to branch-23.08. (#1652) @bdice
- ANN: Optimize host-side refine (#1651) @achirkin
- Cagra template instantiations (#1650) @tfeher
- Modify comm_split to avoid ucp (#1649) @ChuckHastings
- Stop using setup.py in build.sh (#1645) @vyasr
- IVF-PQ: Add a (faster) direct conversion fp8->half (#1644) @achirkin
- Simplify
bench/ann
scripts to Python based module (#1642) @divyegala - Further removal of uses-setup-env-vars (#1639) @dantegd
- Drop blank line in
raft-dask/meta.yaml
(#1637) @jakirkham - Enable conservative memory allocations for RAFT IVF-Flat benchmarks. (#1634) @tfeher
- [FEA] Codepacking for IVF-flat (#1632) @tarang-jain
- Fixing ann bench cmake (and docs) (#1630) @cjnolet
- [WIP] Test CI issues (#1626) @VibhuJawa
- Set pool memory resource for raft IVF ANN benchmarks (#1625) @tfeher
- Adding sort option to matrix::select_k api (#1615) @cjnolet
- CAGRA max_queries auto configuration (#1613) @enp1s0
- Use exceptions instead of
exit(-1)
(#1594) @benfred - [REVIEW] Add scheduler_file argument to support MNMG setup (#1593) @VibhuJawa
- Rename the CAGRA prune function to optimize (#1588) @enp1s0
- This PR adds support to __half and nb_bfloat16 to myAtomicReduce (#1585) @Kh4ster
- [IMP] move core CUDA RT macros to cuda_rt_essentials.hpp (#1584) @MatthiasKohl
- preprocessor syntax fix (#1582) @AyodeAwe
- use rapids-upload-docs script (#1578) @AyodeAwe
- Unpin
dask
anddistributed
for development and fixmerge_labels
test (#1574) @galipremsagar - Remove documentation build scripts for Jenkins (#1570) @ajschmidt8
- Add support to __half and nv_bfloat16 to most math functions (#1554) @Kh4ster
- Add RAFT ANN benchmark for CAGRA (#1552) @enp1s0
- Update CAGRA knn_graph_sort to use Raft::bitonic_sort (#1550) @enp1s0
- Add identity matrix function (#1548) @lowener
- Unpin scikit-build upper bound (#1547) @vyasr
- Migrate wheel workflow scripts locally (#1546) @divyegala
- Add sample filtering for ivf_flat. Filtering code refactoring and cleanup (#1541) @alexanderguzhva
- CAGRA pad dataset for 128bit vectorized load (#1505) @tfeher
- Sparse Pairwise Distances API Updates (#1502) @divyegala
- Add CAGRA gbench (#1496) @tfeher
- Cagra index construction without copying device mdarrays (#1494) @tfeher
- ivf-pq::search: fix the indexing type of the query-related mdspan arguments (#1539) @achirkin
- Dropping Python 3.8 (#1454) @divyegala
- [HOTFIX] Fix distance metrics L2/cosine/correlation when X & Y are same buffer but with different shape and add unit test for such case. (#1571) @mdoijade
- Using raft::resources in rsvd (#1543) @cjnolet
- ivf-pq::search: fix the indexing type of the query-related mdspan arguments (#1539) @achirkin
- Check python brute-force knn inputs (#1537) @benfred
- Fix failing TiledKNNTest unittest (#1533) @benfred
- ivf-flat: fix incorrect recomputed size of the index (#1525) @achirkin
- ivf-flat: limit the workspace size of the search via batching (#1515) @achirkin
- Support uint64_t in CAGRA index data type (#1514) @enp1s0
- Workaround for cuda 12 issue in cusparse (#1508) @cjnolet
- Un-scale output distances (#1499) @achirkin
- Inline get_cache_idx (#1492) @ahendriksen
- Pin to scikit-build<17.2 (#1487) @vyasr
- Remove pool_size() calls from debug printouts (#1484) @tfeher
- Add missing ext declaration for log detail::format (#1482) @tfeher
- Remove include statements from inside namespace (#1467) @robertmaynard
- Use pin_compatible to ensure that lower CTKs can be used (#1462) @vyasr
- fix ivf_pq n_probes (#1456) @benfred
- The glog project root CMakeLists.txt is where we should build from (#1442) @robertmaynard
- Add missing resource factory virtual destructor (#1433) @cjnolet
- Removing cuda stream view include from mdarray (#1429) @cjnolet
- Fix dim param for IVF-PQ wrapper in ANN bench (#1427) @tfeher
- Remove MetricProcessor code from brute_force::knn (#1426) @benfred
- Fix is_min_close (#1419) @benfred
- Have consistent compile lines between BUILD_TESTS enabled or not (#1401) @robertmaynard
- Fix ucx-py pin in raft-dask recipe (#1396) @vyasr
- Various updates to the docs for 23.06 release (#1538) @cjnolet
- Rename kernel arch finding function for dispatch (#1536) @mdoijade
- Adding bfknn and ivf-pq python api to docs (#1507) @cjnolet
- Add RAPIDS cuDF as a library that supports cuda_array_interface (#1444) @miguelusque
- IVF-PQ: manipulating individual lists (#1298) @achirkin
- Gram matrix support for sparse input (#1296) @mfoerste4
- [FEA] Add randomized svd from cusolver (#1000) @lowener
- Require Numba 0.57.0+ (#1559) @jakirkham
- remove device_resources include from linalg::map (#1540) @benfred
- Learn heuristic to pick fastest select_k algorithm (#1523) @benfred
- [REVIEW] make raft::cache::Cache protected to allow overrides (#1522) @mfoerste4
- [REVIEW] Fix padding assertion in sparse Gram evaluation (#1521) @mfoerste4
- run docs nightly too (#1520) @AyodeAwe
- Switch back to using primary shared-action-workflows branch (#1519) @vyasr
- Python API for IVF-Flat serialization (#1516) @tfeher
- Introduce sample filtering to IVFPQ index search (#1513) @alexanderguzhva
- Migrate from raft::device_resources -> raft::resources (#1510) @benfred
- Use rmm allocator in CAGRA prune (#1503) @enp1s0
- Update recipes to GTest version >=1.13.0 (#1501) @bdice
- Remove raft/matrix/matrix.cuh includes (#1498) @benfred
- Generate dataset of select_k times (#1497) @benfred
- Re-use memory pool between benchmark runs (#1495) @benfred
- Support CUDA 12.0 for pip wheels (#1489) @divyegala
- Update cupy dependency (#1488) @vyasr
- Enable sccache hits from local builds (#1478) @AyodeAwe
- Build wheels using new single image workflow (#1477) @vyasr
- Revert shared-action-workflows pin (#1475) @divyegala
- CAGRA: Separate graph index sorting functionality from prune function (#1471) @enp1s0
- Add generic reduction functions and separate reductions/warp_primitives (#1470) @akifcorduk
- [ENH] [FINAL] Header structure: combine all PRs into one (#1469) @ahendriksen
- use
matrix::select_k
in brute_force::knn call (#1463) @benfred - Dropping Python 3.8 (#1454) @divyegala
- Fix linalg::map to work with non-power-of-2-sized types again (#1453) @ahendriksen
- [ENH] Enable building with clang (limit strict error checking to GCC) (#1452) @ahendriksen
- Remove usage of rapids-get-rapids-version-from-git (#1436) @jjacobelli
- Minor Updates to Sparse Structures (#1432) @divyegala
- Use nvtx3 includes. (#1431) @bdice
- Remove wheel pytest verbosity (#1424) @sevagh
- Add python bindings for matrix::select_k (#1422) @benfred
- Using
raft::resources
acrossraft::random
(#1420) @cjnolet - Generate build metrics report for test and benchmarks (#1414) @divyegala
- Update clang-format to 16.0.1. (#1412) @bdice
- Use ARC V2 self-hosted runners for GPU jobs (#1410) @jjacobelli
- Remove uses-setup-env-vars (#1406) @vyasr
- Resolve conflicts in auto-merger of
branch-23.06
andbranch-23.04
(#1403) @galipremsagar - Adding base header-only conda package without cuda math libs (#1386) @cjnolet
- Fix IVF-PQ API to use
device_vector_view
(#1384) @lowener - Branch 23.06 merge 23.04 (#1379) @vyasr
- Forward merge branch 23.04 into 23.06 (#1350) @cjnolet
- Fused L2 1-NN based on cutlass 3xTF32 / DMMA (#1118) @mdoijade
- Pin
dask
anddistributed
for release (#1399) @galipremsagar - Remove faiss_mr.hpp (#1351) @benfred
- Removing FAISS from build (#1340) @cjnolet
- Generic linalg::map (#1337) @achirkin
- Consolidate pre-compiled specializations into single
libraft
binary (#1333) @cjnolet - Generic linalg::map (#1329) @achirkin
- Update and standardize IVF indexes API (#1328) @viclafargue
- IVF-Flat index splitting (#1271) @lowener
- IVF-PQ: store cluster data in individual lists and reduce templates (#1249) @achirkin
- Fix for svd API (#1190) @lowener
- Remove deprecated headers (#1145) @lowener
- Fix primitives benchmarks (#1389) @ahendriksen
- Fixing index-url link on pip install docs (#1378) @cjnolet
- Adding some functions back in that seem to be a copy/paste error (#1373) @cjnolet
- Remove usage of Dask's
get_worker
(#1365) @pentschev - Remove MANIFEST.in use auto-generated one for sdists and package_data for wheels (#1348) @vyasr
- Revert "Generic linalg::map (#1329)" (#1336" (#1336)) @cjnolet
- Small follow-up to specializations cleanup (#1332) @cjnolet
- Fixing select_k specializations (#1330) @cjnolet
- Fixing remaining bug in ann_quantized (#1327) @cjnolet
- Fixign a couple small kmeans bugs (#1274) @cjnolet
- Remove no longer instantiated templates from list of extern template declarations (#1272) @vyasr
- Bump pinned deps to 23.4 (#1266) @vyasr
- Fix the destruction of interruptible token registry (#1229) @achirkin
- Expose raft::handle_t in the public header (#1192) @vyasr
- Fix for svd API (#1190) @lowener
- Adding architecture diagram to README.md (#1370) @cjnolet
- Adding small readme image (#1354) @cjnolet
- Fix serialize documentation of ivf_flat (#1347) @lowener
- Small updates to docs (#1339) @cjnolet
- Add Options to Generate Build Metrics Report (#1369) @divyegala
- Generic linalg::map (#1337) @achirkin
- Generic linalg::map (#1329) @achirkin
- matrix::select_k specializations (#1268) @achirkin
- Use rapids-cmake new COMPONENT exporting feature (#1154) @robertmaynard
- Pin
dask
anddistributed
for release (#1399) @galipremsagar - Pin cupy in wheel tests to supported versions (#1383) @vyasr
- CAGRA (#1375) @tfeher
- add a distance epilogue function to the bfknn call (#1371) @benfred
- Relax UCX pin to allow 1.14 (#1366) @pentschev
- Generate pyproject dependencies with dfg (#1364) @vyasr
- Add nccl to dependencies.yaml (#1361) @benfred
- Add extern template for ivfflat_interleaved_scan (#1360) @ahendriksen
- Stop setting package version attribute in wheels (#1359) @vyasr
- Fix ivf flat specialization header IdxT from uint64_t -> int64_t (#1358) @ahendriksen
- Remove faiss_mr.hpp (#1351) @benfred
- Rename optional helper function (#1345) @viclafargue
- Pass minimum target compile options through
raft::raft
(#1341) @cjnolet - Removing FAISS from build (#1340) @cjnolet
- Add dispatch based on compute architecture (#1335) @ahendriksen
- Consolidate pre-compiled specializations into single
libraft
binary (#1333) @cjnolet - Update and standardize IVF indexes API (#1328) @viclafargue
- Using int64_t specializations for
ivf_pq
andrefine
(#1325) @cjnolet - Migrate as much as possible to pyproject.toml (#1324) @vyasr
- Pass
AWS_SESSION_TOKEN
andSCCACHE_S3_USE_SSL
vars to conda build (#1321) @ajschmidt8 - Numerical stability fixes for l2 pairwise distance (#1319) @benfred
- Consolidate linter configuration into pyproject.toml (#1317) @vyasr
- IVF-Flat Python wrappers (#1316) @tfeher
- Add stream overloads to
ivf_pq
serialize/deserialize methods (#1315) @divyegala - Temporary buffer to view host or device memory in device (#1313) @divyegala
- RAFT skeleton project template (#1312) @cjnolet
- Fix docs build to be
pydata-sphinx-theme=0.13.0
compatible (#1311) @galipremsagar - Update to GCC 11 (#1309) @bdice
- Reduce compile times of distance specializations (#1307) @ahendriksen
- Fix docs upload path (#1305) @AyodeAwe
- Add end-to-end CUDA ann-benchmarks to raft (#1304) @cjnolet
- Make docs builds less verbose (#1302) @AyodeAwe
- Stop using versioneer to manage versions (#1301) @vyasr
- Adding util to get the device id for a pointer address (#1297) @cjnolet
- Enable dfg in pre-commit. (#1293) @vyasr
- Python API for brute-force KNN (#1292) @cjnolet
- support k up to 2048 in faiss select (#1287) @benfred
- CI: Remove specification of manual stage for check_style.sh script. (#1283) @csadorf
- New Sparse Matrix APIs (#1279) @cjnolet
- fix build on cuda 11.5 (#1277) @benfred
- IVF-Flat index splitting (#1271) @lowener
- Remove duplicate
librmm
runtime dependency (#1264) @ajschmidt8 - build.sh: Add option to log nvcc compile times (#1262) @ahendriksen
- Reduce error handling verbosity in CI tests scripts (#1259) @AjayThorve
- Update shared workflow branches (#1256) @ajschmidt8
- Keeping only compute similarity specializations for uint64_t for now (#1255) @cjnolet
- Fix compile time explosion for minkowski distance (#1254) @ahendriksen
- Unpin
dask
anddistributed
for development (#1253) @galipremsagar - Remove gpuCI scripts. (#1252) @bdice
- IVF-PQ: store cluster data in individual lists and reduce templates (#1249) @achirkin
- Fix inconsistency between the building doc and CMakeLists.txt (#1248) @yong-wang
- Consolidating ANN benchmarks and tests (#1243) @cjnolet
- mdspan view for IVF-PQ API (#1236) @viclafargue
- Remove uint32 distance idx specializations (#1235) @cjnolet
- Add innerproduct to the pairwise distance api (#1226) @benfred
- Move date to build string in
conda
recipe (#1223) @ajschmidt8 - Replace faiss bfKnn (#1202) @benfred
- Expose KMeans
init_plus_plus
in pylibraft (#1198) @betatim - Fix
ucx-py
version (#1184) @ajschmidt8 - Improve the performance of radix top-k (#1175) @yong-wang
- Add docs build job (#1168) @AyodeAwe
- Remove deprecated headers (#1145) @lowener
- Simplify distance/detail to make is easier to dispatch to different kernel implementations (#1142) @ahendriksen
- Initial port of auto-find-k (#1070) @cjnolet
- Remove faiss ANN code from knnIndex (#1121) @benfred
- Use
GenPC
(Permuted Congruential) as the default random number generator everywhere (#1099) @Nyrio
- Reverting a few commits from 23.02 and speeding up end-to-end build time (#1232) @cjnolet
- Update README.md: fix a missing word (#1185) @achirkin
- balanced-k-means: fix a too large initial memory pool size (#1148) @achirkin
- Catch signal handler change error (#1147) @tfeher
- Squared norm fix follow-up (change was lost in merge conflict) (#1144) @Nyrio
- IVF-Flat bug fix: the squared norm is required for expanded distance calculations (#1141) @Nyrio
- build.sh switch to use
RAPIDS
magic value (#1132) @robertmaynard - Fix
euclidean_dist
in IVF-Flat search (#1122) @Nyrio - Update handle docstring (#1103) @dantegd
- Pin libcusparse and libcusolver to avoid CUDA 12 (#1095) @wphicks
- Fix race condition in
raft::random::discrete
(#1094) @Nyrio - Fixing libraft conda recipes (#1084) @cjnolet
- Ensure that we get the cuda version of faiss. (#1078) @vyasr
- Fix double definition error in ANN refinement header (#1067) @tfeher
- Specify correct global targets names to raft_export (#1054) @robertmaynard
- Fix concurrency issues in k-means++ initialization (#1048) @Nyrio
- Adding small comms tutorial to docs (#1204) @cjnolet
- Separating more namespaces into easier-to-consume sections (#1091) @cjnolet
- Paying down some tech debt on docs, runtime API, and cython (#1055) @cjnolet
- Add function to convert mdspan to a const view (#1188) @lowener
- Internal library to share headers between test and bench (#1162) @achirkin
- Add public API and tests for hierarchical balanced k-means (#1113) @Nyrio
- Export NCCL dependency as part of raft::distributed. (#1077) @vyasr
- Serialization of IVF Flat and IVF PQ (#919) @tfeher
- Pin
dask
anddistributed
for release (#1242) @galipremsagar - Update shared workflow branches (#1241) @ajschmidt8
- Removing interruptible from basic handle sync. (#1224) @cjnolet
- pre-commit: Update isort version to 5.12.0 (#1215) @wence-
- Pin wheel dependencies to same RAPIDS release (#1200) @sevagh
- Serializer for mdspans (#1173) @hcho3
- Use CTK 118/cp310 branch of wheel workflows (#1169) @sevagh
- Enable shallow copy of
handle_t
's resources with different workspace_resource (#1165) @cjnolet - Protect balanced k-means out-of-memory in some cases (#1161) @achirkin
- Use squeuclidean for metric name in ivf_pq python bindings (#1160) @benfred
- ANN tests: make the min_recall check strict (#1156) @achirkin
- Make cutlass use static ctk (#1155) @sevagh
- Fix various build errors (#1152) @hcho3
- Remove faiss bfKnn call from fused_l2_knn unittest (#1150) @benfred
- Fix
unary_op
docs and addmap_offset
as an improved version ofwrite_only_unary_op
(#1149) @Nyrio - Improvement of the math API wrappers (#1146) @Nyrio
- Changing handle_t to device_resources everywhere (#1140) @cjnolet
- Add L2SqrtExpanded support to ivf_pq (#1138) @benfred
- Adding workspace resource (#1137) @cjnolet
- Add raft::void_op functor (#1136) @ahendriksen
- IVF-PQ: tighten the test criteria (#1135) @achirkin
- Fix documentation author (#1134) @bdice
- Add L2SqrtExpanded support to ivf_flat ANN indices (#1133) @benfred
- Improvements in
matrix::gather
: test coverage, compilation errors, performance (#1126) @Nyrio - Adding ability to use an existing stream in the pylibraft Handle (#1125) @cjnolet
- Remove faiss ANN code from knnIndex (#1121) @benfred
- Update builds for CUDA
11.8
and Python3.10
(#1120) @ajschmidt8 - Update workflows for nightly tests (#1119) @ajschmidt8
- Enable
Recently Updated
Check (#1117) @ajschmidt8 - Build wheels alongside conda CI (#1116) @sevagh
- Allow host dataset for IVF-PQ (#1114) @tfeher
- Decoupling raft handle from underlying resources (#1111) @cjnolet
- Fixing an index error introduced in PR #1109 (#1110) @vinaydes
- Fixing the sample-without-replacement test failures (#1109) @vinaydes
- Remove faiss dependency from fused_l2_knn.cuh, selection_faiss.cuh, ball_cover.cuh and haversine_distance.cuh (#1108) @benfred
- Remove redundant operators in sparse/distance and move others to raft/core (#1105) @Nyrio
- Speedup
make_blobs
by up to 2x by fixing inefficient kernel launch configuration (#1100) @Nyrio - Use
GenPC
(Permuted Congruential) as the default random number generator everywhere (#1099) @Nyrio - Cleanup faiss includes (#1098) @benfred
- matrix::select_k: move selection and warp-sort primitives (#1085) @achirkin
- Exclude changelog from pre-commit spellcheck (#1083) @benfred
- Add GitHub Actions Workflows. (#1076) @bdice
- Adding uninstall option to build.sh (#1075) @cjnolet
- Use doctest for testing python example docstrings (#1073) @benfred
- Minor cython fixes / cleanup (#1072) @benfred
- IVF-PQ: tweak launch configuration (#1069) @achirkin
- Unpin
dask
anddistributed
for development (#1068) @galipremsagar - Bifurcate Dependency Lists (#1065) @ajschmidt8
- Add support for 64bit svdeig (#1060) @lowener
- switch mma instruction shape to 1684 from current 1688 for 3xTF32 L2/cosine kernel (#1057) @mdoijade
- Make IVF-PQ build index in batches when necessary (#1056) @achirkin
- Remove unused setuputils modules (#1053) @vyasr
- Branch 23.02 merge 22.12 (#1051) @benfred
- Shared-memory-cached kernel for
reduce_cols_by_key
to limit atomic conflicts (#1050) @Nyrio - Unify use of common functors (#1049) @Nyrio
- Replace k-means++ CPU bottleneck with a
random::discrete
prim (#1039) @Nyrio - Add python bindings for kmeans fit (#1016) @benfred
- Add MaskedL2NN (#838) @ahendriksen
- Move contractions tiling logic outside of Contractions_NT (#837) @ahendriksen
- Make ucx linkage explicit and add a new CMake target for it (#1032) @vyasr
- IVF-Flat: make adaptive-centers behavior optional (#1019) @achirkin
- Remove make_mdspan template for memory_type enum (#1005) @wphicks
- ivf-pq performance tweaks (#926) @achirkin
- fusedL2NN: Add input alignment checks (#1045) @achirkin
- Fix fusedL2NN bug that can happen when the same point appears in both x and y (#1040) @Nyrio
- Fix trivial deprecated header includes (#1034) @achirkin
- Suppress ptxas stack size warning in Debug mode (#1033) @tfeher
- Don't use CMake 3.25.0 as it has a FindCUDAToolkit show stopping bug (#1029) @robertmaynard
- Fix for gemmi deprecation (#1020) @lowener
- Remove make_mdspan template for memory_type enum (#1005) @wphicks
- Add
except +
to cython extern cdef declarations (#1001) @benfred - Changing Overloads for GCC 11/12 bug (#995) @divyegala
- Changing Overloads for GCC 11/12 bugs (#992) @divyegala
- Fix pylibraft docstring example code (#980) @benfred
- Update raft tests to compile with C++17 features enabled (#973) @robertmaynard
- Making ivf flat gtest invoke mdspanified APIs (#955) @cjnolet
- Updates to kmeans public API to fix cuml (#932) @cjnolet
- Fix logger (vsnprintf consumes args) (#917) @Nyrio
- Adding missing include for device mdspan in
mean_squared_error.cuh
(#906) @cjnolet
- Add links to the docs site in the README (#1042) @benfred
- Moving contributing and developer guides to main docs (#1006) @cjnolet
- Update compiler flags in build docs (#999) @cjnolet
- Updating minimum required gcc version (#993) @cjnolet
- important doc updates for core, cluster, and neighbors (#933) @cjnolet
- ANN refinement Python wrapper (#1052) @tfeher
- Add ANN refinement method (#1038) @tfeher
- IVF-Flat: make adaptive-centers behavior optional (#1019) @achirkin
- Add wheel builds (#1013) @vyasr
- Update cuSparse wrappers to avoid deprecated functions (#989) @wphicks
- Provide memory_type enum (#984) @wphicks
- Add Tests for kmeans API (#982) @lowener
- mdspanifying
weighted_mean
and addraft::stats
tests (#910) @lowener - Implement
raft::stats
API with mdspan (#802) @lowener
- Pin
dask
anddistributed
for release (#1062) @galipremsagar - IVF-PQ: use device properties helper (#1035) @achirkin
- Make ucx linkage explicit and add a new CMake target for it (#1032) @vyasr
- Fixing broken doc functions and improving coverage (#1030) @cjnolet
- Expose cluster_cost to python (#1028) @benfred
- Adding lightweight cai_wrapper to reduce boilerplate (#1027) @cjnolet
- Change
raft
docs theme topydata-sphinx-theme
(#1026) @galipremsagar - Revert " Pin
dask
anddistributed
for release" (#1023) @galipremsagar - Pin
dask
anddistributed
for release (#1022) @galipremsagar - Replace
dots_along_rows
withrowNorm
and improvecoalescedReduction
performance (#1011) @Nyrio - Moving TestDeviceBuffer to
pylibraft.common.device_ndarray
(#1008) @cjnolet - Add codespell as a linter (#1007) @benfred
- Fix environment channels (#996) @bdice
- Automatically sync handle when not passed to pylibraft functions (#987) @benfred
- Replace
normalize_rows
inann_utils.cuh
by a newrowNormalize
prim and improve performance for thin matrices (smalln_cols
) (#979) @Nyrio - Forward merge 22.10 into 22.12 (#978) @vyasr
- Use new rapids-cmake functionality for rpath handling. (#976) @vyasr
- Update cuda-python dependency to 11.7.1 (#975) @galipremsagar
- IVF-PQ Python wrappers (#970) @tfeher
- Remove unnecessary requirements for raft-dask. (#969) @vyasr
- Expose
linalg::dot
in public API (#968) @benfred - Fix kmeans cluster templates (#966) @lowener
- Run linters using pre-commit (#965) @benfred
- linewiseop padded span test (#964) @mfoerste4
- Add unittest for
linalg::mean_squared_error
(#961) @benfred - Exposing fused l2 knn to public APIs (#959) @cjnolet
- Remove a left over print statement from pylibraft (#958) @betatim
- Switch to using rapids-cmake for gbench. (#954) @vyasr
- Some cleanup of k-means internals (#953) @cjnolet
- Remove stale labeler (#951) @raydouglass
- Adding optional handle to each public API function (along with example) (#947) @cjnolet
- Improving documentation across the board. Adding quick-start to breathe docs. (#943) @cjnolet
- Add unittest for
linalg::axpy
(#942) @benfred - Add cutlass 3xTF32,DMMA based L2/cosine distance kernels for SM 8.0 or higher (#939) @mdoijade
- Calculate max cluster size correctly for IVF-PQ (#938) @tfeher
- Add tests for
raft::matrix
(#937) @lowener - Add fusedL2NN benchmark (#936) @Nyrio
- ivf-pq performance tweaks (#926) @achirkin
- Adding
fused_l2_nn_argmin
wrapper to Pylibraft (#924) @cjnolet - Moving kernel gramm primitives to
raft::distance::kernels
(#920) @cjnolet - kmeans improvements: random initialization on GPU, NVTX markers, no batching when using fusedL2NN (#918) @Nyrio
- Moving
raft::spatial::knn
->raft::neighbors
(#914) @cjnolet - Create cub-based argmin primitive and replace
argmin_along_rows
in ANN kmeans (#912) @Nyrio - Replace
map_along_rows
withmatrixVectorOp
(#911) @Nyrio - Integrate
accumulate_into_selected
from ANN utils intolinalg::reduce_rows_by_keys
(#909) @Nyrio - Re-enabling Fused L2 NN specializations and renaming
cub::KeyValuePair
->raft::KeyValuePair
(#905) @cjnolet - Unpin
dask
anddistributed
for development (#886) @galipremsagar - Adding padded layout 'layout_padded_general' (#725) @mfoerste4
- Separating mdspan/mdarray infra into host_* and device_* variants (#810) @cjnolet
- Remove type punning from TxN_t (#781) @wphicks
- ivf_flat::index: hide implementation details (#747) @achirkin
- ivf-pq integration: hotfixes (#891) @achirkin
- Removing cub symbol from libraft-distance instantiation. (#887) @cjnolet
- ivf-pq post integration hotfixes (#878) @achirkin
- Fixing a few compile errors in new APIs (#874) @cjnolet
- Include knn.cuh in knn.cu benchmark source for finding brute_force_knn (#855) @teju85
- Do not use strcpy to copy 2 char (#848) @mhoemmen
- rng_state not including necessary cstdint (#839) @MatthiasKohl
- Fix integer overflow in ANN kmeans (#835) @Nyrio
- Add alignment to the TxN_t vectorized type (#792) @achirkin
- Fix adj_to_csr_kernel (#785) @ahendriksen
- Use rapids-cmake 22.10 best practice for RAPIDS.cmake location (#784) @robertmaynard
- Remove type punning from TxN_t (#781) @wphicks
- Various fixes for build.sh (#771) @vyasr
- Fix target names in build.sh help text (#879) @Nyrio
- Document that minimum required CMake version is now 3.23.1 (#841) @robertmaynard
- mdspanify raft::random functions uniformInt, normalTable, fill, bernoulli, and scaled_bernoulli (#897) @mhoemmen
- mdspan-ify several raft::random rng functions (#857) @mhoemmen
- Develop new mdspan-ified multi_variable_gaussian interface (#845) @mhoemmen
- Mdspanify permute (#834) @mhoemmen
- mdspan-ify rmat_rectangular_gen (#833) @mhoemmen
- mdspanify sampleWithoutReplacement (#830) @mhoemmen
- mdspan-ify make_regression (#811) @mhoemmen
- Updating
raft::linalg
APIs to usemdspan
(#809) @divyegala - Integrate KNN implementation: ivf-pq (#789) @achirkin
- Some fixes for build.sh (#901) @cjnolet
- Revert recent fused l2 nn instantiations (#899) @cjnolet
- Update Python build instructions (#898) @betatim
- Adding ninja and cxx compilers to conda dev dependencies (#893) @cjnolet
- Output non-normalized distances in IVF-PQ and brute-force KNN (#892) @Nyrio
- Readme updates for 22.10 (#884) @cjnolet
- Breaking apart benchmarks into individual binaries (#883) @cjnolet
- Pin
dask
anddistributed
for release (#858) @galipremsagar - Mdspanifying (currently tested)
raft::matrix
(#846) @cjnolet - Separating _RAFT_HOST and _RAFT_DEVICE macros (#836) @cjnolet
- Updating cpu job in hopes it speeds up python cpu builds (#828) @cjnolet
- Mdspan-ifying
raft::spatial
(#827) @cjnolet - Fixing init.py for handle and stream (#826) @cjnolet
- Moving a few more things around (#822) @cjnolet
- Use fusedL2NN in ANN kmeans (#821) @Nyrio
- Separating test executables (#820) @cjnolet
- Separating mdspan/mdarray infra into host_* and device_* variants (#810) @cjnolet
- Fix malloc/delete mismatch (#808) @mhoemmen
- Renaming
pyraft
->raft-dask
(#801) @cjnolet - Branch 22.10 merge 22.08 (#800) @cjnolet
- Statically link all CUDA toolkit libraries (#797) @trxcllnt
- Minor follow-up fixes for ivf-flat (#796) @achirkin
- KMeans benchmarks (cuML + ANN implementations) and fix for IndexT=int64_t (#795) @Nyrio
- Optimize fusedL2NN when data is skinny (#794) @ahendriksen
- Complete the deprecation of duplicated hpp headers (#793) @ahendriksen
- Prepare parts of the balanced kmeans for ivf-pq (#788) @achirkin
- Unpin
dask
anddistributed
for development (#783) @galipremsagar - Exposing python wrapper for the RMAT generator logic (#778) @teju85
- Device, Host, Managed Accessor Types for
mdspan
(#776) @divyegala - Fix Forward-Merger Conflicts (#768) @ajschmidt8
- Fea 2208 kmeans use specializations (#760) @cjnolet
- ivf_flat::index: hide implementation details (#747) @achirkin
- Update
mdspan
to account for changes toextents
(#751) @divyegala - Replace csr_adj_graph functions with faster equivalent (#746) @ahendriksen
- Integrate KNN implementation: ivf-flat (#652) @achirkin
- Moving kmeans from cuml to Raft (#605) @lowener
- Relax ivf-flat test recall thresholds (#766) @achirkin
- Restrict the use of
]
to CXX 20 only. (#764) @trivialfis - Update rapids-cmake version for pyraft in update-version.sh (#749) @vyasr
- Use documented header template for doxygen (#773) @galipremsagar
- Switch
language
fromNone
to"en"
in docs build (#721) @galipremsagar
- Update
mdspan
to account for changes toextents
(#751) @divyegala - Implement matrix transpose with mdspan. (#739) @trivialfis
- Implement unravel_index for row-major array. (#723) @trivialfis
- Integrate KNN implementation: ivf-flat (#652) @achirkin
- Use common
js
andcss
code (#779) @galipremsagar - Pin
dask
&distributed
for release (#772) @galipremsagar - Move cmake to the build section. (#763) @vyasr
- Adding old kmeans impl back in (as kmeans_deprecated) (#761) @cjnolet
- Fix for KMeans raw pointers API (#758) @lowener
- Fix KMeans (#756) @divyegala
- Add inline to nccl_sync_stream() (#750) @seunghwak
- Replace csr_adj_graph functions with faster equivalent (#746) @ahendriksen
- Add wrapper functions for ncclGroupStart() and ncclGroupEnd() (#742) @seunghwak
- Fix variadic template type check for mdarrays (#741) @hlinsen
- RMAT rectangular graph generator (#738) @teju85
- Update conda recipes to UCX 1.13.0 (#736) @pentschev
- Add warp-aggregated atomic increment (#735) @ahendriksen
- fix logic bug in include_checker.py utility (#734) @grlee77
- Support 32bit and unsigned indices in bruteforce KNN (#730) @achirkin
- Ability to use ccache to speedup local builds (#729) @teju85
- Pin max version of
cuda-python
to11.7.0
(#728) @Ethyling - Always add
raft::raft_nn_lib
andraft::raft_distance_lib
aliases (#727) @trxcllnt - Add several type aliases and helpers for creating mdarrays (#726) @achirkin
- fix nans in naive kl divergence kernel introduced by div by 0. (#724) @mdoijade
- Use rapids-cmake for cuco (#722) @vyasr
- Update Python classifiers. (#719) @bdice
- Fix sccache (#718) @Ethyling
- Introducing raft::mdspan as an alias (#715) @divyegala
- Update cuco version (#714) @vyasr
- Update conda environment pinnings and update-versions.sh. (#713) @bdice
- Branch 22.08 merge branch 22.06 (#712) @cjnolet
- Testing conda compilers (#705) @cjnolet
- Unpin
dask
&distributed
for development (#704) @galipremsagar - Avoid shadowing CMAKE_ARGS variable in build.sh (#701) @vyasr
- Use unique ptr in
print_device_vector
(#695) @lowener - Add missing Thrust includes (#678) @bdice
- Consolidate C++ conda recipes and add libraft-tests package (#641) @Ethyling
- Moving kmeans from cuml to Raft (#605) @lowener
- Rng: removed cyclic dependency creating hard-to-debug compiler errors (#639) @MatthiasKohl
- Allow enabling NVTX markers by downstream projects after install (#610) @achirkin
- Rng: expose host-rng-state in host-only API (#609) @MatthiasKohl
- For fixing the cuGraph test failures with PCG (#690) @vinaydes
- Fix excessive memory used in selection test (#689) @achirkin
- Revert print vector changes because of std::vector<bool> (#681) @lowener
- fix race in fusedL2knn smem read/write by adding a syncwarp (#679) @mdoijade
- gemm: fix parameter C mistakenly set as const (#664) @achirkin
- Fix SelectionTest: allow different indices when keys are equal. (#659) @achirkin
- Revert recent cmake updates (#657) @cjnolet
- Don't install component dependency files in raft-header only mode (#655) @robertmaynard
- Rng: removed cyclic dependency creating hard-to-debug compiler errors (#639) @MatthiasKohl
- Fixing raft compile bug w/ RNG changes (#634) @cjnolet
- Get
libcudacxx
fromcuco
(#632) @trxcllnt - RNG API fixes (#630) @MatthiasKohl
- Fix mdspan accessor mixin offset policy. (#628) @trivialfis
- Branch 22.06 merge 22.04 (#625) @cjnolet
- fix issue in fusedL2knn which happens when rows are multiple of 256 (#604) @mdoijade
- Restore changes from #653 and #655 and correct cmake component dependencies (#686) @robertmaynard
- Adding handle and stream to pylibraft (#683) @cjnolet
- Map CMake install components to conda library packages (#653) @robertmaynard
- Rng: expose host-rng-state in host-only API (#609) @MatthiasKohl
- mdspan/mdarray template functions and utilities (#601) @divyegala
- Change build.sh to find C++ library by default (#697) @vyasr
- Pin
dask
anddistributed
for release (#693) @galipremsagar - Pin
dask
&distributed
for release (#680) @galipremsagar - Improve logging (#673) @achirkin
- Fix minor errors in CMake configuration (#662) @vyasr
- Pulling mdspan fork (from official rapids repo) into raft to remove dependency (#649) @cjnolet
- Fixing the unit test issue(s) in RAFT (#646) @vinaydes
- Build pyraft with scikit-build (#644) @vyasr
- Some fixes to pairwise distances for cupy integration (#643) @cjnolet
- Require UCX 1.12.1+ (#638) @jakirkham
- Updating raft rng host public API and adding docs (#636) @cjnolet
- Build pylibraft with scikit-build (#633) @vyasr
- Add
cuda_lib_dir
tolibrary_dirs
, allow changingUCX
/RMM
/Thrust
/spdlog
locations via envvars insetup.py
(#624) @trxcllnt - Remove perf prints from MST (#623) @divyegala
- Enable components installation using CMake (#621) @Ethyling
- Allow nullptr as input-indices argument of select_k (#618) @achirkin
- Update CMake pinning to allow newer CMake versions (#617) @vyasr
- Unpin
dask
&distributed
for development (#616) @galipremsagar - Improve performance of select-top-k RADIX implementation (#615) @achirkin
- Moving more prims benchmarks to RAFT (#613) @cjnolet
- Allow enabling NVTX markers by downstream projects after install (#610) @achirkin
- Improve performance of select-top-k WARP_SORT implementation (#606) @achirkin
- Enable building static libs (#602) @trxcllnt
- Update
ucx-py
version (#596) @ajschmidt8 - Fix merge conflicts (#587) @ajschmidt8
- Making cuco, thrust, and mdspan optional dependencies. (#585) @cjnolet
- Some RBC3D fixes (#530) @cjnolet
- Moving some of the remaining linalg prims from cuml (#502) @cjnolet
- Fix badly merged cublas wrappers (#492) @achirkin
- Hiding implementation details for lap, clustering, spectral, and label (#477) @cjnolet
- Adding destructor for std comms and using nccl allreduce for barrier in mpi comms (#473) @cjnolet
- Cleaning up cusparse_wrappers (#441) @cjnolet
- Improvents to RNG (#434) @vinaydes
- Remove RAFT memory management (#400) @viclafargue
- LinAlg impl in detail (#383) @divyegala
- Pin cmake in conda recipe to <3.23 (#600) @dantegd
- Fix make_device_vector_view (#595) @lowener
- Update cuco version. (#592) @vyasr
- Fixing raft headers dir (#574) @cjnolet
- Update update-version.sh (#560) @raydouglass
- find_package(raft) can now be called multiple times safely (#532) @robertmaynard
- Allocate sufficient memory for Hungarian if number of batches > 1 (#531) @ChuckHastings
- Adding lap.hpp back (with deprecation) (#529) @cjnolet
- raft-config is idempotent no matter RAFT_COMPILE_LIBRARIES value (#516) @robertmaynard
- Call initialize() in mpi_comms_t constructor. (#506) @seunghwak
- Improve row-major meanvar kernel via minimizing atomicCAS locks (#489) @achirkin
- Adding destructor for std comms and using nccl allreduce for barrier in mpi comms (#473) @cjnolet
- Add benchmarks (#549) @achirkin
- Unify weighted mean code (#514) @lowener
- single-pass raft::stats::meanvar (#472) @achirkin
- Move
random
package of cuML to RAFT (#449) @divyegala - mdspan integration. (#437) @trivialfis
- Interruptible execution (#433) @achirkin
- make raft sources compilable with clang (#424) @MatthiasKohl
- Span implementation. (#399) @trivialfis
- Adding build script for docs (#589) @cjnolet
- Temporarily disable new
ops-bot
functionality (#586) @ajschmidt8 - Fix commands to get conda output files (#584) @Ethyling
- Link to
cuco
and add faissEXCLUDE_FROM_ALL
option (#583) @trxcllnt - exposing faiss::faiss (#582) @cjnolet
- Pin
dask
anddistributed
version (#581) @galipremsagar - removing exclude_from_all from cuco (#580) @cjnolet
- Adding INSTALL_EXPORT_SET for cuco, rmm, thrust (#579) @cjnolet
- Thrust package name case (#576) @trxcllnt
- Add missing thrust includes to transpose.cuh (#575) @zbjornson
- Use unanchored clang-format version check (#573) @zbjornson
- Fixing accidental removal of thrust target from cmakelists (#571) @cjnolet
- Don't add gtest to build export set or generate a gtest-config.cmake (#565) @trxcllnt
- Set
main
label by default (#559) @galipremsagar - Add local conda channel while looking for conda outputs (#558) @Ethyling
- Updated dask and distributed to >=2022.02.1 (#557) @rlratzel
- Upload packages using testing label for nightlies (#556) @Ethyling
- Add
.github/ops-bot.yaml
config file (#554) @ajschmidt8 - Disabling benchmarks building by default. (#553) @cjnolet
- KNN select-top-k variants (#551) @achirkin
- Adding logger (#550) @cjnolet
- clang-tidy support: improved clang run scripts with latest changes (see cugraph-ops) (#548) @MatthiasKohl
- Pylibraft for pairwise distances (#540) @cjnolet
- mdspan PoC for distance make_blobs (#538) @cjnolet
- Include thrust/sort.h in ball_cover.cuh (#526) @akifcorduk
- Increase parallelism in allgatherv (#525) @seunghwak
- Moving device functions to cuh files and deprecating hpp (#524) @cjnolet
- Use
dynamic_extent
fromstdex
. (#523) @trivialfis - Updating some of the ci check scripts (#522) @cjnolet
- Use shfl_xor in warpReduce for broadcast (#521) @akifcorduk
- Fixing Python conda package and installation (#520) @cjnolet
- Adding instructions to install from conda and build using CPM (#519) @cjnolet
- Implement span storage optimization. (#515) @trivialfis
- RNG test fixes and improvements (#513) @vinaydes
- Moving scores and metrics over to raft::stats (#512) @cjnolet
- Random ball cover in 3d (#510) @cjnolet
- Initializing memory in RBC (#509) @cjnolet
- Adjusting conda packaging to remove duplicate dependencies (#508) @cjnolet
- Moving remaining stats prims from cuml (#507) @cjnolet
- Correcting the namespace (#505) @vinaydes
- Passing stream through commsplit (#503) @cjnolet
- Moving some of the remaining linalg prims from cuml (#502) @cjnolet
- Fixing spectral APIs (#496) @cjnolet
- Fix badly merged cublas wrappers (#492) @achirkin
- Fix integer overflow in distances (#490) @RAMitchell
- Reusing shared libs in gpu ci builds (#487) @cjnolet
- Adding fatbin to shared libs and fixing conda paths in cpu build (#485) @cjnolet
- Add CMake
install
rule for tests (#483) @ajschmidt8 - Adding cpu ci for conda build (#482) @cjnolet
- iUpdating codeowners to use new raft codeowners (#480) @cjnolet
- Hiding implementation details for lap, clustering, spectral, and label (#477) @cjnolet
- Define PTDS via
-D
to fix cache misses in sccache (#476) @trxcllnt - Unpin dask and distributed (#474) @galipremsagar
- Replace
ccache
withsccache
(#471) @ajschmidt8 - More README updates (#467) @cjnolet
- CUBLAS wrappers with switchable host/device pointer mode (#453) @achirkin
- Cleaning up cusparse_wrappers (#441) @cjnolet
- Adding conda packaging for libraft and pyraft (#439) @cjnolet
- Improvents to RNG (#434) @vinaydes
- Hiding implementation details for comms (#409) @cjnolet
- Remove RAFT memory management (#400) @viclafargue
- LinAlg impl in detail (#383) @divyegala
- Simplify raft component CMake logic, and allow compilation without FAISS (#428) @robertmaynard
- One cudaStream_t instance per raft::handle_t (#291) @divyegala
- Removing extra logging from faiss mr (#463) @cjnolet
- Pin
dask
&distributed
versions (#455) @galipremsagar - Replace RMM CUDA Python bindings with those provided by CUDA-Python (#451) @shwina
- Fix comms memory leak (#436) @seunghwak
- Fix C++ doxygen documentation (#426) @achirkin
- Fix clang-format style errors (#425) @achirkin
- Fix using incorrect macro RAFT_CHECK_CUDA in place of RAFT_CUDA_TRY (#415) @achirkin
- Fix CUDA_CHECK_NO_THROW compatibility define (#414) @zbjornson
- Disabling fused l2 knn from bfknn (#407) @cjnolet
- Disabling expanded fused l2 knn to unblock cuml CI (#404) @cjnolet
- Reverting default knn distance to L2Unexpanded for now. (#403) @cjnolet
- README and build fixes before release (#459) @cjnolet
- Updates to Python and C++ Docs (#442) @cjnolet
- error macros: determining buffer size instead of fixed 2048 chars (#420) @MatthiasKohl
- NVTX range helpers (#416) @achirkin
- Splitting fused l2 knn specializations (#461) @cjnolet
- Update cuCollection git tag (#447) @seunghwak
- Remove libcudacxx patch needed for nvcc 11.4 (#446) @robertmaynard
- Unpin
dask
anddistributed
(#440) @galipremsagar - Public apis for remainder of matrix and stats (#438) @divyegala
- Fix bug in producer-consumer buffer exchange which occurs in UMAP test on GV100 (#429) @mdoijade
- Simplify raft component CMake logic, and allow compilation without FAISS (#428) @robertmaynard
- Update ucx-py version on release using rvc (#422) @Ethyling
- Disabling fused l2 knn again. Not sure how this got added back. (#421) @cjnolet
- Adding no throw macro variants (#417) @cjnolet
- Remove
IncludeCategories
from.clang-format
(#412) @codereport - fix nan issues in L2 expanded sqrt KNN distances (#411) @mdoijade
- Consistent renaming of CHECK_CUDA and *_TRY macros (#410) @cjnolet
- Faster matrix-vector-ops (#401) @achirkin
- Adding dev conda environment files. (#397) @cjnolet
- Update to UCX-Py 0.24 (#392) @pentschev
- Branch 21.12 merge 22.02 (#386) @cjnolet
- Hiding implementation details for sparse API (#381) @cjnolet
- Adding distance specializations (#376) @cjnolet
- Use FAISS with RMM (#363) @viclafargue
- Add Fused L2 Expanded KNN kernel (#339) @mdoijade
- Update
.clang-format
to be consistent with all other RAPIDS repos (#300) @codereport - One cudaStream_t instance per raft::handle_t (#291) @divyegala
- Fixing bad host->device copy (#375) @cjnolet
- Fix coalesced access checks in matrix_vector_op (#372) @achirkin
- Port libcudacxx patch from cudf (#370) @dantegd
- Fixing overflow in expanded distances (#365) @cjnolet
- Upgrade
clang
to11.1.0
(#394) @galipremsagar - Fix Changelog Merge Conflicts for
branch-21.12
(#390) @ajschmidt8 - Pin max
dask
&distributed
(#388) @galipremsagar - Removing conflict w/ CUDA_CHECK (#378) @cjnolet
- Update RAFT test directory (#359) @viclafargue
- Update to UCX-Py 0.23 (#358) @pentschev
- Hiding implementation details for random, stats, and matrix (#356) @divyegala
- README updates (#351) @cjnolet
- Use 64 bit CuSolver API for Eigen decomposition (#349) @lowener
- Hiding implementation details for distance primitives (dense + sparse) (#344) @cjnolet
- Unpin
dask
&distributed
in CI (#338) @galipremsagar
- Miscellaneous tech debts/cleanups (#286) @viclafargue
- Accounting for rmm::cuda_stream_pool not having a constructor for 0 streams (#329) @divyegala
- Fix wrong lda parameter in gemv (#327) @achirkin
- Fix
matrixVectorOp
to verify promoted pointer type is still aligned to vectorized load boundary (#325) @viclafargue - Pin rmm to branch-21.10 and remove warnings from kmeans.hpp (#322) @dantegd
- Temporarily pin RMM while refactor removes deprecated calls (#315) @dantegd
- Fix more warnings (#311) @harrism
- Add Hamming, Jensen-Shannon, KL-Divergence, Russell rao and Correlation distance metrics support (#306) @mdoijade
- Pin max
dask
anddistributed
versions to2021.09.1
(#334) @galipremsagar - Make sure we keep the rapids-cmake and raft cal version in sync (#331) @robertmaynard
- Add broadcast with const input iterator (#328) @seunghwak
- Fused L2 (unexpanded) kNN kernel for NN <= 64, without using temporary gmem to store intermediate distances (#324) @mdoijade
- Update with rapids cmake new features (#320) @robertmaynard
- Update to UCX-Py 0.22 (#319) @pentschev
- Fix Forward-Merge Conflicts (#318) @ajschmidt8
- Enable CUDA device code warnings as errors (#307) @harrism
- Remove max version pin for dask & distributed on development branch (#303) @galipremsagar
- Warnings are errors (#299) @harrism
- Use the new RAPIDS.cmake to fetch rapids-cmake (#298) @robertmaynard
- ENH Replace gpuci_conda_retry with gpuci_mamba_retry (#295) @dillon-cullinan
- Miscellaneous tech debts/cleanups (#286) @viclafargue
- Random Ball Cover Algorithm for 2D Haversine/Euclidean (#213) @cjnolet
- expose epsilon parameter to allow precision to to be specified (#275) @ChuckHastings
- Fix support for different input and output types in linalg::reduce (#296) @Nyrio
- Const raft handle in sparse bfknn (#280) @cjnolet
- Add
cuco::cuco
to list of linked libraries (#279) @trxcllnt - Use nested include in destination of install headers to avoid docker permission issues (#263) @dantegd
- Update UCX-Py version to 0.21 (#255) @pentschev
- Fix mst knn test build failure due to RMM device_buffer change (#253) @mdoijade
- Add chebyshev, canberra, minkowksi and hellinger distance metrics (#276) @mdoijade
- Move FAISS ANN wrappers to RAFT (#265) @cjnolet
- Remaining sparse semiring distances (#261) @cjnolet
- removing divye from codeowners (#257) @divyegala
- Pinning cuco to a specific commit hash for release (#304) @rlratzel
- Pin max
dask
&distributed
versions (#301) @galipremsagar - Overlap epilog compute with ldg of next grid stride in pairwise distance & fusedL2NN kernels (#292) @mdoijade
- Always add faiss library alias if it's missing (#287) @trxcllnt
- Use
NVIDIA/cuCollections
repo again (#284) @trxcllnt - Use the 21.08 branch of rapids-cmake as rmm requires it (#278) @robertmaynard
- expose epsilon parameter to allow precision to to be specified (#275) @ChuckHastings
- Fix
21.08
forward-merge conflicts (#274) @ajschmidt8 - Add lds and sts inline ptx instructions to force vector instruction generation (#273) @mdoijade
- Move ANN to RAFT (additional updates) (#270) @cjnolet
- Sparse semirings cleanup + hash table & batching strategies (#269) @divyegala
- Revert "pin dask versions in CI (#260)" (#264" (#264)) @ajschmidt8
- Pass stream to device_scalar::value() calls. (#259) @harrism
- Update get_rmm.cmake to better support CalVer (#258) @harrism
- Add Grid stride pairwise dist and fused L2 NN kernels (#250) @mdoijade
- Fix merge conflicts (#236) @ajschmidt8
- Update UCX-Py version to 0.20 (#254) @pentschev
- cuco git tag update (again) (#248) @seunghwak
- Revert PR #232 for 21.06 release (#246) @dantegd
- Python comms to hold onto server endpoints (#241) @cjnolet
- Fix Thrust 1.12 compile errors (#231) @trxcllnt
- Make sure we use CalVer when checking out rapids-cmake (#230) @robertmaynard
- Loss of Precision in MST weight alteration (#223) @divyegala
- cuco git tag update (#243) @seunghwak
- Update
CHANGELOG.md
links for calver (#233) @ajschmidt8 - Add Grid stride pairwise dist and fused L2 NN kernels (#232) @mdoijade
- Updates to enable HDBSCAN (#208) @cjnolet
- Exposing spectral random seed property (#193) @cjnolet
- Fix pointer arithmetic in spmv smem kernel (#183) @lowener
- Modify default value for rowMajorIndex and rowMajorQuery in bf-knn (#173) @viclafargue
- Remove setCudaMallocWarning() call for libfaiss[@v1.7.0 (#167) @trxcllnt](https://github.com/v1.7.0 (#167) @trxcllnt)
- Add const to KNN handle (#157) @hlinsen
- Fixing codeowners (#194) @cjnolet
- Adjust Hellinger pairwise distance to vaoid NaNs (#189) @lowener
- Add column major input support in contractions_nt kernels with new kernel policy for it (#188) @mdoijade
- Dice formula correction (#186) @lowener
- Scaling knn graph fix connectivities algorithm (#181) @cjnolet
- Fixing RAFT CI & a few small updates for SLHC Python wrapper (#178) @cjnolet
- Add Precomputed to the DistanceType enum (for cuML DBSCAN) (#177) @Nyrio
- Enable matrix::copyRows for row major input (#176) @tfeher
- Add Dice distance to distancetype enum (#174) @lowener
- Porting over recent updates to distance prim from cuml (#172) @cjnolet
- Update KNN (#171) @viclafargue
- Adding translations parameter to brute_force_knn (#170) @viclafargue
- Update Changelog Link (#169) @ajschmidt8
- Map operation (#168) @viclafargue
- Updating sparse prims based on recent changes (#166) @cjnolet
- Prepare Changelog for Automation (#164) @ajschmidt8
- Update 0.18 changelog entry (#163) @ajschmidt8
- MST symmetric/non-symmetric output for SLHC (#162) @divyegala
- Pass pre-computed colors to MST (#154) @divyegala
- Streams upgrade in RAFT handle (RMM backend + create handle from parent's pool) (#148) @afender
- Merge branch-0.18 into 0.19 (#146) @dantegd
- Add device_send, device_recv, device_sendrecv, device_multicast_sendrecv (#144) @seunghwak
- Adding SLHC prims. (#140) @cjnolet
- Moving cuml sparse prims to raft (#139) @cjnolet
- Make NCCL root initialization configurable. (#120) @drobison00
- Add idx_t template parameter to matrix helper routines (#131) @tfeher
- Eliminate CUDA 10.2 as valid for large svd solving (#129) @wphicks
- Update check to allow svd solver on CUDA>=10.2 (#125) @wphicks
- Updating gpu build.sh and debugging threads CI issue (#123) @dantegd
- Adding additional distances (#116) @cjnolet
- Update stale GHA with exemptions & new labels (#152) @mike-wendt
- Add GHA to mark issues/prs as stale/rotten (#150) @Ethyling
- Prepare Changelog for Automation (#135) @ajschmidt8
- Adding Jensen-Shannon and BrayCurtis to DistanceType for Nearest Neighbors (#132) @lowener
- Add brute force KNN (#126) @hlinsen
- Make NCCL root initialization configurable. (#120) @drobison00
- Auto-label PRs based on their content (#117) @jolorunyomi
- Add gather & gatherv to raft::comms::comms_t (#114) @seunghwak
- Adding canberra and chebyshev to distance types (#99) @cjnolet
- Gpuciscripts clean and update (#92) @msadang
- PR #65: Adding cuml prims that break circular dependency between cuml and cumlprims projects
- PR #101: MST core solver
- PR #93: Incorporate Date/Nagi implementation of Hungarian Algorithm
- PR #94: Allow generic reductions for the map then reduce op
- PR #95: Cholesky rank one update prim
- PR #108: Remove unused old-gpubuild.sh
- PR #73: Move DistanceType enum from cuML to RAFT
- pr #92: Cleanup gpuCI scripts
- PR #98: Adding InnerProduct to DistanceType
- PR #103: Epsilon parameter for Cholesky rank one update
- PR #100: Add divyegala as codeowner
- PR #111: Cleanup gpuCI scripts
- PR #120: Update NCCL init process to support root node placement.
- PR #106: Specify dependency branches to avoid pip resolver failure
- PR #77: Fixing CUB include for CUDA < 11
- PR #86: Missing headers for newly moved prims
- PR #102: Check alignment before binaryOp dispatch
- PR #104: Fix update-version.sh
- PR #109: Fixing Incorrect Deallocation Size and Count Bugs
- PR #63: Adding MPI comms implementation
- PR #70: Adding CUB to RAFT cmake
- PR #59: Adding csrgemm2 to cusparse_wrappers.h
- PR #61: Add cusparsecsr2dense to cusparse_wrappers.h
- PR #62: Adding
get_device_allocator
tohandle.pxd
- PR #67: Remove dependence on run-time type info
- PR #56: Fix compiler warnings.
- PR #64: Remove
cublas_try
fromcusolver_wrappers.h
- PR #66: Fixing typo
get_stream
togetStream
inhandle.pyx
- PR #68: Change the type of recvcounts & displs in allgatherv from size_t[] to size_t* and int[] to size_t*, respectively.
- PR #69: Updates for RMM being header only
- PR #74: Fix std_comms::comm_split bug
- PR #79: remove debug print statements
- PR #81: temporarily expose internal NCCL communicator
- PR #12: Spectral clustering.
- PR #7: Migrating cuml comms -> raft comms_t
- PR #18: Adding commsplit to cuml communicator
- PR #15: add exception based error handling macros
- PR #29: Add ceildiv functionality
- PR #44: Add get_subcomm and set_subcomm to handle_t
- PR #13: Add RMM_INCLUDE and RMM_LIBRARY options to allow linking to non-conda RMM
- PR #22: Preserve order in comms workers for rank initialization
- PR #38: Remove #include <cudar_utils.h> from
raft/mr/
- PR #39: Adding a virtual destructor to
raft::handle_t
andraft::comms::comms_t
- PR #37: Clean-up CUDA related utilities
- PR #41: Upgrade to
cusparseSpMV()
, alg selection, and rectangular matrices. - PR #45: Add Ampere target to cuda11 cmake
- PR #47: Use gtest conda package in CMake/build.sh by default
- PR #17: Make destructor inline to avoid redeclaration error
- PR #25: Fix bug in handle_t::get_internal_streams
- PR #26: Fix bug in RAFT_EXPECTS (add parentheses surrounding cond)
- PR #34: Fix issue with incorrect docker image being used in local build script
- PR #35: Remove #include <nccl.h> from
raft/error.hpp
- PR #40: Preemptively fixed future CUDA 11 related errors.
- PR #43: Fixed CUDA version selection mechanism for SpMV.
- PR #46: Fix for cpp file extension issue (nvcc-enforced).
- PR #48: Fix gtest target names in cmake build gtest option.
- PR #49: Skip raft comms test if raft module doesn't exist
- Initial RAFT version
- PR #3: defining raft::handle_t, device_buffer, host_buffer, allocator classes
- PR #5: Small build.sh fixes