-
Notifications
You must be signed in to change notification settings - Fork 12
D3D12
A headless demo for D3D12; no safety rails for the host code whatsoever. Release as a package is planned, but not in the immediate future.
GPUSortingD3D12 includes:
-
DeviceRadixSort: a general purpose 8-bit LSD radix sort using "reduce-then-scan" to perform the inter-threadblock prefix sum of digit counts. Although slower than OneSweep, use this whenever portability is a concern.
-
OneSweep: a general purpose 8-bit LSD radix sort using "chained-scan-with-decoupled-lookback" to perfom the the inter-threadblock prefix sum of digit counts. Use this if targetting desktop setups with somewhat recent hardware.
GPUSortingD3D12 currently supports:
- keys only sorting
- key-value pair supporting
- ascending order sorting
- descending order sorting
- a maximum sorting size of
$2^{32} - 128$ for DeviceRadixSort - a maxmimum sorting size of
$2^{30} - 128$ for OneSweep - Thearling and Smith entropy controlled benchmarking.
GPUSortingD3D12 currently supports the following data types for keys and values:
uint32_t
int32_t
float
-
Clone or fork the repository.
-
Download and install Visual Studio 2019 or greater.
-
Open the solution file with Visual studio. If on Visual 2022, updating the platform toolset from v142 to v143 should work without issue.
-
Build and run the solution.
To use GPUSortingD3D12 create a device and poll its capabilities likes so:
winrt::com_ptr<ID3D12Device> device = InitDevice();
DeviceInfo deviceInfo = GetDeviceInfo(device.get());
Then use the device pointer and information struct to create a sorting object. Use the enums found in GPUSorting.h
to define the capabilities of the sorting object. These paramters cannot be changed without creating a new object.
DeviceRadixSort* dvr = new DeviceRadixSort(
device,
deviceInfo,
GPU_SORTING_ASCENDING,
GPU_SORTING_KEY_FLOAT32,
GPU_SORTING_PAYLOAD_UINT32);
Now run TestAll()
to ensure the sort is running correctly:
dvr->TestAll();
To time the sort's performance on a uniform random distribution run BatchTiming()
. To control the distribution of the keys generated for the tests a la Thearling and Smith, use the ENTROPY_PRESET
enum. If no change to the distribution is desired, use ENTROPY_PRESET_1
.
uint32_t testSize = 1 << 20;
uint32_t testSeed = 12345;
uint32_t testIterations = 100;
dvr->BatchTiming(testSize, testIterations, testSeed, ENTROPY_PRESET_1);
OneSweep does not work on WARP.