Avoid scalar indexing with GPU arrays #40

jipolanco · 2022-01-27T09:44:07Z

For GPU arrays, transpositions and other operations are now performed completely on the GPU (as far as I can tell...), avoiding slow scalar indexing.

Well, for now this has just been tested with the reference implementation of GPUArrays.jl (JLArray), which is implemented on CPUs.

It would be nice to test things with CuArrays. For that, one just needs to add CuArray to the list of array types tested in test/array_types.jl. @corentin-dev let me know if you can try that out.

For now I have no idea how the transposition of GPU arrays actually performs, and it would be nice to have some benchmarks. There are still some things that can be improved. In particular, when using dimension permutations (enabled by default in PencilFFTs), there are some additional allocations that should be taken care of.

This PR closes #21 (but can be reopened if stuff is missing).

codecov-commenter · 2022-01-27T09:55:45Z

Codecov Report

Merging #40 (30474ed) into master (aaa806b) will increase coverage by 0.02%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master      #40      +/-   ##
==========================================
+ Coverage   97.15%   97.17%   +0.02%     
==========================================
  Files          17       18       +1     
  Lines         983     1026      +43     
==========================================
+ Hits          955      997      +42     
- Misses         28       29       +1

Impacted Files	Coverage Δ
src/Transpositions/Transpositions.jl	`98.09% <100.00%> (+0.30%)`	⬆️
src/gather.jl	`100.00% <100.00%> (ø)`
src/random.jl	`100.00% <100.00%> (ø)`
src/arrays.jl	`95.14% <0.00%> (-0.98%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update aaa806b...30474ed. Read the comment docs.

For now scalar indexing can be avoided when dimension permutations are disabled...

jipolanco added 5 commits January 27, 2022 15:19

Avoid scalar indexing in rand! / randn!

18bc59e

Avoid scalar indexing in gather

73dc3f6

Simplify signatures of transposition functions

19d2a64

Add copy_range! definition for GPU arrays

7be6b40

First attempt at transpositions on GPUs

e6f69a4

For now scalar indexing can be avoided when dimension permutations are disabled...

jipolanco force-pushed the gpu-indexing branch from dc356a9 to e6f69a4 Compare January 27, 2022 14:19

jipolanco marked this pull request as ready for review January 27, 2022 14:46

Perform dimension permutations on the GPU

30474ed

jipolanco merged commit 938cbb3 into master Jan 28, 2022

jipolanco deleted the gpu-indexing branch January 28, 2022 14:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid scalar indexing with GPU arrays #40

Avoid scalar indexing with GPU arrays #40

jipolanco commented Jan 27, 2022 •

edited

Loading

codecov-commenter commented Jan 27, 2022 •

edited

Loading

Avoid scalar indexing with GPU arrays #40

Avoid scalar indexing with GPU arrays #40

Conversation

jipolanco commented Jan 27, 2022 • edited Loading

codecov-commenter commented Jan 27, 2022 • edited Loading

Codecov Report

jipolanco commented Jan 27, 2022 •

edited

Loading

codecov-commenter commented Jan 27, 2022 •

edited

Loading