-
Notifications
You must be signed in to change notification settings - Fork 2
nchong/scan
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Scan implementations using OpenCL ================================= This project contains some simple exclusive scan (re)implementations for arbitrary length arrays: - harris[0] is a vanilla scan - sengupta[1] is a segmented scan. SCAN IN A NUTSHELL ------------------ Suppose you have a bunch of threads that each produce an arbitrary number of outputs. For example, thread 0 outputs 3 values (a,b,c) thread 1 outputs 0 values () thread 2 outputs 2 values (i,j) thread 3 outputs 1 values (x). It is not known statically now many values a thread will produce (but you do know how many will be produced in total, in this case 6). You want to store these outputs in a contiguous array: 0 1 2 3 4 5 +---+---+---+---+---+---+ | a | b | c | i | j | x | +---+---+---+---+---+---+ How can you do this? You need an exclusive scan operation! Taking the array containing the number of outputs for each thread: t0 t1 t2 t3 +---+---+---+---+ | 3 | 0 | 2 | 1 | +---+---+---+---+ An exclusive scan operation calculates, for each index i, the sum of the elements preceding it. That is, for our example: t0 t1 t2 t3 +---+---+---+---+ | 0 | 3 | 3 | 5 | +---+---+---+---+ Which is exactly the offset into our contiguous array that each thread needs to write. It turns out this is a very useful operation (see Blelloch[2]) and can be put to good use for all sorts of goodness. Fortunately, there are work-efficient algorithms for computing scans in parallel. READMORE -------- http://en.wikipedia.org/wiki/Prefix_sum [0] HARRIS [Parallel Prefix Sum (Scan) with CUDA](http://developer.download.nvidia.com/compute/cuda/1_1/Website/projects/scan/doc/scan.pdf) [1] SENGUPTA ET AL [Scan Primitives for GPU Computing](http://www.google.co.uk/url?sa=t&source=web&cd=1&ved=0CCgQFjAA&url=http%3A%2F%2Fciteseer.ist.psu.edu%2Fviewdoc%2Fdownload%3Bjsessionid%3D1190FF7DA52704424448D3AFDDF1AE40%3Fdoi%3D10.1.1.131.3326%26rep%3Drep1%26type%3Dpdf&ei=CpOMTqa4HoWWhQfRoIHgAw&usg=AFQjCNHqpkKwQMHosfwVNEmWm1dFI9CM0g) [2] BLELLOCH [Prefix Sums and Their Applications](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.6230&rep=rep1&type=pdf)
About
Scan (prefix-sum) on OpenCL
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published