Skip to content

Commit

Permalink
Fix and update a few things in the documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
bernhardmgruber committed Sep 1, 2023
1 parent ecf917f commit 95cab88
Show file tree
Hide file tree
Showing 6 changed files with 35 additions and 62 deletions.
46 changes: 18 additions & 28 deletions docs/source/basic/cheatsheet.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,9 +40,12 @@ Define accelerator type (CUDA, OpenMP,etc.)
.. code-block:: c++

AccGpuCudaRt,
AccGpuHipRt,
AccCpuSycl,
AccFpgaSyclIntel,
AccGpuSyclIntel,
AccCpuOmp2Blocks,
AccCpuOmp2Threads,
AccCpuOmp4,
AccCpuTbbBlocks,
AccCpuThreads,
AccCpuSerial
Expand Down Expand Up @@ -126,9 +129,9 @@ Create a view to host memory represented by a pointer
.. code-block:: c++

using Dim = alpaka::DimInt<1u>;
Vec<Dim, Idx> extent = value;
DataType* date = new DataType[extent[0]];
auto hostView = createView(devHost, data, extent);
Vec<Dim, Idx> extent = size;
DataType* ptr = ...;
auto hostView = createView(devHost, ptr, extent);

Create a view to host std::vector
.. code-block:: c++
Expand All @@ -139,7 +142,7 @@ Create a view to host std::vector
Create a view to host std::array
.. code-block:: c++

std::vector<DataType, 2> array = {42u, 23};
std::array<DataType, 2> array = {42u, 23};
auto hostView = createView(devHost, array);

Get a raw pointer to a buffer or view initialization, etc.
Expand All @@ -148,11 +151,6 @@ Get a raw pointer to a buffer or view initialization, etc.
DataType* raw = view::getPtrNative(bufHost);
DataType* rawViewPtr = view::getPtrNative(hostView);

Get an accessor to a buffer and the accessor's type (experimental)
.. code-block:: c++

experimental::BufferAccessor<Acc, Elem, N, AccessTag> a = experimental::access(buffer);

Allocate a buffer in device memory
.. code-block:: c++

Expand Down Expand Up @@ -230,21 +228,14 @@ Access multi-dimensional indices and extents of blocks, threads, and elements

auto idx = getIdx<Origin, Unit>(acc);
auto extent = getWorkDiv<Origin, Unit>(acc);
// Origin: Grid, Block, Thread
// Unit: Blocks, Threads, Elems

Origin:
.. code-block:: c++

Grid, Block, Thread

Unit:
.. code-block:: c++

Blocks, Threads, Elems

Access components of multi-dimensional indices and extents
Access components of and destructuremulti-dimensional indices and extents
.. code-block:: c++

auto idxX = idx[0];
auto [z, y, x] = extent3D;

Linearize multi-dimensional vectors
.. code-block:: c++
Expand All @@ -258,7 +249,8 @@ Linearize multi-dimensional vectors
Allocate static shared memory variable
.. code-block:: c++

Type & var = declareSharedVar<Type, __COUNTER__>(acc);
Type& var = declareSharedVar<Type, __COUNTER__>(acc); // scalar
auto& arr = declareSharedVar<float[256], __COUNTER__>(acc); // array

Get dynamic shared memory pool, requires the kernel to specialize
.. code-block:: c++
Expand All @@ -275,12 +267,10 @@ Atomic operations
.. code-block:: c++

auto result = atomicOp<Operation>(acc, arguments);

Operations:
.. code-block:: c++

AtomicAdd, AtomicSub, AtomicMin, AtomicMax, AtomicExch,
AtomicInc, AtomicDec, AtomicAnd, AtomicOr, AtomicXor, AtomicCas
// Operation: AtomicAdd, AtomicSub, AtomicMin, AtomicMax, AtomicExch,
// AtomicInc, AtomicDec, AtomicAnd, AtomicOr, AtomicXor, AtomicCas
// Also dedicated functions available, e.g.:
auto old = atomicAdd(acc, ptr, 1);

Memory fences on block-, grid- or device level (guarantees LoadLoad and StoreStore ordering)
.. code-block:: c++
Expand Down
29 changes: 8 additions & 21 deletions docs/source/basic/example.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,33 +27,25 @@ The following example shows a minimal example of a ``CMakeLists.txt`` that uses
:caption: CMakeLists.txt
cmake_minimum_required(VERSION 3.22)
set(_TARGET_NAME myProject)
project(${_TARGET_NAME})
project("myexample" CXX)
find_package(alpaka REQUIRED)
alpaka_add_executable(${_TARGET_NAME} helloWorld.cpp)
target_link_libraries(
${_TARGET_NAME}
PUBLIC alpaka::alpaka)
alpaka_add_executable(${PROJECT_NAME} helloWorld.cpp)
target_link_libraries(${PROJECT_NAME} PUBLIC alpaka::alpaka)
In the CMake configuration phase of the project, you must activate the accelerator you want to use:

.. code-block:: bash
cd <path/to/the/project/root>
mkdir build && cd build
# enable the CUDA accelerator
cmake .. -Dalpaka_ACC_GPU_CUDA_ENABLE=ON
# compile and link
cmake --build .
# execute application
./myProject
./myexample
A complete list of CMake flags for the accelerator can be found :doc:`here </advanced/cmake>`.

If the configuration was successful and CMake found the CUDA SDK, the C++ template accelerator type ``alpaka::acc::AccGpuCudaRt`` is available.
If the configuration was successful and CMake found the CUDA SDK, the C++ template accelerator type ``alpaka::AccGpuCudaRt`` is available.

Use alpaka via ``add_subdirectory``
-----------------------------------
Expand All @@ -64,15 +56,10 @@ The ``add_subdirectory`` method does not require alpaka to be installed. Instead
:caption: CMakeLists.txt
cmake_minimum_required(VERSION 3.22)
set(_TARGET_NAME myProject)
project(${_TARGET_NAME})
project("myexample" CXX)
add_subdirectory(thirdParty/alpaka)
alpaka_add_executable(${_TARGET_NAME} helloWorld.cpp)
target_link_libraries(
${_TARGET_NAME}
PUBLIC alpaka::alpaka)
alpaka_add_executable(${PROJECT_NAME} helloWorld.cpp)
target_link_libraries(${PROJECT_NAME} PUBLIC alpaka::alpaka)
The CMake configure and build commands are the same as for the ``find_package`` approach.
9 changes: 2 additions & 7 deletions docs/source/basic/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -44,12 +44,7 @@ By default, no accelerator is enabled because some combinations of compilers and

.. code-block::
# create build folder
mkdir build && cd build
# run cmake configure with enable CUDA backend
cmake -Dalpaka_ACC_GPU_CUDA_ENABLE=ON ..
# compile source code
cmake --build .
cmake -Dalpaka_ACC_GPU_CUDA_ENABLE=ON ...
In the overview of :doc:`cmake arguments </advanced/cmake>` you will find all CMake flags for activating the different accelerators. How to select an accelerator in the source code is described on the :doc:`example page </basic/example>`.

Expand All @@ -60,4 +55,4 @@ In the overview of :doc:`cmake arguments </advanced/cmake>` you will find all CM

.. hint::

When the test or examples are activated, the alpaka build system automatically activates the ``serial backend``, as it is needed for many tests. Therefore, the tests are run with the ``serial backend`` by default. If you want to test another backend, you have to activate it at CMake configuration time, for example the ``HIP`` backend: ``cmake .. -DBUILD_TESTING=ON -Dalpaka_ACC_GPU_HIP_ENABLE=ON``. The alpaka tests use a selector algorithm to choose a specific accelerator for the test cases. The selector works with accelerator priorities. Therefore, it is recommended to enable only one accelerator for a build to make sure that the right one is used.
When the test or examples are activated, the alpaka build system automatically activates the ``serial backend``, as it is needed for many tests. Therefore, the tests are run with the ``serial backend`` by default. If you want to test another backend, you have to activate it at CMake configuration time, for example the ``HIP`` backend: ``cmake .. -DBUILD_TESTING=ON -Dalpaka_ACC_GPU_HIP_ENABLE=ON``. Some alpaka tests use a selector algorithm to choose a specific accelerator for the test cases. The selector works with accelerator priorities. Therefore, it is recommended to enable only one accelerator for a build to make sure that the right one is used.
6 changes: 3 additions & 3 deletions docs/source/basic/intro.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@ Introduction
The *alpaka* library defines and implements an abstract interface for the *hierarchical redundant parallelism* model.
This model exploits task- and data-parallelism as well as memory hierarchies at all levels of current multi-core architectures.
This allows to achieve performance portability across various types of accelerators by ignoring specific unsupported levels and utilizing only the ones supported on a specific accelerator.
All hardware types (multi- and many-core CPUs, GPUs and other accelerators) are treated and can be programmed in the same way.
The *alpaka* library provides back-ends for *CUDA*, *OpenMP*, *HIP* and other methods.
The policy-based C++ template interface provided allows for straightforward user-defined extension of the library to support other accelerators.
All hardware types (CPUs, GPUs and other accelerators) are treated and can be programmed in the same way.
The *alpaka* library provides back-ends for *CUDA*, *OpenMP*, *HIP*, *SYCL* and other technologies.
The trait-based C++ template interface provided allows for straightforward user-defined extension of the library to support other accelerators.

The library name *alpaka* is an acronym standing for **A**\ bstraction **L**\ ibrary for **Pa**\ rallel **K**\ ernel **A**\ cceleration.

Expand Down
2 changes: 1 addition & 1 deletion docs/source/basic/library.rst
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,7 @@ Kernels can also be defined via lambda expressions.
}
.. attention::
The Nvidia ``nvcc`` does not support generic lambdas which are marked with `__device__`, which is what `ALPAKA_FN_ACC` expands to (among others) when the CUDA backend is active.
NVIDIA's ``nvcc`` compiler does not support generic lambdas which are marked with `__device__`, which is what `ALPAKA_FN_ACC` expands to (among others) when the CUDA backend is active.
Therefore, a workaround is required. The type of the ``acc`` must be defined outside the lambda.

.. code-block:: cpp
Expand Down
5 changes: 3 additions & 2 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,11 @@

*alpaka - An Abstraction Library for Parallel Kernel Acceleration*

The alpaka library is a header-only C++14 abstraction library for accelerator development. Its aim is to provide performance portability across accelerators through the abstraction (not hiding!) of the underlying levels of parallelism.
The alpaka library is a header-only C++17 abstraction library for accelerator development.
Its aim is to provide performance portability across accelerators through the abstraction (not hiding!) of the underlying levels of parallelism.

.. CAUTION::
The readthedocs pages are work in progress and contain outdated sections.
The readthedocs pages are provided with best effort, but may contain outdated sections.

alpaka - How to Read This Document
----------------------------------
Expand Down

0 comments on commit 95cab88

Please sign in to comment.