Skip to content

Commit 0f6afbd

Browse files
authored
Updated README and CMake Version Requirement (GridTools#76)
Fixed out-of-date information in the README and updated the CMake minimum version requirement to the one of GridTools 2.2.
1 parent 0a9eded commit 0f6afbd

File tree

2 files changed

+9
-9
lines changed

2 files changed

+9
-9
lines changed

CMakeLists.txt

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
cmake_minimum_required(VERSION 3.14.5)
1+
cmake_minimum_required(VERSION 3.18.1)
22

33
file(STRINGS "version.txt" _gtbench_version)
44
project(GTBench VERSION ${_gtbench_version} LANGUAGES CXX)

README.md

+8-8
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,8 @@ be installed automatically when building GT Bench with cmake unless specified ot
1010

1111
Further external dependencies are listed below:
1212
Required:
13-
- [CMake](https://cmake.org/) (minimum version 3.14.5)
14-
- [Boost](https://www.boost.org/) (minimun version 1.73.0)
13+
- [CMake](https://cmake.org/) (minimum version 3.18.1)
14+
- [Boost](https://www.boost.org/) (minimum version 1.73.0)
1515
- MPI (for example [OpenMPI](https://github.com/open-mpi/ompi))
1616

1717
Optional:
@@ -35,7 +35,7 @@ The backend can be selected by setting the `GTBENCH_BACKEND` option when configu
3535
```console
3636
$ cmake -DGTBENCH_BACKEND=<BACKEND> ..
3737
```
38-
where `<BACKEND>` must be either `cpu_kfirst`, `cpu_ifirst`, or `gpu`. The `cpu_kfirst` and `cpu_ifirst` backends are two different CPU-backends of GridTools. On modern CPUs with large vector width and/or many cores, the `cpu_ifirst` backend might perform significantly better. On CPUs without vectorization or small vector width and limited parallelism, the `cpu_kfirst` backend might perform better. The `hip` backend currently supports running NVIDIA CUDA-capable GPUs and AMD HIP-capable GPUs.
38+
where `<BACKEND>` must be either `cpu_kfirst`, `cpu_ifirst`, or `gpu`. The `cpu_kfirst` and `cpu_ifirst` backends are two different CPU-backends of GridTools. On modern CPUs with large vector width and/or many cores, the `cpu_ifirst` backend might perform significantly better. On CPUs without vectorization or small vector width and limited parallelism, the `cpu_kfirst` backend might perform better. The `gpu` backend currently supports running NVIDIA CUDA-capable GPUs and AMD HIP-capable GPUs.
3939

4040
### Selecting the GPU Compilation Framework
4141

@@ -56,7 +56,7 @@ where `RUNTIME` can be `ghex_comm`, `gcl`, `simple_mpi`, `single_node`.
5656
- The `simple_mpi` implementation uses a simple MPI 2 sided communication for halo exchanges.
5757
- The `gcl` implementation uses a optimized MPI based communication library shipped with [GridTools](https://gridtools.github.io/gridtools/latest/user_manual/user_manual.html#halo-exchanges).
5858
- The `ghex_comm` option will use highly optimized distributed communication via the GHEX library, designed for best performance at scale.
59-
Additionally, this option will enable a multi-threaded version of the benchmark, where a rank may have more than one sub-domain (over-subscription), which are delegated to separate threads. **Note:** The gridtools computations use openmp threads on the CPU targets which will not be affected by this parameter.
59+
Additionally, this option will enable a multi-threaded version of the benchmark, where a rank may have more than one sub-domain (over-subscription), which are delegated to separate threads. **Note:** The gridtools computations use OpenMP threads on the CPU targets which will not be affected by this parameter.
6060

6161
#### Selecting the Transport Layer for GHEX
6262

@@ -88,9 +88,9 @@ To enable xpmem support, pass additionally the following flags
8888

8989
### Benchmark
9090

91-
The benchmark executable requires the global horizontal domain size as a command line parameter. The simulation will then be performed on a total domain size of `NX×NY×60` grid points. To launch the benchmark use the appropriate MPI launcher (`mpirun`, `mpiexec`, `srun`, or similar):
91+
The benchmark executable requires the domain size as a command line parameter. The simulation will then be performed on a total domain size of `NX×NY×NZ` grid points. To launch the benchmark use the appropriate MPI launcher (`mpirun`, `mpiexec`, `srun`, or similar):
9292
```console
93-
$ mpi_launcher <LAUNCHER_OPTIONS> ./benchmark --domain-size <NX> <NY>
93+
$ mpi_launcher <LAUNCHER_OPTIONS> ./benchmark --domain-size <NX> <NY> <NZ>
9494
```
9595

9696
Example output of a single-node benchmark run:
@@ -106,11 +106,11 @@ Columns per second: 50484.1 (95% confidence: 49908.1 - 50622.6)
106106

107107
For testing, the number of runs (and thus the run time) can be reduced as follows:
108108
```console
109-
$ mpi_launcher <LAUNCHER_OPTIONS> ./benchmark --domain-size <N> <NY> --runs <RUNS>
109+
$ mpi_launcher <LAUNCHER_OPTIONS> ./benchmark --domain-size <NX> <NY> <NZ> --runs <RUNS>
110110
```
111111
For example, run only once:
112112
```console
113-
$ mpi_launcher ./benchmark --domain-size 24000 24000 --runs 1
113+
$ mpi_launcher ./benchmark --domain-size 24000 24000 60 --runs 1
114114
Running GTBENCH
115115
Domain size: 24000x24000x60
116116
Floating-point type: float

0 commit comments

Comments
 (0)