Skip to content

Radial Basis Functions with Finite Differencing for Shallow Water Equations

Notifications You must be signed in to change notification settings


Repository files navigation


Radial Basis Functions with Finite Differencing for Shallow Water Equations

Last update: Richelle Streater, September 7, 2018


Getting started

Once all folders are on device, navigate to top directory: Gen3/

Compile with the following command: sbatch
Run with the following command: sbatch run/HPCL/ from Gen3/ directory
Expected output: output_cfdl.txt, output_sfdl.txt, output_cfdl_sfdl.txt, output_default.txt in run/HPCL/output/

On personal device:
Compile with the following command: . ./
Run with the following command: . ./run/openCL/
Expected output: output_cfdl.txt, output_sfdl.txt, output_cfdl_sfdl.txt, output_default.txt in run/openCL/output/



To run with OpenCL:
--> OpenCL Version 2.0 or higher
--> Add -L and -I flags to OCL_LIBS/OCL_FLAGS to config.swe if necessary
--> Can set SPLIT_DEV=0 if device does not support subdividing
--> Set OPENCL=1 in config.swe before compiling
--> Set SWE_USE_OCL=1 in run script

To run with OpenMP:
--> Set OPENMP=1 in config.swe before compiling
--> Set OMP_NUM_THREADS in run script

To run with MPI:
--> Set MPI=1 in config.swe before compiling
--> OpenCL with multiple tasks is possible, but device splitting with MPI is not implemented
--> Change LD_LIBRARY_PATH and PATH in run/hpcl/ or run/hpcl/ if necessary

To use NetCDF:
--> Set NCIO=1 in config.swe before compiling
--> Set SWE_USE_NETCDF=1 in run script and set SWE_INPUT_FILE to a .nc file
--> Change NETCDF variable in if necessary

To run with Intel Compiler:
--> Set MPICC=mpiicc and CC=icc in config.swe
--> Load icc module before compiling if necessary
--> Change LD_LIBRARY_PATH and PATH in run/hpcl/ or run/hpcl/ if necessary


Replicating test results in read_output_file/results.xlsx on NCAR HPCL

For all:
--> To compile: "sbatch"
--> To run: "sbatch run/hpcl/" or "sbatch run/hpcl/" from Gen3/ directory
--> To get output files: Copy output files into read_output_file/output and run read_output_file/read_script.cpp

Tabs 10242 through 655362:
--> Set OPT_FLAGS = -O3 in arch/hpcl/config.swe
--> Compile with "CC=gcc" and "MPICC=mpicc" in arch/hpcl/config.swe
--> In run/HPCL/, set SWE_NODES to desired number (ex. 10242)
--> In run/HPCL/, Layout array should be "cfdl sfdl cfdl_sfdl default" and SWE_NODES=40962
--> Run with

OpenMP gcc tab:
--> Set OPT_FLAGS = -O3 in arch/hpcl/config.swe
--> Compile with "CC=gcc" and "MPICC=mpicc" in arch/hpcl/config.swe
--> In run/HPCL/, Layout array should be "cfdl sfdl cfdl_sfdl default" and SWE_NODES=40962
--> Run

OpenMP icc tab:
--> Set OPT_FLAGS = -O3 -xHost in arch/hpcl/config.swe
--> Compile with "CC=icc" and "MPICC=mpiicc" in arch/hpcl/config.swe
--> Repeat steps 1-6 in "OpenMP gcc tab"

OpenMP KMP_aff tab:
--> Compile with "CC=icc" and "MPICC=mpiicc" in arch/hpcl/config.swe
--> In run/HPCL/, set Layout array to "cfdl" and SWE_NODES=40962
--> Set KMP_AFFINITY=compact and run
--> Set KMP_AFFINITY=disabled and run
--> Set KMP_AFFINITY=scatter and run
--> Set KMP_AFFINITY=balanced and run

Aliasing tab:
--> Set "CC=gcc" and "MPICC=mpicc" in arch/hpcl/config.swe
--> Compile with OPT_FLAGS=-fstrict-aliasing in arch/hpcl/config.swe
--> In run/HPCL/, set Layout array to "cfdl" and SWE_NODES=40962
--> Run
--> Compile with OPT_FLAGS=-fno-strict-aliasing in arch/hpcl/config.swe and run with


Directory structure

Top Directory Folders:

Gen3/arch: Contains configuration parameters for hpcl and pascal testing and for general 
gnu or intel setup

Gen3/inputFiles: Contains all binary/netcdf input files for code

Gen3/read_output_file: Contains code to read eval_rhs values from output files

Gen3/run: Contains run scripts for HPCL and general OpenCL setup

Gen3/swe_code: Contains all c/cl code


swe_code folder structure:

--> input.c:  Reads input files, either with binary or NetCDF format, and fills all differentiation matrices, 
state variable matrices, ordering, and constants
--> nc2bin.c: Converts to binary file from .nc format (not called by main function)

--> layout.c: Calls padding/reordering functions for differentiation matrices/state variable matrices
--> matrix_transformations.c: Functions to pad matrices (to allow for tiling/vectorizing) and rearrange based 
on CFDL/SFDL options

--> main.c: Calls reading/reordering functions and calls patch initialization functions (for MPI). Declares
OpenCL objects and compiles kernels, opens device/platform, loads buffers, and sets kernel arguments for 
OpenCL. For n attempts and time steps, calls Runge-Kutta stepping function. Compares results to known array.
--> profiling.c: Use arrays of loop times to determine average time, min/max, and std dev for all operations.
Prints timing results.
--> rk4_rbffd_swe.c: Computes Runge-Kutta step with radial basis function finite differencing algorithm.
--> runtime_params.c: processes external variables set in run script.

--> halos.c: Function for exchanging neighbor node information so that state variable matrix can be divided
among MPI threads
--> init_patches.c: Creates divided matrices and copies read and reordered arrays from thread 0 to other 
MPI threads.

--> buffers.c: converts arrays into OpenCL buffer objects to be passed into the kernels and frees buffers
--> device_setup.c: Creates all OpenCL objects: kernels, devices, platforms, and command queues
--> RK_ocl.c: Version of rk4_rbffd_swe.c that used OpenCL and calls openCL kernels
--> All Runge-Kutta step functions; vectorized along nodes and works for all layouts
--> All RK step functions; vectorized along u,v,w,h in state variable matrices. Only valid for
CFDL layout.

--> rcm.c: Calls all reorder functions for Reverse CutHill-McKee ordering scheme
--> reorder_nodes: Defines mapping for Reverse CutHill-McKee ordering scheme



Radial Basis Functions with Finite Differencing for Shallow Water Equations






No releases published


No packages published