Skip to content

Latest commit

 

History

History
20 lines (16 loc) · 1.23 KB

README.md

File metadata and controls

20 lines (16 loc) · 1.23 KB

Introduction

When I reading NVIDIA NCCL Documentation, it said that NCCL does not define specific verbs for sendrecv, gather, gatherv, scatter, scatterv, alltoall, alltoallv, alltoallw, nor neighbor collectives. All those operations can be simply expressed using a combination of ncclSend, ncclRecv, and ncclGroupStart/ncclGroupEnd, similarly to how they can be expressed with MPI_Isend, MPI_Irecv and MPI_Waitall. So I try to use nccl's API ncclSend,ncclRecv,ncclGroupStart,ncclGroupEnd to realize these function:

  1. NCCLSendrecv
  2. NCCLGather
  3. NCCLScatter
  4. NCCLAlltoall

I referenced openmpi's API when writing these APIs.

Build

  1. Use a linux PC
  2. Make sure that openmpi and nccl is installed on your PC
  3. Make sure your cuda version or nvcc could use std::c++17 (my cuda version is 11.4 )
  4. Clone this repo to your disk
  5. cd to any one of three directories and then:
    • make or make all , it will build the binary file
    • make test, it will execute the examples

All the new added functions are in ncclEnhance.h