Skip to content

Commit

Permalink
Markdown fixes
Browse files Browse the repository at this point in the history
  • Loading branch information
neon60 committed May 23, 2024
1 parent d8dfffc commit dbfb2c8
Show file tree
Hide file tree
Showing 5 changed files with 107 additions and 70 deletions.
18 changes: 9 additions & 9 deletions docs/how-to/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ HIP code provides the same performance as native CUDA code, plus the benefits of

## What specific version of CUDA does HIP support?

HIP APIs and features do not map to a specific CUDA version. HIP provides a strong subset of the functionality provided in CUDA, and the hipify tools can scan code to identify any unsupported CUDA functions * this is useful for identifying the specific features required by a given application.
HIP APIs and features do not map to a specific CUDA version. HIP provides a strong subset of the functionality provided in CUDA, and the hipify tools can scan code to identify any unsupported CUDA functions - this is useful for identifying the specific features required by a given application.

However, we can provide a rough summary of the features included in each CUDA SDK and the support level in HIP. Each bullet below lists the major new language features in each CUDA release and then indicate which are supported/not supported in HIP:

Expand Down Expand Up @@ -143,11 +143,11 @@ Yes. HIP's CUDA path only exposes the APIs and functionality that work on both
"Extra" APIs, parameters, and features which exist in CUDA but not in HIP-Clang will typically result in compile-time or run-time errors.
Developers need to use the HIP API for most accelerator code and bracket any CUDA-specific code with preprocessor conditionals.
Developers concerned about portability should, of course, run on both platforms, and should expect to tune for performance.
In some cases, CUDA has a richer set of modes for some APIs, and some C++ capabilities such as virtual functions * see the HIP @API documentation for more details.
In some cases, CUDA has a richer set of modes for some APIs, and some C++ capabilities such as virtual functions - see the HIP @API documentation for more details.

## Can I develop HIP code on an AMD HIP-Clang platform?

Yes. HIP's HIP-Clang path only exposes the APIs and functions that work on AMD runtime back ends. "Extra" APIs, parameters and features that appear in HIP-Clang but not CUDA will typically cause compile* or run-time errors. Developers must use the HIP API for most accelerator code and bracket any HIP-Clang specific code with preprocessor conditionals. Those concerned about portability should, of course, test their code on both platforms and should tune it for performance. Typically, HIP-Clang supports a more modern set of C++11/C++14/C++17 features, so HIP developers who want portability should be careful when using advanced C++ features on the HIP-Clang path.
Yes. HIP's HIP-Clang path only exposes the APIs and functions that work on AMD runtime back ends. "Extra" APIs, parameters and features that appear in HIP-Clang but not CUDA will typically cause compile- or run-time errors. Developers must use the HIP API for most accelerator code and bracket any HIP-Clang specific code with preprocessor conditionals. Those concerned about portability should, of course, test their code on both platforms and should tune it for performance. Typically, HIP-Clang supports a more modern set of C++11/C++14/C++17 features, so HIP developers who want portability should be careful when using advanced C++ features on the HIP-Clang path.

## How to use HIP-Clang to build HIP programs?

Expand All @@ -164,9 +164,9 @@ NOTE: If HIP_ROCCLR_HOME is set, there is no need to set HIP_CLANG_PATH since hi

AMD clr (Common Language Runtime) is a repository for the AMD platform, which contains source codes for AMD's compute languages runtimes as follows,

* hipamd * contains implementation of HIP language for AMD GPU.
* rocclr * contains virtual device interfaces that compute runtimes interact with backends, such as ROCr on Linux and PAL on Windows.
* opencl * contains implementation of OpenCL™ on the AMD platform.
* hipamd - contains implementation of HIP language for AMD GPU.
* rocclr - contains virtual device interfaces that compute runtimes interact with backends, such as ROCr on Linux and PAL on Windows.
* opencl - contains implementation of OpenCL™ on the AMD platform.

## What is hipother?

Expand Down Expand Up @@ -194,7 +194,7 @@ HIP is C++ runtime API that supports C style applications as well.
Some C style applications (and interfaces to other languages (FORTRAN, Python)) would call certain HIP APIs but not use kernel programming.
They can be compiled with a C compiler and run correctly, however, small details must be considered in the code. For example, initialization, as shown in the simple application below, uses HIP structs dim3 with the file name "test.hip.cpp"

```C++
```cpp
#include "hip/hip_runtime_api.h"
#include "stdio.h"

Expand All @@ -219,7 +219,7 @@ dim3 grid2 = {1,1,1}; x=1, y=1, z=1
In which "dim3 grid1;" will yield a dim3 grid with all dimensional members x,y,z initalized to 1, as the default constructor behaves that way.
Further, if written:

```C++
```cpp
dim3 grid(2); // yields {2,1,1}
dim3 grid(2,3); yields {2,3,1}
```
Expand All @@ -236,7 +236,7 @@ dim3 grid2 = {1,1,1}; x=1, y=1, z=1
In which "dim3 grid;" does not imply any initialization, no constructor is called, and dimensional values x,y,z of grid are undefined.
NOTE: To get the C++ default behavior, C programmers must additionally specify the right-hand side as shown below,

```C++
```cpp
dim3 grid = {1,1,1}; // initialized as in C++
```
Expand Down
37 changes: 18 additions & 19 deletions docs/how-to/hip_porting_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,8 @@ and provides practical suggestions on how to port CUDA code and work through com
### Scanning existing CUDA code to scope the porting effort

The **[hipexamine-perl.sh](https://github.com/ROCm/HIPIFY/blob/amd-staging/bin/hipexamine-perl.sh)** tool will scan a source directory to determine which files contain CUDA code and how much of that code can be automatically hipified.
```

```shell
> cd examples/rodinia_3.0/cuda/kmeans
> $HIP_DIR/bin/hipexamine-perl.sh.
info: hipify ./kmeans.h =====>
Expand All @@ -40,7 +41,7 @@ hipexamine-perl scans each code file (cpp, c, h, hpp, etc.) found in the specifi
* Files with no CUDA code (ie kmeans.h) print one line summary just listing the source file name.
* Files with CUDA code print a summary of what was found - for example the kmeans_cuda_kernel.cu file:

```bash
```shell
info: hipify ./kmeans_cuda_kernel.cu =====>
info: converted 40 CUDA->HIP refs( dev:0 mem:0 kern:0 builtin:37 math:0 stream:0 event:0
```
Expand Down Expand Up @@ -69,7 +70,6 @@ For each input file FILE, this script will:
This is useful for testing improvements to the hipify toolset.
The [hipconvertinplace-perl.sh](https://github.com/ROCm/HIPIFY/blob/amd-staging/bin/hipconvertinplace-perl.sh) script will perform inplace conversion for all code files in the specified directory.
This can be quite handy when dealing with an existing CUDA code base since the script preserves the existing directory structure
and filenames - and includes work. After converting in-place, you can review the code to add additional parameters to
Expand Down Expand Up @@ -119,21 +119,21 @@ Note, `HIP_PLATFORM_NVCC` was previously defined if the HIP platform targeted NV
Often, it's useful to know whether the underlying compiler is HIP-Clang or nvcc. This knowledge can guard platform-specific code or aid in platform-specific performance tuning.
```C++
```cpp
#ifdef __HIP_PLATFORM_AMD__
// Compiled with HIP-Clang
#endif
```
```C++
```cpp
#ifdef __HIP_PLATFORM_NVIDIA__
// Compiled with nvcc
// Could be compiling with CUDA language extensions enabled (for example, a ".cu file)
// Could be in pass-through mode to an underlying host compile OR (for example, a .cpp file)
```
```C++
```cpp
#ifdef __CUDACC__
// Compiled with nvcc (CUDA language extensions enabled)
```
Expand All @@ -146,7 +146,7 @@ nvcc makes two passes over the code: one for host code and one for device code.
HIP-Clang will have multiple passes over the code: one for the host code, and one for each architecture on the device code.
`__HIP_DEVICE_COMPILE__` is set to a nonzero value when the compiler (HIP-Clang or nvcc) is compiling code for a device inside a `__global__` kernel or for a device function. `__HIP_DEVICE_COMPILE__` can replace #ifdef checks on the `__CUDA_ARCH__` define.
```C++
```cpp
// #ifdef __CUDA_ARCH__
#if __HIP_DEVICE_COMPILE__
```
Expand Down Expand Up @@ -178,7 +178,7 @@ Unlike `__CUDA_ARCH__`, the `__HIP_DEVICE_COMPILE__` value is 1 or undefined, an
Some CUDA code tests `__CUDA_ARCH__` for a specific value to determine whether the machine supports a certain architectural feature. For instance,
```C++
```cpp
#if (__CUDA_ARCH__ >= 130)
// doubles are supported
```
Expand All @@ -187,7 +187,7 @@ This type of code requires special attention, since AMD and CUDA devices have di
The `__HIP_ARCH_*` defines can replace comparisons of `__CUDA_ARCH__` values:
```C++
```cpp
//#if (__CUDA_ARCH__ >= 130) // non-portable
if __HIP_ARCH_HAS_DOUBLES__ { // portable HIP feature query
// doubles are supported
Expand All @@ -200,7 +200,7 @@ For host code, the `__HIP_ARCH__*` defines are set to 0. You should only use the
Host code should query the architecture feature flags in the device properties that hipGetDeviceProperties returns, rather than testing the "major" and "minor" fields directly:
```C++
```cpp
hipGetDeviceProperties(&deviceProp, device);
//if ((deviceProp.major == 1 && deviceProp.minor < 2)) // non-portable
if (deviceProp.arch.hasSharedInt32Atomics) { // portable HIP feature query
Expand All @@ -209,6 +209,7 @@ if (deviceProp.arch.hasSharedInt32Atomics) { // portable HIP feature
```
### Table of Architecture Properties
The table below shows the full set of architectural properties that HIP supports.
|Define (use only in device code) | Device Property (run-time query) | Comment |
Expand Down Expand Up @@ -292,7 +293,6 @@ hipcc adds the necessary libraries for HIP as well as for the accelerator compil
hipcc adds -lm by default to the link command.
## Linking Code With Other Compilers
CUDA code often uses nvcc for accelerator code (defining and launching kernels, typically defined in .cu or .cuh files).
Expand All @@ -302,7 +302,6 @@ In some cases, you must take care to ensure the data types and alignment of the
HIP-Clang generates both device and host code using the same Clang-based compiler. The code uses the same API as gcc, which allows code generated by different gcc-compatible compilers to be linked together. For example, code compiled using HIP-Clang can link with code compiled using "standard" compilers (such as gcc, ICC and Clang). Take care to ensure all compilers use the same standard C++ header and library formats.
### libc++ and libstdc++
hipcc links to libstdc++ by default. This provides better compatibility between g++ and HIP.
Expand Down Expand Up @@ -372,7 +371,7 @@ Kernel code should use ``` __attribute__((amdgpu_flat_work_group_size(<min>,<max
For example:
```C++
```cpp
__global__ void dot(double *a,double *b,const int n) __attribute__((amdgpu_flat_work_group_size(1, 512)))
```
Expand All @@ -388,7 +387,7 @@ For example:
Device Code:
```C++
```cpp
#include<hip/hip_runtime.h>
#include<hip/hip_runtime_api.h>
#include<iostream>
Expand Down Expand Up @@ -438,7 +437,7 @@ To get pointer's memory type in HIP/HIP-Clang, developers should use hipPointerG
For example:
```C++
```cpp
double * ptr;
hipMalloc(reinterpret_cast<void**>(&ptr), sizeof(double));
hipPointerAttribute_t attr;
Expand All @@ -453,7 +452,7 @@ Please note, hipMemoryType enum values are different from cudaMemoryType enum va
For example, on AMD platform, hipMemoryType is defined in hip_runtime_api.h,
```C++
```cpp
typedef enum hipMemoryType {
hipMemoryTypeHost = 0, ///< Memory is physically located on host
hipMemoryTypeDevice = 1, ///< Memory is physically located on device. (see deviceId for specific device)
Expand All @@ -464,7 +463,7 @@ typedef enum hipMemoryType {
```
Looking into CUDA toolkit, it defines cudaMemoryType as following,
```C++
```cpp
enum cudaMemoryType
{
cudaMemoryTypeUnregistered = 0, // Unregistered memory.
Expand Down Expand Up @@ -509,7 +508,7 @@ On an AMD platform, set the AMD_LOG_LEVEL environment variable to log HIP applic
The value of the setting controls different logging level,
```C++
```cpp
enum LogLevel {
LOG_NONE = 0,
LOG_ERROR = 1,
Expand All @@ -522,7 +521,7 @@ LOG_DEBUG = 4
Logging mask is used to print types of functionalities during the execution of HIP application.
It can be set as one of the following values,
```C++
```cpp
enum LogMask {
LOG_API = 1, //!< (0x1) API call
LOG_CMD = 2, //!< (0x2) Kernel and Copy Commands and Barriers
Expand Down
Loading

0 comments on commit dbfb2c8

Please sign in to comment.