Markdown fixes

ROCm · May 23, 2024 · dbfb2c8 · dbfb2c8
1 parent d8dfffc
commit dbfb2c8
Show file tree

Hide file tree

Showing 5 changed files with 107 additions and 70 deletions.
diff --git a/docs/how-to/faq.md b/docs/how-to/faq.md
@@ -48,7 +48,7 @@ HIP code provides the same performance as native CUDA code, plus the benefits of
 
 ## What specific version of CUDA does HIP support?
 
-HIP APIs and features do not map to a specific CUDA version. HIP provides a strong subset of the functionality provided in CUDA, and the hipify tools can scan code to identify any unsupported CUDA functions * this is useful for identifying the specific features required by a given application.
+HIP APIs and features do not map to a specific CUDA version. HIP provides a strong subset of the functionality provided in CUDA, and the hipify tools can scan code to identify any unsupported CUDA functions - this is useful for identifying the specific features required by a given application.
 
 However, we can provide a rough summary of the features included in each CUDA SDK and the support level in HIP. Each bullet below lists the major new language features in each CUDA release and then indicate which are supported/not supported in HIP:
 
@@ -143,11 +143,11 @@ Yes.  HIP's CUDA path only exposes the APIs and functionality that work on both
 "Extra" APIs, parameters, and features which exist in CUDA but not in HIP-Clang will typically result in compile-time or run-time errors.
 Developers need to use the HIP API for most accelerator code and bracket any CUDA-specific code with preprocessor conditionals.
 Developers concerned about portability should, of course, run on both platforms, and should expect to tune for performance.
-In some cases, CUDA has a richer set of modes for some APIs, and some C++ capabilities such as virtual functions * see the HIP @API documentation for more details.
+In some cases, CUDA has a richer set of modes for some APIs, and some C++ capabilities such as virtual functions - see the HIP @API documentation for more details.
 
 ## Can I develop HIP code on an AMD HIP-Clang platform?
 
-Yes. HIP's HIP-Clang path only exposes the APIs and functions that work on AMD runtime back ends. "Extra" APIs, parameters and features that appear in HIP-Clang but not CUDA will typically cause compile* or run-time errors. Developers must use the HIP API for most accelerator code and bracket any HIP-Clang specific code with preprocessor conditionals. Those concerned about portability should, of course, test their code on both platforms and should tune it for performance. Typically, HIP-Clang supports a more modern set of C++11/C++14/C++17 features, so HIP developers who want portability should be careful when using advanced C++ features on the HIP-Clang path.
+Yes. HIP's HIP-Clang path only exposes the APIs and functions that work on AMD runtime back ends. "Extra" APIs, parameters and features that appear in HIP-Clang but not CUDA will typically cause compile- or run-time errors. Developers must use the HIP API for most accelerator code and bracket any HIP-Clang specific code with preprocessor conditionals. Those concerned about portability should, of course, test their code on both platforms and should tune it for performance. Typically, HIP-Clang supports a more modern set of C++11/C++14/C++17 features, so HIP developers who want portability should be careful when using advanced C++ features on the HIP-Clang path.
 
 ## How to use HIP-Clang to build HIP programs?
 
@@ -164,9 +164,9 @@ NOTE: If HIP_ROCCLR_HOME is set, there is no need to set HIP_CLANG_PATH since hi
 
 AMD clr (Common Language Runtime) is a repository for the AMD platform, which contains source codes for AMD's compute languages runtimes as follows,
 
-* hipamd * contains implementation of HIP language for AMD GPU.
-* rocclr * contains virtual device interfaces that compute runtimes interact with backends, such as ROCr on Linux and PAL on Windows.
-* opencl * contains implementation of OpenCL™ on the AMD platform.
+* hipamd - contains implementation of HIP language for AMD GPU.
+* rocclr - contains virtual device interfaces that compute runtimes interact with backends, such as ROCr on Linux and PAL on Windows.
+* opencl - contains implementation of OpenCL™ on the AMD platform.
 
 ## What is hipother?
 
@@ -194,7 +194,7 @@ HIP is C++ runtime API that supports C style applications as well.
 Some C style applications (and interfaces to other languages (FORTRAN, Python)) would call certain HIP APIs but not use kernel programming.
 They can be compiled with a C compiler and run correctly, however, small details must be considered in the code. For example, initialization, as shown in the simple application below, uses HIP structs dim3 with the file name "test.hip.cpp"
 
-```C++
+```cpp
 #include "hip/hip_runtime_api.h"
 #include "stdio.h"
 
@@ -219,7 +219,7 @@ dim3 grid2 = {1,1,1}; x=1, y=1, z=1
 In which "dim3 grid1;" will yield a dim3 grid with all dimensional members x,y,z initalized to 1, as the default constructor behaves that way.
 Further, if written:
 
-```C++
+```cpp
 dim3 grid(2); // yields {2,1,1}
 dim3 grid(2,3); yields {2,3,1}
 ```
@@ -236,7 +236,7 @@ dim3 grid2 = {1,1,1}; x=1, y=1, z=1
 In which "dim3 grid;" does not imply any initialization, no constructor is called, and dimensional values x,y,z of grid are undefined.
 NOTE: To get the C++ default behavior, C programmers must additionally specify the right-hand side as shown below,
 
-```C++
+```cpp
 dim3 grid = {1,1,1}; // initialized as in C++
 ```
 

diff --git a/docs/how-to/hip_porting_guide.md b/docs/how-to/hip_porting_guide.md
@@ -16,7 +16,8 @@ and provides practical suggestions on how to port CUDA code and work through com
 ### Scanning existing CUDA code to scope the porting effort
 
 The **[hipexamine-perl.sh](https://github.com/ROCm/HIPIFY/blob/amd-staging/bin/hipexamine-perl.sh)** tool will scan a source directory to determine which files contain CUDA code and how much of that code can be automatically hipified.
-```
+
+```shell
 > cd examples/rodinia_3.0/cuda/kmeans
 > $HIP_DIR/bin/hipexamine-perl.sh.
 info: hipify ./kmeans.h =====>
@@ -40,7 +41,7 @@ hipexamine-perl scans each code file (cpp, c, h, hpp, etc.) found in the specifi
    * Files with no CUDA code (ie kmeans.h) print one line summary just listing the source file name.
    * Files with CUDA code print a summary of what was found - for example the kmeans_cuda_kernel.cu file:
 
-```bash
+```shell
 info: hipify ./kmeans_cuda_kernel.cu =====>
   info: converted 40 CUDA->HIP refs( dev:0 mem:0 kern:0 builtin:37 math:0 stream:0 event:0
 ```
@@ -69,7 +70,6 @@ For each input file FILE, this script will:
 
 This is useful for testing improvements to the hipify toolset.
 
-
 The [hipconvertinplace-perl.sh](https://github.com/ROCm/HIPIFY/blob/amd-staging/bin/hipconvertinplace-perl.sh) script will perform inplace conversion for all code files in the specified directory.
 This can be quite handy when dealing with an existing CUDA code base since the script preserves the existing directory structure
 and filenames - and includes work.  After converting in-place, you can review the code to add additional parameters to
@@ -119,21 +119,21 @@ Note, `HIP_PLATFORM_NVCC` was previously defined if the HIP platform targeted NV
 
 Often, it's useful to know whether the underlying compiler is HIP-Clang or nvcc. This knowledge can guard platform-specific code or aid in platform-specific performance tuning.
 
-```C++
+```cpp
 #ifdef __HIP_PLATFORM_AMD__
 // Compiled with HIP-Clang
 #endif
 ```
 
-```C++
+```cpp
 #ifdef __HIP_PLATFORM_NVIDIA__
 // Compiled with nvcc
 //  Could be compiling with CUDA language extensions enabled (for example, a ".cu file)
 //  Could be in pass-through mode to an underlying host compile OR (for example, a .cpp file)
 
 ```
 
-```C++
+```cpp
 #ifdef __CUDACC__
 // Compiled with nvcc (CUDA language extensions enabled)
 ```
@@ -146,7 +146,7 @@ nvcc makes two passes over the code: one for host code and one for device code.
 HIP-Clang will have multiple passes over the code: one for the host code, and one for each architecture on the device code.
 `__HIP_DEVICE_COMPILE__` is set to a nonzero value when the compiler (HIP-Clang or nvcc) is compiling code for a device inside a `__global__` kernel or for a device function. `__HIP_DEVICE_COMPILE__` can replace #ifdef checks on the `__CUDA_ARCH__` define.
 
-```C++
+```cpp
 // #ifdef __CUDA_ARCH__
 #if __HIP_DEVICE_COMPILE__
 ```
@@ -178,7 +178,7 @@ Unlike `__CUDA_ARCH__`, the `__HIP_DEVICE_COMPILE__` value is 1 or undefined, an
 
 Some CUDA code tests `__CUDA_ARCH__` for a specific value to determine whether the machine supports a certain architectural feature. For instance,
 
-```C++
+```cpp
 #if (__CUDA_ARCH__ >= 130)
 // doubles are supported
 ```
@@ -187,7 +187,7 @@ This type of code requires special attention, since AMD and CUDA devices have di
 
 The `__HIP_ARCH_*` defines can replace comparisons of `__CUDA_ARCH__` values:
 
-```C++
+```cpp
 //#if (__CUDA_ARCH__ >= 130)   // non-portable
 if __HIP_ARCH_HAS_DOUBLES__ {  // portable HIP feature query
    // doubles are supported
@@ -200,7 +200,7 @@ For host code, the `__HIP_ARCH__*` defines are set to 0. You should only use the
 
 Host code should query the architecture feature flags in the device properties that hipGetDeviceProperties returns, rather than testing the "major" and "minor" fields directly:
 
-```C++
+```cpp
 hipGetDeviceProperties(&deviceProp, device);
 //if ((deviceProp.major == 1 && deviceProp.minor < 2))  // non-portable
 if (deviceProp.arch.hasSharedInt32Atomics) {            // portable HIP feature query
@@ -209,6 +209,7 @@ if (deviceProp.arch.hasSharedInt32Atomics) {            // portable HIP feature
 ```
 
 ### Table of Architecture Properties
+
 The table below shows the full set of architectural properties that HIP supports.
 
 |Define (use only in device code) | Device Property (run-time query) | Comment |
@@ -292,7 +293,6 @@ hipcc adds the necessary libraries for HIP as well as for the accelerator compil
 
 hipcc adds -lm by default to the link command.
 
-
 ## Linking Code With Other Compilers
 
 CUDA code often uses nvcc for accelerator code (defining and launching kernels, typically defined in .cu or .cuh files).
@@ -302,7 +302,6 @@ In some cases, you must take care to ensure the data types and alignment of the
 
 HIP-Clang generates both device and host code using the same Clang-based compiler. The code uses the same API as gcc, which allows code generated by different gcc-compatible compilers to be linked together. For example, code compiled using HIP-Clang can link with code compiled using "standard" compilers (such as gcc, ICC and Clang). Take care to ensure all compilers use the same standard C++ header and library formats.
 
-
 ### libc++ and libstdc++
 
 hipcc links to libstdc++ by default. This provides better compatibility between g++ and HIP.
@@ -372,7 +371,7 @@ Kernel code should use ``` __attribute__((amdgpu_flat_work_group_size(<min>,<max
 
 For example:
 
-```C++
+```cpp
 __global__ void dot(double *a,double *b,const int n) __attribute__((amdgpu_flat_work_group_size(1, 512)))
 ```
 
@@ -388,7 +387,7 @@ For example:
 
 Device Code:
 
-```C++
+```cpp
 #include<hip/hip_runtime.h>
 #include<hip/hip_runtime_api.h>
 #include<iostream>
@@ -438,7 +437,7 @@ To get pointer's memory type in HIP/HIP-Clang, developers should use hipPointerG
 
 For example:
 
-```C++
+```cpp
 double * ptr;
 hipMalloc(reinterpret_cast<void**>(&ptr), sizeof(double));
 hipPointerAttribute_t attr;
@@ -453,7 +452,7 @@ Please note, hipMemoryType enum values are different from cudaMemoryType enum va
 
 For example, on AMD platform, hipMemoryType is defined in hip_runtime_api.h,
 
-```C++
+```cpp
 typedef enum hipMemoryType {
     hipMemoryTypeHost = 0,    ///< Memory is physically located on host
     hipMemoryTypeDevice = 1,  ///< Memory is physically located on device. (see deviceId for specific device)
@@ -464,7 +463,7 @@ typedef enum hipMemoryType {
 ```
 Looking into CUDA toolkit, it defines cudaMemoryType as following,
 
-```C++
+```cpp
 enum cudaMemoryType
 {
   cudaMemoryTypeUnregistered = 0, // Unregistered memory.
@@ -509,7 +508,7 @@ On an AMD platform, set the AMD_LOG_LEVEL environment variable to log HIP applic
 
 The value of the setting controls different logging level,
 
-```C++
+```cpp
 enum LogLevel {
 LOG_NONE = 0,
 LOG_ERROR = 1,
@@ -522,7 +521,7 @@ LOG_DEBUG = 4
 Logging mask is used to print types of functionalities during the execution of HIP application.
 It can be set as one of the following values,
 
-```C++
+```cpp
 enum LogMask {
   LOG_API       = 1,      //!< (0x1)     API call
   LOG_CMD       = 2,      //!< (0x2)     Kernel and Copy Commands and Barriers