Landing page update

ROCm · Nov 20, 2024 · 41cb111 · 41cb111
1 parent a53c773
commit 41cb111
Show file tree

Hide file tree

Showing 5 changed files with 107 additions and 31 deletions.
diff --git a/.wordlist.txt b/.wordlist.txt
@@ -116,6 +116,8 @@ omnitrace
 overindex
 overindexing
 oversubscription
+overutilized
+parallelizable
 pixelated
 pragmas
 preallocated
@@ -154,6 +156,7 @@ texels
 tradeoffs
 templated
 toolkits
+transfering
 typedefs
 unintuitive
 UMM

diff --git a/docs/index.md b/docs/index.md
@@ -1,34 +1,29 @@
-# HIP documentation
-
-The Heterogeneous-computing Interface for Portability (HIP) is a C++ runtime API and kernel language that lets you create portable applications for AMD and NVIDIA GPUs from a single source code. For more information, see [What is HIP?](./what_is_hip)
-
-Installation instructions are available from:
+<head>
+  <meta charset="UTF-8">
+  <meta name="description" content="HIP documentation and programming guide.">
+  <meta name="keywords" content="HIP, Heterogeneous-computing Interface for Portability, HIP programming guide">
+</head>
 
-* [Installing HIP](./install/install)
-* [Building HIP from source](./install/build)
-
-HIP enabled GPUs:  
+# HIP documentation
 
-* [Supported AMD GPUs on Linux](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html#supported-gpus)
-* [Supported AMD GPUs on Windows](https://rocm.docs.amd.com/projects/install-on-windows/en/latest/reference/system-requirements.html#windows-supported-gpus)
-* [Supported NVIDIA GPUs](https://developer.nvidia.com/cuda-gpus)
+The Heterogeneous-computing Interface for Portability (HIP) is a C++ runtime API
+and kernel language that lets you create portable applications for AMD and
+NVIDIA GPUs from a single source code. For more information, see [What is HIP?](./what_is_hip)
 
 The HIP documentation is organized into the following categories:
 
 ::::{grid} 1 2 2 2
 :gutter: 3
 
-:::{grid-item-card} Conceptual
+:::{grid-item-card} Programming guide
 
+* [Introduction](./programming_guide)
 * {doc}`./understand/programming_model`
 * {doc}`./understand/hardware_implementation`
-* {doc}`./understand/amd_clr`
 * {doc}`./understand/compilers`
-
-:::
-
-:::{grid-item-card} How to
-
+* {doc}`./how-to/performance_guidelines`
+* [Debugging with HIP](./how-to/debugging)
+* {doc}`./how-to/logging`
 * {doc}`./how-to/hip_runtime_api`
   * {doc}`./how-to/hip_runtime_api/initialization`
   * {doc}`./how-to/hip_runtime_api/memory_management`
@@ -40,9 +35,7 @@ The HIP documentation is organized into the following categories:
 * [HIP porting guide](./how-to/hip_porting_guide)
 * [HIP porting: driver API guide](./how-to/hip_porting_driver_api)
 * {doc}`./how-to/hip_rtc`
-* {doc}`./how-to/performance_guidelines`
-* [Debugging with HIP](./how-to/debugging)
-* {doc}`./how-to/logging`
+* {doc}`./understand/amd_clr`
 
 :::
 

diff --git a/docs/programming_guide.rst b/docs/programming_guide.rst
@@ -0,0 +1,79 @@
+.. meta::
+    :description: HIP programming guide introduction
+    :keywords: HIP programming guide introduction, HIP programming guide
+
+.. _hip-programming-guide:
+
+********************************************************************************
+HIP programming guide introduction
+********************************************************************************
+
+This topic provides key HIP programming concepts and links to more detailed information. 
+
+Write GPU Kernels for Parallel Execution
+================================================================================
+
+To make the most of the parallelism inherent to GPUs, a thorough understanding
+of the :ref:`programming model <programming_model>` is helpful. The HIP
+programming model is designed to make it easy to map data-parallel algorithms to
+architecture of the GPUs. HIP employs the SIMT-model (Single
+Instruction Multiple Threads) with a multi-layered thread hierarchy for
+efficient execution.
+
+Understand the Target Architecture (CPU and GPU)
+================================================================================
+
+The :ref:`hardware implementation <hardware_implementation>` topic outlines the
+GPUs supported by HIP. In general, GPUs are made up of Compute Units that excel
+at executing parallelizable, computationally intensive workloads without complex
+control-flow.
+
+Increase parallelism on multiple level
+================================================================================
+
+To maximize performance and keep all system components fully utilized, the
+application should expose and efficiently manage as much parallelism as possible.
+:ref:`Parallel execution <parallel execution>` can be achieved at the
+application, device, and multiprocessor levels.
+
+The application’s host and device operations can achieve parallel execution
+through asynchronous calls, streams, or HIP graphs. On the device level,
+multiple kernels can execute concurrently when resources are available, and at
+the multiprocessor level, developers can overlap data transfers with
+computations to further optimize performance.
+
+Memory management
+================================================================================
+
+GPUs generally have their own distinct memory, also called :ref:`device
+memory <device_memory>`, separate from the :ref:`host memory <host_memory>`.
+Device memory needs to be managed separately from the host memory. This includes
+allocating the memory and transfering it between the host and the device. These
+operations can be performance critical, so it's important to know how to use
+them effectively. For more information, see :ref:`Memory management <memory_management>`.
+
+Synchronize CPU and GPU Workloads
+================================================================================
+
+Tasks on the host and devices run asynchronously, so proper synchronization is
+needed when dependencies between those tasks exist. The asynchronous execution of
+tasks is useful for fully utilizing the available resources. Even when only a
+single device is available, memory transfers and the execution of tasks can be
+overlapped with asynchronous execution.
+
+Error Handling
+================================================================================
+
+All functions in the HIP runtime API return an error value of type
+:cpp:enum:`hipError_t` that can be used to verify whether the function was
+successfully executed. It's important to confirm these
+returned values, in order to catch and handle those errors, if possible.
+An exception is kernel launches, which don't return any value. These
+errors can be caught with specific functions like :cpp:func:`hipGetLastError()`.
+
+Multi-GPU and Load Balancing
+================================================================================
+
+Large-scale applications that need more compute power can use multiple GPUs in
+the system. This requires distributing workloads across multiple GPUs to balance
+the load to prevent GPUs from being overutilized while others are idle.
diff --git a/docs/sphinx/_toc.yml.in b/docs/sphinx/_toc.yml.in
@@ -22,15 +22,16 @@ subtrees:
   - url: https://developer.nvidia.com/cuda-gpus
     title: NVIDIA supported GPUs
 
-- caption: Conceptual
+- caption: Programming guide
   entries:
+  - file: programming_guide
+    title: Introduction
   - file: understand/programming_model
   - file: understand/hardware_implementation
-  - file: understand/amd_clr
   - file: understand/compilers
-
-- caption: How to
-  entries:
+  - file: how-to/performance_guidelines
+  - file: how-to/debugging
+  - file: how-to/logging
   - file: how-to/hip_runtime_api
     subtrees:
     - entries:
@@ -55,9 +56,7 @@ subtrees:
   - file: how-to/hip_porting_guide
   - file: how-to/hip_porting_driver_api
   - file: how-to/hip_rtc
-  - file: how-to/performance_guidelines
-  - file: how-to/debugging
-  - file: how-to/logging
+  - file: understand/amd_clr
 
 - caption: Reference
   entries:

diff --git a/docs/understand/programming_model.rst b/docs/understand/programming_model.rst
@@ -2,7 +2,9 @@
   :description: This chapter explains the HIP programming model, the contract
                 between the programmer and the compiler/runtime executing the
                 code, how it maps to the hardware.
-  :keywords: AMD, ROCm, HIP, CUDA, API design
+  :keywords: ROCm, HIP, CUDA, API design, programming model
+
+.. _programming_model:
 
 *******************************************************************************
 HIP programming model