Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PTX: Add tcgen05 instructions #3607

Merged
merged 11 commits into from
Jan 30, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions docs/libcudacxx/ptx/instructions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,16 @@ PTX Instructions
instructions/red_async
instructions/st_async
instructions/st_bulk
instructions/tcgen05_alloc
instructions/tcgen05_commit
instructions/tcgen05_cp
instructions/tcgen05_fence
instructions/tcgen05_ld
instructions/tcgen05_mma
instructions/tcgen05_mma_ws
instructions/tcgen05_shift
instructions/tcgen05_st
instructions/tcgen05_wait
instructions/tensormap_replace
instructions/tensormap_cp_fenceproxy
instructions/special_registers
Expand Down
9 changes: 9 additions & 0 deletions docs/libcudacxx/ptx/instructions/tcgen05_alloc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
.. _libcudacxx-ptx-instructions-tcgen05-alloc:

tcgen05.alloc
=============

- PTX ISA:
`tcgen05.alloc <https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#tensorcore-5th-generation-instructions-tcgen05-alloc-tcgen05-dealloc-tcgen05-relinquish-alloc-permit>`__

.. include:: generated/tcgen05_alloc.rst
9 changes: 9 additions & 0 deletions docs/libcudacxx/ptx/instructions/tcgen05_commit.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
.. _libcudacxx-ptx-instructions-tcgen05-commit:

tcgen05.commit
==============

- PTX ISA:
`tcgen05.commit <https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#tensorcore-5th-generation-instructions-tcgen05-commit>`__

.. include:: generated/tcgen05_commit.rst
9 changes: 9 additions & 0 deletions docs/libcudacxx/ptx/instructions/tcgen05_cp.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
.. _libcudacxx-ptx-instructions-tcgen05-cp:

tcgen05.cp
==========

- PTX ISA:
`tcgen05.cp <https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#tensorcore-5th-generation-instructions-tcgen05-cp>`__

.. include:: generated/tcgen05_cp.rst
9 changes: 9 additions & 0 deletions docs/libcudacxx/ptx/instructions/tcgen05_fence.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
.. _libcudacxx-ptx-instructions-tcgen05-fence:

tcgen05.fence
=============

- PTX ISA:
`tcgen05.fence <https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#tensorcore-5th-generation-instructions-tcgen05-fence>`__

.. include:: generated/tcgen05_fence.rst
9 changes: 9 additions & 0 deletions docs/libcudacxx/ptx/instructions/tcgen05_ld.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
.. _libcudacxx-ptx-instructions-tcgen05-ld:

tcgen05.ld
==========

- PTX ISA:
`tcgen05.ld <https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#tensorcore-5th-generation-instructions-tcgen05-ld>`__

.. include:: generated/tcgen05_ld.rst
9 changes: 9 additions & 0 deletions docs/libcudacxx/ptx/instructions/tcgen05_mma.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
.. _libcudacxx-ptx-instructions-tcgen05-mma:

tcgen05.mma
===========

- PTX ISA:
`tcgen05.mma <https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#tensorcore-5th-generation-instructions-tcgen05-mma>`__

.. include:: generated/tcgen05_mma.rst
9 changes: 9 additions & 0 deletions docs/libcudacxx/ptx/instructions/tcgen05_mma_ws.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
.. _libcudacxx-ptx-instructions-tcgen05-mma-ws:

tcgen05.mma.ws
==============

- PTX ISA:
`tcgen05.mma.ws <https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#tensorcore-5th-generation-instructions-tcgen05-mma-ws>`__

.. include:: generated/tcgen05_mma_ws.rst
9 changes: 9 additions & 0 deletions docs/libcudacxx/ptx/instructions/tcgen05_shift.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
.. _libcudacxx-ptx-instructions-tcgen05-shift:

tcgen05.shift
=============

- PTX ISA:
`tcgen05.shift <https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#tensorcore-5th-generation-instructions-tcgen05-shift>`__

.. include:: generated/tcgen05_shift.rst
9 changes: 9 additions & 0 deletions docs/libcudacxx/ptx/instructions/tcgen05_st.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
.. _libcudacxx-ptx-instructions-tcgen05-st:

tcgen05.st
==========

- PTX ISA:
`tcgen05.st <https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#tensorcore-5th-generation-instructions-tcgen05-st>`__

.. include:: generated/tcgen05_st.rst
9 changes: 9 additions & 0 deletions docs/libcudacxx/ptx/instructions/tcgen05_wait.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
.. _libcudacxx-ptx-instructions-tcgen05-wait:

tcgen05.wait
============

- PTX ISA:
`tcgen05.wait <https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#tensorcore-5th-generation-instructions-tcgen05-wait>`__

.. include:: generated/tcgen05_wait.rst
37 changes: 37 additions & 0 deletions libcudacxx/include/cuda/__ptx/instructions/tcgen05_alloc.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
// -*- C++ -*-
//===----------------------------------------------------------------------===//
//
// Part of libcu++, the C++ Standard Library for your entire system,
// under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
// SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES.
//
//===----------------------------------------------------------------------===//

#ifndef _CUDA_PTX_TCGEN05_ALLOC_H_
#define _CUDA_PTX_TCGEN05_ALLOC_H_

#include <cuda/std/detail/__config>

#if defined(_CCCL_IMPLICIT_SYSTEM_HEADER_GCC)
# pragma GCC system_header
#elif defined(_CCCL_IMPLICIT_SYSTEM_HEADER_CLANG)
# pragma clang system_header
#elif defined(_CCCL_IMPLICIT_SYSTEM_HEADER_MSVC)
# pragma system_header
#endif // no system header

#include <cuda/__ptx/ptx_dot_variants.h>
#include <cuda/__ptx/ptx_helper_functions.h>
#include <cuda/std/cstdint>

#include <nv/target> // __CUDA_MINIMUM_ARCH__ and friends

_LIBCUDACXX_BEGIN_NAMESPACE_CUDA_PTX

#include <cuda/__ptx/instructions/generated/tcgen05_alloc.h>

_LIBCUDACXX_END_NAMESPACE_CUDA_PTX

#endif // _CUDA_PTX_TCGEN05_ALLOC_H_
37 changes: 37 additions & 0 deletions libcudacxx/include/cuda/__ptx/instructions/tcgen05_commit.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
// -*- C++ -*-
//===----------------------------------------------------------------------===//
//
// Part of libcu++, the C++ Standard Library for your entire system,
// under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
// SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES.
//
//===----------------------------------------------------------------------===//

#ifndef _CUDA_PTX_TCGEN05_COMMIT_H_
#define _CUDA_PTX_TCGEN05_COMMIT_H_

#include <cuda/std/detail/__config>

#if defined(_CCCL_IMPLICIT_SYSTEM_HEADER_GCC)
# pragma GCC system_header
#elif defined(_CCCL_IMPLICIT_SYSTEM_HEADER_CLANG)
# pragma clang system_header
#elif defined(_CCCL_IMPLICIT_SYSTEM_HEADER_MSVC)
# pragma system_header
#endif // no system header

#include <cuda/__ptx/ptx_dot_variants.h>
#include <cuda/__ptx/ptx_helper_functions.h>
#include <cuda/std/cstdint>

#include <nv/target> // __CUDA_MINIMUM_ARCH__ and friends

_LIBCUDACXX_BEGIN_NAMESPACE_CUDA_PTX

#include <cuda/__ptx/instructions/generated/tcgen05_commit.h>

_LIBCUDACXX_END_NAMESPACE_CUDA_PTX

#endif // _CUDA_PTX_TCGEN05_COMMIT_H_
37 changes: 37 additions & 0 deletions libcudacxx/include/cuda/__ptx/instructions/tcgen05_cp.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
// -*- C++ -*-
//===----------------------------------------------------------------------===//
//
// Part of libcu++, the C++ Standard Library for your entire system,
// under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
// SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES.
//
//===----------------------------------------------------------------------===//

#ifndef _CUDA_PTX_TCGEN05_CP_H_
#define _CUDA_PTX_TCGEN05_CP_H_

#include <cuda/std/detail/__config>

#if defined(_CCCL_IMPLICIT_SYSTEM_HEADER_GCC)
# pragma GCC system_header
#elif defined(_CCCL_IMPLICIT_SYSTEM_HEADER_CLANG)
# pragma clang system_header
#elif defined(_CCCL_IMPLICIT_SYSTEM_HEADER_MSVC)
# pragma system_header
#endif // no system header

#include <cuda/__ptx/ptx_dot_variants.h>
#include <cuda/__ptx/ptx_helper_functions.h>
#include <cuda/std/cstdint>

#include <nv/target> // __CUDA_MINIMUM_ARCH__ and friends

_LIBCUDACXX_BEGIN_NAMESPACE_CUDA_PTX

#include <cuda/__ptx/instructions/generated/tcgen05_cp.h>

_LIBCUDACXX_END_NAMESPACE_CUDA_PTX

#endif // _CUDA_PTX_TCGEN05_CP_H_
37 changes: 37 additions & 0 deletions libcudacxx/include/cuda/__ptx/instructions/tcgen05_fence.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
// -*- C++ -*-
//===----------------------------------------------------------------------===//
//
// Part of libcu++, the C++ Standard Library for your entire system,
// under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
// SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES.
//
//===----------------------------------------------------------------------===//

#ifndef _CUDA_PTX_TCGEN05_FENCE_H_
#define _CUDA_PTX_TCGEN05_FENCE_H_

#include <cuda/std/detail/__config>

#if defined(_CCCL_IMPLICIT_SYSTEM_HEADER_GCC)
# pragma GCC system_header
#elif defined(_CCCL_IMPLICIT_SYSTEM_HEADER_CLANG)
# pragma clang system_header
#elif defined(_CCCL_IMPLICIT_SYSTEM_HEADER_MSVC)
# pragma system_header
#endif // no system header

#include <cuda/__ptx/ptx_dot_variants.h>
#include <cuda/__ptx/ptx_helper_functions.h>
#include <cuda/std/cstdint>

#include <nv/target> // __CUDA_MINIMUM_ARCH__ and friends

_LIBCUDACXX_BEGIN_NAMESPACE_CUDA_PTX

#include <cuda/__ptx/instructions/generated/tcgen05_fence.h>

_LIBCUDACXX_END_NAMESPACE_CUDA_PTX

#endif // _CUDA_PTX_TCGEN05_FENCE_H_
37 changes: 37 additions & 0 deletions libcudacxx/include/cuda/__ptx/instructions/tcgen05_ld.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
// -*- C++ -*-
//===----------------------------------------------------------------------===//
//
// Part of libcu++, the C++ Standard Library for your entire system,
// under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
// SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES.
//
//===----------------------------------------------------------------------===//

#ifndef _CUDA_PTX_TCGEN05_LD_H_
#define _CUDA_PTX_TCGEN05_LD_H_

#include <cuda/std/detail/__config>

#if defined(_CCCL_IMPLICIT_SYSTEM_HEADER_GCC)
# pragma GCC system_header
#elif defined(_CCCL_IMPLICIT_SYSTEM_HEADER_CLANG)
# pragma clang system_header
#elif defined(_CCCL_IMPLICIT_SYSTEM_HEADER_MSVC)
# pragma system_header
#endif // no system header

#include <cuda/__ptx/ptx_dot_variants.h>
#include <cuda/__ptx/ptx_helper_functions.h>
#include <cuda/std/cstdint>

#include <nv/target> // __CUDA_MINIMUM_ARCH__ and friends

_LIBCUDACXX_BEGIN_NAMESPACE_CUDA_PTX

#include <cuda/__ptx/instructions/generated/tcgen05_ld.h>

_LIBCUDACXX_END_NAMESPACE_CUDA_PTX

#endif // _CUDA_PTX_TCGEN05_LD_H_
37 changes: 37 additions & 0 deletions libcudacxx/include/cuda/__ptx/instructions/tcgen05_mma.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
// -*- C++ -*-
//===----------------------------------------------------------------------===//
//
// Part of libcu++, the C++ Standard Library for your entire system,
// under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
// SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES.
//
//===----------------------------------------------------------------------===//

#ifndef _CUDA_PTX_TCGEN05_MMA_H_
#define _CUDA_PTX_TCGEN05_MMA_H_

#include <cuda/std/detail/__config>

#if defined(_CCCL_IMPLICIT_SYSTEM_HEADER_GCC)
# pragma GCC system_header
#elif defined(_CCCL_IMPLICIT_SYSTEM_HEADER_CLANG)
# pragma clang system_header
#elif defined(_CCCL_IMPLICIT_SYSTEM_HEADER_MSVC)
# pragma system_header
#endif // no system header

#include <cuda/__ptx/ptx_dot_variants.h>
#include <cuda/__ptx/ptx_helper_functions.h>
#include <cuda/std/cstdint>

#include <nv/target> // __CUDA_MINIMUM_ARCH__ and friends

_LIBCUDACXX_BEGIN_NAMESPACE_CUDA_PTX

#include <cuda/__ptx/instructions/generated/tcgen05_mma.h>

_LIBCUDACXX_END_NAMESPACE_CUDA_PTX

#endif // _CUDA_PTX_TCGEN05_MMA_H_
37 changes: 37 additions & 0 deletions libcudacxx/include/cuda/__ptx/instructions/tcgen05_mma_ws.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
// -*- C++ -*-
//===----------------------------------------------------------------------===//
//
// Part of libcu++, the C++ Standard Library for your entire system,
// under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
// SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES.
//
//===----------------------------------------------------------------------===//

#ifndef _CUDA_PTX_TCGEN05_MMA_WS_H_
#define _CUDA_PTX_TCGEN05_MMA_WS_H_

#include <cuda/std/detail/__config>

#if defined(_CCCL_IMPLICIT_SYSTEM_HEADER_GCC)
# pragma GCC system_header
#elif defined(_CCCL_IMPLICIT_SYSTEM_HEADER_CLANG)
# pragma clang system_header
#elif defined(_CCCL_IMPLICIT_SYSTEM_HEADER_MSVC)
# pragma system_header
#endif // no system header

#include <cuda/__ptx/ptx_dot_variants.h>
#include <cuda/__ptx/ptx_helper_functions.h>
#include <cuda/std/cstdint>

#include <nv/target> // __CUDA_MINIMUM_ARCH__ and friends

_LIBCUDACXX_BEGIN_NAMESPACE_CUDA_PTX

#include <cuda/__ptx/instructions/generated/tcgen05_mma_ws.h>

_LIBCUDACXX_END_NAMESPACE_CUDA_PTX

#endif // _CUDA_PTX_TCGEN05_MMA_WS_H_
Loading
Loading