Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISC-V] Add CSR read/write builtins #85091

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

nemanjai
Copy link
Member

To facilitate proper range checking and better error messages if an attempt is made to call these with non-litaral arguments, we provide builtins to emit the read/write CSR instructions.

To facilitate proper range checking and better error messages
if an attempt is made to call these with non-litaral arguments,
we provide builtins to emit the read/write CSR instructions.
@nemanjai nemanjai requested review from asb, topperc and wangpc-pp March 13, 2024 14:46
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:RISC-V clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen llvm:ir labels Mar 13, 2024
@llvmbot
Copy link
Member

llvmbot commented Mar 13, 2024

@llvm/pr-subscribers-backend-risc-v
@llvm/pr-subscribers-llvm-ir
@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-clang

Author: Nemanja Ivanovic (nemanjai)

Changes

To facilitate proper range checking and better error messages if an attempt is made to call these with non-litaral arguments, we provide builtins to emit the read/write CSR instructions.


Full diff: https://github.com/llvm/llvm-project/pull/85091.diff

9 Files Affected:

  • (modified) clang/include/clang/Basic/BuiltinsRISCV.td (+6)
  • (modified) clang/lib/CodeGen/CGBuiltin.cpp (+10)
  • (modified) clang/lib/Sema/SemaChecking.cpp (+3)
  • (added) clang/test/CodeGen/RISCV/csr-intrinsics/riscv-zicsr-invalid.c (+25)
  • (added) clang/test/CodeGen/RISCV/csr-intrinsics/riscv-zicsr.c (+63)
  • (modified) llvm/include/llvm/IR/IntrinsicsRISCV.td (+15)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.td (+17)
  • (added) llvm/test/CodeGen/RISCV/rv32zicsr-intrinsic.ll (+34)
  • (added) llvm/test/CodeGen/RISCV/rv64zicsr-intrinsic.ll (+34)
diff --git a/clang/include/clang/Basic/BuiltinsRISCV.td b/clang/include/clang/Basic/BuiltinsRISCV.td
index 4cc89a8a9d8af2..14ef6f9d313a41 100644
--- a/clang/include/clang/Basic/BuiltinsRISCV.td
+++ b/clang/include/clang/Basic/BuiltinsRISCV.td
@@ -20,6 +20,12 @@ class RISCVBuiltin<string prototype, string features = ""> : TargetBuiltin {
 
 let Attributes = [NoThrow, Const] in {
 //===----------------------------------------------------------------------===//
+// Zicsr extension.
+//===----------------------------------------------------------------------===//
+def csrr : RISCVBuiltin<"unsigned long int(unsigned long int)", "zicsr">;
+def csrw :
+  RISCVBuiltin<"void(unsigned long int, unsigned long int)", "zicsr">;
+//===----------------------------------------------------------------------===//
 // Zbb extension.
 //===----------------------------------------------------------------------===//
 def orc_b_32 : RISCVBuiltin<"unsigned int(unsigned int)", "zbb">;
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 93ab465079777b..99486d8aee6f9e 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -21379,6 +21379,16 @@ Value *CodeGenFunction::EmitRISCVBuiltinExpr(unsigned BuiltinID,
   llvm::SmallVector<llvm::Type *, 2> IntrinsicTypes;
   switch (BuiltinID) {
   default: llvm_unreachable("unexpected builtin ID");
+  // Zicsr
+  case RISCV::BI__builtin_riscv_csrr:
+  case RISCV::BI__builtin_riscv_csrw:
+    if (IntPtrTy->getScalarSizeInBits() == 32)
+      ID = BuiltinID == RISCV::BI__builtin_riscv_csrr ? Intrinsic::riscv_csrr
+                                                      : Intrinsic::riscv_csrw;
+    else
+      ID = BuiltinID == RISCV::BI__builtin_riscv_csrr ? Intrinsic::riscv_csrr64
+                                                      : Intrinsic::riscv_csrw64;
+    break;
   case RISCV::BI__builtin_riscv_orc_b_32:
   case RISCV::BI__builtin_riscv_orc_b_64:
   case RISCV::BI__builtin_riscv_clz_32:
diff --git a/clang/lib/Sema/SemaChecking.cpp b/clang/lib/Sema/SemaChecking.cpp
index a5f42b630c3fa2..637da165c73589 100644
--- a/clang/lib/Sema/SemaChecking.cpp
+++ b/clang/lib/Sema/SemaChecking.cpp
@@ -6223,6 +6223,9 @@ bool Sema::CheckRISCVBuiltinFunctionCall(const TargetInfo &TI,
   case RISCVVector::BI__builtin_rvv_vfwnmsac_vv_rm_mu:
   case RISCVVector::BI__builtin_rvv_vfwnmsac_vf_rm_mu:
     return SemaBuiltinConstantArgRange(TheCall, 4, 0, 4);
+  case RISCV::BI__builtin_riscv_csrr:
+  case RISCV::BI__builtin_riscv_csrw:
+    return SemaBuiltinConstantArgRange(TheCall, 0, 0, 4095);
   case RISCV::BI__builtin_riscv_ntl_load:
   case RISCV::BI__builtin_riscv_ntl_store:
     DeclRefExpr *DRE =
diff --git a/clang/test/CodeGen/RISCV/csr-intrinsics/riscv-zicsr-invalid.c b/clang/test/CodeGen/RISCV/csr-intrinsics/riscv-zicsr-invalid.c
new file mode 100644
index 00000000000000..d0420731977d05
--- /dev/null
+++ b/clang/test/CodeGen/RISCV/csr-intrinsics/riscv-zicsr-invalid.c
@@ -0,0 +1,25 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+// RUN: %clang_cc1 -triple riscv32 -target-feature +zicsr %s -fsyntax-only -verify
+// RUN: %clang_cc1 -triple riscv64 -target-feature +zicsr %s -fsyntax-only -verify
+// RUN: %clang_cc1 -triple riscv32 %s -fsyntax-only -verify
+
+#ifdef __riscv_zicsr
+unsigned long non_const(unsigned long a) {
+  return __builtin_riscv_csrr(a); // expected-error {{argument to '__builtin_riscv_csrr' must be a constant integer}}
+}
+
+unsigned long too_large() {
+  return __builtin_riscv_csrr(33312); // expected-error {{argument value 33312 is outside the valid range [0, 4095]}}
+}
+
+void non_const_write(unsigned long d) {
+  return __builtin_riscv_csrw(d, d); // expected-error {{argument to '__builtin_riscv_csrw' must be a constant integer}}
+}
+#else
+unsigned long read(unsigned long a) {
+  return __builtin_riscv_csrr(3); // expected-error {{builtin requires at least one of the following extensions: 'Zicsr'}}
+}
+void write(unsigned long d) {
+  return __builtin_riscv_csrw(3, d); // expected-error {{builtin requires at least one of the following extensions: 'Zicsr'}}
+}
+#endif
diff --git a/clang/test/CodeGen/RISCV/csr-intrinsics/riscv-zicsr.c b/clang/test/CodeGen/RISCV/csr-intrinsics/riscv-zicsr.c
new file mode 100644
index 00000000000000..c86c796303eba6
--- /dev/null
+++ b/clang/test/CodeGen/RISCV/csr-intrinsics/riscv-zicsr.c
@@ -0,0 +1,63 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+// RUN: %clang_cc1 -triple riscv32 -target-feature +zicsr -emit-llvm %s -o - \
+// RUN:     -disable-O0-optnone | opt -S -passes=mem2reg \
+// RUN:     | FileCheck %s  -check-prefix=RV32ZICSR
+// RUN: %clang_cc1 -triple riscv64 -target-feature +zicsr -emit-llvm %s -o - \
+// RUN:     -disable-O0-optnone | opt -S -passes=mem2reg \
+// RUN:     | FileCheck %s  -check-prefix=RV64ZICSR
+
+// RV32ZICSR-LABEL: @readcsr(
+// RV32ZICSR-NEXT:  entry:
+// RV32ZICSR-NEXT:    [[TMP0:%.*]] = call i32 @llvm.riscv.csrr(i32 3)
+// RV32ZICSR-NEXT:    ret i32 [[TMP0]]
+//
+// RV64ZICSR-LABEL: @readcsr(
+// RV64ZICSR-NEXT:  entry:
+// RV64ZICSR-NEXT:    [[TMP0:%.*]] = call i64 @llvm.riscv.csrr64(i64 3)
+// RV64ZICSR-NEXT:    ret i64 [[TMP0]]
+//
+unsigned long readcsr() {
+  return __builtin_riscv_csrr(3);
+}
+
+// RV32ZICSR-LABEL: @readcsr_arbitrary(
+// RV32ZICSR-NEXT:  entry:
+// RV32ZICSR-NEXT:    [[TMP0:%.*]] = call i32 @llvm.riscv.csrr(i32 333)
+// RV32ZICSR-NEXT:    ret i32 [[TMP0]]
+//
+// RV64ZICSR-LABEL: @readcsr_arbitrary(
+// RV64ZICSR-NEXT:  entry:
+// RV64ZICSR-NEXT:    [[TMP0:%.*]] = call i64 @llvm.riscv.csrr64(i64 333)
+// RV64ZICSR-NEXT:    ret i64 [[TMP0]]
+//
+unsigned long readcsr_arbitrary() {
+  return __builtin_riscv_csrr(333);
+}
+
+// RV32ZICSR-LABEL: @writecsr(
+// RV32ZICSR-NEXT:  entry:
+// RV32ZICSR-NEXT:    call void @llvm.riscv.csrw(i32 3, i32 [[D:%.*]])
+// RV32ZICSR-NEXT:    ret void
+//
+// RV64ZICSR-LABEL: @writecsr(
+// RV64ZICSR-NEXT:  entry:
+// RV64ZICSR-NEXT:    call void @llvm.riscv.csrw64(i64 3, i64 [[D:%.*]])
+// RV64ZICSR-NEXT:    ret void
+//
+void writecsr(unsigned long d) {
+  return __builtin_riscv_csrw(3, d);
+}
+
+// RV32ZICSR-LABEL: @writecsr_arbitrary(
+// RV32ZICSR-NEXT:  entry:
+// RV32ZICSR-NEXT:    call void @llvm.riscv.csrw(i32 333, i32 [[D:%.*]])
+// RV32ZICSR-NEXT:    ret void
+//
+// RV64ZICSR-LABEL: @writecsr_arbitrary(
+// RV64ZICSR-NEXT:  entry:
+// RV64ZICSR-NEXT:    call void @llvm.riscv.csrw64(i64 333, i64 [[D:%.*]])
+// RV64ZICSR-NEXT:    ret void
+//
+void writecsr_arbitrary(unsigned long d) {
+  return __builtin_riscv_csrw(333, d);
+}
diff --git a/llvm/include/llvm/IR/IntrinsicsRISCV.td b/llvm/include/llvm/IR/IntrinsicsRISCV.td
index 1d58860a0afc8a..71d20f5be0bfa2 100644
--- a/llvm/include/llvm/IR/IntrinsicsRISCV.td
+++ b/llvm/include/llvm/IR/IntrinsicsRISCV.td
@@ -74,6 +74,21 @@ let TargetPrefix = "riscv" in {
 
 } // TargetPrefix = "riscv"
 
+let TargetPrefix = "riscv" in {
+  // Zicsr
+  def int_riscv_csrr :
+    DefaultAttrsIntrinsic<[llvm_i32_ty], [llvm_i32_ty],
+                          [IntrNoMem, IntrHasSideEffects, ImmArg<ArgIndex<0>>]>;
+  def int_riscv_csrr64 :
+    DefaultAttrsIntrinsic<[llvm_i64_ty], [llvm_i64_ty],
+                          [IntrNoMem, IntrHasSideEffects, ImmArg<ArgIndex<0>>]>;
+  def int_riscv_csrw :
+    DefaultAttrsIntrinsic<[], [llvm_i32_ty, llvm_i32_ty],
+                          [IntrNoMem, IntrHasSideEffects, ImmArg<ArgIndex<0>>]>;
+  def int_riscv_csrw64 :
+    DefaultAttrsIntrinsic<[], [llvm_i64_ty, llvm_i64_ty],
+                          [IntrNoMem, IntrHasSideEffects, ImmArg<ArgIndex<0>>]>;
+} // TargetPrefix = "riscv"
 //===----------------------------------------------------------------------===//
 // Bitmanip (Bit Manipulation) Extension
 
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.td b/llvm/lib/Target/RISCV/RISCVInstrInfo.td
index e753c1f1add0c6..fa01ae9fdeb915 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.td
@@ -565,6 +565,13 @@ class CSR_ir<bits<3> funct3, string opcodestr>
     : RVInstI<funct3, OPC_SYSTEM, (outs GPR:$rd), (ins csr_sysreg:$imm12, GPR:$rs1),
               opcodestr, "$rd, $imm12, $rs1">, Sched<[WriteCSR, ReadCSR]>;
 
+let hasNoSchedulingInfo = 1,
+    hasSideEffects = 1, mayLoad = 0, mayStore = 0 in
+class CSR_ir_x0<bits<3> funct3, string opcodestr>
+    : RVInstI<funct3, OPC_SYSTEM, (outs), (ins csr_sysreg:$imm12, GPR:$rs1),
+              opcodestr, "$imm12, $rs1">, Sched<[WriteCSR]> {
+  let rd = 0;
+}
 let hasNoSchedulingInfo = 1,
     hasSideEffects = 1, mayLoad = 0, mayStore = 0 in
 class CSR_ii<bits<3> funct3, string opcodestr>
@@ -733,6 +740,8 @@ def UNIMP : RVInstI<0b001, OPC_SYSTEM, (outs), (ins), "unimp", "">,
 
 } // hasSideEffects = 1, mayLoad = 0, mayStore = 0
 
+let isCodeGenOnly = 1 in
+def CSRW : CSR_ir_x0<0b001, "csrw">;
 def CSRRW : CSR_ir<0b001, "csrrw">;
 def CSRRS : CSR_ir<0b010, "csrrs">;
 def CSRRC : CSR_ir<0b011, "csrrc">;
@@ -1845,6 +1854,14 @@ def ReadCounterWide : Pseudo<(outs GPR:$lo, GPR:$hi), (ins i32imm:$csr_lo, i32im
                               (riscv_read_counter_wide csr_sysreg:$csr_lo, csr_sysreg:$csr_hi))],
                              "", "">;
 
+// Zicsr
+let Predicates = [IsRV64] in {
+  def : Pat<(i64 (int_riscv_csrr64 timm:$I)), (CSRRS csr_sysreg:$I, (XLenVT X0))>;
+  def : Pat<(int_riscv_csrw64 timm:$I, i64:$D), (CSRW csr_sysreg:$I, i64:$D)>;
+}
+def : Pat<(i32 (int_riscv_csrr timm:$I)), (CSRRS csr_sysreg:$I, (XLenVT X0))>;
+def : Pat<(int_riscv_csrw timm:$I, i32:$D), (CSRW csr_sysreg:$I, i32:$D)>;
+
 /// traps
 
 // We lower `trap` to `unimp`, as this causes a hard exception on nearly all
diff --git a/llvm/test/CodeGen/RISCV/rv32zicsr-intrinsic.ll b/llvm/test/CodeGen/RISCV/rv32zicsr-intrinsic.ll
new file mode 100644
index 00000000000000..b866bdb415cd30
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/rv32zicsr-intrinsic.ll
@@ -0,0 +1,34 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=riscv32 -mattr=+zicsr -verify-machineinstrs < %s \
+; RUN:   | FileCheck %s -check-prefix=RV32ZICSR
+
+declare i32 @llvm.riscv.csrr(i32 immarg)
+declare void @llvm.riscv.csrw(i32 immarg, i32)
+
+define i32 @read() nounwind {
+; RV32ZICSR-LABEL: read:
+; RV32ZICSR:       # %bb.0:
+; RV32ZICSR-NEXT:    csrr a0, fcsr
+; RV32ZICSR-NEXT:    csrr a1, fcsr
+; RV32ZICSR-NEXT:    csrr a2, stval
+; RV32ZICSR-NEXT:    add a1, a1, a2
+; RV32ZICSR-NEXT:    add a0, a0, a1
+; RV32ZICSR-NEXT:    ret
+  %val = call i32 @llvm.riscv.csrr(i32 3)
+  %val2 = call i32 @llvm.riscv.csrr(i32 3)
+  %val3 = call i32 @llvm.riscv.csrr(i32 323)
+  %add = add i32 %val2, %val3
+  %ret = add i32 %val, %add
+  ret i32 %ret
+}
+
+define void @testwrite(i32 %d) nounwind {
+; RV32ZICSR-LABEL: testwrite:
+; RV32ZICSR:       # %bb.0:
+; RV32ZICSR-NEXT:    csrw fcsr, a0
+; RV32ZICSR-NEXT:    csrw 3435, a0
+; RV32ZICSR-NEXT:    ret
+  call void @llvm.riscv.csrw(i32 3, i32 %d)
+  call void @llvm.riscv.csrw(i32 3435, i32 %d)
+  ret void
+}
diff --git a/llvm/test/CodeGen/RISCV/rv64zicsr-intrinsic.ll b/llvm/test/CodeGen/RISCV/rv64zicsr-intrinsic.ll
new file mode 100644
index 00000000000000..39163100673b0f
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/rv64zicsr-intrinsic.ll
@@ -0,0 +1,34 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=riscv64 -mattr=+zicsr -verify-machineinstrs < %s \
+; RUN:   | FileCheck %s
+
+declare i64 @llvm.riscv.csrr64(i64 immarg)
+declare void @llvm.riscv.csrw64(i64 immarg, i64)
+
+define i64 @read() nounwind {
+; CHECK-LABEL: read:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    csrr a0, fcsr
+; CHECK-NEXT:    csrr a1, fcsr
+; CHECK-NEXT:    csrr a2, 111
+; CHECK-NEXT:    add a1, a1, a2
+; CHECK-NEXT:    add a0, a0, a1
+; CHECK-NEXT:    ret
+  %val = call i64 @llvm.riscv.csrr64(i64 3)
+  %val2 = call i64 @llvm.riscv.csrr64(i64 3)
+  %val3 = call i64 @llvm.riscv.csrr64(i64 111)
+  %add = add i64 %val2, %val3
+  %ret = add i64 %val, %add
+  ret i64 %ret
+}
+
+define void @testwrite(i64 %d) nounwind {
+; CHECK-LABEL: testwrite:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    csrw fcsr, a0
+; CHECK-NEXT:    csrw 3231, a0
+; CHECK-NEXT:    ret
+  call void @llvm.riscv.csrw64(i64 3, i64 %d)
+  call void @llvm.riscv.csrw64(i64 3231, i64 %d)
+  ret void
+}

@nemanjai
Copy link
Member Author

Individual implementations will provide different sets of CSR's and need a way to read/write them. Of course, this can be done with inline asm, but doing such things with inline asm has its limitations (no error checking, if a user attempts to wrap the asm in a function, they won't be able to build code that calls those functions with -O0 as inlining/constant propagation is required, etc.).

Providing builtins to do this allows the compiler to do range checking and ensure CSR # is a literal.

@@ -20,6 +20,12 @@ class RISCVBuiltin<string prototype, string features = ""> : TargetBuiltin {

let Attributes = [NoThrow, Const] in {
//===----------------------------------------------------------------------===//
// Zicsr extension.
//===----------------------------------------------------------------------===//
def csrr : RISCVBuiltin<"unsigned long int(unsigned long int)", "zicsr">;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want "size_t" or "uintptr_t", please use that, not "unsigned long".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these are Const since they operate on state that is not part of their arguments/return.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can certainly change it to size_t. And I'll move them out of the section marked with Const.

def int_riscv_csrw64 :
DefaultAttrsIntrinsic<[], [llvm_i64_ty, llvm_i64_ty],
[IntrNoMem, IntrHasSideEffects, ImmArg<ArgIndex<0>>]>;
} // TargetPrefix = "riscv"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about CSR swap?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, I only implemented these two initially for 2 reasons

  1. The user that requested them only needs these for now
  2. To validate the approach with the community

I think ultimately, we could provide read, write, set, clear and swap (and handle producing the immediate forms if the operand allows it).

@topperc
Copy link
Collaborator

topperc commented Mar 13, 2024

Individual implementations will provide different sets of CSR's and need a way to read/write them. Of course, this can be done with inline asm, but doing such things with inline asm has its limitations (no error checking,

Wouldn't the assembler error check the constant? Diagnostic is probably a bit uglier, but its not nothing.

if a user attempts to wrap the asm in a function, they won't be able to build code that calls those functions with -O0 as inlining/constant propagation is required, etc.).

You mean if they write a generic write_csr or read_csr function? If they use a write_frm or read_fcsr function then there is no constant propagation issue and they get to refer to the CSR by name in the assembly string instead of maintaining a constant in their code.

@jrtc27
Copy link
Collaborator

jrtc27 commented Mar 13, 2024

I have always been unconvinced that these are a good idea to have / add significant value over using inline assembly. IIRC Arm has them but nobody uses them?

@nemanjai
Copy link
Member Author

nemanjai commented Mar 14, 2024

Individual implementations will provide different sets of CSR's and need a way to read/write them. Of course, this can be done with inline asm, but doing such things with inline asm has its limitations (no error checking,

Wouldn't the assembler error check the constant? Diagnostic is probably a bit uglier, but its not nothing.

Perhaps it would, but I really don't think it is particularly friendly to the user to rely on the assembler for this.

if a user attempts to wrap the asm in a function, they won't be able to build code that calls those functions with -O0 as inlining/constant propagation is required, etc.).

You mean if they write a generic write_csr or read_csr function? If they use a write_frm or read_fcsr function then there is no constant propagation issue and they get to refer to the CSR by name in the assembly string instead of maintaining a constant in their code.

Yes, I mean the general, unnamed CSR's.

@nemanjai
Copy link
Member Author

I have always been unconvinced that these are a good idea to have / add significant value over using inline assembly. IIRC Arm has them but nobody uses them?

Is this a comment about the general concept of builtins to produce specific instructions or about these specific ones?
For these ones, the patch isn't something I materialized out of thin air just because I think it's nice to have - it was in response to the very use case I described (from a library writer). They had something like:

void __attribute__((always_inline)) set_csr(unsigned CSRNum, unsigned Val) {
__asm__ volatile ...

Which works great if you're optimizing, but doesn't with -O0. Of course they could write it as a macro, but that has all the pitfalls of function-style macros.

Of course, if the consensus is to not provide these because of some reason, I'm fine with that - I would just like to understand the reason for the opposition to this.

@jrtc27
Copy link
Collaborator

jrtc27 commented Mar 14, 2024

I have always been unconvinced that these are a good idea to have / add significant value over using inline assembly. IIRC Arm has them but nobody uses them?

Is this a comment about the general concept of builtins to produce specific instructions or about these specific ones? For these ones, the patch isn't something I materialized out of thin air just because I think it's nice to have - it was in response to the very use case I described (from a library writer). They had something like:

void __attribute__((always_inline)) set_csr(unsigned CSRNum, unsigned Val) {
__asm__ volatile ...

Which works great if you're optimizing, but doesn't with -O0. Of course they could write it as a macro, but that has all the pitfalls of function-style macros.

Of course, if the consensus is to not provide these because of some reason, I'm fine with that - I would just like to understand the reason for the opposition to this.

If it’s not a constant integer for inline assembly then how would it magically be a constant integer for an intrinsic? The IR’s going to be the same with an alloca/store/load, no?

@wangpc-pp
Copy link
Contributor

I support adding these builtins personally, but I think we need more discussions on the design.
We can achieve the same thing via inline assemblies, that's true. But, from the compiler side, inline assemblies are kind of barriers, we can't do a lot of optimizations/reorderings if inline assemblies exist. If we make it a builtin, these limitations can be loosed I think.

@topperc
Copy link
Collaborator

topperc commented Mar 14, 2024

I support adding these builtins personally, but I think we need more discussions on the design. We can achieve the same thing via inline assemblies, that's true. But, from the compiler side, inline assemblies are kind of barriers, we can't do a lot of optimizations/reorderings if inline assemblies exist. If we make it a builtin, these limitations can be loosed I think.

Inline assembly is only a barrier to moving load/stores and other side effecting instructions. And only if the inline assembly is marked volatile or has "memory" clobber. It should not affect moving arithmetic. I think these CSR intrinsics would also need to be barriers against load/stores and other side effects to be the most conservative to access any CSR.

@jrtc27
Copy link
Collaborator

jrtc27 commented Mar 14, 2024

I support adding these builtins personally, but I think we need more discussions on the design. We can achieve the same thing via inline assemblies, that's true. But, from the compiler side, inline assemblies are kind of barriers, we can't do a lot of optimizations/reorderings if inline assemblies exist. If we make it a builtin, these limitations can be loosed I think.

Inline assembly is only a barrier to moving load/stores and other side effecting instructions. And only if the inline assembly is marked volatile or has "memory" clobber. It should not affect moving arithmetic. I think these CSR intrinsics would also need to be barriers against load/stores and other side effects to be the most conservative to access any CSR.

Yes, exactly; you have no clue what the semantics of the CSR are (if you did, you'd have a named intrinsic, like __builtin_readcyclecounter()) because it is just an arbitrary number.

Also, in what world do you need compiler optimisations around general CSR accesses? That's normally relatively cold performance-non-critical code...

@wangpc-pp
Copy link
Contributor

I support adding these builtins personally, but I think we need more discussions on the design. We can achieve the same thing via inline assemblies, that's true. But, from the compiler side, inline assemblies are kind of barriers, we can't do a lot of optimizations/reorderings if inline assemblies exist. If we make it a builtin, these limitations can be loosed I think.

Inline assembly is only a barrier to moving load/stores and other side effecting instructions. And only if the inline assembly is marked volatile or has "memory" clobber. It should not affect moving arithmetic. I think these CSR intrinsics would also need to be barriers against load/stores and other side effects to be the most conservative to access any CSR.

Yes, exactly; you have no clue what the semantics of the CSR are (if you did, you'd have a named intrinsic, like __builtin_readcyclecounter()) because it is just an arbitrary number.

Also, in what world do you need compiler optimisations around general CSR accesses? That's normally relatively cold performance-non-critical code...

Thanks for replying. I may not think it clearly.
There are many ways to do the same thing. I think this PR wouldn't be the last time that someone try to add these builtins:

@nemanjai
Copy link
Member Author

I have always been unconvinced that these are a good idea to have / add significant value over using inline assembly. IIRC Arm has them but nobody uses them?

...

If it’s not a constant integer for inline assembly then how would it magically be a constant integer for an intrinsic? The IR’s going to be the same with an alloca/store/load, no?

To be completely clear with the use case:

$ cat csrread.c
// An obvious way a library writer tries to provide a
// convenience function for reading CSR's.
#ifdef _ASM
static unsigned __attribute__((always_inline)) read_csr(const unsigned CSR) {
  unsigned Ret;
  __asm__ volatile("csrr %0, %1" : "=r"(Ret) : "I"(CSR));
  return Ret;
}
#else
#define read_csr __builtin_riscv_csrr
#endif

// Use the convenience function.
unsigned someFunc() {
  return read_csr(0x300);
}

$ clang csrread.c -S --target=riscv32 -D_ASM -O1 && echo Success # No error checking for no Zicsr but it compiles
Success

$ clang csrread.c -S --target=riscv32 -D_ASM && echo Success # Does not compile at -O0
csrread.c:6:20: error: constraint 'I' expects an integer constant expression
    6 |   __asm__ volatile("csrr %0, %1" : "=r"(Ret) : "I"(CSR));

$ clang csrread.c -S --target=riscv32 -O1 && echo Success # The builtin provides a nice error message for no Zicsr
csrread.c:15:10: error: builtin requires at least one of the following extensions: 'Zicsr'
   15 |   return read_csr(0x300);

$ clang csrread.c -S --target=riscv32 -march=rv32i_zicsr && echo Success # The builtin compiles fine at -O0
Success

In my view, the builtin solution is superior from a usability perspective.

@asb
Copy link
Contributor

asb commented Mar 14, 2024

In my view, the builtin solution is superior from a usability perspective.

I agree, I think Sam's previous work on this stopped due to changing priorities not because there was any real pushback.

@jrtc27
Copy link
Collaborator

jrtc27 commented Mar 14, 2024

You can just use ({ ... }) to achieve that same goal with inline assembly (and write doesn't even need that, you can do it with a single statement). I'm not convinced the intrinsics gain you anything.

@nemanjai
Copy link
Member Author

You can just use ({ ... }) to achieve that same goal with inline assembly (and write doesn't even need that, you can do it with a single statement). I'm not convinced the intrinsics gain you anything.

Hmm... I think there's a bit of a disconnect here between the point I am trying to make and your responses. I have attempted to illustrate that this is meant as a usability enhancement. Not something that will increase the expressive power of the language. Any usability feature, more or less by nature, can be achieved in another way. Whatever usability enhancement one proposes, there can always be a rebuttal suggesting that the semantics can be achieved through some other mechanism. Sometimes that mechanism is an exotic one such as GNU statement expressions.

I view the compiler as a tool that users use to achieve the goal of developing and building their project and its usability has a direct contribution to the users' ability to effectively achieve their goal. If a tool can easily provide a convenient and consistent mechanism to achieve a specific goal without significant drawbacks, it should probably do so.

Suggesting that a facility the compiler can easily provide can be achieved with something like a GNU statement expression, in my opinion is in direct opposition to the goal of improving the usability of the compiler. Calls to functions/intrinsics are a fundamental aspect of a language such as C/C++ that everyone is familiar with. GNU statement expressions exist in an entirely different realm and are familiar to an exceedingly small subset of developers.

I feel that the argument of "you don't need this, you can achieve it in another way" is only meaningful if "this" has significant drawbacks or if "another way" is equivalently convenient. As an absurd example that regresses this argument to the absolute extreme, one could state that "you don't need a compiler or high level language, you can just write your code in assembly." I am of course not making any attempt to draw a false equivalence between your suggestion and that absurd example. I am simply illustrating the point that not every way of achieving a goal is equivalent in terms of skill and effort required to achieve it.

I am sorry to digress into a philosophical discussion in this post, but I feel like our discussion is not converging as we are making somewhat orthogonal arguments.

So can we please discuss the way forward based on whether we feel this usability enhancement is desirable and whether its benefits outweigh the cost of providing it. What is not at all clear to me is what you view as the cost of providing this convenience. And to a lesser extent whether you feel that a familiar interface such as a builtin would at all be preferrable to any user.

@jrtc27
Copy link
Collaborator

jrtc27 commented Mar 14, 2024

I guess my underlying point is that Arm's ACLE provides functions like __arm_rsr, but I'm not aware of them really being used, with inline assembly being the far more common alternative, so what's the real point of providing an interface that developers have already demonstrated they won't generally use?

@wangpc-pp
Copy link
Contributor

wangpc-pp commented Mar 15, 2024

GCC gained its __arm_rsr and __arm_wsr support last year (October, 2023): https://gcc.gnu.org/pipermail/gcc-patches/2023-October/631855.html. So there is no stable released GCC version that supports these builtins.
Clang supported these builtins about nine years ago: https://reviews.llvm.org/D9697. But as we know, clang is not the default compiler in most OS distributions. People may not even know these features.
We can see some usages (ignore the gcc/llvm packages) via Debian Code Search: http://codesearch.debian.net/search?q=__arm_rsr&literal=1.
So I think it's mainly a problem left over by history:

// Although the ARM ACLE does have a specification for __arm_rsr/__arm_wsr
// for reading and writing to the status registers, they are not implemented
// by GCC, so we need to resort to inline assembly.

As for RISCV, the software ecosystem is growing. So I just think we should do the right thing ASAP.

@topperc
Copy link
Collaborator

topperc commented Mar 15, 2024

Should we use strings like ARM does so we can get register by name?

@wangpc-pp
Copy link
Contributor

Should we use strings like ARM does so we can get register by name?

Good point! We may provide two kinds of builtins: one by name, and another by CSR number.
We should continue @lenary's proposal and discuss it in https://github.com/riscv-non-isa/riscv-toolchain-conventions or https://github.com/riscv-non-isa/riscv-c-api-doc.

@nemanjai
Copy link
Member Author

Should we use strings like ARM does so we can get register by name?

Good point! We may provide two kinds of builtins: one by name, and another by CSR number. We should continue @lenary's proposal and discuss it in https://github.com/riscv-non-isa/riscv-toolchain-conventions or https://github.com/riscv-non-isa/riscv-c-api-doc.

My personal preference would be to not use strings, but I think a close analog could be pre-defined macro names for existing named CSR's. I just find string matching to be tedious.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:RISC-V clang:codegen clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category llvm:ir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants