Local array creation #122

JonathanDLTran · 2022-02-25T01:58:32Z

When C programs have local arrays created within a function, LLVM sometimes generates these arrays by first allocating space, then doing a memset into the allocated space. The current method of backtracking from a store instruction during the translation from LLVM to Egg terms fails to track the memset instructions, because memset instructions do not write to a register, but instead use the memory location as an argument. Sometimes memcopy instructions are also generated as well.

Here is a minimal test case that generates memset, without memcopy instructions:

void test(float A[SIZE]) {
    for (int i = 0; i < SIZE; i++) {
        float x[SIZE] = {0.0f};
        for (int j = 0; j < SIZE; j++) {
            x[j] = 1.0f;
        }
        float sum = 0.0f;
        for (int j = 0; j < SIZE; j++) {
            sum += x[j];
        }
        A[i] = sum;
    }
}

and the clang code it generates, after several optimizations were run:

; Function Attrs: noinline nounwind ssp uwtable
define void @test(float* %0) #0 {
.preheader:
  %1 = alloca i64, align 8
  %tmpcast = bitcast i64* %1 to [2 x float]*
  %2 = bitcast i64* %1 to i8*
  %3 = bitcast i64* %1 to float*
  store i64 0, i64* %1, align 8
  call void @memset_pattern16(i8* nonnull %2, i8* bitcast ([4 x float]* @.memset_pattern to i8*), i64 8) #4
  %4 = load float, float* %3, align 8
  %5 = fadd float %4, 0.000000e+00
  %6 = getelementptr inbounds [2 x float], [2 x float]* %tmpcast, i64 0, i64 1
  %7 = load float, float* %6, align 4
  %8 = fadd float %5, %7
  store float %8, float* %0, align 4
  store i64 0, i64* %1, align 8
  call void @memset_pattern16(i8* nonnull %2, i8* bitcast ([4 x float]* @.memset_pattern to i8*), i64 8) #4
  %9 = load float, float* %3, align 8
  %10 = fadd float %9, 0.000000e+00
  %11 = load float, float* %6, align 4
  %12 = fadd float %10, %11
  %13 = getelementptr inbounds float, float* %0, i64 1
  store float %12, float* %13, align 4
  ret void
}

The Diospyros pass walks back from the store after the instruction definition %13, and recursively looks at arguments of instructions, beginning with the %12 and %13 referenced in the store. Because no instruction references memset (memset has no destination register), the current pass implementation fails to pick up memset.

To fix this issue, I am planning to treat any memset/memcopy instruction like a store, because it has the same basic function as a store. Where the Diospyros pass walks back from stores, it should also walk back from stores/memset/memcopy instructions, and translate recursively backwards into an Egg term. This change will allow Egg expressions to contain memset nodes in the correct position.

The text was updated successfully, but these errors were encountered:

sampsyo · 2022-02-25T19:26:52Z

Ah, nice work boiling this down to a small example! This is incredibly helpful to look at!

It looks like the optimizations run ahead of Diospyros are already smart enough to convert that for loops that does initialization into a memset, huh? That's pretty cool.

To confirm: is SIZE equal to 2 here? That would explain this:

%1 = alloca i64, align 8
%tmpcast = bitcast i64* %1 to [2 x float]*

That is, LLVM is using a single i64 to store two floats for x.

Anyway, treating memset & memcpy calls like stores seem like exactly the right thing to do!

JonathanDLTran · 2022-02-25T20:19:57Z

Yep, to clarify, SIZE was 2 for this example, and that does help clarify why the i64 is used! I will go ahead and translate the memset and memcpy instructions.

JonathanDLTran · 2022-03-10T16:53:42Z

The original problem was mostly fixed, but another similar problem relating in the load/store movement pass is being fixed, where loads and stores cannot be moved around memsets.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Local array creation #122

Local array creation #122

JonathanDLTran commented Feb 25, 2022

sampsyo commented Feb 25, 2022

JonathanDLTran commented Feb 25, 2022

JonathanDLTran commented Mar 10, 2022

Local array creation #122

Local array creation #122

Comments

JonathanDLTran commented Feb 25, 2022

sampsyo commented Feb 25, 2022

JonathanDLTran commented Feb 25, 2022

JonathanDLTran commented Mar 10, 2022