You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When C programs have local arrays created within a function, LLVM sometimes generates these arrays by first allocating space, then doing a memset into the allocated space. The current method of backtracking from a store instruction during the translation from LLVM to Egg terms fails to track the memset instructions, because memset instructions do not write to a register, but instead use the memory location as an argument. Sometimes memcopy instructions are also generated as well.
Here is a minimal test case that generates memset, without memcopy instructions:
void test(float A[SIZE]) {
for (int i = 0; i < SIZE; i++) {
float x[SIZE] = {0.0f};
for (int j = 0; j < SIZE; j++) {
x[j] = 1.0f;
}
float sum = 0.0f;
for (int j = 0; j < SIZE; j++) {
sum += x[j];
}
A[i] = sum;
}
}
and the clang code it generates, after several optimizations were run:
The Diospyros pass walks back from the store after the instruction definition %13, and recursively looks at arguments of instructions, beginning with the %12 and %13 referenced in the store. Because no instruction references memset (memset has no destination register), the current pass implementation fails to pick up memset.
To fix this issue, I am planning to treat any memset/memcopy instruction like a store, because it has the same basic function as a store. Where the Diospyros pass walks back from stores, it should also walk back from stores/memset/memcopy instructions, and translate recursively backwards into an Egg term. This change will allow Egg expressions to contain memset nodes in the correct position.
The text was updated successfully, but these errors were encountered:
Ah, nice work boiling this down to a small example! This is incredibly helpful to look at!
It looks like the optimizations run ahead of Diospyros are already smart enough to convert that for loops that does initialization into a memset, huh? That's pretty cool.
To confirm: is SIZE equal to 2 here? That would explain this:
%1 = alloca i64, align 8
%tmpcast = bitcast i64* %1 to [2 x float]*
That is, LLVM is using a single i64 to store two floats for x.
Anyway, treating memset & memcpy calls like stores seem like exactly the right thing to do!
Yep, to clarify, SIZE was 2 for this example, and that does help clarify why the i64 is used! I will go ahead and translate the memset and memcpy instructions.
The original problem was mostly fixed, but another similar problem relating in the load/store movement pass is being fixed, where loads and stores cannot be moved around memsets.
When C programs have local arrays created within a function, LLVM sometimes generates these arrays by first allocating space, then doing a memset into the allocated space. The current method of backtracking from a store instruction during the translation from LLVM to Egg terms fails to track the memset instructions, because memset instructions do not write to a register, but instead use the memory location as an argument. Sometimes memcopy instructions are also generated as well.
Here is a minimal test case that generates memset, without memcopy instructions:
and the clang code it generates, after several optimizations were run:
The Diospyros pass walks back from the store after the instruction definition %13, and recursively looks at arguments of instructions, beginning with the %12 and %13 referenced in the store. Because no instruction references memset (memset has no destination register), the current pass implementation fails to pick up memset.
To fix this issue, I am planning to treat any memset/memcopy instruction like a store, because it has the same basic function as a store. Where the Diospyros pass walks back from stores, it should also walk back from stores/memset/memcopy instructions, and translate recursively backwards into an Egg term. This change will allow Egg expressions to contain memset nodes in the correct position.
The text was updated successfully, but these errors were encountered: