-
Notifications
You must be signed in to change notification settings - Fork 267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CONTRACTS] Detect loop locals with goto_rw in DFCC #8489
[CONTRACTS] Detect loop locals with goto_rw in DFCC #8489
Conversation
269f253
to
7dab9d8
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #8489 +/- ##
==========================================
Coverage ? 78.86%
==========================================
Files ? 1727
Lines ? 198516
Branches ? 18496
==========================================
Hits ? 156559
Misses ? 41957
Partials ? 0 ☔ View full report in Codecov by Sentry. |
/// Collect identifiers that are local to this loop only | ||
/// (excluding nested loop). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please be more verbose to clarify what "excluding nested loop" means? Perhaps even give an example? Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When generating loop information, we want to collect locals that are only local for the given loop_id but not its nested inner loops to avoid instrument a loop instruction more than once.
I remove the condition of excluding nested inner loops from this method. And erase inner loops' locals in gen_dfcc_loop_info
instead.
/// Collect identifiers that are local to this loop only | ||
/// (excluding nested loop). | ||
/// A target is a loop local if | ||
/// 1. it is declared inside the loop, or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that a sufficient condition? Don't we also need to have it marked DEAD
within the (syntactic) bounds of the loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want to identify identifiers that are equivalent to loop locals in the sense that DFCC can safely ignore them for havocing and checking assignability.
There is no write/read outside the loop is sufficient for ignoring the target. However, DECL in GOTO is also a writing instruction as I commented below. What do you think could be a better conditions for identifying such identifiers? How about checking if a variable is read before the first write in the loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think you might just not really care about declarations very much and instead focus on all reads/writes being within the set of instructions that make up the loop body.
regression/contracts-dfcc/dont_skip_cprover_prefixed_vars_fail/main.c
Outdated
Show resolved
Hide resolved
int nondet_var; | ||
int __VERIFIER_var; | ||
int __CPROVER_var; | ||
int nondet_var = nondet_int(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please elaborate as to why this is newly necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason is that if we only declare variables but not initialize them or use them outside the loop, they will be identified as loop locals by the new algorithm. And hence DFCC will not check if they are assignable according to the assigns clauses.
I think the issue here is that in C, declaring a variable also initializes them with a nondet value. And we usually just use declarations in harness for deterministically initializing variables. To avoid being identified as loop locals and ignored by DFCC, I explicitly initialize them with nondet.
Shouldn't a C declaration be converted to a GOTO DECL and a GOTO ASSIGN with nondet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason is that if we only declare variables but not initialize them or use them outside the loop, they will be identified as loop locals by the new algorithm. And hence DFCC will not check if they are assignable according to the assigns clauses.
Is there anything wrong with taking this view? That is, if a variable is declared outside the loop yet never read outside the loop then it is indistinguishable from one that is declared inside the loop, I'd say?!
I think the issue here is that in C, declaring a variable also initializes them with a nondet value. And we usually just use declarations in harness for deterministically initializing variables. To avoid being identified as loop locals and ignored by DFCC, I explicitly initialize them with nondet.
Shouldn't a C declaration be converted to a GOTO DECL and a GOTO ASSIGN with nondet?
One could do that, and I have made such an attempt in #35. There just wasn't an obvious benefit to doing so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there anything wrong with taking this view? That is, if a variable is declared outside the loop yet never read outside the loop then it is indistinguishable from one that is declared inside the loop, I'd say?!
In C program, it is distinguishable. Because declaration is also a havocing in C. In the tests I updated in this PR, they expect the variables that are only declared outside the loop but no read/write outside the loop to be checked by DFCC inside the loop. That's why I have to explicitly assign to them to avoid them being identified as loop locals.
I think adding a condition that a variable should also not be read before first write in the loop can make them really indistinguishable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, the issue here is that also they are not used outside the loops, but they are used in loop contracts, which will be instrumented as R/W outside the loop. So, a loop local should also not be used in loop contracts.
src/goto-instrument/contracts/dynamic-frames/dfcc_infer_loop_assigns.cpp
Outdated
Show resolved
Hide resolved
src/goto-instrument/contracts/dynamic-frames/dfcc_infer_loop_assigns.cpp
Outdated
Show resolved
Hide resolved
Hi, Now we are trying to infer if a GOTO variable, representing a MIR place, which itself represents a Rust variable, is a loop-local variable at the Rust level by checking if a GOTO variable DECL/DEAD outside of a loop is read or written outside the loop or not. Initially the concept of loop local was that the identifier really only exists (DECL/DEAD) within the connected component that defines the loop. Excluding loop locals from frame condition checking is sound and is the the thing to do because these variables can't even be used outside of the loop instructions and hence are not part of the frame of the loop (frame: set of memory locations that exists before/after the loop executes). It kind of feels wrong to bend this (sorta) well-defined notion to make up for the destructive compilation performed by Could we instead try to have this fixed upstream ? i.e. use the scopes info produced by |
This is what we used to do in Kani up until model-checking/kani#3164 (which became part of model-checking/kani#2995). That problem that we run into when producing declarations is that MIR also has a lot of non-linear control flow via |
1d5351e
to
656ddfa
Compare
@tautschnig , after adding the condition that a loop local should not appear in loop contracts, the current detection algorithm is sound for C's declaration now. In the regression tests updated in this PR, we want to enforce DFCC to check assignability of some variables even if it is sound and safe to ignore them. I added initialization/invariants in those tests so that ignoring the variables is unsound and trigger the checks. |
656ddfa
to
cfd9395
Compare
I can see good value in a GOTO pass that tightens variable lifetimes, which might fix the MIR issue. |
Agreed, but non-trivial given all the non-linear control flow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving with two additional questions.
@@ -12,7 +12,7 @@ void main() | |||
x1 = &z1; | |||
|
|||
while(y1 > 0) | |||
__CPROVER_loop_invariant(*x1 == __CPROVER_loop_entry(*x1)) | |||
__CPROVER_loop_invariant(y1 >= 0 && *x1 == __CPROVER_loop_entry(*x1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this (and the other changes in this file) now necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is for the same reason why we add nondet_int
. It is safe to ignore y1
in DFCC instrumentation. But we want to trigger the checks for y1
as described in the .desc
file. Adding a new invariant clause will trigger all DFCC checks for y1
.
from_integer(0, size_type()), | ||
false_exprt(), | ||
false_exprt(), | ||
false_exprt(), | ||
false_exprt(), | ||
true_exprt(), | ||
true_exprt(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Why did you have to change this one call to
write_set_create_call
? - It seems that, after this change, there is no further call to the 3-parameter version of
write_set_create_call
. Can you please remove it?
c924ee6
to
2e6da0c
Compare
2e6da0c
to
29f7ca5
Compare
This PR improve the loop locals detection in DFCC by also detecting variables that are not written or read outside the loop.