# What is loop unwinding?
CBMC is a bounded model checker. CBMC works by
executing a program for some bounded number of steps and looking
for issues that occur within those steps. CBMC does not execute a
program concretely in the same way that a computer would execute the program
(would execute the result of compiling the program).
Instead,
CBMC [symbolically executes](https://en.wikipedia.org/wiki/Symbolic_execution)
the program to turn it into a set of symbolic constraints that can be solved by
a constraint solver like a SAT solver.
But where does this bound come from? And how do we know the bound
is the right one?
## Loop bounds
If the program contains no loops and contains no recursion, then every
execution of the program is finite, and CBMC can figure out the length
of the longest execution and use this as the bound. Sometime CBMC
can figure this out even when the program contains loops. For example,
CBMC can loop at the loop
```
for (int i=0; i < 10; i++)
buffer[i] = ' ';
```
and figure out on its own that the loop iterates at most 10 times.
But CBMC needs some help with the loop
```
while (TRUE)
if (i < buffer.size) buffer.data[i] = ' ' else break;
```
We need to give CBMC a bound on the number of times this loop iterates.
We need to tell CBMC how many times to unwind this loop during its
symbolic execution of the program.
We can do this in two ways.
* **Bound all loops**: We can instruct CBMC to unwind every loop at
most `K` times with `cbmc --unwind K`. If you are using the CBMC
starter kit, this bound 1 by default, but you can change this
to `K` in your Makefile with
```
UNWIND=K
```
* **Bound one loop**: We can instruct CBMC to unwind a specific loop
at most `K` times with `cbmc --unwindset FUNC.NUM:K` where `FUNC`
is the name of the function containing the loop and `NUM` is the
number of the loop within the function (more on loop numbers below).
If you are using the CBMC starter kit, we recommend that you set
this bound in your Makefile with
```
UNWINDSET+=FUNC.NUM:K
```
## Loop unwinding assertions
Now we know how to tell CBMC to unwind a loop `K` times, but how do we
know that unwinding a loop `K` times is enough? What if the problem
we are looking for occurs on iteration `K+1` and we have asked CBMC to
unwind the loop only `K` times? We could miss the issue and never know
it because we didn't tell CBMC to work hard enough to find it!
CBMC has solution called "loop unwinding assertions."
A loop unwinding assertion is a check that a loop has been completely unwound.
It checks that the loop termination condition is guaranteed to be true
after the specified number of unwindings.
First CBMC unwinds the loop, and then CBMC inserts a assertion that
the loop termination condition is true immediately after the final
unwinding of the loop.
If there is any execution of the program in which you haven't unwound
the loop enough times, CBMC will identify the loop by its
loop name `FUNC.NUM` in the list of checks that have failed.
Invoking CBMC with `cbmc --unwinding-assertions` will ask CBMC to check
unwinding assertions.
Always invoke CBMC with cbmc --unwinding-assertions
.
You can rest assured that CBMC is always invoked
with `cbmc --unwinding-assertions` if you are using the CBMC starter kit.
## Counting loop iterations
There are two ways to count loop iterations:
* One is the number of times `B` that the loop body is traversed.
* One is the number of times `T` that the loop termination is checked.
These two numbers differ by one: `T = B + 1`.
When you invoke CBMC, the number `K` that you give to `--unwind` or
`--unwindset` is interpreted as the number of times the loop termination
is checked: `K = T = B+1`.
This off-by-one business sometimes confuses people, and sometimes makes
it challenging to write a Makefile where arithmetic like `B+1` is difficult.
One style we have developed is to write a proof harness
```C
int main() {
size_t length;
__CPROVER_assume(length < LENGTH);
char *buffer = (char *) malloc(length);
for (int i=0; i < length; i++) buffer[i] = ' ';
...
}
```
and to write a Makefile
```Makefile
LENGTH=100
DEFINES += -DLENGTH=$(LENGTH)
UNWINDSET += main.0:$(LENGTH)
```
Because `length` is assumed to be strictly less than `LENGTH`, we know
that `length + 1` is at most `LENGTH`, so unwinding the loop `main.0`
at most `LENGTH` times is unwinding the loop at least `length + 1` times
as desired.
A final comment on this off-by-one issue:
At some point in the future, you may come to learn that
you can also pass the `cbmc` flags `--unwind` and `--unwindset` to a tool
named `goto-instrument`. For historical reasons, `cbmc` interprets `K` as
`K=T` and `goto-instrument` interprets `K` as `K=B`.
Of course, when choosing between unrolling a loop `T=B+1` and `B` times,
it is always safe to unroll a loop more times than necessary.
But it may not be practical: If the loop body is
large and you are already unrolling a loop many times, the difference between
unrolling `K` and `K+1` times many push CBMC execution time from "okay"
to "way too long."
Just something to keep in mind when using these tools.
## Debugging loop unwinding
When not using the starter kit, which uses `--unwind 1` by default, you may find
yourself in a situation where CBMC keeps unwinding a loop, even when you
expected CBMC to be able to determine the loop bound. CBMC uses constant
propagation and algebraic simplification to evaluate the loop guard. CBMC will
stop unwinding the loop when, using this process, the loop guard evaluates to
false. If you would like to debug a case of unexpected loop unwinding, proceed
as described below. We will use the following example to illustrate these steps:
```C
int main()
{
int step;
for(int i = 0; i < 5; i += step) ;
}
```
Note that `step` is unconstrained, perhaps unintentionally so.
Running CBMC on this example without any additional options yields:
```
> cbmc loop.c
[...]
Starting Bounded Model Checking
Unwinding loop main.0 iteration 1 file loop.c line 4 function main thread 0
Unwinding loop main.0 iteration 2 file loop.c line 4 function main thread 0
Unwinding loop main.0 iteration 3 file loop.c line 4 function main thread 0
Unwinding loop main.0 iteration 4 file loop.c line 4 function main thread 0
Unwinding loop main.0 iteration 5 file loop.c line 4 function main thread 0
Unwinding loop main.0 iteration 6 file loop.c line 4 function main thread 0
Unwinding loop main.0 iteration 7 file loop.c line 4 function main thread 0
Unwinding loop main.0 iteration 8 file loop.c line 4 function main thread 0
Unwinding loop main.0 iteration 9 file loop.c line 4 function main thread 0
Unwinding loop main.0 iteration 10 file loop.c line 4 function main thread 0
[...]
```
until you manually abort this process. One way to understand why this is
happening is to undertake the following steps:
1. Run CBMC with whatever options you normally do, but add some finite unrolling
bound using `--unwind` and also add `--program-only`. This will yield as
output the equation system generate by symbolic execution. Pipe the output to
some text file. We exemplify this on the above example:
```
> cbmc loop.c --program-only --unwind 2
[...]
Starting Bounded Model Checking
Unwinding loop main.0 iteration 1 file loop.c line 4 function main thread 0
Not unwinding loop main.0 iteration 2 file loop.c line 4 function main thread 0
[...]
Program constraints:
[...]
// 2 file loop.c line 4 function main
(26) i!0@1#2 == 0
// 3 file loop.c line 4 function main
// 4 file loop.c line 4 function main
(27) i!0@1#3 == step!0@1#1
// 5 file loop.c line 4 function main
// 3 file loop.c line 4 function main
// 3 file loop.c line 4 function main
(28) \guard#1 == !(i!0@1#3 >= 5)
// 4 file loop.c line 4 function main
(29) i!0@1#4 == i!0@1#3 + step!0@1#1
guard: \guard#1
[...]
```
The above output is an excerpt from the equation system generated by symbolic
execution. We see assignments to the loop counter `i` turned into equations
over its renamed (in static single assignment (SSA) form) instantiations. The
loop guard is found as `\guard#1`.
2. Search for the source location where you expect the constant to appear in the
loop guard. In our example, this is "file loop.c line 4."
3. Once you have found the assignment or equality defining the loop guard that
doesn't have the expected constant on the right-hand side (in the example,
this could be the equality over `\guard#1`), work backwards
from that right-hand side to see where you last had a constant. That is,
backwards-search for the name appearing on the right-hand side, which will
eventually appear as a left-hand side. Now repeat this process for the new
right-hand side of this equation, until the right-hand side is a constant. In
the example, the only time we find a constant as right-hand side is the
initial assignment, `i!0@1#2 == 0`.
4. The expression where you flip from is-a-constant to is-a-symbol is the
culprit. For our example, this is `i!0@1#3 == step!0@1#1`. You will then
conclude that either this behavior is as expected, meaning your source
program does not permit constant propagation (our example does not permit
constant propagation, because `step` is unconstrained; setting it to some
constant value will result in bounded loop unwinding), or a shortcoming in
CBMC's simplification process. In the latter case, report an issue or
contribute a patch to address this.