Good news and bad news. I’ve made more progress on BAP’s VSA implementation. Recall this example from last time:
BAP’s VSA now outputs this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
|
There is something odd going on here. At BB_1
, VSA determines that
x_83
can be (almost) any number below 4. However, it’s obvious that x_83
cannot be negative.
What’s going on? Well, x_83
starts out as phi(x, x_84)
, which
means that x_83
gets its value from either x
or x_84
, depending
on control flow. Read up on SSA
form if this makes no sense to you. VSA determines that x_83
’s
value set is the union of x
and x_84
’s value sets. Sounds good so
far. x
always has zero, so that one is easy. x_84
will initially
be Top
, since we have not seen an assignment for it yet. Since
Top
is any value, Top
union anything is Top
. Thus, x_83
becomes Top
, and x_84
becomes Top + 1
. Now, when the value set
of x_84
flows through the edge condition x_84:u32 <= 3:u32
, any
values above 3 are removed, leaving us with (almost) any value less
than 4.
VSA is an abstract interpretation algorithm, but not a dataflow algorithm. In dataflow problems, data values always move downward in the lattice. But in abstract interpretation, they can move up and down, which is what happens with VSA when we perform union (up) and intersection (down). I am used to working with dataflow problems, and so this seems very unnatural to me.
It seems like we would get the right answer if the initial lattice
values of each node used the edge conditions somehow; right now the
initial lattice values are simply Top
. But if the initial lattice
value for BB_1
had x_84
constrained to be less than 4, we’d get
the right answer. It also seems like we would get the right answer if
we were not using SSA. The original author of BAP VSA wrote it as an
SSA analysis, although I do not know why. I do not think there is any
benefit.