Wednesday, July 06, 2005

This just in: first results

After building some further infrastructure for the iterative analysis today, and sanitizing other parts of the compiler to cooperate with it, I have the first simple example compiling ('return [x]' is not supported for now :-)):
def makel(x)
l = []
l.append(x)
return l
print makel(1.0)
print makel(1)

During the first iteration, CPA creates two method templates for 'makel', because it sees that it is called with exactly two different argument products: (float,) and (int,). Since we want to iteratively improve upon imprecise object contours, during this first CPA iteration a single object contour is used for the duplicated allocation sites '[]' inside 'makel' (one for each template.) Consequently, an upper bound of {int, float} is inferred for this object contour.

As I explained earlier, after each iteration, the iterative algorithm tries to resolve such imprecisions by splitting object contours. But splitting is not yet necessary for this example to work. By tracing back dataflow paths from inside the two 'append' templates (conflicting assignments to an instance variable), we can conclude that each '[]' allocation site flows only to one of these. The compiler does just that, and splits the object contour in two. Further, it records the newly gained information about these allocation sites in a global table (still messy atm), so that future CPA iterations can seed these allocation sites with the right object contours as they create their templates.

After one more iteration, the compiler concludes that nothing more can change, and generates the following code (nicely parameterizing the types as well, corresponding to another part of my SoC proposal - btw I renamed template brackets to [ and ] because messing with HTML apparently makes me very sad):
int main() {
printf("%s\n", CS(makel(1.0)));
printf("%s\n", CS(makel(1)));
}

template [class A] list[A] *makel(A x) {
list[A] *l;
l = new list[A]();
l->append(x);
return l;
}

Which compiles and runs just fine. Simple object contour splitting also works already (see two postings earlier):
def ident(x):
return x
a = []
a = []
b = []
ident(a).append(1)
ident(b).append(1.0)

After the first iteration, the compiler notes the conflicting assignments, and sees that multiple creation points 'flow' to both assignments. It tries to 'resolve' this problem by finding a confluence node (in our case, 'x'), and by splitting the respective object contour along the incoming edges of this node. Here, it nicely splits the 'mother' contour into a contour for the two creation points for 'a' and a contour for the creation point for 'b'. In the next iteration, CPA splits the 'ident' function, because a and b now have different object contours, and the imprecision is gone :-)

No comments: