> Home
 Parallel Inclusion-based Points-to Analysis
 Application Description Points-to analysis is a compiler technique aimed at identifying which variables/objects are pointed by the pointers of the input program. The results of this compiler phase are useful for program optimization, program verification, debugging and whole program comprehension.     The algorithm presented here is a parallel implementation of the context and flow insensitive, inclusion-based, points-to analysis described by Ben Hardekopf in his PLDI'07 paper . We also implemented some techniques (HVN, HRU) described in later work .  Ben Hardekopf and Calvin Lin. The Ant and the Grasshopper: Fast and Accurate Pointer Analysis for Millions of Lines of Code. PLDI 2007.  Ben Hardekopf and Calvin Lin. Exploiting Pointer and Location Equivalence to Optimize Pointer Analysis. SAS 2007. Algorithm The input for the points-to algorithm is the program to be analyzed. The output is a map from program pointers to the set of variables they might point to. The analysis proceeds as follows: Extract pointer statements, which for the C language are x=&y,x=y,x=*y, and *x=y. Create the initial constraint graph. Each node corresponds to a pointer variable, and each edge (x,y) indicates that x and y appear together in a statement. The edges are tagged with the type of the statement they come from. For instance, a copy instruction such as x=y results on a copy edge from y to x. Iterate over the graph until no rewrite rule needs to be applied. A rewrite rule is just the addition of a new points-to or copy edge to the graph, whenever some precondition is met. You can find more about these preconditions in the paper - for now just assume that there are certain pairs of edges that fire the addition of a third edge to the constraint graph. The following pseudo-code shows the fixpoint phase of the points-to algorithm. ``` 01 foreach (variable v: wl) 02 foreach (variable w : copy neighbors of v) 03 apply copy_rule(v,w) 04 // copy rule might add a new points-to edge to another variable x 05 if (points-to edge added to x) 06 wl.add(x) 07 foreach (variable w : load neighbors of v) 08 apply load_rule(v, w) 09 // load rule might add a new copy edge to another variable x 10 if (copy edge added to x) 11 wl.add(x) 12 foreach (variable w : store neighbors of v) 13 apply store_rule(v,w) 14 // store rule might add a new copy edge to another variable x 15 if (copy edge added to x) 16 wl.add(x) ``` Eventually, the algorithm will reach a fixpoint (=the worklist of active nodes is empty), so we can read the solution out of the points-to edges of the graph. Example: A program written in C contains only the statements b=&w;b=&v;a=b;. Because there are three pointer statements and another three variables, we initialize (figure below, left-most subfigure) the constraint graph by adding three nodes and edges. The edges are labeled with their type: p means "points-to", while c means "copy". Then we apply the copy rule (lines 02-06) twice to reach fixpoint (right-most subfigure). From the final graph we can infer that the points-to of both a and b is {v,w}. Data Structures There are two key data structures: Unordered Set: the worklist containing the active nodes (i.e., the nodes that violate the precondition of a graph rewrite rule). Ideally, we would like to process the active nodes in topological order. In practice, a FIFO scheduling performs much better because it does not constraint parallelism and it has a more efficient implementation. MultiGraph: both the copy/load/store relationships and the points-to sets are represented using edges. Therefore, there might be multiple edges between two nodes (variables) in the graph. We only store the outgoing edges (neighbors) of a variable; the internal representation depends on the type of edge. BDD: A Binary Decision Diagram is a directed acyclic graph that s used to represent a boolean formula. In our case, the BDD can be interpreted as a function that for a pair (x,y) returns true iff there exists a points-to edge from x to y. The BDD is used then to store the points-to edges of the variables in the program. In order to support concurrent accesses, we implemented it in terms of a concurrent hash map. Sparse bit vector: A sparse bit vector is a linked list used to represent sparse sets of numbers in an ordered fashion. In our case, we use it for storing the load, store and copy neighbors of a node in the graph. Internally, the sparse bit vector is implemented as a lock-free linked list. Parallelism The parallel (inclusion-based) points-to analysis does not require the "full power" of a transactional memory system in order to preserve the sequential semantics, but is not an embarrasingly parallel algorithm either. As long as the multigraph supports concurrent accesses, there is a guarantee that the parallel analysis will reach the same fixpoint as the sequential implementation. In other words, we can apply the graph rewrite rules in any other and the final result will be identical to that of the sequential code (again, assuming that edges additions and the merging of nodes appear to be atomic) The parallelism intensity of the parallel algorithm is 100%, since all the active nodes in the worklist can be safely processed in parallel at any point during execution. Download and usage You can download our parallel implementation from here. (note: the file size is close to 150MB). The system requirements are identical to those of the Galois system. To analyze the gcc benchmark using two threads and a heap size of 1 gigabyte, run the following commands from the installation directory: ``` export TEST=gcc;export PROP=apps/hardekopfPointsTo/conf/\${TEST}.properties;export MEM=1g;export THREADS=2 ant clean dist-app -Dapp=hardekopfPointsTo ./scripts/galois -d -m \${MEM} -f \${PROP} -t \${THREADS} hardekopfPointsTo.main.Main ``` You should see the parallel analysis running once. The output is verified at the end of the run. If you want to analyze any of the other input programs included in the distribution (input/hardekopfPointsTo directory), simply update the TEST variable. Same applies if you want to use a different heap size, or number of threads. For instance, if we want to analyze linux using sixteen threads and a heap of 4 gigabytes, we would first type: ``` export TEST=linux;export PROP=apps/hardekopfPointsTo/conf/\${TEST}.properties;export MEM=4g;export THREADS=16 ``` The other two commands remain the same. Command-line options for Galois (number of runs, JVM tuning, etc) are listed in the manual. 