GCD test

In compiler theory, a greatest common divisor test (GCD test) is the test used in study of loop optimization and loop dependence analysis to test the dependency between loop statements.

Description

A greatest common divisor (GCD) test is a test used in computer science compiler theory to study of loop optimization and loop dependence analysis to test the dependency between loop statements.

Use

Whenever a sequential loop like for loop is made to be parallel so that it can be executed on more than one processor—as in case of grid computing or cluster computing—then certain dependencies (e.g., testing the flow (true) dependence of a statement) are checked to know whether the loop can be parallelized. According to this test, by comparing the indices of two arrays present in two or more statements, it can be calculated whether it is legal to parallelize the loop or not.

Rationale

Theorem

A linear Diophantine equation

 a1*x1 + a2*x2 +... + an*xn =c

has an integer solution x1, x2,..., xn iff GCD (a1,a2,.., an) divides c.

E.g.

 2*x1 -2*x2 =1

GCD(2,-2) =2, 2 cannot divide 1. So, there is no integer solution for the equation above.

Dependency analysis

It is difficult to analyze array references in compile time to determine data dependency (whether they point to same address or not). A simple and sufficient test for the absence of a dependence is the greatest common divisor (GCD) test. It is based on the observation that if a loop carried dependency exists between X[a*i + b] and X[c*i + d] (where X is the array; a, b, c and d are integers, and i is the loop variable), then GCD (c, a) must divide (d – b). The assumption is that the loop must be normalized – written so that the loop index/variable starts at 1 and gets incremented by 1 in every iteration. For example, in the following loop, a=2, b=3, c=2, d=0 and GCD(a,c)=2 and (d-b) is -3. Since 2 does not divide -3, no dependence is possible.

for (i=1; i<=100; i++)
{
  X[2*i+3] = X[2*i] + 50;
}

Process

Loop code in general:

for (int i=0; i<n; i++)
{
  s1   a[x*i+k] = ...;
  s2   ... = a[y*i+m];               
}

To decide if there is loop carried dependence (two array references access the same memory location and one of them is a write operation) between a[x*i+k] and a[y*i+m], one usually^{[weasel words]} needs to solve the equation^[why?]

x*i1 +k = y*i2+m   (Or x*i1 -y*i2 = m -k)

Where 0<=i1, i2 <n and i1 != i2.

If GCD(x,y) divides (m-k), then there may exist some dependency in the loop statement s1 and s2. If GCD(x,y) does not divide (m-k) then both statements are independent and can be executed at parallel. Similarly this test is conducted for all statements present in a given loop.

A concrete example source code in C would appear as:

for (int i=0; i<100; i++)
{
  s1 a[2*i] = b[i];
  s2 c[i] = a[4*i+1];
}

The GCD of (2,4) is 2 and dividend is 1. As 2 can not divide 1 properly (leaving remainder zero), there is no dependency between s1 and s2 and various other loop transformation methods can be applied.

References

Advanced Compiler Design and Implementation by Steven S Muchnick