Constant Folding in Traditional and AI Compilers

Constant folding is a compiler optimization technique that evaluates constant expressions at compile time and replaces them with their computed values, reducing runtime computation overhead.

Constant Folding in Traditional Compilers

Traditional compilers identify constant expressions during compilation and replace them with precomputed results. For example:

seconds_per_day = 24 * 60 * 60

This is optimized to:

seconds_per_day = 86400

In CPython, this optimization is visible in the bytecode:

import dis
dis.dis("seconds_per_day = 24*60*60")

Output:

  1           0 LOAD_CONST               0 (86400)
              2 STORE_NAME               0 (seconds_per_day)
              4 LOAD_CONST               1 (None)
              6 RETURN_VALUE

The interpreter directly loads the constant 86400, avoiding three loads and two multiplications at runtime.

For an expression to be eligible for constant folding, all its subexpressions must be constants. This often requires constant propagation, which substitutes variables with their known constant values.

Consider this code:

int x = 14;
int y = 7 - x / 2;
return y * (28 / x + 2);

After constant propagation (replacing x with 14):

int x = 14;
int y = 7 - 14 / 2;
return y * (28 / 14 + 2);

Now, both 7 - 14/2 and 28/14 + 2 are constant expressions and can be folded:

int x = 14;
int y = 0;
return y * 4;

Further propagation replaces y with 0:

int x = 14;
int y = 0;
return 0 * 4;

Finally, 0 * 4 folds to 0:

int x = 14;
int y = 0;
return 0;

Constant propagation relies on data-flow analysis over the control flow graph (CFG). For each basic block n, the set of reachable definitions Reaches(n) is computed as:

\[ Reaches(n) = \bigcup_{m \in preds(n)} \left( DEDef(m) \cup (Reaches(m) \cap \overline{DefKill(m)}) \right) \]

Where:

  • preds(n): predecessors of block n
  • DEDef(m): definitions in block m that reach its exit without redefinition
  • DefKill(m): definitions killed (redefined) in block m

A definition is reachable at a use site if it appears in Reaches(n) and hasn't been redefined along the path from entry to the use.

Note: language-specific rules affect what qualifies as a compile-time constant. In Java, only final variables initialized with constant expressions are treated as compile-time constants. Thus:

String a = "a";
String bc = "bc";
String s1 = a + bc; // Not folded—uses StringBuilder at runtime

Languages also impose practical limits. In CPython:

  • 2**64 is folded (result fits in internal representation), but 4**64 is not:
dis.dis("x = 2**64")   # LOAD_CONST with 18446744073709551616
dis.dis("x = 4**64")   # BINARY_POWER remains

Similarly, string repetition is only folded if the result length ≤ 4096:

dis.dis("s = '-' * 4097")  # BINARY_MULTIPLY remains

Constant Folding in AI Compilers

AI compilers apply constant folding to computation graphs rather then abstract syntax trees. A node is folded if its inputs are compile-time constants or become constants after prior folding.

Common foldable patterns include:

  1. Constant-input operations: If all inputs to an op (e.g., AddN) are constant tensors, compute the result at compile time and replace the op with a constant node.
  2. Shape-dependent ops: Operations like Size or Shape depend only on input tensor shapes. If shapes are known statically, these ops are replaced with constant values. For a tensor of shape (1,2,3,4), Size() becomes the constant 24.
  3. Chained folding: Even if an op’s inputs aren’t initially constant, they may become so after folding upstream nodes. For example, if AddN1 is folded to a constant, and AddN2 consumes it along with another constant, AddN2 can then be folded too.

However, AI compilers often avoid folding very large constants to prevent graph bloat.

Implementation Examples

Traditional Compiler (CPython)

CPython uses astfold_expr to recursively traverse the AST. For binary operations:

  1. Fold left and right operands recursively
  2. If both are constants, call fold_binop
  3. Perform the operation (e.g., via PyNumber_Add)
  4. Replace the AST node with the result

AI Compiler (TensorFlow)

TensorFlow’s constant folding proceeds as follows:

  1. Traverse the graph in reverse post-order (ensuring inputs are processed before consumers)
  2. Identify constant nodes and foldable shape ops (e.g., Shape, Size)
  3. Build a "constant subgraph" containing all foldable nodes and their dependencies
  4. Mark nodes whose outputs feed non-constant parts of the graph as "to be folded"
  5. For each such node, execute it in a sandboxed environment (since all inputs are constants) and replace it with the resulting constant tensor

This approach ensures correctness while maximizing optimizaton opportunities across complex computation graphs.

Tags: compiler-optimization constant-folding ai-compilers python-bytecode TensorFlow

Posted on Sat, 16 May 2026 20:56:21 +0000 by stewart715