yuqi-zheng

Compiler Optimization: When LICM Fails —The Aliasing Trap


Loop-invariant code motion (LICM) is one of the most reliable compiler optimizations: if a computation does not change across iterations, move it before the loop and compute it once. When this optimization fails unexpectedly, the cause is often aliasing —the compiler cannot prove that a value is truly invariant because some pointer write might change it.

The Setup

Start with a function that searches a C string for an exclamation mark:

bool has_exclamation(const char* str) {
    for (size_t i = 0; i < strlen(str); ++i)
        if (str[i] == '!') return true;
    return false;
}

The call to strlen(str) appears in the loop condition. A human reader can see that str points to a fixed location and the loop body does not modify the string, so strlen should return the same value every iteration. The compiler agrees: it hoists strlen(str) out of the loop, computes it once, and uses the result as a fixed bound.

Now add one line:

std::size_t num_compares = 0;

bool has_exclamation(const char* str) {
    for (size_t i = 0; i < strlen(str); ++i) {
        ++num_compares;
        if (str[i] == '!') return true;
    }
    return false;
}

The only change is incrementing a global counter. The optimization disappears. strlen(str) is now called on every iteration, making the function dramatically slower for long strings.

Why a Global Write Breaks LICM

The C and C++ standards define aliasing rules that specify which pointer types may legally refer to the same memory. One rule is that char*, unsigned char*, and std::byte* may alias any object of any type. This is intentional: it allows code like memcpy to operate on raw bytes of any object.

The consequence here is that the compiler cannot assume str and num_compares occupy separate memory. Specifically:

  • str is a char*, which can alias anything.
  • num_compares is a std::size_t stored at a global address.
  • The write ++num_compares modifies memory at that address.
  • For all the compiler knows, str might point into the bytes of num_compares, meaning the string the compiler is scanning could change when num_compares is written.

This sounds absurd —no one would construct a char* pointing into a size_t —but the compiler’s aliasing analysis is a formal system operating on types and addresses, not programmer intent. It must handle the worst case the type system permits.

Since str might alias num_compares, a write to num_compares might change the contents of the string, which means strlen(str) might return a different value after each write. Therefore, hoisting strlen(str) would be incorrect, and the compiler does not do it.

Compiler Behavior in Practice

The behavior varies across compilers:

Clang is somewhat aggressive. If num_compares is declared static and its address is never taken or passed out of the translation unit, Clang may determine that no external char* can reach it and optimize accordingly. Once the variable is genuinely used across translation units or its address is taken, the optimization fails.

GCC and MSVC are typically more conservative. Even with types that should not alias —replacing const char* with a different pointer type —these compilers may still decline to hoist the strlen call in some configurations.

The author of the original series filed optimization deficiency reports against GCC and LLVM for this specific pattern.

Fixes

Use string_view

bool has_exclamation(std::string_view sv) {
    for (char c : sv) {
        ++num_compares;
        if (c == '!') return true;
    }
    return false;
}

std::string_view stores its length as a plain integer member, not as a derivation from a pointer. Accessing .size() is an integer load, not a memory scan through a pointer. The compiler can see that the size is stored in the string_view object, that ++num_compares does not modify that object, and therefore the size is loop-invariant. LICM succeeds.

This is the preferred solution. It also communicates intent more clearly: string_view conveys “a fixed-length view of a string” rather than “a pointer to a null-terminated sequence of unknown length.”

Cache the length explicitly

bool has_exclamation(const char* str) {
    const size_t len = strlen(str);
    for (size_t i = 0; i < len; ++i) {
        ++num_compares;
        if (str[i] == '!') return true;
    }
    return false;
}

Storing the length in a const local variable makes the invariant explicit. A local whose address is never taken cannot alias anything —the compiler knows it is not reachable through any pointer. The loop condition reads len from a stack slot that no external pointer can access, so LICM is not even needed; the value is just used directly.

Reduce global state

The global counter itself is a structural problem. Global variables are addressable from anywhere in the program, which always creates aliasing uncertainty. A counter passed in as a reference parameter, or accumulated locally and returned as a result, gives the compiler more information about where writes go.

The Broader Lesson

The char* aliasing rule is one of the most far-reaching in C and C++. Because character pointers can legally alias anything, writes through any char* —or writes near any object that could be observed through a char* —create aliasing uncertainty that blocks a wide range of optimizations.

This is not a compiler bug or deficiency. It is the correct behavior given the language rules. The fix is to use types and structures that let the compiler prove the invariants it needs: string_view instead of char*, local variables instead of globals, const references instead of raw pointers.

The signature of this problem in practice is a performance regression caused by a logically unrelated change. Adding a counter, a debug statement, or a statistics update inside a loop should not affect how the loop bound is computed —but through aliasing rules, it can. Checking the assembly with Compiler Explorer and looking for unexpected memory reads in loop conditions is the fastest way to diagnose this class of issue.