yuqi-zheng

Compiler Optimization: Why xor eax, eax Instead of mov eax, 0


If you have spent any time reading x86 assembly output from a C or C++ compiler, you have almost certainly seen this:

xor eax, eax

The intent is clear enough —it zeros the eax register. But why this form? The more obvious instruction, mov eax, 0, also zeroes the register and its meaning is immediately apparent to anyone who reads assembly. Yet GCC, Clang, and MSVC all consistently prefer the xor form. There are two distinct reasons: instruction encoding size and CPU microarchitecture.

The Encoding Argument

mov eax, 0 must encode the immediate value zero as a full 32-bit operand, even though that value is trivially small. The machine code is:

b8 00 00 00 00

Five bytes. The first byte is the opcode; the remaining four encode the literal zero.

xor eax, eax, by contrast, is a register-to-register operation with no immediate at all:

31 c0

Two bytes. That is a 60% reduction in encoding size for a single instruction. In a program that zeros many registers —which is common in function preambles, loop initialization, and return value setup —this adds up. Smaller code means more instructions fit in the L1 instruction cache, which directly reduces fetch pressure.

The Microarchitecture Argument

The encoding size alone would justify the preference, but modern x86 CPUs go further: they recognize xor reg, reg (and a few analogous forms like sub reg, reg) as a special zeroing idiom, and handle it differently from a typical arithmetic instruction.

Ordinarily, xor eax, eax would create a data dependency on the previous value of eax. Semantically, XOR reads both operands and produces a result. If some earlier instruction wrote to eax and has not yet completed, the XOR would have to wait for it.

But the CPU knows that xor reg, reg always produces zero, regardless of the register’s current value. It need not read the operand at all. This means:

  • The false dependency is eliminated. The instruction does not wait for the previous writer of eax to finish.
  • Register renaming can allocate a fresh physical register initialized to zero, rather than checking or modifying the existing one.
  • In some microarchitectures, the instruction is handled entirely in the rename stage without consuming an execution port at all. It is marked as producing zero, retirement happens normally, but the execution units stay free.

The result is that xor eax, eax can have effectively zero latency and zero throughput cost on modern Intel and AMD cores, while mov eax, 0 still needs to pass through an execution unit and creates a full write dependency.

Why eax and Not rax?

In 64-bit mode, writing to a 32-bit register automatically zeros the upper 32 bits of the corresponding 64-bit register. This is an explicit part of the x86-64 specification. So xor eax, eax actually zeros all 64 bits of rax —the same effect as xor rax, rax.

The rax form requires a REX.W prefix byte to indicate 64-bit operand size, making it three bytes instead of two. Since the 32-bit form already produces the correct result and is shorter, compilers consistently use it.

For the extended registers r8 through r15, a REX prefix is required regardless (to encode the register number), so xor r8d, r8d and xor r8, r8 are both three bytes. Compilers still use the 32-bit form for consistency and because it remains recognized as a zeroing idiom.

Comparison

MethodSizeExecutionZeros full 64-bit register
mov eax, 05 bytesNormal, with dependencyYes (writes eax, clears upper 32)
xor eax, eax2 bytesZeroing idiom, no dependencyYes
xor rax, rax3 bytesZeroing idiom, no dependencyYes

The xor eax, eax form wins on every axis: smallest encoding, best microarchitectural treatment, full 64-bit zeroing effect.

The Broader Pattern

This is a good example of how compiler output can look strange to a human reader but be entirely deliberate. The instruction does not look like it is zeroing a register, but the CPU understands exactly what it means. The compiler is writing for the CPU, not for the person reading the disassembly.

It also illustrates how ISA quirks —like the rule that 32-bit writes zero the upper 32 bits —create optimization opportunities that compilers exploit systematically. You do not need to think about any of this when writing C or C++; writing return 0; or int x = 0; will produce xor eax, eax automatically. But understanding why it appears there is useful background for reading assembly and for reasoning about what compilers do.