Byte Order Reversal in C++: Bit Twiddling vs. Compiler Builtins
Converting between big-endian and little-endian representations is a common task in network programming, binary file parsing, and hardware interfacing. C++ offers two natural approaches: manual bit manipulation and compiler builtins. Examining the generated assembly reveals they are equivalent —but one is far more readable.
The Manual Approach
For a 32-bit integer, byte reversal requires shuffling each of the four bytes into the opposite position:
void reverse32(unsigned int* p) {
unsigned int& a = *p;
a = ((a & 0xff000000) >> 24)
| ((a & 0x00ff0000) >> 8)
| ((a & 0x0000ff00) << 8)
| ((a & 0x000000ff) << 24);
}
The 64-bit version extends this to eight bytes:
void reverse64(unsigned long long* b) {
unsigned long long& a = *b;
a = ((a & 0xff00000000000000ULL) >> 56)
| ((a & 0x00ff000000000000ULL) >> 40)
| ((a & 0x0000ff0000000000ULL) >> 24)
| ((a & 0x000000ff00000000ULL) >> 8)
| ((a & 0x00000000ff000000ULL) << 8)
| ((a & 0x0000000000ff0000ULL) << 24)
| ((a & 0x000000000000ff00ULL) << 40)
| ((a & 0x00000000000000ffULL) << 56);
}
Correct, but verbose. The intent is buried in eight mask-and-shift operations.
Compiler Builtins
GCC and Clang provide __builtin_bswap32 and __builtin_bswap64 that express the intent directly:
void reverse32_builtin(unsigned int* p) {
*p = __builtin_bswap32(*p);
}
void reverse64_builtin(unsigned long long* p) {
*p = __builtin_bswap64(*p);
}
Assembly Output
The compiler recognizes both patterns and emits the same bswap instruction for all four variants:
32-bit:
reverse32:
mov eax, DWORD PTR [rdi]
bswap eax
mov DWORD PTR [rdi], eax
ret
reverse32_builtin:
mov eax, DWORD PTR [rdi]
bswap eax
mov DWORD PTR [rdi], eax
ret
64-bit:
reverse64:
mov rax, QWORD PTR [rdi]
bswap rax
mov QWORD PTR [rdi], rax
ret
reverse64_builtin:
mov rax, QWORD PTR [rdi]
bswap rax
mov QWORD PTR [rdi], rax
ret
Three instructions. No performance difference between the approaches.
Recommendations
Use __builtin_bswap32 / __builtin_bswap64 over manual bit manipulation. The compiler recognizes and optimizes both, but the builtin makes intent explicit.
Use fixed-width types. Prefer uint32_t and uint64_t over unsigned int and unsigned long long to guarantee the correct width across platforms.
In C++23, use std::byteswap. It is the standard, type-safe, portable spelling:
#include <bit>
uint32_t x = std::byteswap(value);
The manual bit-twiddling version is a useful exercise for understanding what the hardware does, but in production code it adds noise without adding correctness or performance.