Structure-Aware Fuzzing for Floating-Point Code with libFuzzer
Coverage-guided fuzzers like libFuzzer are excellent at finding bugs in parsers and protocol implementations. They struggle with floating-point code because they treat the input as an opaque byte string, making it unlikely to generate the values that actually matter: NaN, ±∞, -0.0, DBL_EPSILON`, subnormals. A custom mutator fixes this.
The Problem: A Function That “Never” Returns NaN
Consider this accumulation function that explicitly skips NaN inputs:
double sum(const double* begin, const double* end) {
return std::accumulate(begin, end, 0.0, [](auto a, auto b) {
return std::isnan(b) ? a : a + b;
});
}
Intuition says this can never return NaN —NaN inputs are filtered out. To test that claim with libFuzzer, the target is straightforward:
extern "C" int LLVMFuzzerTestOneInput(const uint8_t* Data, size_t Size) {
auto* begin = reinterpret_cast<const double*>(Data);
auto* end = begin + Size / sizeof(double);
if (std::isnan(sum(begin, end))) {
std::abort();
}
return 0;
}
Without custom mutators, libFuzzer will generate random bytes that almost never align to meaningful IEEE 754 bit patterns. It might never find the bug. With a custom mutator, it finds it almost instantly.
Custom Mutator
The mutator produces arrays of double values drawn from the interesting corners of IEEE 754:
extern "C" size_t LLVMFuzzerCustomMutator(
uint8_t* Data, size_t Size, size_t MaxSize, unsigned int Seed)
{
auto* begin = reinterpret_cast<double*>(Data);
auto* end = begin + Size / sizeof(double);
std::minstd_rand gen(Seed);
auto interesting_double = [&]() -> double {
switch (std::uniform_int_distribution<>(0, 10)(gen)) {
case 0: return std::numeric_limits<double>::quiet_NaN();
case 1: return std::numeric_limits<double>::min(); // smallest normal
case 2: return std::numeric_limits<double>::max();
case 3: return -std::numeric_limits<double>::min();
case 4: return -std::numeric_limits<double>::max();
case 5: return std::numeric_limits<double>::epsilon();
case 6: return -std::numeric_limits<double>::epsilon();
case 7: return std::numeric_limits<double>::infinity();
case 8: return -std::numeric_limits<double>::infinity();
case 9: return 0.0;
default: return std::uniform_real_distribution<>(-1.0, 1.0)(gen);
}
};
switch (std::uniform_int_distribution<>(0, 3)(gen)) {
case 0: // mutate an existing element
if (begin != end) {
auto idx = std::uniform_int_distribution<>(0, (int)(end - begin) - 1)(gen);
begin[idx] = interesting_double();
}
break;
case 1: // append an element
if (Size + sizeof(double) <= MaxSize) {
*end++ = interesting_double();
}
break;
case 2: // remove the last element
if (begin != end) --end;
break;
case 3: // shuffle
std::shuffle(begin, end, gen);
break;
}
return (end - begin) * sizeof(double);
}
Custom Crossover
The crossover function combines two inputs by selecting each element from either parent with equal probability:
extern "C" size_t LLVMFuzzerCustomCrossOver(
const uint8_t* Data1, size_t Size1,
const uint8_t* Data2, size_t Size2,
uint8_t* Out, size_t MaxOutSize, unsigned int Seed)
{
std::minstd_rand gen(Seed);
std::bernoulli_distribution coin(0.5);
size_t n = std::min({Size1, Size2, MaxOutSize}) / sizeof(double);
for (size_t i = 0; i < n; ++i) {
reinterpret_cast<double*>(Out)[i] = coin(gen)
? reinterpret_cast<const double*>(Data1)[i]
: reinterpret_cast<const double*>(Data2)[i];
}
return n * sizeof(double);
}
Compile and Run
clang++ -g -fsanitize=fuzzer fpfuzzing.cpp -o fpfuzzing
./fpfuzzing
The fuzzer finds a crashing input almost immediately.
What It Found
The crashing input is {+∞, NaN, NaN, -∞}. The function skips both NaN values, then computes +∞ + (-∞). IEEE 754 defines ∞ - ∞ = NaN. The function does return NaN — the filter only guards against NaN inputs, not NaN results produced by combining infinities.
SUMMARY: libFuzzer: deadly signal
Base64: AAAAAAAA8H8AAAAAAAAAAAAAAAAAAPD/
Decoded: [+∞, NaN, NaN, -∞].
Why This Matters
This is property-based testing with a coverage-guided engine. The property being tested is “this function never produces NaN.” The fuzzer falsifies it in milliseconds by generating inputs that a human reviewer would likely miss.
The same technique extends to more complex domains: symmetric matrices, linear systems, functions with specific numeric constraints. Wherever the input has structure that random bytes will not naturally satisfy, a custom mutator pays for itself immediately.