yuqi-zheng

Structure-Aware Fuzzing for Floating-Point Code with libFuzzer


Coverage-guided fuzzers like libFuzzer are excellent at finding bugs in parsers and protocol implementations. They struggle with floating-point code because they treat the input as an opaque byte string, making it unlikely to generate the values that actually matter: NaN, ±∞, -0.0, DBL_EPSILON`, subnormals. A custom mutator fixes this.


The Problem: A Function That “Never” Returns NaN

Consider this accumulation function that explicitly skips NaN inputs:

double sum(const double* begin, const double* end) {
    return std::accumulate(begin, end, 0.0, [](auto a, auto b) {
        return std::isnan(b) ? a : a + b;
    });
}

Intuition says this can never return NaN —NaN inputs are filtered out. To test that claim with libFuzzer, the target is straightforward:

extern "C" int LLVMFuzzerTestOneInput(const uint8_t* Data, size_t Size) {
    auto* begin = reinterpret_cast<const double*>(Data);
    auto* end   = begin + Size / sizeof(double);
    if (std::isnan(sum(begin, end))) {
        std::abort();
    }
    return 0;
}

Without custom mutators, libFuzzer will generate random bytes that almost never align to meaningful IEEE 754 bit patterns. It might never find the bug. With a custom mutator, it finds it almost instantly.


Custom Mutator

The mutator produces arrays of double values drawn from the interesting corners of IEEE 754:

extern "C" size_t LLVMFuzzerCustomMutator(
    uint8_t* Data, size_t Size, size_t MaxSize, unsigned int Seed)
{
    auto* begin = reinterpret_cast<double*>(Data);
    auto* end   = begin + Size / sizeof(double);
    std::minstd_rand gen(Seed);

    auto interesting_double = [&]() -> double {
        switch (std::uniform_int_distribution<>(0, 10)(gen)) {
            case 0:  return std::numeric_limits<double>::quiet_NaN();
            case 1:  return std::numeric_limits<double>::min();       // smallest normal
            case 2:  return std::numeric_limits<double>::max();
            case 3:  return -std::numeric_limits<double>::min();
            case 4:  return -std::numeric_limits<double>::max();
            case 5:  return std::numeric_limits<double>::epsilon();
            case 6:  return -std::numeric_limits<double>::epsilon();
            case 7:  return std::numeric_limits<double>::infinity();
            case 8:  return -std::numeric_limits<double>::infinity();
            case 9:  return 0.0;
            default: return std::uniform_real_distribution<>(-1.0, 1.0)(gen);
        }
    };

    switch (std::uniform_int_distribution<>(0, 3)(gen)) {
        case 0: // mutate an existing element
            if (begin != end) {
                auto idx = std::uniform_int_distribution<>(0, (int)(end - begin) - 1)(gen);
                begin[idx] = interesting_double();
            }
            break;
        case 1: // append an element
            if (Size + sizeof(double) <= MaxSize) {
                *end++ = interesting_double();
            }
            break;
        case 2: // remove the last element
            if (begin != end) --end;
            break;
        case 3: // shuffle
            std::shuffle(begin, end, gen);
            break;
    }

    return (end - begin) * sizeof(double);
}

Custom Crossover

The crossover function combines two inputs by selecting each element from either parent with equal probability:

extern "C" size_t LLVMFuzzerCustomCrossOver(
    const uint8_t* Data1, size_t Size1,
    const uint8_t* Data2, size_t Size2,
    uint8_t* Out, size_t MaxOutSize, unsigned int Seed)
{
    std::minstd_rand gen(Seed);
    std::bernoulli_distribution coin(0.5);
    size_t n = std::min({Size1, Size2, MaxOutSize}) / sizeof(double);

    for (size_t i = 0; i < n; ++i) {
        reinterpret_cast<double*>(Out)[i] = coin(gen)
            ? reinterpret_cast<const double*>(Data1)[i]
            : reinterpret_cast<const double*>(Data2)[i];
    }
    return n * sizeof(double);
}

Compile and Run

clang++ -g -fsanitize=fuzzer fpfuzzing.cpp -o fpfuzzing
./fpfuzzing

The fuzzer finds a crashing input almost immediately.


What It Found

The crashing input is {+∞, NaN, NaN, -∞}. The function skips both NaN values, then computes +∞ + (-∞). IEEE 754 defines ∞ - ∞ = NaN. The function does return NaN — the filter only guards against NaN inputs, not NaN results produced by combining infinities.

SUMMARY: libFuzzer: deadly signal
Base64: AAAAAAAA8H8AAAAAAAAAAAAAAAAAAPD/

Decoded: [+∞, NaN, NaN, -∞].


Why This Matters

This is property-based testing with a coverage-guided engine. The property being tested is “this function never produces NaN.” The fuzzer falsifies it in milliseconds by generating inputs that a human reviewer would likely miss.

The same technique extends to more complex domains: symmetric matrices, linear systems, functions with specific numeric constraints. Wherever the input has structure that random bytes will not naturally satisfy, a custom mutator pays for itself immediately.