C++ Async Callbacks: Lambda Capture and the Destruction Order Fiasco
A common crash pattern in async C++: an object registers a callback with a long-lived component, the object gets destroyed, and the callback fires later — accessing memory that no longer exists. This is called the Destruction Order Fiasco, and it’s subtle enough to slip through code review.
This post breaks down the problem, shows a real example from the Ray distributed computing framework, and explains why capturing shared_ptr by value in a lambda is the right fix.
The Problem
Consider this scenario:
CoreWorkerregisters a callback withGcsClient(which lives longer)- The callback closes over
this— a raw pointer toCoreWorker CoreWorkeris destroyed; its members are freedGcsClientfires the callback- The callback dereferences a dangling pointer → crash
This is a use-after-free, and in production it often manifests as an intermittent, hard-to-reproduce segfault.
The Dangerous Pattern: Capturing [this]
class CoreWorker {
public:
CoreWorker(std::shared_ptr<GcsClient> gcs) : gcs_client_(gcs) {
// Danger: lambda captures raw `this`
gcs_client_->Subscribe([this](const NodeID& node_id) {
ref_counter_->ResetObjectsOnRemovedNode(node_id);
});
}
~CoreWorker() { /* ref_counter_ and friends are destroyed here */ }
private:
std::shared_ptr<GcsClient> gcs_client_;
std::shared_ptr<ReferenceCounter> ref_counter_;
};
[this] captures an 8-byte raw pointer. It participates in no reference counting. Nothing stops CoreWorker from being destroyed while GcsClient still holds the callback.
Timeline of the crash:
1. CoreWorker constructed → callback registered, closure holds raw `this`
2. CoreWorker destroyed → ref_counter_ freed
3. GcsClient fires event → callback invoked
4. this->ref_counter_->... → use-after-free → segfault
The Fix: Capture shared_ptr by Value
From core_worker.cc in the Ray source:
void CoreWorker::SubscribeToNodeChanges() {
std::call_once(subscribe_to_node_changes_flag_, [this]() {
// Capture shared ownership to avoid destruction order fiasco between
// gcs_client, reference_counter_, raylet_client_pool_, and
// core_worker_client_pool_.
auto on_node_change = [reference_counter = reference_counter_,
rate_limiter = lease_request_rate_limiter_,
raylet_client_pool = raylet_client_pool_,
core_worker_client_pool = core_worker_client_pool_](
const NodeID &node_id,
const rpc::GcsNodeAddressAndLiveness &data) {
if (data.state() == rpc::GcsNodeInfo::DEAD) {
reference_counter->ResetObjectsOnRemovedNode(node_id);
raylet_client_pool->Disconnect(node_id);
core_worker_client_pool->Disconnect(node_id);
}
};
gcs_client_->Nodes().AsyncSubscribeToNodeAddressAndLivenessChange(
std::move(on_node_change), /*callback*/...);
});
}
The comment is the giveaway: “capture shared ownership to avoid destruction order fiasco”. Each member (reference_counter_, raylet_client_pool_, etc.) is a shared_ptr. Capturing them by value copies the smart pointer — not the underlying object — and increments the reference count.
Safe timeline:
1. CoreWorker constructed → lambda captures shared_ptrs (ref count +1 each)
2. CoreWorker destroyed → members' ref counts drop by 1
→ underlying objects NOT freed (lambda still holds them)
3. GcsClient fires event → callback runs safely, objects still alive
4. GcsClient destroyed → lambda destroyed, ref counts hit zero → objects freed
The lambda becomes self-contained: it owns everything it needs to execute safely, independent of CoreWorker’s lifetime.
Performance Cost
Capturing a shared_ptr by value is a shallow copy — it copies two pointers (16 bytes) and does one atomic increment on the reference count. The managed object itself is not copied.
| Capture | Cost |
|---|---|
[this] | 8-byte pointer copy, no atomic op |
[shared_ptr] | 16-byte pointer copy + one atomic increment |
In practice this is negligible. The real cost of a callback is the heap allocation from type-erasure in std::function — and that happens regardless of what you capture.
Choosing the Right Capture
| Scenario | Capture | Reason |
|---|---|---|
Synchronous, local scope (e.g. std::sort) | [&] or [this] | Lifetime is obvious, zero overhead |
| Callback runs immediately, caller waits | [&] or [this] | Call stack keeps object alive |
| Callback stored in long-lived component | [shared_ptr] | Extend lifetime, prevent dangling |
| Callback is best-effort (e.g. UI refresh) | [weak_ptr] | Don’t force the object to stay alive |
The shared_ptr vs weak_ptr choice comes down to whether the callback must run or may be discarded:
// weak_ptr: callback is optional — skip if object is gone
auto callback = [weak = weak_from_this()] {
if (auto self = weak.lock()) {
self->UpdateUI();
}
};
Ray uses shared_ptr because the cleanup logic (ResetObjectsOnRemovedNode, Disconnect) is not optional — it must run even if CoreWorker is gone.
Summary
[this]capture is a raw pointer — it carries no ownership, no safety guarantee- Capturing
shared_ptrmembers by value gives the lambda partial ownership of the resources it needs - The lambda becomes self-contained and safe to invoke regardless of the registering object’s lifetime
- Cost is minimal: one atomic op per captured pointer
- Use
weak_ptrwhen the callback is optional;shared_ptrwhen it must execute
The next time you write a callback that gets stored somewhere, ask: could the object I’m closing over be destroyed before this fires? If yes, capture shared_ptr — not this.
This is Part 1 of a series. Part 2 covers the weak_from_this pattern for callbacks that should silently abort when the object is gone.
References
- Ray source:
src/ray/core_worker/core_worker.cc - Scott Meyers, Effective Modern C++ — Item 31: Avoid default capture modes
- C++ Core Guidelines F.53: Avoid capturing by reference in lambdas used non-locally