Real-time audio processing on mobile demands sub-5ms latency and zero glitches. Yet most DSP frameworks treat audio graphs as stateless function chains—rebuild the graph, lose all filter coefficients, delay line history, and envelope state. The result: audible clicks, phase discontinuities, and broken user experiences when users adjust parameters or switch presets mid-playback.
Building HearingAid Pro—a clinical-grade hearing aid using AirPods Pro as DSP hardware—required solving this: how do you preserve stateful DSP nodes (IIR filters with 200+ samples of history, AGC with attack/release envelopes, adaptive noise gates) across graph topology changes, all while maintaining hard real-time guarantees?
The Naive Approach: Rebuild Everything
Most audio frameworks follow a simple model: user changes a parameter, tear down the old graph, allocate new nodes, wire them up. For stateless operations (gain, mix, pan) this works. For stateful processors it fails catastrophically.
Consider a 10-band parametric EQ, each band a biquad IIR filter storing two samples of input history and two of output history. When the user drags the 5kHz band's gain slider, a naive rebuild:
- Deallocates all ten biquad nodes
- Allocates ten fresh nodes with zero history
- Causes a 4-sample discontinuity in nine bands that didn't change
- Produces an audible click at 48kHz sample rate
Worse: if this happens during the audio callback—which runs every 5.3ms at 256-sample buffer size—you've just blown your real-time budget with malloc/free in the hot path.
Node Identity and Stable Handles
The solution borrows from React's reconciliation: assign each DSP node a stable identity independent of its position in the graph. When rebuilding, match old nodes to new topology by ID, preserve their internal state, and only instantiate genuinely new nodes.
Implementation sketch in C++:
struct NodeHandle {
uint64_t id; // stable across rebuilds
void* state; // opaque processor state
AudioNodeVtable* vtable;
};
struct GraphRebuildContext {
std::unordered_map old_nodes;
std::vector new_graph;
};During rebuild, the engine walks the new topology. For each node descriptor, it checks if old_nodes contains that ID. If yes, reuse the existing NodeHandle and its state pointer. If no, allocate fresh. Critically, all allocation happens outside the audio thread—the rebuild produces a new graph pointer that gets atomically swapped in.
State Preservation Semantics
Not all state should persist. A compressor's input/output history must survive parameter changes, but if the user switches from compressor to limiter, that state is meaningless. The solution: three-level identity hierarchy.
- Processor type ID: "biquad_filter", "agc_compressor". Changing this forces reallocation.
- Instance ID: "eq_band_5", "left_channel_gate". Changing this within the same type resets state.
- Parameter set: Frequency, Q, gain. Changing these preserves history but updates coefficients.
When HearingAid Pro switches from "Mild Loss" to "Moderate Loss" preset, the 10-band EQ nodes retain their IDs. The engine recalculates biquad coefficients for the new gains but keeps the 4-sample delay lines intact. Zero clicks, zero phase jumps.
Lock-Free State Updates
Audio threads cannot block. If the UI thread is recalculating a 12th-order Butterworth filter's coefficients, the audio callback must keep processing with the old coefficients—but once the new ones are ready, they must swap in atomically mid-buffer.
The pattern: triple-buffered coefficient sets.
struct BiquadCoeffs {
float b0, b1, b2, a1, a2;
std::atomic version;
};
struct BiquadNode {
BiquadCoeffs coeffs[3]; // triple buffer
std::atomic write_idx;
std::atomic read_idx;
float x1, x2, y1, y2; // history, audio-thread only
};UI thread writes new coefficients to coeffs[write_idx], increments version, atomically updates write_idx. Audio thread checks if read_idx != write_idx at buffer boundaries (never mid-buffer), swaps if so. The middle buffer absorbs races. At 256-sample buffers and 48kHz, coefficient updates land within 5.3ms, imperceptible to users.
Bulk Updates and Dependency Ordering
Changing a global sample rate affects every node. Naive approach: iterate the graph, update each node's state. Problem: if node B depends on node A's output and A updates first, you get one buffer of mismatched sample rates—manifests as a pitch glitch.
Solution: two-phase commit. Phase one: all nodes prepare new state in scratch space. Phase two: atomic swap of a single "active config" pointer. All nodes see the new sample rate simultaneously. In HearingAid Pro, switching from 48kHz to 44.1kHz (when user switches AirPods) updates 47 DSP nodes in 0.3ms, zero artifacts.
Memory Management Without Malloc
Allocating in the audio callback is forbidden. But nodes do need memory—a reverb needs a 2-second delay line, 96,000 samples at 48kHz. Pre-allocating worst-case for every possible node is wasteful (a typical mobile app might define 200+ node types).
The answer: arena allocators with epoch-based reclamation. At app launch, allocate a 16MB audio arena. When building a graph, bump-allocate node state from the arena. When tearing down, don't free—just mark the epoch. Every 5 seconds (off audio thread), scan for epochs with no live references, reset those arena chunks.
struct AudioArena {
uint8_t* base;
std::atomic offset;
std::vector epochs;
};
void* arena_alloc(size_t bytes) {
size_t old = offset.fetch_add(bytes, std::memory_order_relaxed);
return base + old;
}In practice, a complex graph (20-band EQ, 4-band compressor, stereo reverb, limiter) consumes 400KB. The 16MB arena holds 40 such graphs. Graph rebuilds happen at