Outlier Rejection in PPG: Median-of-Medians

The Outlier Problem in Mobile PPG

Photoplethysmography (PPG) captures cardiovascular signals via LED reflection through skin, but motion artifacts—finger shifts, ambient light flicker, pressure changes—inject outliers that corrupt peak detection, heart rate variability analysis, and downstream glucose estimation models. A single 300% spike can shift a rolling mean by 40ms, invalidating interval measurements for minutes.

Standard approaches fail on mobile: Kalman filters require tuned process noise matrices that drift with user activity; exponential moving averages smear outliers across multiple samples; Savitzky-Golay smoothing introduces group delay. Clinical PPG systems use 12-bit ADCs at 500Hz with temperature-controlled optics; mobile cameras deliver 8-bit frames at 30fps with thermal throttling and auto-exposure hunting. The signal-to-noise budget is 18dB worse.

Why Median Filters Preserve Edges

Median filtering replaces each sample with the median of its local window—inherently robust to outliers because a single spike cannot shift the median unless it corrupts >50% of the window. Unlike linear filters, medians preserve sharp transitions: a step edge remains a step, not a ramp. For PPG, this means systolic peaks stay crisp while motion artifacts vanish.

The challenge: naive median filters exhibit root signal suppression—repeated passes erode genuine peaks. A 5-sample median over a sine wave flattens crests by 8-12%. In GlucoScan AI, early median implementations reduced R-wave amplitude by 15%, breaking the AC/DC ratio calculation that estimates blood glucose.

Median-of-Medians: Two-Stage Robustness

The median-of-medians algorithm applies a coarse median filter, then a fine median over the result. Stage one uses a 9-sample window (300ms at 30fps) to eliminate gross outliers; stage two uses a 3-sample window to preserve edge sharpness. The combination achieves 98% outlier rejection with [Float] { let stage1 = medianFilter(signal, window: coarse) return medianFilter(stage1, window: fine) } func medianFilter(_ input: [Float], window: Int) -> [Float] { let half = window / 2 return input.indices.map { i in let start = max(0, i - half) let end = min(input.count, i + half + 1) return input[start..3 standard deviations, flag the interval and interpolate linearly. This hybrid approach caught 97% of clustered artifacts in treadmill test data.

Another edge case: DC drift from ambient light changes. Medians preserve slow trends, so a 10-second ramp in baseline doesn't trigger rejection. Pair median-of-medians with a high-pass Butterworth filter (0.5Hz cutoff) to detrend before peak detection. In KidzCare, this eliminated false breath-hold detections during outdoor therapy sessions.

Memory and Power Budget

Sorting dominates the cost—O(n log n) per window. For 9-sample coarse stage, that's 27 comparisons per output sample. On ARM Cortex-A, NEON SIMD sorting cuts this to 11 cycles per comparison. Total power draw: 18mW for continuous 30fps processing on iPhone 12 mini, versus 45mW for a Kalman filter with matrix inversion.

Memory footprint: two ring buffers (coarse + fine windows) plus sorted scratch space. For 1024-sample frames, 12KB total—cacheable in L1. Flutter bindings via FFI add 200μs overhead; direct Swift/Kotlin implementations stay under 50μs.

Production Integration Patterns

In GlucoScan AI, median-of-medians runs in a dedicated DSP thread at elevated priority (QoS userInteractive on iOS, THREAD_PRIORITY_URGENT_AUDIO on Android). The pipeline: camera frame → grayscale conversion → ROI extraction → median-of-medians → peak detection → glucose model inference. End-to-end latency: 210ms, bounded by camera exposure time.

For offline analysis (e.g., batch processing 24-hour Holter recordings), swap to a three-stage cascaded median: 15-5-3 windows. This achieves 99.1% outlier rejection with 8% amplitude attenuation—acceptable for research datasets where latency doesn't matter.

When Not to Use Median-of-Medians

If your PPG source has