A Goertzel filter computes the energy at one specific frequency using a two-sample IIR recurrence — essentially a single DFT bin without the full FFT. It runs once per target frequency per frame, making it far cheaper than an FFT when you only care about a handful of frequencies.
Each 512-sample frame is first multiplied by a Hanning window to taper the edges to zero. This reduces spectral leakage — without it, energy from one frequency bleeds into adjacent bins and raises false readings.
s[n] = x[n] + 2·cos(ω)·s[n-1] − s[n-2] power = (s[N-1] − s[N-2]·cos(ω))² + (s[N-2]·sin(ω))²
Six filters run in parallel, one per target frequency (200, 400, 800, 1200, 2400, 4000 Hz). These are chosen to cover common drone rotor harmonics — a quadcopter at 80 Hz blade-pass produces harmonics at 160, 240, 320 Hz and above.
Raw Goertzel power scales with the microphone's gain and the drone's distance. To remove that dependency, each bin's power is divided by the total frame energy (sum-of-squares × N). This gives a tonal-to-broadband ratio that stays roughly constant regardless of volume.
ratio[f] = goertzel_power[f] / (sumSq × N)
The bars on screen show this ratio. A steady tone like a drone rotor produces a ratio well above the background. Wind, voices, and traffic produce relatively flat spectra, so no single bin dominates.
Raw per-frame ratios are noisy. An exponential moving average smooths them across time:
ema[f] = α · ratio[f] + (1 − α) · ema[f]
α (alpha) controls the trade-off. At α = 0.25, the EMA has a time constant of roughly 3–4 frames ≈ 100 ms. A drone that appears suddenly reaches threshold in a few frames; a brief spike decays just as quickly.
Two separate thresholds prevent chattering around the detection boundary:
Sustain counters add debouncing in both directions. The alarm only fires after N consecutive triggering frames (Sustain ON), and only clears after M consecutive quiet frames (Sustain OFF). This prevents a single noisy frame from either triggering or cancelling an alarm.