Why Your Mean Is More Stable Than Your Standard Error Suggests

18 Jul 2025 - tsp
Last update 18 Jul 2025
Reading time 7 mins

Have you ever wondered why your measured mean appears more stable than your calculated standard deviation or standard error would suggest? If you’ve been averaging measurements and your results don’t “jump around” as much as you expect, you’re not alone. The answer lies in the nature of noise.

When analyzing repeated measurements, it’s common practice to report:

Those are defined as:

[ \begin{aligned} \bar{x} = \mu &= \frac{1}{N} \sum_{i=1}^{N} x_i \\ \sigma &= \sqrt{\frac{1}{N-1} \sum_{i=1}^{N} \left(\bar{x} - x_i\right)^2} \\ SE &= \frac{\sigma}{\sqrt{N}} \end{aligned} ]

Beyond summarizing past data, the mean and standard deviation (SD) also serve as point estimators for predicting future measurements — assuming the underlying statistical properties remain consistent.

For uncorrelated noise (“white noise” or Gaussian noise), the standard error should shrink as you take more samples:

[ SE = \frac{\sigma}{\sqrt{N}} ]

In the most extreme case you would assume

[ \lim_{N\to\infty} SE = \lim_{N\to\infty} \frac{\sigma}{N} \to 0 ]

This assumes each measurement is statistically independent. But what if they’re not?

The reason: Correlated Noise

In real-world experiments, especially with precise instruments or long averaging times, measurements often contain correlated noise. A few examples include:

These sources introduce long-term correlations. Even if you repeat a measurement a large number of times, the effective number of independent samples is far less. That means:

You might notice that the standard deviation (SD) and standard error (SE) seem to reach a plateau, while your measured mean hovers consistently around the same value. This can feel counterintuitive - but it’s actually expected behavior in the presence of low-frequency noise.

The role of Allan Deviation

To detect and quantify this kind of noise behavior, especially in time-series data, physicists and engineers often use Allan deviation. Instead of assuming all samples are uncorrelated, it measures how the average changes over increasing timescales.

The Allan variance for a time series $x(t)$ for fixed $\tau$ (keep in mind this is and estimator only valid under the condition of fixed $\tau$, there are different forms depending on condition; this is not a tutorial on Allan deviations and variances) is defined as:

[ \sigma_x^2(\tau) = \frac{1}{2(M-1)} \sum_{i=0}^{M-1} \left(\bar{x_{i+1}} - \bar{x_{i}}\right)^2 ]

The $\bar{x}_i$ are the averages of $x(t)$ over successive time intervals of a fixed duration $\tau$ (imagine this as a sliding window). The square root of this variance gives the Allan deviation $sigma_x(\tau)$.

Different types of noise yield characteristic dependencies of Allan deviation with respect to $\tau$:

By plotting Allan deviation across a range of $\tau$, one can visually identify which type of noise dominates at which timescales.

Allan deviation helps answer:

The following graph of simulated data shows the 3 common cases of white noise, flicker noise and a constant drift as well as the behaviour of the Allan deviation as well as standard deviation (SD) and standard error (SE) over time:

In this graph we can see:

Apparent Stability Is Not the Same as Accuracy

A crucial caveat is that a stable mean and small standard error might tempt one to report overly precise results - especially if the underlying noise is correlated. In such cases, standard error underestimates the true uncertainty, and any conclusions drawn from the illusion of precision may be misleading. It’s important to remember that a narrow confidence interval computed from standard error is only valid under the assumption of independent samples. When this assumption fails, reported uncertainty becomes artificially optimistic. Always assess the nature of your noise before trusting the digits after the decimal point.

Conversely, long term drifts or correlated offsets can also lead to inflated standard deviations and standard errors, which may overestimate the variability of individual measurements. This can make your system appear noisier than it really is on short timescales - especially if the underlying random noise is low and the dominant effect is slow drift. In this case, while the reported mean might itself be inaccurate due to bias from drift, the apparent per-measurement noise is exaggerated. Correlated noise thus distorts both under- and overconfidence, depending on how you interpret your statistics.

Takeaways

Noise Type Standard Deviation (SD) Standard Error (SE)
White noise ✅ Converges to a constant value (true σ) ✅ Decreases as $\frac{\sigma}{\sqrt{N}}$
Flicker noise ⚠️ Slowly increasing or saturating ❌ Does not decrease as $1/\sqrt{N}$; often saturates
Linear drift 🔺 Increases linearly with time ❌ Misleading - may shrink briefly, but becomes invalid due to nonstationarity

References

This article is tagged: Physics, School math, Math, Tutorial, Basics, Measurement, Statistics


Data protection policy

Dipl.-Ing. Thomas Spielauer, Wien (webcomplains389t48957@tspi.at)

This webpage is also available via TOR at http://rh6v563nt2dnxd5h2vhhqkudmyvjaevgiv77c62xflas52d5omtkxuid.onion/

Valid HTML 4.01 Strict Powered by FreeBSD IPv6 support