Prediction Error vs Measurement Error in Model Fitting

11 Apr 2026 - tsp
Last update 11 Apr 2026
Reading time 6 mins

When fitting a model function to experimental data, one is often confronted with a subtle but important conceptual issue: the uncertainty of the fitted model is frequently much smaller than the apparent measurement uncertainty of the individual data points. At first glance, this may appear contradictory. How can a model, fitted to noisy data, exhibit smaller uncertainty than the data itself?

This apparent paradox often leads to misinterpretation. Observers may assume that the narrow confidence bands of the fitted model represent the measurement uncertainty, and consequently judge the data against these bands, leading to incorrect conclusions about data quality or model validity.

This article clarifies the distinction between:

We will demonstrate how these quantities arise via a simulated measurement, how they should be interpreted, and how they can be computed in practice.

In the end we will provide a short summary and conclusion.

Measurement Error vs Model Uncertainty

Measurement Error

Measurement error describes the uncertainty associated with each observed data point. Formally, we write:

[ \begin{aligned} y_i &= f(x_i, \theta) + \epsilon_i \end{aligned} ]

Here

The error $\epsilon_i$ is typically determined by the measurement process itself (measurement noise, environmental fluctuations, discretization, etc.). This corresponds to the standard deviation of the measurement process. $y_i$ is typically the mean obtained from repeated measurements.

Model (Fit) Uncertainty

When fitting a model $f(x, \theta)$ to data, the parameters $\theta$ are estimated from all observations. The uncertainty of these parameters is given by the covariance matrix:

[ C := \mathrm{Cov}(\theta) ]

The matrix encodes how precisely the parameters are determined by the fitting / regression procedure. The uncertainty of the model prediction at a given point $x$ is obtained by propagating the covariance:

[ \begin{aligned} \sigma_f^2(x) &= \left(\nabla_\theta f(x, \theta)\right)^T C \nabla_\theta f(x, \theta) \end{aligned} ]

The quantity $\sigma_f$ represents the confidence band of the fitted model. The width of this band decreases with the number of data points (similar to the standard error or a measurement. For well conditioned problems with independent observations, the scaling can be estimated for many problems as:

[ \sigma_f \sim \frac{\sigma_y}{\sqrt{N}} ]

Even if individual measurements are noisy, the estimated parameters of the assumed model can be determined very precisely.

Fit uncertainty $\sigma_f$: How confident can we be about the fitted model.

Note that a small $\sigma_f$ does not imply that the model is correct. You need to apply proper statistical tests on your hypothesis.

Residuals and Data-Driven Variance

The residuals quantify how well the model describes the observed data:

[ r_i = y_i - \hat{y_i} ]

Here $\hat{y_i} = f(x_i, \hat{\theta})$ is the prediction of the data value by the fitted model. From these residuals one can estimate the variance of the data around the model:

[ \sigma_r^2 = \frac{1}{N-p} \sum_{i=1}^{N} r_i^2 ]

Here:

The quantity $\sigma_r$ represents the intrinsic scatter of the data and is typically comparable to the measurement noise (but they are not equal). In case of correlated noise $\sigma_r$ underestimates true uncertainty.

Residual error / intrinsic scatter $\sigma_r$: How much does the measurement process scatter (typically comparable to the measurement noise, though $\sigma_r$ includes also the model mismatch, unmodeled systematics, etc.)

Prediction Error

The prediction error describes where - at a given point of your model - you would expect the next measurement to reside with a given certainty. This must account for two contributions, that are typically treated as independent:

This corresponds to the classical distinction between confidence intervals, the uncertainty of the fitted mean model, and prediction intervals, the uncertainty of individual observations.

Under the assumption of independence this yields a total error $\sigma$:

[ \begin{aligned} \sigma^2(x) &= \sigma_f^2(x) + \sigma_r^2 \\ \sigma(x) &= \sqrt{\sigma_f^2(x) + \sigma_r^2} \end{aligned} ]

Prediction error $\sigma$: How well can the model predict a new measurement at position $x$

Keep in mind that the assumption of independence breaks in case of heteroscedastic errors or correlated noise!

A Practical Example

To illustrate the concepts, we simulate a derivative Lorentzian (Cauchy) shaped signal, add noise in both axes, perform a fit and then compute the parameter uncertainties, the model confidence band, the residual variance and the full prediction uncertainty.

In this plot one can see:

First the blue (simulated) datapoints. The simulation assumes an amplitude of $A=120$, $x_0=400$, $\mathrm{FWHM}=3.0$ (i.e. $\gamma=1.5$), $\sigma_x = 0.3$ and $\sigma_y = 0.7 * \mathrm{max}(y_i)$. On top of these we performed the orange fit using the Levenberg-Marquardt algorithm to perform a least squares fit against the same model function that has been used to synthesize the data. This yields $\hat{x_0} = 399.892560 \pm 0.097864 \mathrm{MHz}$ and $\mathrm{FWHM} = 2.369803 \pm 0.385163 \mathrm{MHz}$. The narrow blue region around the orange fit function is the fit uncertainty $\sigma_f$. As one can see this is extremly narrow and does not reflect the scatter of individual datapoints. When one would compare this region with new datapoints one would get the impression that the measurements would not confirm the hypothesis given by the model. When adding the residual measurement error $\sigma_r$ one gets the total error $\sigma = \sqrt{\sigma_f(x)^2 + \sigma_r^2}$, which is shown as the orange band. This is much wider and one can estimate this to include around 68 percent of all datapoints. The blue errorbar like line on top of the points is again $\sigma_r$, the expected scatter of individual measurements.

As one can see the confidence of the model (orange region) is much narrower than the prediction region for individual measurements.

Conclusion and the Common Interpretation Pitfall

Comparing measurement data directly to the confidence band $\sigma_f(x)$ instead of the prediction interval $\sigma(x)$ is a common mistake and leads to systematic overestimation of discrepancies between model and data.

The correct interpretation is:

This implies, that for sufficiently large datasets:

[ \sigma_f(x) \ll \sigma_r ]

This leads to the conclusions:

This article is tagged: Physics, School math, Math, Basics, Tutorial, Statistics, Measurements


Data protection policy

Dipl.-Ing. Thomas Spielauer, Wien (webcomplainsQu98equt9ewh@tspi.at)

This webpage is also available via TOR at http://rh6v563nt2dnxd5h2vhhqkudmyvjaevgiv77c62xflas52d5omtkxuid.onion/

Valid HTML 4.01 Strict Powered by FreeBSD IPv6 support