11 Jan 2024 - tsp

Last update 11 Jan 2024

6 mins

Fitting models to data is a task that has to be done by a physicists on a daily basis. `lmfit`

is a very nice
and easy to use library when doing this in Python. Upfront - this is not the original way how one’s supposed
to build models using `lmfit`

. `lmfit`

is supposed to
be a very easy wrapper around `scipy`

fitting routines for different fitting methods (by default it uses
the Levenberg-Marquardt least squares fitting method - depending on the set of parameters and type of functions one
can use any of the many other fitting methods supported by `lmfit`

and `scipy`

) as well as a method to easily
build composite functions based on existing model functions. This recipe on the other hand just uses the standard minimizer
interface as well as an guessing and objective function. It’s very similar to the method shown in
the `minimize`

documentation but differs from the approach usually used in
the documentation.

- How to install lmfit
- A very simple method for a simple model
- Why this method might be worth using
- GitHub GIST

`lmfit`

The easiest way to install `lmfit`

is using FreeBSDs `pkg`

to fetch precompiled binaries:

```
pkg install math/lmfit
```

Another way to install is using pip:

```
pip install lmfit
```

The idea is simple: You have some data and a model function that depends on a set of parameters. You then want to decide on the best choice of parameters so your model function matches the data in the best possible way. This is the most common problem that I usually use in many practical problems.

Upfront let’s import all required functions and classes:

```
import numpy as np
import matplotlib.pyplot as plt
from lmfit import Parameters, fit_report, minimize
```

The first thing we need is a objective function that we want to fit to our data. This can be a polynomial, a Gaussian function, etc. This function can be either evaluated - then it will get passed the coordinate at which is should be evaluated and the parameters - or it should be fitted in which case it will also receive the data. Let’s take a look how such a simple function can look for a Gaussian:

[ f(x; c, \mu, \sigma) = c * \frac{1}{\sigma \sqrt{2 \pi}} e^{-\frac{1}{2} \frac{(x-\mu)^2}{\sigma^2}} ]As one can see this function has three parameters:

- The amplitude $c$
- The most probable value $\mu$
- The standard deviation $\sigma$

We are going to implement this method in a way that it receives the parameters as an Parameter object in it’s first argument, the evaluation point as it’s second argument and optionally a third argument that receives data during fitting:

```
def modelFunction(pars, x, data = None):
# First we convert the Parameters object into a dictionary,
# this will make our code a little bit easier to understand
vals = pars.valuesdict()
# For sake of readability give our parameters variables.
amp = vals["c"]
mu = vals["mu"]
sigma = vals["sigma"]
offset = vals["offset"]
# Evaluate the function
modelValue = amp * (1.0 / (sigma * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((x - mu)**2 / (sigma**2))) + offset
# If we got data calculate the residual, else return the value
if data is not None:
return modelValue - data
else:
return modelValue
```

Now let’s assume we got a bunch of data points at the position `data_x`

with values `data_y`

and we want to fit this Gaussian function. For sake of this demonstration we’re going to generate a bunch
of noisy data points by our self:

```
real_mu = 3.4
real_sigma = 3.1
real_amp = 7.1
real_offset = -3
real_noise_lvl = 1
data_x = np.linspace(-10, 10, 100)
data_y = real_amp * np.exp(-0.5 * (data_x - real_mu)**2 / (real_sigma**2)) / (np.sqrt(2 * np.pi) * real_sigma) + real_offset
data_y_noisy = data_y + (np.random.rand(data_y.shape[0]) - 0.5) * real_noise_lvl
```

Lets take a short look how this generated data looks like:

```
fig, ax = plt.subplots()
ax.set_title("Simulated input data")
ax.plot(data_x, data_y, 'g--', label = "Real data")
ax.plot(data_x, data_y_noisy, label = "Noisy input data")
ax.grid()
ax.legend()
plt.show()
```

Now lets try to fit our function on the noisy data. First we generate a `Parameters`

object. For the
initial values we do a naive guess of the parameters:

- The amplitude will be guessed to be the maximum minus the minimum - so the span of the function values
- The most probable value $\mu$ will be estimated to be at the location of the maximum
- We assume $\sigma$ to be 1.
- We assume the offset to be the minimum of all data points

```
fit_pars = Parameters()
fit_pars.add("amp", value = np.max(data_y_noisy) - np.min(data_y_noisy), vary = True)
fit_pars.add("mu", value = data_x[np.argmax(data_y_noisy)], vary = True)
fit_pars.add("sigma", value = 1, vary = True)
fit_pars.add("offset", value = np.min(data_y_noisy), vary = True)
```

After initialization we just have to call `minimize`

:

```
out = minimize(modelFunction, fit_pars, args=(data_x,), kws = { 'data' : data_y_noisy })
```

This has already performed the fit. To get a human readable fit report we can just
use `fit_report`

:

```
print(fit_report(out))
```

This will yield output like the following:

```
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 107
# data points = 100
# variables = 4
chi-square = 9.70078655
reduced chi-square = 0.10104986
Akaike info crit = -225.296322
Bayesian info crit = -214.875641
[[Variables]]
amp: 9.66997823 +/- 1.66650837 (17.23%) (init = -1.615516)
mu: 3.53609503 +/- 0.32519657 (9.20%) (init = 2.929293)
sigma: 3.95359254 +/- 0.50650314 (12.81%) (init = 1)
offset: -3.09039105 +/- 0.07705836 (2.49%) (init = -3.490874)
[[Correlations]] (unreported correlations are < 0.100)
C(amp, offset) = -0.906
C(amp, sigma) = 0.844
C(sigma, offset) = -0.723
```

One can programmatically access $\chi^2$ by `out.chisqr`

(in this example `9.70078654587556`

)

```
print(f"Chi squared: {out.chisqr}")
```

Now let’s plot the result again:

```
fig, ax = plt.subplots()
ax.plot(data_x, data_y_noisy, 'x', label = "Noisy input data")
ax.plot(data_x, modelFunction(out.params, data_x), 'r--', label = "Fitted function")
ax.plot(data_x, data_y, 'g', label = "Original function")
ax.grid()
ax.legend()
plt.show()
```

So this method is of course much boilerplate code for `lmfit`

that would also
support simple model building like in the following example:

```
from lmfit.models import GaussianModel, ConstantModel
model = GaussianModel()
params = model.guess(data_y_noisy, data_x)
background = ConstantModel()
background.guess(data_y_noisy, data_x)
model = model + background
out = model.fit(data_y_noisy, params, x=data_x)
## Plot
fig, ax = plt.subplots()
ax.plot(data_x, data_y_noisy, 'x', label = "Original data")
ax.plot(data_x, out.best_fit, 'r', label = "Fitted function")
plt.show()
```

It gets really handy when one wants to build complex models or do iterative approximations that are hard to handle using other minimization or fitting systems though or when one wants to perform experimental measurements in between the fitting steps.

The whole code used for this blog article can be found in a GitHub GIST

This article is tagged: Physics, Programming, Math, Tutorial, Basics, Python, Statistics, Measurement, Gaussian, Mathematics

Dipl.-Ing. Thomas Spielauer, Wien (webcomplains389t48957@tspi.at)

This webpage is also available via TOR at http://rh6v563nt2dnxd5h2vhhqkudmyvjaevgiv77c62xflas52d5omtkxuid.onion/