What's new? | Help | Directory | Sign in
Google
pyminuit
Minuit numerical function minimization in Python
  
  
  
  
    
Search
for
Updated Jun 21, 2008 by jpivarski
Labels: User-Documentation, Featured
GettingStartedGuide  
Informal introduction and tutorial

Installation

First, download and install PyMinuit.

When it is correctly installed, you will be able to import the minuit module in Python.

% python
Python 2.4.3 (#2, Oct  6 2006, 07:52:30) 
[GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import minuit

Success!

>>> import minuit
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ImportError: No module named minuit

Failure!

All of the following examples will assume that you have executed

>>> import minuit

(without explicitly typing the Python >>> prompt).

Minimizing simple functions

To minimize a function, we will

  1. create it,
  2. pass it to a new Minuit object, and
  3. call migrad(), the minimizing algorithm
  4. extract the results from the Minuit object

Here's an example of a simple 2-dimensional paraboloid.

>>> def f(x, y): return (x-1)**2 + (y-2)**2 + 3      # step 1
>>> m = minuit.Minuit(f)                             # step 2
>>> m.migrad()                                       # step 3
>>> m.values["x"], m.values["y"], m.fval             # step 4
(0.99999999986011212, 1.9999999997202242, 3.0)

The optimized x value is 1 (well, almost), the y value is 2 (ditto), and the minimum value of the function is 3.

You can do this with any Python function, even functions that throw exceptions when certain parameters are not met.

>>> def f(x,y):
...     if abs(x) < 1: raise Exception
...     return x**2 + y**2
... 
>>> m = minuit.Minuit(f)
>>> m.migrad()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "<stdin>", line 2, in f
Exception

A function in Python can be defined using the def keyword or inline as a lambda expression. The following two statements are equivalent:

>>> def f(x, y):
...     return x**2 + y**2
... 
>>> f = lambda x, y: x**2 + y**2

but the second can be embedded in the Minuit object's constructor:

>>> m = minuit.Minuit(lambda x, y: x**2 + y**2)

See FunctionReference for a warning about integer division.

Helping the minimization by setting initial values

Naturally, there is no guarantee that we have found the absolute minimum of the function: Minuit can be fooled by a local minimum.

>>> def f(x):
...     if x < 10:
...         return (x-1)**2 - 5
...     else:
...         return (x-15)**2 - 7
... 
>>> m = minuit.Minuit(f)
>>> m.values["x"] = 2
>>> m.migrad()
>>> m.values["x"], m.fval
(1.0000000001398872, -5.0)
>>> 
>>> m = minuit.Minuit(f)
>>> m.values["x"] = 16
>>> m.migrad()
>>> m.values["x"], m.fval
(15.000000001915796, -7.0)

You can guide Minuit away from local minima by a careful choice of initial values. This can also help if Minuit fails to converge: the neighborhood of minima are often more well-behaved (positive definite, or even parabolic) than regions far from minima. You can pass initial values and errors (which migrad() interprets as starting step sizes) in the Minuit constructor

>>> m = minuit.Minuit(lambda x, y: x**2 + y**2, x=3, y=5, err_x=0.01)

or after the object has been created

>>> m = minuit.Minuit(lambda x, y: x**2 + y**2)
>>> m.values["x"] = 3
>>> m.values["y"] = 5
>>> m.errors["x"] = 0.01

If Minuit fails to find a minimum of your function, you can set the printMode to try to diagnose the problem and choose a better starting point. (See FunctionReference for more.)

>>> m = minuit.Minuit(lambda x, y: (x-1)**2 + (y-2)**2, x=3, y=4)
>>> m.printMode = 1
>>> m.migrad()
  FCN Result | Parameter values
-------------+--------------------------------------------------------
           8 |            3            4
       8.004 |        3.001            4
       7.996 |        2.999            4
     8.00586 |      3.00146            4
     7.99414 |      2.99854            4
       8.004 |            3        4.001
       7.996 |            3        3.999
     8.00586 |            3      4.00146
     7.99414 |            3      3.99854
 4.24755e-18 |            1            2
 2.38419e-07 |      1.00049            2
 2.38418e-07 |     0.999512            2
 2.38421e-07 |            1      2.00049
 2.38417e-07 |            1      1.99951
 4.24755e-18 |            1            2
 2.38419e-07 |      1.00049            2
 2.38418e-07 |     0.999512            2
 2.38421e-07 |            1      2.00049
 2.38417e-07 |            1      1.99951
 9.53677e-09 |       1.0001            2
 9.53671e-09 |     0.999902            2
 9.53714e-09 |            1       2.0001
 9.53634e-09 |            1       1.9999
 4.76839e-07 |      1.00049      2.00049

You can also help a minimization by starting with a rough scan of the parameter space.

>>> m = minuit.Minuit(lambda x, y: (x-1)**2 + (y-2)**2, x=3, y=4)
>>> m.scan(("x", 30, -3, 7), ("y", 30, -3, 7), output=False)
>>> m.values
{'y': 2.1666666666666661, 'x': 1.1666666666666663}
>>> m.migrad()
>>> m.values
{'y': 2.0000000000041425, 'x': 1.0000000000042153}

Error estimation

For statistics applications, we're also very interested in the steepness (stepth?) of the function near its minimum, because that is related the the uncertainty in our fit parameters given by the data. These are returned in the errors attribute

>>> m = minuit.Minuit(lambda x, y: (x-1)**2 / 9.0 + (y-2)**4)
>>> m.migrad()
>>> m.errors["x"], m.errors["y"]
(2.9999999999999463, 9.3099692500949249)

But the migrad() algorithm alone does not guarantee error estimates within tolerance. For accurate errors, run the hesse() algorithm.

>>> m.hesse()
>>> m.errors["x"], m.errors["y"]
(2.9999999999999454, 9.308850308199208)

(Note that the quartic "y" error changed by 0.01%, much larger than the quadratic "x" error. The differences can be large in more pathological cases.)

The hesse() algorithm calculates the entire covariance matrix, the matrix of second derivatives at the minimum. For uncorrelated functions that can be separated into x terms and y terms (like all the ones I have presented so far), the off-diagonal entries of the matrix are zero. Let's illustrate the covariance matrix with a mixed xy term.

>>> m = minuit.Minuit(lambda x, y: x**2 + y**2 + x*y)
>>> m.migrad()
>>> m.hesse()
>>> m.errors["x"]**2, m.errors["y"]**2
(1.3333333333333337, 1.3333333333333337)
>>> m.covariance
{('y', 'x'): -0.66666666666666685, ('x', 'y'): -0.66666666666666685, ('y', 'y'): 1.3333333333333335, ('x', 'x'): 1.3333333333333335}

The diagonal elements (xx and yy) are equal to the squares of the errors by definition. Because covariance is expressed as a dictionary, we can pull any element from it by name.

>>> m.covariance["x", "x"], m.covariance["y", "y"]
(1.3333333333333335, 1.3333333333333335)
>>> m.covariance["x", "y"], m.covariance["y", "x"]
(-0.66666666666666685, -0.66666666666666685)

Sometimes, it is more convenient to access the covariance matrix as a square array.

>>> m.matrix()
((1.3333333333333335, -0.66666666666666685), (-0.66666666666666685, 1.3333333333333335))
>>> import Numeric
>>> Numeric.array(m.matrix())
array([[ 1.33333333, -0.66666667],
       [-0.66666667,  1.33333333]])

(The second example works only if you have the Numeric, numarray, or numpy modules installed.)

It's also sometimes interesting to see the correlation matrix, which is normalized such that all diagonal entries are 1.

>>> m.matrix(correlation=True)
((1.0, -0.5), (-0.5, 1.0))

Non-linear error measurement

When a function is not parabolic near its minimum, the errors from the second derivative (which are quadratic at heart) may not be representative. In that case, we use the minos() algorithm to measure 1-sigma devitions by explicitly calculating the function away from the minimum.

>>> m = minuit.Minuit(lambda x: x**4)
>>> m.migrad()
>>> m.hesse()
>>> m.errors["x"]
48.82085066828526
>>> m.minos()
>>> m.merrors["x", 1]
1.0007813246693986

Errors calculated by minos() go into the merrors attribute, rather than errors. They are indexed by parameter and the number of sigmas in each direction, because the errors are not necessarily symmetric.

>>> m = minuit.Minuit(lambda x: x**4 + x**3)
>>> m.migrad()
VariableMetricBuilder: matrix not pos.def.
gdel > 0: 0.0604704
gdel: -0.00686919
>>> m.minos()
>>> m.merrors["x", -1], m.merrors["x", 1]
(-0.60752321396926234, 1.5429599172115098)

You can also calculate minos() errors an arbitrary number of sigmas from the minimum.

>>> m.minos("x", 2)
>>> m.merrors
{('x', 2.0): 1.9581663883782807, ('x', -1.0): -0.60752321396926234, ('x', 1.0): 1.5429599172115098}

This can be useful for 90%, 95%, and 99% confidence levels.

Drawing contour curves and function density maps

There's a 2-dimensional equivalent of minos() errors: contour lines in two parameters. When you call contour("param1", "param2", sigmas), you will get a list of x-y pairs for an error ellipse drawn at N sigmas.

>>> m = minuit.Minuit(lambda x, y: x**2 + y**2 + x*y)
>>> m.migrad()
>>> m.contour("x", "y", 1)
[(-1.1547005383792515, 0.57735026918952459), (-1.1016128399175811, 0.25107843795420443), ...]

If the function has a non-linear (or really, non-parabolic) minimum, the error ellipse won't be elliptical.

The scan() function can also produce plotter-friendly output by setting output=True (the default). It outputs a matrix of evaluated function values, which can be plotted as a density map for the function.

Fitting

If you actually want to fit anything, you need to write a chi2 or negative log likelihood function.

>>> data = [(1, 1.1, 0.1), (2, 2.1, 0.1), (3, 2.4, 0.2), (4, 4.3, 0.1)]
>>> def f(x, a, b): return a + b*x
... 
>>> def chi2(a, b):
...     c2 = 0.
...     for x, y, yerr in data:
...         c2 += (f(x, a, b) - y)**2 / yerr**2
...     return c2
... 
>>> m = minuit.Minuit(chi2)
>>> m.migrad()
>>> m.hesse()
>>> m.values
{'a': -4.8538950636611844e-13, 'b': 1.0451612903223784}
>>> m.errors
{'a': 0.12247448677828, 'b': 0.045790546904858835}
>>> m.matrix(correlation=True)
((1.0, -0.89155582779537212), (-0.89155582779537224, 1.0))
>>> for x in 1, 2, 3, 4:
...     print x, f(x, *m.args)
... 
1 1.04516129032
2 2.09032258064
3 3.13548387097
4 4.18064516129

But with access to the low-level function minimization, you can do much more complicated fits, such as simultaneous fits to different distributions, which are difficult to express in a high-level application.


Sign in to add a comment