Pybroom API Documentation¶

This module contains the 3 main pybroom’s functions:

glance()
tidy()
augment()

These functions take one or multiple fit results as input and return a “tidy” (or long-form) DataFrame. The glance function returns fit statistics, one for each fit result (e.g. fit method, number of iterations, chi-square etc.). The tidy function returns data for each fitted parameter (e.g. fitted value, gradient, bounds, etc.). The augment function returns data with the same size as the fitted data points (evaluated best-fit model, residuals, etc.).

In the case of multiple fit results, pybroom functions accept a list, a dict or a nested structure of dict and lists (for example a dict of lists of fit results). The example below shows some use cases.

Note

pybroom functions are particularly convenient when tidying a collection of fit results. The following examples are valid for all the 3 pybroom functions. If results is a list of datasets (e.g. data replicates), the returned dataframe will have an additional “index” column containing the index of the dataset in the list. If results is a dict of fit results (e.g. results from different fit methods or models on the same dataset), then the “index” column contains the keys of the dict (each key identifies a fit result). In the previous two example, var_names should contains the name of the “index” column (a string). Nested structures are also possible. For example, when fitting a list of datasets with different methods, we can build a dict of lists of fit results where the dict keys are the method names and the items in the list are fit results for the different datasets. In this case the returned dataframe has two additional “index” columns: one with the dict keys and one with the list index. The tuple (key, list index) identifies each single fit result. In this case var_names should be a list of column names for the keys and index column respectively (list of strings)

Example

The following examples shows pybroom output when multiple fit results are used. The glance function is used as example but the same logic (and input arguments) can be also passsed to tidy and augment.

Input is a list of fit results:

>>> results = [fit_res1, fit_res2, fit_res3]
>>> br.glance(results, var_names='dataset')

  num_params num_data_points      redchi      AIC  dataset
0          6             101  0.00911793 -468.634        0
1          6             101  0.00996431 -459.669        1
2          6             101   0.0109456 -450.183        2

Input is a dict of fit results:

>>> results = {'A': fit_res1, 'B': fit_res2, 'C': fit_res3}
>>> br.glance(results, var_names='function')

  num_params num_data_points      redchi      AIC function
0          6             101  0.00911793 -468.634        A
1          6             101  0.00996431 -459.669        B
2          6             101   0.0109456 -450.183        C

Input is a dict of lists of fit results:

>>> results = {'A': [fit_res1, fit_res2], 'B': [fit_res3, fit_res4]}
>>> br.glance(results, var_names=['function', 'dataset'])

  num_params num_data_points      redchi      AIC  dataset function
0          6             101  0.00911793 -468.634        0        A
1          6             101  0.00996431 -459.669        1        A
2          6             101   0.0109456 -450.183        0        B
3          6             101   0.0176529 -401.908        1        B

Main Functions¶

The 3 high-level functions glance(), tidy() and augment() allows tidying one or more fit results. These are pybroom’s most generic functions, accepting all the the supported fit result objects, as well as a list/dict of such objects. See also the examples at the beginning of this page and the example notebooks.

pybroom.glance(results, var_names='key', **kwargs)¶

Tidy DataFrame containing fit summaries from`result`.

A function to tidy any of the supported fit result (or a list of fit results). This function will identify input type and call the relative “specialized” tidying function. When the input is a list, the returned DataFrame contains data from all the fit results. Supported fit result objects are lmfit.ModelResult, lmfit.MinimizeResult and scipy.optimize.OptimizeResult.

Parameters:

result (fit result object or list) – one of the supported fit result objects or a list of supported fit result objects. When a list, all the elements need to be of the same type.
var_names (string or list) – name(s) of the column(s) containing an “index” that is different for each element in the set of fit results.
**kwargs – additional arguments passed to the underlying specialized tidying function.

Returns:

A DataFrame with one row for each passed fit result. Columns include fit summaries such as reduced chi-square, number of evaluation, successful convergence, AIC, BIC, etc. When a list of fit-result objects is passed, the column var_name (‘item’ by default) contains the index of the object in the list.

Dictionary conversions¶

The two functions tidy_to_dict() and dict_to_tidy() provide the ability to convert a tidy DataFrame to and from a python dictionary.

pybroom.tidy_to_dict(df, key='name', value='value', keys_exclude=None, cast_value=<class 'float'>)¶

Convert a tidy DataFrame into a dictionary.

This function converts two columns from an input tidy (or long-form) DataFrame into a dictionary. A typical use-case is passing parameters stored in tidy DataFrame to a python function. The arguments key and value contain the name of the DataFrame columns containing the keys and the values of the dictionary.

Parameters:

df (pandas.DataFrame) – the “tidy” DataFrame containing the data. Two columns of this DataFrame should contain the keys and the values to construct the dictionary.
key (string or scalar) – name of the DataFrame column containing the keys of the dictionary.
value (string or scalar) – name of the DataFrame column containing the values of the dictionary.
keys_exclude (iterable or None) – list of keys excluded when building the returned dictionary.
cast_value (callable or None) – callable used to cast the value of each item in the dictionary. If None, no casting is performed and the resulting values are 1-element pandas.Series. Default is the python built-in float. Other typical values may be int or str.

Returns:

A dictionary with keys and values extracted from the input (tidy) DataFrame.

Specialized functions¶

These are the specialized (i.e. low-level) functions, each converting one specific object to a tidy DataFrame.

pybroom.glance_scipy_result(result)¶

Tidy summary statistics from scipy’s OptimizeResult.

Normally this function is not called directly but invoked by the general purpose function glance().

Parameters:	result (OptimizeResult) – the fit result object.
Returns:	A DataFrame in tidy format with one row and several summary statistics as columns.

Note

Possible columns of the returned DataFrame include:

success (bool): whether the fit succeed
cost (float): cost function
optimality (float): optimality parameter as returned by scipy.optimize.least_squares.
nfev (int): number of objective function evaluations
njev (int): number of jacobian function evaluations
nit (int): number of iterations
status (int): status returned by the fit routine
message (string): message returned by the fit routine

pybroom.tidy_scipy_result(result, param_names, **kwargs)¶

Tidy parameters data from scipy’s OptimizeResult.

Normally this function is not called directly but invoked by the general purpose function tidy(). Since OptimizeResult has a raw array of fitted parameters but no names, the parameters’ names need to be passed in param_names.

Parameters:	result (OptimizeResult) – the fit result object. param_names (string or list of string) – names of the fitted parameters. It can either be a list of strings or a single string with space-separated names.
Returns:	A DataFrame in tidy format with one row for each parameter.

Note

These two columns are always present in the returned DataFrame:

name (string): name of the parameter.
value (number): value of the parameter after the optimization.

Optional columns (depending on the type of result) are:

grad (float): gradient for each parameter
active_mask (int)

pybroom.glance_lmfit_result(result)¶

Tidy summary statistics from lmfit’s ModelResult or MinimizerResult.

Normally this function is not called directly but invoked by the general purpose function glance().

Parameters:	result (ModelResult or MinimizerResult) – the fit result object.
Returns:	A DataFrame in tidy format with one row and several summary statistics as columns.

Note

The columns of the returned DataFrame are:

model (string): model name (only for ModelResult)
method (string): method used for the optimization (e.g. leastsq).
num_params (int): number of varied parameters
ndata (int):
chisqr (float): chi-square statistics.
redchi (float): reduced chi-square statistics.
AIC (float): Akaike Information Criterion statistics.
BIC (float): Bayes Information Criterion statistics.
num_func_eval (int): number of evaluations of the objective function during the fit.
num_data_points (int): number of data points (e.g. samples) used for the fit.

pybroom.tidy_lmfit_result(result)¶

Tidy parameters from lmfit’s ModelResult or MinimizerResult.

Normally this function is not called directly but invoked by the general purpose function tidy().

Parameters:	result (ModelResult or MinimizerResult) – the fit result object.
Returns:	A DataFrame in tidy format with one row for each parameter.

Note

The (possible) columns of the returned DataFrame are:

name (string): name of the parameter.
value (number): value of the parameter after the optimization.
init_value (number): initial value of the parameter before the optimization.
min, max (numbers): bounds of the parameter
vary (bool): whether the parameter has been varied during the optimization.
expr (string): constraint expression for the parameter.
stderr (float): standard error for the parameter.

pybroom._augment_lmfit_modelresult(result)¶: Tidy data values and fitted model from lmfit.model.ModelResult.