Pybroom API Documentation

This module contains the 3 main pybroom’s functions:

These functions take one or multiple fit results as input and return a “tidy” (or long-form) DataFrame. The glance function returns fit statistics, one for each fit result (e.g. fit method, number of iterations, chi-square etc.). The tidy function returns data for each fitted parameter (e.g. fitted value, gradient, bounds, etc.). The augment function returns data with the same size as the fitted data points (evaluated best-fit model, residuals, etc.).

In the case of multiple fit results, pybroom functions accept a list, a dict or a nested structure of dict and lists (for example a dict of lists of fit results). The example below shows some use cases.

Note

pybroom functions are particularly convenient when tidying a collection of fit results. The following examples are valid for all the 3 pybroom functions. If results is a list of datasets (e.g. data replicates), the returned dataframe will have an additional “index” column containing the index of the dataset in the list. If results is a dict of fit results (e.g. results from different fit methods or models on the same dataset), then the “index” column contains the keys of the dict (each key identifies a fit result). In the previous two example, var_names should contains the name of the “index” column (a string). Nested structures are also possible. For example, when fitting a list of datasets with different methods, we can build a dict of lists of fit results where the dict keys are the method names and the items in the list are fit results for the different datasets. In this case the returned dataframe has two additional “index” columns: one with the dict keys and one with the list index. The tuple (key, list index) identifies each single fit result. In this case var_names should be a list of column names for the keys and index column respectively (list of strings)

Example

The following examples shows pybroom output when multiple fit results are used. The glance function is used as example but the same logic (and input arguments) can be also passsed to tidy and augment.

Input is a list of fit results:

>>> results = [fit_res1, fit_res2, fit_res3]
>>> br.glance(results, var_names='dataset')

  num_params num_data_points      redchi      AIC  dataset
0          6             101  0.00911793 -468.634        0
1          6             101  0.00996431 -459.669        1
2          6             101   0.0109456 -450.183        2

Input is a dict of fit results:

>>> results = {'A': fit_res1, 'B': fit_res2, 'C': fit_res3}
>>> br.glance(results, var_names='function')

  num_params num_data_points      redchi      AIC function
0          6             101  0.00911793 -468.634        A
1          6             101  0.00996431 -459.669        B
2          6             101   0.0109456 -450.183        C

Input is a dict of lists of fit results:

>>> results = {'A': [fit_res1, fit_res2], 'B': [fit_res3, fit_res4]}
>>> br.glance(results, var_names=['function', 'dataset'])

  num_params num_data_points      redchi      AIC  dataset function
0          6             101  0.00911793 -468.634        0        A
1          6             101  0.00996431 -459.669        1        A
2          6             101   0.0109456 -450.183        0        B
3          6             101   0.0176529 -401.908        1        B

Main Functions

The 3 high-level functions glance(), tidy() and augment() allows tidying one or more fit results. These are pybroom’s most generic functions, accepting all the the supported fit result objects, as well as a list/dict of such objects. See also the examples at the beginning of this page and the example notebooks.

pybroom.glance(results, var_names='key', **kwargs)

Tidy DataFrame containing fit summaries from`result`.

A function to tidy any of the supported fit result (or a list of fit results). This function will identify input type and call the relative “specialized” tidying function. When the input is a list, the returned DataFrame contains data from all the fit results. Supported fit result objects are lmfit.ModelResult, lmfit.MinimizeResult and scipy.optimize.OptimizeResult.

Parameters:
  • result (fit result object or list) – one of the supported fit result objects or a list of supported fit result objects. When a list, all the elements need to be of the same type.
  • var_names (string or list) – name(s) of the column(s) containing an “index” that is different for each element in the set of fit results.
  • **kwargs – additional arguments passed to the underlying specialized tidying function.
Returns:

A DataFrame with one row for each passed fit result. Columns include fit summaries such as reduced chi-square, number of evaluation, successful convergence, AIC, BIC, etc. When a list of fit-result objects is passed, the column var_name (‘item’ by default) contains the index of the object in the list.

See also

For more details on the returned DataFrame and on additional arguments refer to the specialized tidying functions: glance_lmfit_result() and glance_scipy_result().

pybroom.tidy(result, var_names='key', **kwargs)

Tidy DataFrame containing fitted parameter data from result.

A function to tidy any of the supported fit result (or a list of fit results). This function will identify input type and call the relative “specialized” tidying function. When the input is a list, the returned DataFrame contains data from all the fit results. Supported fit result objects are lmfit.ModelResult, lmfit.MinimizeResult and scipy.optimize.OptimizeResult.

Parameters:
  • result (fit result object or list) – one of the supported fit result objects or a list of supported fit result objects. When a list, all the elements need to be of the same type.
  • var_names (string or list) – name(s) of the column(s) containing an “index” that is different for each element in the set of fit results.
  • param_names (string or list of string) – names of the fitted parameters for fit results which don’t include parameter’s names (such as scipy’s OptimizeResult). It can either be a list of strings or a single string with space-separated names.
  • **kwargs – additional arguments passed to the underlying specialized tidying function.
Returns:

A DataFrame with one row for each fitted parameter. Columns include parameter properties such as best-fit value, standard error, eventual bounds/constrains, etc. When a list of fit-result objects is passed, the column var_name (‘item’ by default) contains the index of the object in the list.

See also

For more details on the returned DataFrame and on additional arguments refer to the specialized tidying functions: tidy_lmfit_result() and tidy_scipy_result().

pybroom.augment(results, var_names='key', **kwargs)

Tidy DataFrame containing fit data from result.

A function to tidy any of the supported fit result (or a list of fit results). This function will identify input type and call the relative “specialized” tidying function. When the input is a list or a dict of fit results, the returned DataFrame contains data from all the fit results. In this case data from different fit results is identified by the values in the additional “index” (or categorical) column(s) whose name(s) are specified in var_names.

Parameters:
  • results (fit result object or list) – one of the supported fit result objects or a list of supported fit result objects. When a list, all the elements need to be of the same type.
  • var_names (string or list) – name(s) of the column(s) containing an “index” that is different for each element in the set of fit results. See the example section below.
  • **kwargs – additional arguments passed to the underlying specialized tidying function.
Returns:

A DataFrame with one row for each data point used in the fit. It contains the input data, the model evaluated at the data points with best fitted parameters, error ranges, etc. When a list of fit-result objects is passed, the column var_name (‘item’ by default) contains the index of the object in the list.

Dictionary conversions

The two functions tidy_to_dict() and dict_to_tidy() provide the ability to convert a tidy DataFrame to and from a python dictionary.

pybroom.tidy_to_dict(df, key='name', value='value', keys_exclude=None, cast_value=<class 'float'>)

Convert a tidy DataFrame into a dictionary.

This function converts two columns from an input tidy (or long-form) DataFrame into a dictionary. A typical use-case is passing parameters stored in tidy DataFrame to a python function. The arguments key and value contain the name of the DataFrame columns containing the keys and the values of the dictionary.

Parameters:
  • df (pandas.DataFrame) – the “tidy” DataFrame containing the data. Two columns of this DataFrame should contain the keys and the values to construct the dictionary.
  • key (string or scalar) – name of the DataFrame column containing the keys of the dictionary.
  • value (string or scalar) – name of the DataFrame column containing the values of the dictionary.
  • keys_exclude (iterable or None) – list of keys excluded when building the returned dictionary.
  • cast_value (callable or None) – callable used to cast the value of each item in the dictionary. If None, no casting is performed and the resulting values are 1-element pandas.Series. Default is the python built-in float. Other typical values may be int or str.
Returns:

A dictionary with keys and values extracted from the input (tidy) DataFrame.

See also: dict_to_tidy().

pybroom.dict_to_tidy(dc, key='name', value='value', keys_exclude=None)

Convert a dictionary into a tidy DataFrame.

This function converts a dictionary into a “tidy” (or long-form) DataFrame with two columns: one containing the keys and the other containing the values from the dictionary. Names of the columns can be specified with the key and value argument.

Parameters:
  • dc (dict) – the input dictionary used to build the DataFrame.
  • key (string or scalar) – name of the DataFrame column containing the keys of the dictionary.
  • value (string or scalar) – name of the DataFrame column containing the values of the dictionary.
  • keys_exclude (iterable or None) – list of keys excluded when building the returned DataFrame.
Returns:

A two-columns tidy DataFrame containing the data in the dictionary.

See also: tidy_to_dict().

Specialized functions

These are the specialized (i.e. low-level) functions, each converting one specific object to a tidy DataFrame.

pybroom.glance_scipy_result(result)

Tidy summary statistics from scipy’s OptimizeResult.

Normally this function is not called directly but invoked by the general purpose function glance().

Parameters:result (OptimizeResult) – the fit result object.
Returns:A DataFrame in tidy format with one row and several summary statistics as columns.

Note

Possible columns of the returned DataFrame include:

  • success (bool): whether the fit succeed
  • cost (float): cost function
  • optimality (float): optimality parameter as returned by scipy.optimize.least_squares.
  • nfev (int): number of objective function evaluations
  • njev (int): number of jacobian function evaluations
  • nit (int): number of iterations
  • status (int): status returned by the fit routine
  • message (string): message returned by the fit routine
pybroom.tidy_scipy_result(result, param_names, **kwargs)

Tidy parameters data from scipy’s OptimizeResult.

Normally this function is not called directly but invoked by the general purpose function tidy(). Since OptimizeResult has a raw array of fitted parameters but no names, the parameters’ names need to be passed in param_names.

Parameters:
  • result (OptimizeResult) – the fit result object.
  • param_names (string or list of string) – names of the fitted parameters. It can either be a list of strings or a single string with space-separated names.
Returns:

A DataFrame in tidy format with one row for each parameter.

Note

These two columns are always present in the returned DataFrame:

  • name (string): name of the parameter.
  • value (number): value of the parameter after the optimization.

Optional columns (depending on the type of result) are:

  • grad (float): gradient for each parameter
  • active_mask (int)
pybroom.glance_lmfit_result(result)

Tidy summary statistics from lmfit’s ModelResult or MinimizerResult.

Normally this function is not called directly but invoked by the general purpose function glance().

Parameters:result (ModelResult or MinimizerResult) – the fit result object.
Returns:A DataFrame in tidy format with one row and several summary statistics as columns.

Note

The columns of the returned DataFrame are:

  • model (string): model name (only for ModelResult)
  • method (string): method used for the optimization (e.g. leastsq).
  • num_params (int): number of varied parameters
  • ndata (int):
  • chisqr (float): chi-square statistics.
  • redchi (float): reduced chi-square statistics.
  • AIC (float): Akaike Information Criterion statistics.
  • BIC (float): Bayes Information Criterion statistics.
  • num_func_eval (int): number of evaluations of the objective function during the fit.
  • num_data_points (int): number of data points (e.g. samples) used for the fit.
pybroom.tidy_lmfit_result(result)

Tidy parameters from lmfit’s ModelResult or MinimizerResult.

Normally this function is not called directly but invoked by the general purpose function tidy().

Parameters:result (ModelResult or MinimizerResult) – the fit result object.
Returns:A DataFrame in tidy format with one row for each parameter.

Note

The (possible) columns of the returned DataFrame are:

  • name (string): name of the parameter.
  • value (number): value of the parameter after the optimization.
  • init_value (number): initial value of the parameter before the optimization.
  • min, max (numbers): bounds of the parameter
  • vary (bool): whether the parameter has been varied during the optimization.
  • expr (string): constraint expression for the parameter.
  • stderr (float): standard error for the parameter.
pybroom._augment_lmfit_modelresult(result)

Tidy data values and fitted model from lmfit.model.ModelResult.