API Reference#

This reference provides detailed documentation for all modules, classes, and methods in the current release of Bambi.

bambi.models#

class bambi.models.Model(formula, data, family='gaussian', priors=None, link=None, categorical=None, potentials=None, dropna=False, auto_scale=True, noncentered=True, center_predictors=True, extra_namespace=None)[source]#

Specification of model class.

Parameters:
  • formula (str or bambi.formula.Formula) – A model description written using the formula syntax from the formulae library.

  • data (pandas.DataFrame) – A pandas dataframe containing the data on which the model will be fit, with column names matching variables defined in the formula.

  • family (str or bambi.families.Family) – A specification of the model family (analogous to the family object in R). Either a string, or an instance of class bambi.families.Family. If a string is passed, a family with the corresponding name must be defined in the defaults loaded at Model initialization. Valid pre-defined families are "bernoulli", "beta", "binomial", "categorical", "gamma", "gaussian", "negativebinomial", "poisson", "t", and "wald". Defaults to "gaussian".

  • priors (dict) – Optional specification of priors for one or more terms. A dictionary where the keys are the names of terms in the model, “common,” or “group_specific” and the values are instances of class Prior. If priors are unset, uses automatic priors inspired by the R rstanarm library.

  • link (str or Dict[str, str]) – The name of the link function to use. Valid names are "cloglog", "identity", "inverse_squared", "inverse", "log", "logit", "probit", and "softmax". Not all the link functions can be used with all the families. If a dictionary, keys are the names of the target parameters and the values are the names of the link functions.

  • categorical (str or list) – The names of any variables to treat as categorical. Can be either a single variable name, or a list of names. If categorical is None, the data type of the columns in the data will be used to infer handling. In cases where numeric columns are to be treated as categorical (e.g., group specific factors coded as numerical IDs), explicitly passing variable names via this argument is recommended.

  • potentials (A list of 2-tuples.) – Optional specification of potentials. A potential is an arbitrary expression added to the likelihood, this is generally useful to add constrains to models, that are difficult to express otherwise. The first term of a 2-tuple is the name of a variable in the model, the second a lambda function expressing the desired constraint. If a constraint involves n variables, you can pass n 2-tuples or pass a tuple which first element is a n-tuple and second element is a lambda function with n arguments. The number and order of the lambda function has to match the number and order of the variables names.

  • dropna (bool) – When True, rows with any missing values in either the predictors or outcome are automatically dropped from the dataset in a listwise manner.

  • auto_scale (bool) – If True (default), priors are automatically rescaled to the data (to be weakly informative) any time default priors are used. Note that any priors explicitly set by the user will always take precedence over default priors.

  • noncentered (bool) – If True (default), uses a non-centered parameterization for normal hyperpriors on grouped parameters. If False, naive (centered) parameterization is used.

  • extra_namespace (dict, optional) – Additional user supplied variables with transformations or data to include in the environment where the formula is evaluated. Defaults to None.

build()[source]#

Set up the model for sampling/fitting.

Creates an instance of the underlying PyMC model and adds all the necessary terms to it.

Return type:

None

fit(draws=1000, tune=1000, discard_tuned_samples=True, omit_offsets=True, include_mean=False, inference_method='mcmc', init='auto', n_init=50000, chains=None, cores=None, random_seed=None, **kwargs)[source]#

Fit the model using PyMC.

Parameters:
  • draws (int) – The number of samples to draw from the posterior distribution. Defaults to 1000.

  • tune (int) – Number of iterations to tune. Defaults to 1000. Samplers adjust the step sizes, scalings or similar during tuning. These tuning samples are be drawn in addition to the number specified in the draws argument, and will be discarded unless discard_tuned_samples is set to False.

  • discard_tuned_samples (bool) – Whether to discard posterior samples of the tune interval. Defaults to True.

  • omit_offsets (bool) – Omits offset terms in the InferenceData object returned when the model includes group specific effects. Defaults to True.

  • include_mean (bool) – Compute the posterior of the mean response. Defaults to False.

  • inference_method (str) – The method to use for fitting the model. By default, "mcmc". This automatically assigns a MCMC method best suited for each kind of variables, like NUTS for continuous variables and Metropolis for non-binary discrete ones. Alternatively, "vi", in which case the model will be fitted using variational inference as implemented in PyMC using the fit function. Finally, "laplace", in which case a Laplace approximation is used and is not recommended other than for pedagogical use. To use the PyMC numpyro and blackjax samplers, use nuts_numpyro or nuts_blackjax respectively. Both methods will only work if you can use NUTS sampling, so your model must be differentiable.

  • init (str) – Initialization method. Defaults to "auto". The available methods are: * auto: Use "jitter+adapt_diag" and if this method fails it uses "adapt_diag". * adapt_diag: Start with a identity mass matrix and then adapt a diagonal based on the variance of the tuning samples. All chains use the test value (usually the prior mean) as starting point. * jitter+adapt_diag: Same as "adapt_diag", but use test value plus a uniform jitter in [-1, 1] as starting point in each chain. * advi+adapt_diag: Run ADVI and then adapt the resulting diagonal mass matrix based on the sample variance of the tuning samples. * advi+adapt_diag_grad: Run ADVI and then adapt the resulting diagonal mass matrix based on the variance of the gradients during tuning. This is experimental and might be removed in a future release. * advi: Run ADVI to estimate posterior mean and diagonal mass matrix. * advi_map: Initialize ADVI with MAP and use MAP as starting point. * map: Use the MAP as starting point. This is strongly discouraged. * adapt_full: Adapt a dense mass matrix using the sample covariances. All chains use the test value (usually the prior mean) as starting point. * jitter+adapt_full: Same as "adapt_full", but use test value plus a uniform jitter in [-1, 1] as starting point in each chain.

  • n_init (int) – Number of initialization iterations. Only works for "advi" init methods.

  • chains (int) – The number of chains to sample. Running independent chains is important for some convergence statistics and can also reveal multiple modes in the posterior. If None, then set to either cores or 2, whichever is larger.

  • cores (int) – The number of chains to run in parallel. If None, it is equal to the number of CPUs in the system unless there are more than 4 CPUs, in which case it is set to 4.

  • random_seed (int or list of ints) – A list is accepted if cores is greater than one.

  • **kwargs – For other kwargs see the documentation for PyMC.sample().

Returns:

  • An ArviZ InferenceData instance if inference_method is "mcmc" (default),

  • ”nuts_numpyro”, “nuts_blackjax” or “laplace”.

  • An Approximation object if "vi".

graph(formatting='plain', name=None, figsize=None, dpi=300, fmt='png')[source]#

Produce a graphviz Digraph from a built Bambi model.

Requires graphviz, which may be installed most easily with

conda install -c conda-forge python-graphviz

Alternatively, you may install the graphviz binaries yourself, and then pip install graphviz to get the python bindings. See http://graphviz.readthedocs.io/en/stable/manual.html for more information.

Parameters:
  • formatting (str) – One of "plain" or "plain_with_params". Defaults to "plain".

  • name (str) – Name of the figure to save. Defaults to None, no figure is saved.

  • figsize (tuple) – Maximum width and height of figure in inches. Defaults to None, the figure size is set automatically. If defined and the drawing is larger than the given size, the drawing is uniformly scaled down so that it fits within the given size. Only works if name is not None.

  • dpi (int) – Point per inch of the figure to save. Defaults to 300. Only works if name is not None.

  • fmt (str) – Format of the figure to save. Defaults to "png". Only works if name is not None.

Returns:

The graph

Return type:

graphviz.Digraph

Example

>>> model = Model("y ~ x + (1|z)")
>>> model.build()
>>> model.graph()
>>> model = Model("y ~ x + (1|z)")
>>> model.fit()
>>> model.graph()
plot_priors(draws=5000, var_names=None, random_seed=None, figsize=None, textsize=None, hdi_prob=None, round_to=2, point_estimate='mean', kind='kde', bins=None, omit_offsets=True, omit_group_specific=True, ax=None, **kwargs)[source]#

Samples from the prior distribution and plots its marginals.

Parameters:
  • draws (int) – Number of draws to sample from the prior predictive distribution. Defaults to 5000.

  • var_names (str or list) – A list of names of variables for which to compute the posterior predictive distribution. Defaults to None which means to include both observed and unobserved RVs.

  • random_seed (int) – Seed for the random number generator.

  • figsize (tuple) – Figure size. If None it will be defined automatically.

  • textsize (float) – Text size scaling factor for labels, titles and lines. If None it will be autoscaled based on figsize.

  • hdi_prob (float or str) – Plots highest density interval for chosen percentage of density. Use "hide" to hide the highest density interval. Defaults to 0.94.

  • round_to (int) – Controls formatting of floats. Defaults to 2 or the integer part, whichever is bigger.

  • point_estimate (str) – Plot point estimate per variable. Values should be "mean", "median", "mode" or None. Defaults to "auto" i.e. it falls back to default set in ArviZ’s rcParams.

  • kind (str) – Type of plot to display ("kde" or "hist") For discrete variables this argument is ignored and a histogram is always used.

  • bins (integer or sequence or "auto") – Controls the number of bins, accepts the same keywords matplotlib.pyplot.hist() does. Only works if kind == "hist". If None (default) it will use "auto" for continuous variables and range(xmin, xmax + 1) for discrete variables.

  • omit_offsets (bool) – Whether to omit offset terms in the plot. Defaults to True.

  • omit_group_specific (bool) – Whether to omit group specific effects in the plot. Defaults to True.

  • ax (numpy array-like of matplotlib axes or bokeh figures) – A 2D array of locations into which to plot the densities. If not supplied, ArviZ will create its own array of plot areas (and return it).

  • **kwargs – Passed as-is to matplotlib.pyplot.hist() or matplotlib.pyplot.plot() function depending on the value of kind.

Returns:

axes

Return type:

matplotlib axes

predict(idata, kind='mean', data=None, inplace=True, include_group_specific=True, sample_new_groups=False)[source]#

Predict method for Bambi models

Obtains in-sample and out-of-sample predictions from a fitted Bambi model.

Parameters:
  • idata (InferenceData) – The InferenceData instance returned by .fit().

  • kind (str) – Indicates the type of prediction required. Can be "mean" or "pps". The first returns draws from the posterior distribution of the mean, while the latter returns the draws from the posterior predictive distribution (i.e. the posterior probability distribution for a new observation) in addition to the mean posterior distribution. Defaults to "mean".

  • data (pandas.DataFrame or None) – An optional data frame with values for the predictors that are used to obtain out-of-sample predictions. If omitted, the original dataset is used.

  • inplace (bool) – If True it will modify idata in-place. Otherwise, it will return a copy of idata with the predictions added. If kind="mean", a new variable ending in "_mean" is added to the posterior group. If kind="pps", it appends a posterior_predictive group to idata. If any of these already exist, it will be overwritten.

  • include_group_specific (bool) – Determines if predictions incorporate group-specific effects. If False, predictions are made with common effects only (i.e. group specific are set to zero). Defaults to True.

  • sample_new_groups (bool) – Specifies if it is allowed to obtain predictions for new groups of group-specific terms. When True, each posterior sample for the new groups is drawn from the posterior draws of a randomly selected existing group. Since different groups may be selected at each draw, the end result represents the variation across existing groups. The method implemented is quivalent to sample_new_levels=”uncertainty” in brms.

Return type:

InferenceData or None

prior_predictive(draws=500, var_names=None, omit_offsets=True, random_seed=None)[source]#

Generate samples from the prior predictive distribution.

Parameters:
  • draws (int) – Number of draws to sample from the prior predictive distribution. Defaults to 500.

  • var_names (str or list) – A list of names of variables for which to compute the prior predictive distribution. Defaults to None which means both observed and unobserved RVs.

  • omit_offsets (bool) – Whether to omit offset terms in the plot. Defaults to True.

  • random_seed (int) – Seed for the random number generator.

Returns:

InferenceData object with the groups prior, prior_predictive and observed_data.

Return type:

InferenceData

set_alias(aliases)[source]#

Set aliases for the terms and auxiliary parameters in the model

Parameters:

aliases (dict) – A dictionary where key represents the original term name and the value is the alias.

Return type:

None

set_priors(priors=None, common=None, group_specific=None)[source]#

Set priors for one or more existing terms.

Parameters:
  • priors (dict) – Dictionary of priors to update. Keys are names of terms to update; values are the new priors (either a Prior instance, or an int or float that scales the default priors).

  • common (Prior, int, or float) – A prior specification to apply to all common terms included in the model.

  • group_specific (Prior, int, or float) – A prior specification to apply to all group specific terms included in the model.

Return type:

None

bambi.priors#

Classes to represent prior distributions and methods to set automatic priors

class bambi.priors.Prior(name, auto_scale=True, dist=None, **kwargs)[source]#

Abstract specification of a term prior.

Parameters:
  • name (str) – Name of prior distribution. Must be the name of a PyMC distribution (e.g., "Normal", "Bernoulli", etc.)

  • auto_scale (bool) – Whether to adjust the parameters of the prior or use them as passed. Default to True.

  • kwargs (dict) – Optional keywords specifying the parameters of the named distribution.

  • dist (pymc.distributions.distribution.DistributionMeta or callable) – A callable that returns a valid PyMC distribution. The signature must contain name, dims, and shape, as well as its own keyworded arguments.

update(**kwargs)[source]#

Update the arguments of the prior with additional arguments.

Parameters:

kwargs (dict) – Optional keyword arguments to add to prior args.

class bambi.priors.PriorScaler(model)[source]#

Scale prior distributions parameters.

bambi.families#

Classes to construct model families.

class bambi.families.Family(name, likelihood, link: str | Dict[str, str | Link])[source]#

A specification of model family.

Parameters:
  • name (str) – The name of the family. It can be any string.

  • likelihood (Likelihood) – A bambi.families.Likelihood instance specifying the model likelihood function.

  • link (Union[str, Dict[str, Union[str, Link]]]) – The link function that’s used for every parameter in the likelihood function. Keys are the names of the parameters and values are the link functions. These can be a str with a name or a bambi.families.Link instance. The link function transforms the linear predictors.

Examples

>>> import bambi as bmb

Replicate the Gaussian built-in family.

>>> sigma_prior = bmb.Prior("HalfNormal", sigma=1)
>>> likelihood = bmb.Likelihood("Gaussian", params=["mu", "sigma"], parent="mu")
>>> family = bmb.Family("gaussian", likelihood, "identity")
>>> bmb.Model("y ~ x", data, family=family, priors={"sigma": sigma_prior})

Replicate the Bernoulli built-in family.

>>> likelihood = bmb.Likelihood("Bernoulli", parent="p")
>>> family = bmb.Family("bernoulli", likelihood, "logit")
>>> bmb.Model("y ~ x", data, family=family)
posterior_predictive(model, posterior, **kwargs)[source]#

Get draws from the posterior predictive distribution

This function works for almost all the families. It grabs the draws for the parameters needed in the response distribution, and then gets samples from the posterior predictive distribution using pm.draw(). It won’t work when the response distribution requires parameters that are not available in posterior.

Parameters:
  • model (bambi.Model) – The model

  • posterior (xr.Dataset) – The xarray dataset that contains the draws for all the parameters in the posterior. It must contain the parameters that are needed in the distribution of the response, or the parameters that allow to derive them.

  • kwargs – Parameters that are used to get draws but do not appear in the posterior object or other configuration parameters. For instance, the ‘n’ in binomial models and multinomial models.

Returns:

A data array with the draws from the posterior predictive distribution

Return type:

xr.DataArray

set_default_priors(priors)[source]#

Set default priors for non-parent parameters

Parameters:

priors (dict) – The keys are the names of non-parent parameters and the values are their default priors.

class bambi.families.Likelihood(name, params=None, parent=None, dist=None)[source]#

Representation of a Likelihood function for a Bambi model.

Notes: * parent must be in params * parent is inferred from the name if it is a known name

Parameters:
  • name (str) – Name of the likelihood function. Must be a valid PyMC distribution name.

  • params (Sequence[str]) – The name of the parameters the likelihood function accepts.

  • parent (str) – Optional specification of the name of the mean parameter in the likelihood. This is the parameter whose transformation is modeled by the linear predictor.

  • dist (pymc.distributions.distribution.DistributionMeta or callable) – Optional custom PyMC distribution that will be used to compute the likelihood.

Representation of a link function.

This object contains two main functions. One is the link function itself, the function that maps values in the response scale to the linear predictor, and the other is the inverse of the link function, that maps values of the linear predictor to the response scale.

The great majority of users will never interact with this class unless they want to create a custom Family with a custom Link. This is automatically handled for all the built-in families.

Parameters:
  • name (str) – The name of the link function. If it is a known name, it’s not necessary to pass any other arguments because functions are already defined internally. If not known, all of link, linkinv and linkinv_backend must be specified.

  • link (function) – A function that maps the response to the linear predictor. Known as the \(g\) function in GLM jargon. Does not need to be specified when name is a known name.

  • linkinv (function) – A function that maps the linear predictor to the response. Known as the \(g^{-1}\) function in GLM jargon. Does not need to be specified when name is a known name.

  • linkinv_backend (function) – Same than linkinv but must be something that works with PyMC backend (i.e. it must work with PyTensor tensors). Does not need to be specified when name is a known name.

bambi.data#

Code for loading datasets.

bambi.data.clear_data_home(data_home=None)[source]#

Delete all the content of the data home cache.

Parameters:

data_home (str) – The path to Bambi data dir. By default a folder named "bambi_data" in the user home folder.

bambi.data.load_data(dataset=None, data_home=None)[source]#

Load a dataset.

Run with no parameters to get a list of all available data sets.

The directory to save can also be set with the environment variable BAMBI_HOME. The checksum of the dataset is checked against a hardcoded value to watch for data corruption. Run bmb.clear_data_home() to clear the data directory.

Parameters:
  • dataset (str) – Name of dataset to load.

  • data_home (str, optional) – Where to save remote datasets

Return type:

pandas.DataFrame

bambi.plots#

bambi.interpret.comparisons(model: Model, idata: InferenceData, contrast: str | dict, conditional: str | dict | list | None = None, average_by: str | list | bool | None = None, comparison_type: str = 'diff', use_hdi: bool = True, prob: float | None = None, transforms: dict | None = None) DataFrame[source]#

Compute Conditional Adjusted Comparisons

Parameters:
  • model (bambi.Model) – The model for which we want to plot the predictions.

  • idata (arviz.InferenceData) – The InferenceData object that contains the samples from the posterior distribution of the model.

  • contrast (str, dict) – The predictor name whose contrast we would like to compare.

  • conditional (str, dict, list) – The covariates we would like to condition on.

  • average_by (str, list, bool, optional) – The covariates we would like to average by. The passed covariate(s) will marginalize over the other covariates in the model. If True, it averages over all covariates in the model to obtain the average estimate. Defaults to None.

  • comparison_type (str, optional) – The type of comparison to plot. Defaults to ‘diff’.

  • use_hdi (bool, optional) – Whether to compute the highest density interval (defaults to True) or the quantiles.

  • prob (float, optional) – The probability for the credibility intervals. Must be between 0 and 1. Defaults to 0.94. Changing the global variable az.rcParam["stats.hdi_prob"] affects this default.

  • transforms (dict, optional) – Transformations that are applied to each of the variables being plotted. The keys are the name of the variables, and the values are functions to be applied. Defaults to None.

Returns:

A dataframe with the comparison values, highest density interval, contrast name, contrast value, and conditional values.

Return type:

pandas.DataFrame

Raises:

ValueError – If wrt is a dict and length of contrast is greater than 1. If wrt is a dict and length of contrast is greater than 2 and conditional is None. If conditional is None and contrast is categorical with > 2 values. If comparison_type is not ‘diff’ or ‘ratio’. If prob is not > 0 and < 1.

bambi.interpret.plot_comparisons(model: Model, idata: InferenceData, contrast: str | dict | list, conditional: str | dict | list | None = None, average_by: str | list | None = None, comparison_type: str = 'diff', use_hdi: bool = True, prob=None, legend: bool = True, transforms=None, ax=None, fig_kwargs=None, subplot_kwargs=None)[source]#

Plot Conditional Adjusted Comparisons

Parameters:
  • model (bambi.Model) – The model for which we want to plot the predictions.

  • idata (arviz.InferenceData) – The InferenceData object that contains the samples from the posterior distribution of the model.

  • contrast (str, dict, list) – The predictor name whose contrast we would like to compare.

  • conditional (str, dict, list) – The covariates we would like to condition on.

  • average_by (str, list, optional) – The covariates we would like to average by. The passed covariate(s) will marginalize over the other covariates in the model. Defaults to None.

  • comparison_type (str, optional) – The type of comparison to plot. Defaults to ‘diff’.

  • use_hdi (bool, optional) – Whether to compute the highest density interval (defaults to True) or the quantiles.

  • prob (float, optional) – The probability for the credibility intervals. Must be between 0 and 1. Defaults to 0.94. Changing the global variable az.rcParam["stats.hdi_prob"] affects this default.

  • legend (bool, optional) – Whether to automatically include a legend in the plot. Defaults to True.

  • transforms (dict, optional) – Transformations that are applied to each of the variables being plotted. The keys are the name of the variables, and the values are functions to be applied. Defaults to None.

  • ax (matplotlib.axes._subplots.AxesSubplot, optional) – A matplotlib axes object or a sequence of them. If None, this function instantiates a new axes object. Defaults to None.

  • fig_kwargs (optional) – Keyword arguments passed to the matplotlib figure function as a dict. For example, fig_kwargs=dict(figsize=(11, 8)), sharey=True would make the figure 11 inches wide by 8 inches high and would share the y-axis values.

  • subplot_kwargs (optional) – Keyword arguments used to determine the covariates used for the horizontal, group, and panel axes. For example, subplot_kwargs=dict(main="x", group="y", panel="z") would plot the horizontal axis as x, the color (hue) as y, and the panel axis as z.

Returns:

A tuple with the figure and the axes.

Return type:

matplotlib.figure.Figure, matplotlib.axes._subplots.AxesSubplot

Raises:
  • ValueError – If conditional and average_by are both None. If length of conditional is greater than 3 and average_by is None.

  • Warning – If length of contrast is greater than 2.

bambi.interpret.plot_predictions(model: Model, idata: InferenceData, covariates: str | list, target: str = 'mean', pps: bool = False, use_hdi: bool = True, prob=None, transforms=None, legend: bool = True, ax=None, fig_kwargs=None, subplot_kwargs=None)[source]#

Plot Conditional Adjusted Predictions

Parameters:
  • model (bambi.Model) – The model for which we want to plot the predictions.

  • idata (arviz.InferenceData) – The InferenceData object that contains the samples from the posterior distribution of the model.

  • covariates (list or dict) – A sequence of between one and three names of variables in the model.

  • target (str) – Which model parameter to plot. Defaults to ‘mean’. Passing a parameter into target only works when pps is False as the target may not be available in the posterior predictive distribution.

  • pps (bool, optional) – Whether to plot the posterior predictive samples. Defaults to False.

  • use_hdi (bool, optional) – Whether to compute the highest density interval (defaults to True) or the quantiles.

  • prob (float, optional) – The probability for the credibility intervals. Must be between 0 and 1. Defaults to 0.94. Changing the global variable az.rcParam["stats.hdi_prob"] affects this default.

  • legend (bool, optional) – Whether to automatically include a legend in the plot. Defaults to True.

  • transforms (dict, optional) – Transformations that are applied to each of the variables being plotted. The keys are the name of the variables, and the values are functions to be applied. Defaults to None.

  • ax (matplotlib.axes._subplots.AxesSubplot, optional) – A matplotlib axes object or a sequence of them. If None, this function instantiates a new axes object. Defaults to None.

  • fig_kwargs (optional) – Keyword arguments passed to the matplotlib figure function as a dict. For example, fig_kwargs=dict(figsize=(11, 8)), sharey=True would make the figure 11 inches wide by 8 inches high and would share the y-axis values.

  • subplot_kwargs (optional) – Keyword arguments used to determine the covariates used for the horizontal, group, and panel axes. For example, subplot_kwargs=dict(main="x", group="y", panel="z") would plot the horizontal axis as x, the color (hue) as y, and the panel axis as z.

Returns:

A tuple with the figure and the axes.

Return type:

matplotlib.figure.Figure, matplotlib.axes._subplots.AxesSubplot

Raises:
  • ValueError – When level is not within 0 and 1. When the main covariate is not numeric or categoric.

  • TypeError – When covariates is not a string or a list of strings.

bambi.interpret.plot_slopes(model: Model, idata: InferenceData, wrt: str | dict, conditional: str | dict | list | None = None, average_by: str | list | None = None, eps: float = 0.0001, slope: str = 'dydx', use_hdi: bool = True, prob=None, transforms=None, legend: bool = True, ax=None, fig_kwargs=None, subplot_kwargs=None)[source]#

Plot Conditional Adjusted Slopes

Parameters:
  • model (bambi.Model) – The model for which we want to plot the predictions.

  • idata (arviz.InferenceData) – The InferenceData object that contains the samples from the posterior distribution of the model.

  • wrt (str, dict) – The slope of the regression with respect to (wrt) this predictor will be computed. If ‘wrt’ is numeric, the derivative is computed, else if string or categorical, ‘comparisons’ is called to compute difference in group means.

  • conditional (str, dict, list) – The covariates we would like to condition on.

  • average_by (str, list, bool, optional) – The covariates we would like to average by. The passed covariate(s) will marginalize over the other covariates in the model. If True, it averages over all covariates in the model to obtain the average estimate. Defaults to None.

  • eps (float, optional) – To compute the slope, ‘wrt’ is evaluated at wrt +/- ‘eps’. The rate of change is then computed as the difference between the two values divided by ‘eps’. Defaults to 1e-4.

  • slope (str, optional) – The type of slope to compute. Defaults to ‘dydx’. ‘dydx’ represents a unit increase in ‘wrt’ is associated with an n-unit change in the response. ‘eyex’ represents a percentage increase in ‘wrt’ is associated with an n-percent change in the response. ‘eydx’ represents a unit increase in ‘wrt’ is associated with an n-percent change in the response. ‘dyex’ represents a percent change in ‘wrt’ is associated with a unit increase in the response.

  • use_hdi (bool, optional) – Whether to compute the highest density interval (defaults to True) or the quantiles.

  • prob (float, optional) – The probability for the credibility intervals. Must be between 0 and 1. Defaults to 0.94. Changing the global variable az.rcParam["stats.hdi_prob"] affects this default.

  • transforms (dict, optional) – Transformations that are applied to each of the variables being plotted. The keys are the name of the variables, and the values are functions to be applied. Defaults to None.

  • legend (bool, optional) – Whether to automatically include a legend in the plot. Defaults to True.

  • ax (matplotlib.axes._subplots.AxesSubplot, optional) – A matplotlib axes object or a sequence of them. If None, this function instantiates a new axes object. Defaults to None.

  • fig_kwargs (optional) – Keyword arguments passed to the matplotlib figure function as a dict. For example, fig_kwargs=dict(figsize=(11, 8)), sharey=True would make the figure 11 inches wide by 8 inches high and would share the y-axis values.

  • subplot_kwargs (optional) – Keyword arguments used to determine the covariates used for the horizontal, group, and panel axes. For example, subplot_kwargs=dict(main="x", group="y", panel="z") would plot the horizontal axis as x, the color (hue) as y, and the panel axis as z.

Returns:

A tuple with the figure and the axes.

Return type:

matplotlib.figure.Figure, matplotlib.axes._subplots.AxesSubplot

Raises:

ValueError – If number of values passed with conditional is >= 2 and average_by are both None. If conditional and average_by are both None. If length of conditional is greater than 3 and average_by is None. If slope is not one of (‘dydx’, ‘dyex’, ‘eyex’, ‘eydx’).

bambi.interpret.predictions(model: Model, idata: InferenceData, covariates: str | dict | list, target: str = 'mean', pps: bool = False, use_hdi: bool = True, prob=None, transforms=None) DataFrame[source]#

Compute Conditional Adjusted Predictions

Parameters:
  • model (bambi.Model) – The model for which we want to plot the predictions.

  • idata (arviz.InferenceData) – The InferenceData object that contains the samples from the posterior distribution of the model.

  • covariates (list or dict) – A sequence of between one and three names of variables or a dict of length between one and three. If a sequence, the first variable is taken as the main variable and is mapped to the horizontal axis. If present, the second name is a coloring/grouping variable, and the third is mapped to different plot panels. If a dictionary, keys must be taken from (“main”, “group”, “panel”) and the values are the names of the variables.

  • target (str) – Which model parameter to plot. Defaults to ‘mean’. Passing a parameter into target only works when pps is False as the target may not be available in the posterior predictive distribution.

  • pps (bool, optional) – Whether to plot the posterior predictive samples. Defaults to False.

  • use_hdi (bool, optional) – Whether to compute the highest density interval (defaults to True) or the quantiles.

  • prob (float, optional) – The probability for the credibility intervals. Must be between 0 and 1. Defaults to 0.94. Changing the global variable az.rcParam["stats.hdi_prob"] affects this default.

  • transforms (dict, optional) – Transformations that are applied to each of the variables being plotted. The keys are the name of the variables, and the values are functions to be applied. Defaults to None.

Returns:

cap_data – A DataFrame with the create_cap_data and model predictions.

Return type:

pandas.DataFrame

Raises:

ValueError – If pps is True and target is not "mean". If passed covariates is not in correct key, value format. If length of covariates is not between 1 and 3.

bambi.interpret.slopes(model: Model, idata: InferenceData, wrt: str | dict, conditional: str | dict | list | None = None, average_by: str | list | bool | None = None, eps: float = 0.0001, slope: str = 'dydx', use_hdi: bool = True, prob: float | None = None, transforms: dict | None = None) DataFrame[source]#

Compute Conditional Adjusted Slopes

Parameters:
  • model (bambi.Model) – The model for which we want to plot the predictions.

  • idata (arviz.InferenceData) – The InferenceData object that contains the samples from the posterior distribution of the model.

  • wrt (str, dict) – The slope of the regression with respect to (wrt) this predictor will be computed.

  • conditional (str, dict, list) – The covariates we would like to condition on.

  • average_by (str, list, bool, optional) – The covariates we would like to average by. The passed covariate(s) will marginalize over the other covariates in the model. If True, it averages over all covariates in the model to obtain the average estimate. Defaults to None.

  • eps (float, optional) – To compute the slope, ‘wrt’ is evaluated at wrt +/- ‘eps’. The rate of change is then computed as the difference between the two values divided by ‘eps’. Defaults to 1e-4.

  • slope (str, optional) – The type of slope to compute. Defaults to ‘dydx’. ‘dydx’ represents a unit increase in ‘wrt’ is associated with an n-unit change in the response. ‘eyex’ represents a percentage increase in ‘wrt’ is associated with an n-percent change in the response. ‘eydx’ represents a unit increase in ‘wrt’ is associated with an n-percent change in the response. ‘dyex’ represents a percent change in ‘wrt’ is associated with a unit increase in the response.

  • use_hdi (bool, optional) – Whether to compute the highest density interval (defaults to True) or the quantiles.

  • prob (float, optional) – The probability for the credibility intervals. Must be between 0 and 1. Defaults to 0.94. Changing the global variable az.rcParam["stats.hdi_prob"] affects this default.

  • transforms (dict, optional) – Transformations that are applied to each of the variables being plotted. The keys are the name of the variables, and the values are functions to be applied. Defaults to None.

Returns:

A dataframe with the comparison values, highest density interval, wrt name, contrast value, and conditional values.

Return type:

pandas.DataFrame

Raises:

ValueError – If length of wrt is greater than 1. If conditional is None and wrt is passed more than 2 values. If conditional is None and default wrt has more than 2 unique values. If slope is not ‘dydx’, ‘dyex’, ‘eyex’, or ‘eydx’. If prob is not > 0 and < 1.