Alternative sampling backends

In Bambi, the sampler used is automatically selected given the type of variables used in the model. For inference, Bambi supports both MCMC and variational inference. By default, Bambi uses PyMC’s implementation of the adaptive Hamiltonian Monte Carlo (HMC) algorithm for sampling. Also known as the No-U-Turn Sampler (NUTS). This sampler is a good choice for many models. However, it is not the only sampling method, nor is PyMC the only library implementing NUTS.

To this extent, Bambi supports multiple backends for MCMC sampling such as NumPyro and Blackjax. This notebook will cover how to use such alternatives in Bambi.

Note: Bambi utilizes bayeux to access a variety of sampling backends. Thus, you will need to install the optional dependencies in the Bambi pyproject.toml file to use these backends.

import arviz as az
import bambi as bmb
import numpy as np
import pandas as pd

Bayeux

Bambi leverages bayeux to access different sampling backends. In short, bayeux lets you write a probabilistic model in JAX and immediately have access to state-of-the-art inference methods.

Since the underlying Bambi model is a PyMC model, this PyMC model can be “given” to bayeux. Then, we can choose from a variety of MCMC methods to perform inference.

To demonstrate the available backends, we will fist simulate data and build a model.

num_samples = 100
num_features = 1
noise_std = 1.0
random_seed = 42

rng = np.random.default_rng(random_seed)

coefficients = rng.normal(size=num_features)
X = rng.normal(size=(num_samples, num_features))
error = rng.normal(scale=noise_std, size=num_samples)
y = X @ coefficients + error

data = pd.DataFrame({"y": y, "x": X.flatten()})

model = bmb.Model("y ~ x", data)
model.build()

We can call bmb.inference_methods.names that returns a nested dictionary of the backends and list of inference methods.

methods = bmb.inference_methods.names
methods

{'pymc': {'mcmc': ['mcmc'], 'vi': ['vi']},
 'bayeux': {'mcmc': ['tfp_hmc',
   'tfp_nuts',
   'tfp_snaper_hmc',
   'blackjax_hmc',
   'blackjax_chees_hmc',
   'blackjax_meads_hmc',
   'blackjax_nuts',
   'blackjax_hmc_pathfinder',
   'blackjax_nuts_pathfinder',
   'flowmc_rqspline_hmc',
   'flowmc_rqspline_mala',
   'flowmc_realnvp_hmc',
   'flowmc_realnvp_mala',
   'numpyro_hmc',
   'numpyro_nuts',
   'nutpie']}}

With the PyMC backend, we have access to their implementation of the NUTS sampler and mean-field variational inference.

methods["pymc"]

{'mcmc': ['mcmc'], 'vi': ['vi']}

bayeux lets us have access to Tensorflow probability, Blackjax, FlowMC, and NumPyro backends.

methods["bayeux"]

{'mcmc': ['tfp_hmc',
  'tfp_nuts',
  'tfp_snaper_hmc',
  'blackjax_hmc',
  'blackjax_chees_hmc',
  'blackjax_meads_hmc',
  'blackjax_nuts',
  'blackjax_hmc_pathfinder',
  'blackjax_nuts_pathfinder',
  'flowmc_rqspline_hmc',
  'flowmc_rqspline_mala',
  'flowmc_realnvp_hmc',
  'flowmc_realnvp_mala',
  'numpyro_hmc',
  'numpyro_nuts',
  'nutpie']}

The values of the MCMC and VI keys in the dictionary are the names of the argument you would pass to inference_method in model.fit. This is shown in the section below.

Specifying an `inference_method`

By default, Bambi uses the PyMC NUTS implementation. To use a different backend, pass the name of the bayeux MCMC method to the inference_method parameter of the fit method.

Blackjax

blackjax_nuts_idata = model.fit(inference_method="blackjax_nuts")
blackjax_nuts_idata

WARNING:2024-12-21 13:43:24,702:jax._src.xla_bridge:969: An NVIDIA GPU may be present on this machine, but a CUDA-enabled jaxlib is not installed. Falling back to cpu.

Different backends have different naming conventions for the parameters specific to that MCMC method. Thus, to specify backend-specific parameters, pass your own kwargs to the fit method.

The following can be performend to identify the kwargs specific to each method.

bmb.inference_methods.get_kwargs("blackjax_nuts")

{<function blackjax.adaptation.window_adaptation.window_adaptation(algorithm, logdensity_fn: Callable, is_mass_matrix_diagonal: bool = True, initial_step_size: float = 1.0, target_acceptance_rate: float = 0.8, progress_bar: bool = False, adaptation_info_fn: Callable = <function return_all_adapt_info at 0x7f164c18d120>, integrator=<function generate_euclidean_integrator.<locals>.euclidean_integrator at 0x7f164c15c680>, **extra_parameters) -> blackjax.base.AdaptationAlgorithm>: {'logdensity_fn': <function bayeux._src.shared.constrain.<locals>.wrap_log_density.<locals>.wrapped(args)>,
  'is_mass_matrix_diagonal': True,
  'initial_step_size': 1.0,
  'target_acceptance_rate': 0.8,
  'progress_bar': False,
  'adaptation_info_fn': <function blackjax.adaptation.base.return_all_adapt_info(state, info, adaptation_state)>,
  'algorithm': GenerateSamplingAPI(differentiable=<function as_top_level_api at 0x7f164c16a7a0>, init=<function init at 0x7f164c133380>, build_kernel=<function build_kernel at 0x7f164c169e40>)},
 'adapt.run': {'num_steps': 500},
 <function blackjax.mcmc.nuts.as_top_level_api(logdensity_fn: Callable, step_size: float, inverse_mass_matrix: Union[blackjax.mcmc.metrics.Metric, jax.Array, Callable[[Union[jax.Array, numpy.ndarray, numpy.bool_, numpy.number, bool, int, float, complex, Iterable[ForwardRef('ArrayLikeTree')], Mapping[Any, ForwardRef('ArrayLikeTree')]]], jax.Array]], *, max_num_doublings: int = 10, divergence_threshold: int = 1000, integrator: Callable = <function generate_euclidean_integrator.<locals>.euclidean_integrator at 0x7f164c15c680>) -> blackjax.base.SamplingAlgorithm>: {'max_num_doublings': 10,
  'divergence_threshold': 1000,
  'integrator': <function blackjax.mcmc.integrators.generate_euclidean_integrator.<locals>.euclidean_integrator(logdensity_fn: Callable, kinetic_energy_fn: blackjax.mcmc.metrics.KineticEnergy) -> Callable[[blackjax.mcmc.integrators.IntegratorState, float], blackjax.mcmc.integrators.IntegratorState]>,
  'logdensity_fn': <function bayeux._src.shared.constrain.<locals>.wrap_log_density.<locals>.wrapped(args)>,
  'step_size': 0.5},
 'extra_parameters': {'chain_method': 'vectorized',
  'num_chains': 8,
  'num_draws': 500,
  'num_adapt_draws': 500,
  'return_pytree': False}}

Now, we can identify the kwargs we would like to change and pass to the fit method.

kwargs = {
    "adapt.run": {"num_steps": 500},
    "num_chains": 4,
    "num_draws": 250,
    "num_adapt_draws": 250,
}

blackjax_nuts_idata = model.fit(inference_method="blackjax_nuts", **kwargs)
blackjax_nuts_idata

Tensorflow probability

tfp_nuts_idata = model.fit(inference_method="tfp_nuts")
tfp_nuts_idata

NumPyro

numpyro_nuts_idata = model.fit(inference_method="numpyro_nuts")
numpyro_nuts_idata

sample: 100%|██████████| 1500/1500 [00:03<00:00, 386.97it/s]

flowMC

flowmc_idata = model.fit(inference_method="flowmc_realnvp_hmc")
flowmc_idata

['n_dim', 'n_chains', 'n_local_steps', 'n_global_steps', 'n_loop', 'output_thinning', 'verbose']

Global Tuning: 100%|██████████| 5/5 [00:20<00:00,  4.05s/it]
Global Sampling: 100%|██████████| 5/5 [00:00<00:00, 26.22it/s]

nutpie

bmb.inference_methods.get_kwargs("nutpie")

{<function nutpie.compiled_pyfunc.from_pyfunc(ndim: int, make_logp_fn: Callable, make_expand_fn: Callable, expanded_dtypes: list[numpy.dtype], expanded_shapes: list[tuple[int, ...]], expanded_names: list[str], *, initial_mean: numpy.ndarray | None = None, coords: dict[str, typing.Any] | None = None, dims: dict[str, tuple[str, ...]] | None = None, shared_data: dict[str, typing.Any] | None = None)>: {'ndim': 1,
  'make_logp_fn': <function bayeux._src.mcmc.nutpie._NutpieSampler._get_aux.<locals>.make_logp_fn()>,
  'make_expand_fn': <function bayeux._src.mcmc.nutpie._NutpieSampler.get_kwargs.<locals>.make_expand_fn(*args, **kwargs)>,
  'expanded_shapes': [(1,)],
  'expanded_names': ['x'],
  'expanded_dtypes': [numpy.float64]},
 <function nutpie.sample.sample(compiled_model: nutpie.sample.CompiledModel, *, draws: int = 1000, tune: int = 300, chains: int = 6, cores: Optional[int] = None, seed: Optional[int] = None, save_warmup: bool = True, progress_bar: bool = True, low_rank_modified_mass_matrix: bool = False, init_mean: Optional[numpy.ndarray] = None, return_raw_trace: bool = False, blocking: bool = True, progress_template: Optional[str] = None, progress_style: Optional[str] = None, progress_rate: int = 100, **kwargs) -> arviz.data.inference_data.InferenceData>: {'draws': 1000,
  'tune': 300,
  'chains': 8,
  'cores': 8,
  'seed': None,
  'save_warmup': True,
  'progress_bar': True,
  'low_rank_modified_mass_matrix': False,
  'init_mean': None,
  'return_raw_trace': False,
  'blocking': True,
  'progress_template': None,
  'progress_style': None,
  'progress_rate': 100},
 'extra_parameters': {'flatten': <function bayeux._src.mcmc.nutpie._NutpieSampler._get_aux.<locals>.flatten(pytree)>,
  'unflatten': <jax._src.util.HashablePartial at 0x7f1545283cd0>,
  'return_pytree': False}}

nutpie_idata = model.fit(inference_method="nutpie", tune=400, draws=500, chains=3)
nutpie_idata

Sampler Progress

Total Chains: 3

Active Chains: 0

Finished Chains: 3

Sampling for now

Estimated Time to Completion: now

Draws	Step Size	Gradients/Draw
900	1.04	3
900	1.02	3
900	0.99	3

Sampler comparisons

With ArviZ, we can compare the inference result summaries of the samplers. Note: We can’t use az.compare as not each inference data object returns the pointwise log-probabilities. Thus, an error would be raised.

az.summary(blackjax_nuts_idata)

	mean	sd	hdi_3%	hdi_97%	mcse_mean	mcse_sd	ess_bulk	ess_tail	r_hat
Intercept	-0.000	0.097	-0.180	0.183	0.003	0.003	938.0	752.0	1.0
sigma	0.987	0.073	0.859	1.126	0.002	0.002	913.0	739.0	1.0
x	0.423	0.125	0.151	0.629	0.004	0.003	1044.0	820.0	1.0

az.summary(tfp_nuts_idata)

	mean	sd	hdi_3%	hdi_97%	mcse_mean	mcse_sd	ess_bulk	ess_tail	r_hat
Intercept	0.002	0.099	-0.183	0.190	0.001	0.001	6775.0	5598.0	1.0
sigma	0.987	0.071	0.848	1.114	0.001	0.001	8338.0	5715.0	1.0
x	0.424	0.127	0.186	0.661	0.002	0.001	6244.0	5267.0	1.0

az.summary(numpyro_nuts_idata)

	mean	sd	hdi_3%	hdi_97%	mcse_mean	mcse_sd	ess_bulk	ess_tail	r_hat
Intercept	0.005	0.098	-0.180	0.188	0.001	0.001	9065.0	6523.0	1.0
sigma	0.988	0.074	0.856	1.127	0.001	0.001	7217.0	5477.0	1.0
x	0.423	0.130	0.179	0.661	0.002	0.001	7449.0	6203.0	1.0

az.summary(flowmc_idata)

	mean	sd	hdi_3%	hdi_97%	mcse_mean	mcse_sd	ess_bulk	ess_tail	r_hat
Intercept	0.004	0.101	-0.184	0.193	0.002	0.001	2352.0	3365.0	1.01
sigma	0.987	0.070	0.861	1.123	0.001	0.001	4252.0	4034.0	1.01
x	0.425	0.129	0.171	0.656	0.001	0.001	7504.0	3764.0	1.01

az.summary(nutpie_idata)

	mean	sd	hdi_3%	hdi_97%	mcse_mean	mcse_sd	ess_bulk	ess_tail	r_hat
Intercept	0.002	0.098	-0.179	0.181	0.002	0.003	2288.0	1040.0	1.0
sigma	0.989	0.072	0.857	1.118	0.002	0.001	2199.0	1155.0	1.0
x	0.423	0.128	0.176	0.657	0.003	0.002	1956.0	1287.0	1.0

Summary

Thanks to bayeux, we can use three different sampling backends and 10+ alternative MCMC methods in Bambi. Using these methods is as simple as passing the inference name to the inference_method of the fit method.

%load_ext watermark
%watermark -n -u -v -iv -w

Last updated: Sat Dec 21 2024

Python implementation: CPython
Python version       : 3.11.9
IPython version      : 8.27.0

bambi : 0.14.1.dev17+g25798ce7
arviz : 0.19.0
pandas: 2.2.3
numpy : 1.26.4

Watermark: 2.5.0