LittleMCMC¶
A lightweight and performant implementation of HMC and NUTS in Python, spun out of the PyMC project. Not to be confused with minimc.
Installation¶
Note
LittleMCMC is developed for Python 3.6 or later.
LittleMCMC is a pure Python library, so it can be easily installed by using
pip
or directly from source.
From Source¶
The source code for LittleMCMC can be downloaded from GitHub by running
git clone https://github.com/eigenfoo/littlemcmc.git
cd littlemcmc/
python setup.py install
Testing¶
To run the unit tests, install pytest and then, in the root of the project directory, execute:
pytest -v
All of the tests should pass. If any of the tests don’t pass and you can’t figure out why, please open an issue on GitHub.
Note
This tutorial was generated from an IPython notebook that can be downloaded here.
LittleMCMC Quickstart¶
LittleMCMC is a lightweight and performant implementation of HMC and NUTS in Python, spun out of the PyMC project. In this quickstart tutorial, we will walk through the main use case of LittleMCMC, and outline the various modules that may be of interest.
Table of Contents¶
Who should use LittleMCMC?¶
LittleMCMC is a fairly barebones library with a very niche use case. Most users will probably find that PyMC3 will satisfy their needs, with better strength of support and quality of documentation.
There are two expected use cases for LittleMCMC. Firstly, if you:
- Have a model with only continuous parameters,
- Are willing to manually transform all of your model’s parameters to the unconstrained space (if necessary),
- Have a Python function/callable that:
- computes the log probability of your model and its derivative
- is pickleable
- outputs an array with the same shape as its input
- And all you need is an implementation of HMC/NUTS (preferably in Python) to sample from the posterior,
then you should consider using LittleMCMC.
Secondly, if you want to run algorithmic experiments on HMC/NUTS (in Python), without having to develop around the heavy machinery that accompanies other probabilistic programming frameworks (like PyMC3, TensorFlow Probability or Stan), then you should consider running your experiments in LittleMCMC.
How to Sample¶
import numpy as np
import scipy
import littlemcmc as lmc
def logp_func(x, loc=0, scale=1):
return np.log(scipy.stats.norm.pdf(x, loc=loc, scale=scale))
def dlogp_func(x, loc=0, scale=1):
return -(x - loc) / scale
def logp_dlogp_func(x, loc=0, scale=1):
return logp_func(x, loc=loc, scale=scale), dlogp_func(x, loc=loc, scale=scale)
# By default: 4 chains in 4 cores, 500 tuning steps and 1000 sampling steps.
trace, stats = lmc.sample(
logp_dlogp_func=logp_dlogp_func,
model_ndim=1,
progressbar=None, # HTML progress bars don't render well in RST.
)
/home/george/littlemcmc/venv/lib/python3.6/site-packages/ipykernel_launcher.py:2: RuntimeWarning: divide by zero encountered in log
Inspecting the Output of lmc.sample
¶
# Shape is (num_chains, num_samples, num_parameters)
trace.shape
(4, 1000, 1)
# The first 2 samples across all chains and parameters
trace[:, :2, :]
array([[[ 0.92958231],
[ 0.92958231]],
[[-1.06231693],
[-1.11589309]],
[[-0.73177109],
[-0.66975061]],
[[ 0.8923907 ],
[ 0.97253646]]])
stats.keys()
dict_keys(['depth', 'step_size', 'tune', 'mean_tree_accept', 'step_size_bar', 'tree_size', 'diverging', 'energy_error', 'energy', 'max_energy_error', 'model_logp'])
# Again, shape is (num_chains, num_samples, num_parameters)
stats["depth"].shape
(4, 1000, 1)
# The first 2 tree depths across all chains and parameters
stats["depth"][:, :2, :]
array([[[2],
[1]],
[[1],
[1]],
[[2],
[1]],
[[2],
[1]]])
Customizing the Default NUTS Sampler¶
By default, lmc.sample
samples using NUTS with sane defaults. These
defaults can be override by either:
- Passing keyword arguments from
lmc.NUTS
intolmc.sample
, or - Constructing an
lmc.NUTS
sampler, and passing that tolmc.sample
. This method also allows you to choose to other samplers, such aslmc.HamiltonianMC
.
For example, suppose you want to increase the target_accept
from the
default 0.8
to 0.9
. The following two cells are equivalent:
trace, stats = lmc.sample(
logp_dlogp_func=logp_dlogp_func,
model_ndim=1,
target_accept=0.9,
progressbar=None,
)
step = lmc.NUTS(logp_dlogp_func=logp_dlogp_func, model_ndim=1, target_accept=0.9)
trace, stats = lmc.sample(
logp_dlogp_func=logp_dlogp_func,
model_ndim=1,
step=step,
progressbar=None,
)
/home/george/littlemcmc/venv/lib/python3.6/site-packages/ipykernel_launcher.py:2: RuntimeWarning: divide by zero encountered in log
For a list of keyword arguments that lmc.NUTS
accepts, please refer
to the API reference for
``lmc.NUTS` <https://littlemcmc.readthedocs.io/en/latest/generated/littlemcmc.NUTS.html#littlemcmc.NUTS>`__.
Other Modules¶
LittleMCMC exposes:
- Two step methods (a.k.a. samplers):
`littlemcmc.HamiltonianMC
(Hamiltonian Monte Carlo) <https://littlemcmc.readthedocs.io/en/latest/generated/littlemcmc.HamiltonianMC.html#littlemcmc.HamiltonianMC>`__ and the`littlemcmc.NUTS
(No-U-Turn Sampler) <https://littlemcmc.readthedocs.io/en/latest/generated/littlemcmc.NUTS.html#littlemcmc.NUTS>`__ - Various quadpotentials (a.k.a. mass matrices or inverse metrics) in
`littlemcmc.quadpotential
<https://littlemcmc.readthedocs.io/en/latest/api.html#quadpotentials-a-k-a-mass-matrices>`__, along with mass matrix adaptation routines - Dual-averaging step size adaptation in
`littlemcmc.step_sizes
<https://littlemcmc.readthedocs.io/en/latest/generated/littlemcmc.step_sizes.DualAverageAdaptation.html#littlemcmc.step_sizes.DualAverageAdaptation>`__ - A leapfrog integrator in
`littlemcmc.integration
<https://littlemcmc.readthedocs.io/en/latest/generated/littlemcmc.integration.CpuLeapfrogIntegrator.html#littlemcmc.integration.CpuLeapfrogIntegrator>`__
These modules should allow for easy experimentation with the sampler. Please refer to the API Reference for more information.
Note
This tutorial was generated from an IPython notebook that can be downloaded here.
Framework Cookbook¶
littlemcmc
only needs a logp_dlogp_func
, which is
framework-agnostic. To illustrate this, this cookbook implements linear
in multiple frameworks, and samples them with littlemcmc
. At the end
of this notebook, we load the inference traces and sampler statistics
into ArviZ and do some basic visualizations.
import littlemcmc as lmc
Create and Visualize Data¶
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(42)
true_params = np.array([0.5, -2.3, -0.23])
N = 50
t = np.linspace(0, 10, 2)
x = np.random.uniform(0, 10, 50)
y = x * true_params[0] + true_params[1]
y_obs = y + np.exp(true_params[-1]) * np.random.randn(N)
plt.plot(x, y_obs, ".k", label="observations")
plt.plot(t, true_params[0]*t + true_params[1], label="ground truth")
plt.xlabel("x")
plt.ylabel("y")
plt.legend()
plt.show()

PyTorch¶
import torch
class LinearModel(torch.nn.Module):
def __init__(self):
super(LinearModel, self).__init__()
self.m = torch.nn.Parameter(torch.tensor(0.0, dtype=torch.float64))
self.b = torch.nn.Parameter(torch.tensor(0.0, dtype=torch.float64))
self.logs = torch.nn.Parameter(torch.tensor(0.0, dtype=torch.float64))
def forward(self, x, y):
mean = self.m * x + self.b
loglike = -0.5 * torch.sum((y - mean) ** 2 * torch.exp(-2 * self.logs) + 2 * self.logs)
return loglike
torch_model = torch.jit.script(LinearModel())
torch_params = [torch_model.m, torch_model.b, torch_model.logs]
args = [torch.tensor(x, dtype=torch.double), torch.tensor(y_obs, dtype=torch.double)]
def torch_logp_dlogp_func(x):
for i, p in enumerate(torch_params):
p.data = torch.tensor(x[i])
if p.grad is not None:
p.grad.detach_()
p.grad.zero_()
result = torch_model(*args)
result.backward()
return result.detach().numpy(), np.array([p.grad.numpy() for p in torch_params])
298 µs ± 43.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Please see
`sample_pytorch_logp_dlogp_func.py
<https://github.com/eigenfoo/littlemcmc/tree/master/docs/_static/scripts/sample_pytorch_logp_dlogp_func.py>`__
for a working example. Theoretically, however, all that’s needed is to
run the following snippet:
trace, stats = lmc.sample(
logp_dlogp_func=torch_logp_dlogp_func, model_ndim=3, tune=500, draws=1000, chains=4,
)
JAX¶
from jax.config import config
config.update("jax_enable_x64", True)
import jax
import jax.numpy as jnp
def jax_model(params):
mean = params[0] * x + params[1]
loglike = -0.5 * jnp.sum((y_obs - mean) ** 2 * jnp.exp(-2 * params[2]) + 2 * params[2])
return loglike
@jax.jit
def jax_model_and_grad(x):
return jax_model(x), jax.grad(jax_model)(x)
def jax_logp_dlogp_func(x):
v, g = jax_model_and_grad(x)
return np.asarray(v), np.asarray(g)
/Users/george/miniconda3/lib/python3.7/site-packages/jax/lib/xla_bridge.py:125: UserWarning: No GPU/TPU found, falling back to CPU.
warnings.warn('No GPU/TPU found, falling back to CPU.')
269 µs ± 48.6 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
Please see
`sample_jax_logp_dlogp_func.py
<https://github.com/eigenfoo/littlemcmc/tree/master/docs/_static/scripts/sample_jax_logp_dlogp_func.py>`__
for a working example. Theoretically, however, all that’s needed is to
run the following snippet:
trace, stats = lmc.sample(
logp_dlogp_func=jax_logp_dlogp_func, model_ndim=3, tune=500, draws=1000, chains=4,
)
PyMC3¶
import pymc3 as pm
import theano
with pm.Model() as pm_model:
pm_params = pm.Flat("pm_params", shape=3)
mean = pm_params[0] * x + pm_params[1]
pm.Normal("obs", mu=mean, sigma=pm.math.exp(pm_params[2]), observed=y_obs)
pm_model_and_grad = pm_model.fastfn([pm_model.logpt] + theano.grad(pm_model.logpt, pm_model.vars))
def pm_logp_dlogp_func(x):
return pm_model_and_grad(pm_model.bijection.rmap(x))
46.3 µs ± 3.94 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
trace, stats = lmc.sample(
logp_dlogp_func=pm_logp_dlogp_func,
model_ndim=3,
tune=500,
draws=1000,
chains=4,
progressbar=False, # Progress bars don't render well in reStructuredText docs...
)
Visualize Traces with ArviZ¶
Just to sanity check our results, let’s visualize all the traces using
ArviZ. At the time of writing there’s no way to easily load the
np.ndarrays
arrays that littlemcmc
returns into an
az.InferenceDataset
. Hopefully one day we’ll have an
az.from_littlemcmc
method… but until then, please use this code
snippet!
def arviz_from_littlemcmc(trace, stats):
return az.InferenceData(
posterior=az.dict_to_dataset({"x": trace}),
sample_stats=az.dict_to_dataset({k: v.squeeze() for k, v in stats.items()})
)
import arviz as az
dataset = arviz_from_littlemcmc(trace, stats)
az.plot_trace(dataset)
plt.show()

API Reference¶
Sampling¶
sample (logp_dlogp_func, Tuple[numpy.ndarray, …) |
Draw samples from the posterior using the given step methods. |
Step Methods¶
HamiltonianMC (logp_dlogp_func, …[, potential]) |
A sampler for continuous variables based on Hamiltonian mechanics. |
NUTS (logp_dlogp_func, Tuple[numpy.ndarray, …) |
A sampler for continuous variables based on Hamiltonian mechanics. |
Quadpotentials (a.k.a. Mass Matrices)¶
quad_potential (C, is_cov) |
Compute a QuadPotential object from a scaling matrix. |
QuadPotentialDiag (v[, dtype]) |
Quad potential using a diagonal covariance matrix. |
QuadPotentialFull (cov[, dtype]) |
Basic QuadPotential object for Hamiltonian calculations. |
QuadPotentialFullInv (A[, dtype]) |
QuadPotential object for Hamiltonian calculations using inverse of covariance matrix. |
QuadPotentialDiagAdapt (n, initial_mean[, …]) |
Adapt a diagonal mass matrix from the sample variances. |
QuadPotentialFullAdapt (n, initial_mean[, …]) |
Adapt a dense mass matrix using the sample covariances. |
Dual Averaging Step Size Adaptation¶
step_sizes.DualAverageAdaptation (…) |
Dual averaging step size adaptation. |
Integrators¶
integration.CpuLeapfrogIntegrator (potential, …) |
Leapfrog integrator using the CPU. |
About LittleMCMC¶
LittleMCMC is a lightweight, performant implementation of Hamiltonian Monte Carlo (HMC) and the No-U-Turn Sampler (NUTS) in Python. This document aims to explain and contextualize the motivation and purpose of LittleMCMC. For an introduction to the user-facing API, refer to the quickstart tutorial.
Motivation and Purpose¶
Bayesian inference and probabilistic computation is complicated and has many moving parts[1]_. As a result, many probabilistic programming frameworks (or any library that automates Bayesian inference) are monolithic libraries that handle everything from model specification (including automatic differentiation of the joint log probability), to inference (usually via Markov chain Monte Carlo or variational inference), to diagnosis and visualization of the inference results[2]_. PyMC3 and Stan are two excellent examples of such monolithic frameworks.
However, such monoliths require users to buy in to entire frameworks or ecosystems. For example, a user that has specified their model in one framework but who now wishes to migrate to another library (to take advantage of certain better-supported features, say) must now reimplement their models from scratch in the new framework.
LittleMCMC remedies this exact use case: by isolating PyMC’s HMC/NUTS code in a standalone library, users with their own log probability function and its derivative may use PyMC’s battle-tested HMC/NUTS samplers.
LittleMCMC in Context¶
LittleMCMC stands on the shoulders of giants (that is, giant open source projects). Most obviously, LittleMCMC builds from (or, more accurately, is a spin-off project from) the PyMC project (both PyMC3 and PyMC4).
In terms of prior art, LittleMCMC is similar to several other open-source libraries, such as NUTS by Morgan Fouesneau or Sampyl by Mat Leonard. However, these libraries do not offer the same functionality as LittleMCMC: for example, they do not allow for easy changes of the mass matrix (instead assuming that an identity mass matrix), or they do not raise sampler errors or track sampler statistics such as divergences, energy, etc.
By offering step methods, integrators, quadpotentials and the sampling loop in separate Python modules, LittleMCMC offers not just a battle-tested sampler, but also an extensible one: users may configure the samplers as they wish.