LittleMCMC

https://github.com/eigenfoo/littlemcmc/workflows/tests/badge.svg https://github.com/eigenfoo/littlemcmc/workflows/lint/badge.svg https://codecov.io/gh/eigenfoo/littlemcmc/branch/master/graph/badge.svg https://readthedocs.org/projects/littlemcmc/badge/?version=latest https://img.shields.io/github/license/eigenfoo/littlemcmc
littlemcmc     /lɪtəl ɛm si ɛm si/     noun

A lightweight and performant implementation of HMC and NUTS in Python, spun out of the PyMC project. Not to be confused with minimc.

Installation

Note

LittleMCMC is developed for Python 3.6 or later.

LittleMCMC is a pure Python library, so it can be easily installed by using pip or directly from source.

Using pip

LittleMCMC can be installed using pip.

pip install littlemcmc

From Source

The source code for LittleMCMC can be downloaded from GitHub by running

git clone https://github.com/eigenfoo/littlemcmc.git
cd littlemcmc/
python setup.py install

Testing

To run the unit tests, install pytest and then, in the root of the project directory, execute:

pytest -v

All of the tests should pass. If any of the tests don’t pass and you can’t figure out why, please open an issue on GitHub.

Note

This tutorial was generated from an IPython notebook that can be downloaded here.

LittleMCMC Quickstart

LittleMCMC is a lightweight and performant implementation of HMC and NUTS in Python, spun out of the PyMC project. In this quickstart tutorial, we will walk through the main use case of LittleMCMC, and outline the various modules that may be of interest.

Who should use LittleMCMC?

LittleMCMC is a fairly barebones library with a very niche use case. Most users will probably find that PyMC3 will satisfy their needs, with better strength of support and quality of documentation.

There are two expected use cases for LittleMCMC. Firstly, if you:

  1. Have a model with only continuous parameters,
  2. Are willing to manually transform all of your model’s parameters to the unconstrained space (if necessary),
  3. Have a Python function/callable that:
    1. computes the log probability of your model and its derivative
    2. is pickleable
    3. outputs an array with the same shape as its input
  4. And all you need is an implementation of HMC/NUTS (preferably in Python) to sample from the posterior,

then you should consider using LittleMCMC.

Secondly, if you want to run algorithmic experiments on HMC/NUTS (in Python), without having to develop around the heavy machinery that accompanies other probabilistic programming frameworks (like PyMC3, TensorFlow Probability or Stan), then you should consider running your experiments in LittleMCMC.

How to Sample

import numpy as np
import scipy
import littlemcmc as lmc
def logp_func(x, loc=0, scale=1):
    return np.log(scipy.stats.norm.pdf(x, loc=loc, scale=scale))


def dlogp_func(x, loc=0, scale=1):
    return -(x - loc) / scale


def logp_dlogp_func(x, loc=0, scale=1):
    return logp_func(x, loc=loc, scale=scale), dlogp_func(x, loc=loc, scale=scale)
# By default: 4 chains in 4 cores, 500 tuning steps and 1000 sampling steps.
trace, stats = lmc.sample(
    logp_dlogp_func=logp_dlogp_func,
    model_ndim=1,
    progressbar=None,  # HTML progress bars don't render well in RST.
)
/home/george/littlemcmc/venv/lib/python3.6/site-packages/ipykernel_launcher.py:2: RuntimeWarning: divide by zero encountered in log

Inspecting the Output of lmc.sample

# Shape is (num_chains, num_samples, num_parameters)
trace.shape
(4, 1000, 1)
# The first 2 samples across all chains and parameters
trace[:, :2, :]
array([[[ 0.92958231],
        [ 0.92958231]],

       [[-1.06231693],
        [-1.11589309]],

       [[-0.73177109],
        [-0.66975061]],

       [[ 0.8923907 ],
        [ 0.97253646]]])
stats.keys()
dict_keys(['depth', 'step_size', 'tune', 'mean_tree_accept', 'step_size_bar', 'tree_size', 'diverging', 'energy_error', 'energy', 'max_energy_error', 'model_logp'])
# Again, shape is (num_chains, num_samples, num_parameters)
stats["depth"].shape
(4, 1000, 1)
# The first 2 tree depths across all chains and parameters
stats["depth"][:, :2, :]
array([[[2],
        [1]],

       [[1],
        [1]],

       [[2],
        [1]],

       [[2],
        [1]]])

Customizing the Default NUTS Sampler

By default, lmc.sample samples using NUTS with sane defaults. These defaults can be override by either:

  1. Passing keyword arguments from lmc.NUTS into lmc.sample, or
  2. Constructing an lmc.NUTS sampler, and passing that to lmc.sample. This method also allows you to choose to other samplers, such as lmc.HamiltonianMC.

For example, suppose you want to increase the target_accept from the default 0.8 to 0.9. The following two cells are equivalent:

trace, stats = lmc.sample(
    logp_dlogp_func=logp_dlogp_func,
    model_ndim=1,
    target_accept=0.9,
    progressbar=None,
)
step = lmc.NUTS(logp_dlogp_func=logp_dlogp_func, model_ndim=1, target_accept=0.9)
trace, stats = lmc.sample(
    logp_dlogp_func=logp_dlogp_func,
    model_ndim=1,
    step=step,
    progressbar=None,
)
/home/george/littlemcmc/venv/lib/python3.6/site-packages/ipykernel_launcher.py:2: RuntimeWarning: divide by zero encountered in log

For a list of keyword arguments that lmc.NUTS accepts, please refer to the API reference for ``lmc.NUTS` <https://littlemcmc.readthedocs.io/en/latest/generated/littlemcmc.NUTS.html#littlemcmc.NUTS>`__.

Other Modules

LittleMCMC exposes:

  1. Two step methods (a.k.a. samplers): `littlemcmc.HamiltonianMC (Hamiltonian Monte Carlo) <https://littlemcmc.readthedocs.io/en/latest/generated/littlemcmc.HamiltonianMC.html#littlemcmc.HamiltonianMC>`__ and the `littlemcmc.NUTS (No-U-Turn Sampler) <https://littlemcmc.readthedocs.io/en/latest/generated/littlemcmc.NUTS.html#littlemcmc.NUTS>`__
  2. Various quadpotentials (a.k.a. mass matrices or inverse metrics) in `littlemcmc.quadpotential <https://littlemcmc.readthedocs.io/en/latest/api.html#quadpotentials-a-k-a-mass-matrices>`__, along with mass matrix adaptation routines
  3. Dual-averaging step size adaptation in `littlemcmc.step_sizes <https://littlemcmc.readthedocs.io/en/latest/generated/littlemcmc.step_sizes.DualAverageAdaptation.html#littlemcmc.step_sizes.DualAverageAdaptation>`__
  4. A leapfrog integrator in `littlemcmc.integration <https://littlemcmc.readthedocs.io/en/latest/generated/littlemcmc.integration.CpuLeapfrogIntegrator.html#littlemcmc.integration.CpuLeapfrogIntegrator>`__

These modules should allow for easy experimentation with the sampler. Please refer to the API Reference for more information.

Note

This tutorial was generated from an IPython notebook that can be downloaded here.

Framework Cookbook

littlemcmc only needs a logp_dlogp_func, which is framework-agnostic. To illustrate this, this cookbook implements linear in multiple frameworks, and samples them with littlemcmc. At the end of this notebook, we load the inference traces and sampler statistics into ArviZ and do some basic visualizations.

import littlemcmc as lmc

Create and Visualize Data

import numpy as np
import matplotlib.pyplot as plt

np.random.seed(42)

true_params = np.array([0.5, -2.3, -0.23])

N = 50
t = np.linspace(0, 10, 2)
x = np.random.uniform(0, 10, 50)
y = x * true_params[0] + true_params[1]
y_obs = y + np.exp(true_params[-1]) * np.random.randn(N)

plt.plot(x, y_obs, ".k", label="observations")
plt.plot(t, true_params[0]*t + true_params[1], label="ground truth")
plt.xlabel("x")
plt.ylabel("y")
plt.legend()
plt.show()
_images/framework_cookbook_3_0.png

PyTorch

import torch

class LinearModel(torch.nn.Module):
    def __init__(self):
        super(LinearModel, self).__init__()
        self.m = torch.nn.Parameter(torch.tensor(0.0, dtype=torch.float64))
        self.b = torch.nn.Parameter(torch.tensor(0.0, dtype=torch.float64))
        self.logs = torch.nn.Parameter(torch.tensor(0.0, dtype=torch.float64))

    def forward(self, x, y):
        mean = self.m * x + self.b
        loglike = -0.5 * torch.sum((y - mean) ** 2 * torch.exp(-2 * self.logs) + 2 * self.logs)
        return loglike

torch_model = torch.jit.script(LinearModel())
torch_params = [torch_model.m, torch_model.b, torch_model.logs]
args = [torch.tensor(x, dtype=torch.double), torch.tensor(y_obs, dtype=torch.double)]

def torch_logp_dlogp_func(x):
    for i, p in enumerate(torch_params):
        p.data = torch.tensor(x[i])
        if p.grad is not None:
            p.grad.detach_()
            p.grad.zero_()

    result = torch_model(*args)
    result.backward()

    return result.detach().numpy(), np.array([p.grad.numpy() for p in torch_params])
298 µs ± 43.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Please see `sample_pytorch_logp_dlogp_func.py <https://github.com/eigenfoo/littlemcmc/tree/master/docs/_static/scripts/sample_pytorch_logp_dlogp_func.py>`__ for a working example. Theoretically, however, all that’s needed is to run the following snippet:

trace, stats = lmc.sample(
    logp_dlogp_func=torch_logp_dlogp_func, model_ndim=3, tune=500, draws=1000, chains=4,
)

JAX

from jax.config import config
config.update("jax_enable_x64", True)

import jax
import jax.numpy as jnp

def jax_model(params):
    mean = params[0] * x + params[1]
    loglike = -0.5 * jnp.sum((y_obs - mean) ** 2 * jnp.exp(-2 * params[2]) + 2 * params[2])
    return loglike

@jax.jit
def jax_model_and_grad(x):
    return jax_model(x), jax.grad(jax_model)(x)


def jax_logp_dlogp_func(x):
    v, g = jax_model_and_grad(x)
    return np.asarray(v), np.asarray(g)
/Users/george/miniconda3/lib/python3.7/site-packages/jax/lib/xla_bridge.py:125: UserWarning: No GPU/TPU found, falling back to CPU.
  warnings.warn('No GPU/TPU found, falling back to CPU.')
269 µs ± 48.6 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

Please see `sample_jax_logp_dlogp_func.py <https://github.com/eigenfoo/littlemcmc/tree/master/docs/_static/scripts/sample_jax_logp_dlogp_func.py>`__ for a working example. Theoretically, however, all that’s needed is to run the following snippet:

trace, stats = lmc.sample(
    logp_dlogp_func=jax_logp_dlogp_func, model_ndim=3, tune=500, draws=1000, chains=4,
)

PyMC3

import pymc3 as pm
import theano

with pm.Model() as pm_model:
    pm_params = pm.Flat("pm_params", shape=3)
    mean = pm_params[0] * x + pm_params[1]
    pm.Normal("obs", mu=mean, sigma=pm.math.exp(pm_params[2]), observed=y_obs)

pm_model_and_grad = pm_model.fastfn([pm_model.logpt] + theano.grad(pm_model.logpt, pm_model.vars))

def pm_logp_dlogp_func(x):
    return pm_model_and_grad(pm_model.bijection.rmap(x))
46.3 µs ± 3.94 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
trace, stats = lmc.sample(
    logp_dlogp_func=pm_logp_dlogp_func,
    model_ndim=3,
    tune=500,
    draws=1000,
    chains=4,
    progressbar=False,  # Progress bars don't render well in reStructuredText docs...
)

Visualize Traces with ArviZ

Just to sanity check our results, let’s visualize all the traces using ArviZ. At the time of writing there’s no way to easily load the np.ndarrays arrays that littlemcmc returns into an az.InferenceDataset. Hopefully one day we’ll have an az.from_littlemcmc method… but until then, please use this code snippet!

def arviz_from_littlemcmc(trace, stats):
    return az.InferenceData(
        posterior=az.dict_to_dataset({"x": trace}),
        sample_stats=az.dict_to_dataset({k: v.squeeze() for k, v in stats.items()})
    )
import arviz as az

dataset = arviz_from_littlemcmc(trace, stats)

az.plot_trace(dataset)
plt.show()
_images/framework_cookbook_18_0.png

API Reference

Sampling

sample(logp_dlogp_func, Tuple[numpy.ndarray, …) Draw samples from the posterior using the given step methods.

Step Methods

HamiltonianMC(logp_dlogp_func, …[, potential]) A sampler for continuous variables based on Hamiltonian mechanics.
NUTS(logp_dlogp_func, Tuple[numpy.ndarray, …) A sampler for continuous variables based on Hamiltonian mechanics.

Quadpotentials (a.k.a. Mass Matrices)

quad_potential(C, is_cov) Compute a QuadPotential object from a scaling matrix.
QuadPotentialDiag(v[, dtype]) Quad potential using a diagonal covariance matrix.
QuadPotentialFull(cov[, dtype]) Basic QuadPotential object for Hamiltonian calculations.
QuadPotentialFullInv(A[, dtype]) QuadPotential object for Hamiltonian calculations using inverse of covariance matrix.
QuadPotentialDiagAdapt(n, initial_mean[, …]) Adapt a diagonal mass matrix from the sample variances.
QuadPotentialFullAdapt(n, initial_mean[, …]) Adapt a dense mass matrix using the sample covariances.

Dual Averaging Step Size Adaptation

step_sizes.DualAverageAdaptation(…) Dual averaging step size adaptation.

Integrators

integration.CpuLeapfrogIntegrator(potential, …) Leapfrog integrator using the CPU.

About LittleMCMC

LittleMCMC is a lightweight, performant implementation of Hamiltonian Monte Carlo (HMC) and the No-U-Turn Sampler (NUTS) in Python. This document aims to explain and contextualize the motivation and purpose of LittleMCMC. For an introduction to the user-facing API, refer to the quickstart tutorial.

Motivation and Purpose

Bayesian inference and probabilistic computation is complicated and has many moving parts[1]_. As a result, many probabilistic programming frameworks (or any library that automates Bayesian inference) are monolithic libraries that handle everything from model specification (including automatic differentiation of the joint log probability), to inference (usually via Markov chain Monte Carlo or variational inference), to diagnosis and visualization of the inference results[2]_. PyMC3 and Stan are two excellent examples of such monolithic frameworks.

However, such monoliths require users to buy in to entire frameworks or ecosystems. For example, a user that has specified their model in one framework but who now wishes to migrate to another library (to take advantage of certain better-supported features, say) must now reimplement their models from scratch in the new framework.

LittleMCMC remedies this exact use case: by isolating PyMC’s HMC/NUTS code in a standalone library, users with their own log probability function and its derivative may use PyMC’s battle-tested HMC/NUTS samplers.

LittleMCMC in Context

LittleMCMC stands on the shoulders of giants (that is, giant open source projects). Most obviously, LittleMCMC builds from (or, more accurately, is a spin-off project from) the PyMC project (both PyMC3 and PyMC4).

In terms of prior art, LittleMCMC is similar to several other open-source libraries, such as NUTS by Morgan Fouesneau or Sampyl by Mat Leonard. However, these libraries do not offer the same functionality as LittleMCMC: for example, they do not allow for easy changes of the mass matrix (instead assuming that an identity mass matrix), or they do not raise sampler errors or track sampler statistics such as divergences, energy, etc.

By offering step methods, integrators, quadpotentials and the sampling loop in separate Python modules, LittleMCMC offers not just a battle-tested sampler, but also an extensible one: users may configure the samplers as they wish.