Skip to content

Gaussian processes on graphs and lattices in Stan.

License

Notifications You must be signed in to change notification settings

onnela-lab/gptools

Repository files navigation

gptools: Performant Gaussian Processes in Stan gptools: Python gptools: R

Gaussian processes (GPs) are powerful distributions for modeling functional data, but using them is computationally challenging except for small datasets. gptools implements two methods for performant GP inference in Stan.

  1. A sparse approximation of the likelihood. This approach includes nearest neighbor Gaussian processes but also supports more general dependence structures, e.g., for periodic kernels.
  2. An exact likelihood evaluation for data on regularly spaced lattices using fast Fourier transforms.

The implementation follows Stan’s design and exposes performant inference through a familiar interface. We provide interfaces in Python and R. See the accompanying publication Scalable Gaussian Process Inference with Stan for details of the implementation. The comprehensive documentation includes many examples.

Getting Started

You can use the gptools package by including it the functions block of your Stan program, e.g.,

functions {
    // Include utility functions, such as real fast Fourier transforms.
    #include gptools/util.stan
    // Include functions to evaluate GP likelihoods with Fourier methods.
    #include gptools/fft.stan
}

See here for the list of all include options. You can download the Stan files and use them with your favorite interface or use the provided Python and R interfaces.

Python and cmdstanpy

  1. Install cmdstanpy and cmdstan if you haven't already (see here for details).
  2. Install gptools from PyPI by running pip install gptools-stan from the command line.
  3. Compile your first model.
from cmdstanpy import CmdStanModel
from gptools.stan import get_include

model = CmdStanModel(
  stan_file="path/to/your/model.stan",
  stanc_options={"include-paths": get_include()},
)

For an end-to-end example, see this notebook.

R and cmdstanr

  1. Install cmdstanr and cmdstan if you haven't already (see here for details).
  2. Install gptools from CRAN by running install.packages("gptoolsStan").
  3. Compile your first model.
library(cmdstanr)
library(gptoolsStan)

model <- cmdstan_model(
  stan_file="path/to/your/model.stan",
  include_paths=gptools_include_path(),
)

Note

Unfortunately, Rstan is not supported because it does not provide an option to specify include paths.

For an end-to-end example, see this vignette.

Contributing

Contributions to the package are always welcome! The repository structure is as follows:

Replicating the Results

To replicate the results presented in the accompanying publication Scalable Gaussian Process Inference with Stan, please see the dedicated repository of replication materials.