Skip to content

Timeline

Michael Osthege edited this page Feb 26, 2021 · 15 revisions

2021-01-27 PyMC Timeline

What a year! What started out as a very diffuse outlook has converged to an exciting future for PyMC3: We decided to stick with Theano as our backend, while turning it into a next-generation graph computation engine.

This means that the library you know as PyMC3 will prevail, albeit with some breaking changes coming up.

Let's start with some basics:

  • With PyMC3 v3.10.0 we switched the backend from Theano v1.0.5 to Theano-PyMC v1.0.11. Theano-PyMC is our fork that we are currently refactoring.
  • With PyMC3 v3.11.0 we made some long due breaking changes and fixes in PyMC3 and pinned to Theano-PyMC v1.1.0 that also breaks a few things. These breaking changes in both libraries are however easy to adapt to.
  • PyMC3 3.11.1 added a bunch of bugfixes.

You could say that 3.103.11 is a "rough ride", but the biggest change is still coming up:

A new RandomVariable Op was recently merged into Theano-PyMC. By switching the inner workings of PyMC3 from its pm.Distribution class to the new RandomVariable, we will open the door to many exciting possibilities (see PR #4440). It will also dramatically simplify the inner workings of PyMC3, solving most if not all shape problems and allowing us to delete huge chunks of internal code.

Theano-PyMC was renamed and released with Aesara v2.0. The rename of Theano to Aesara and the switch of the PyMC3 internals to RandomVariable mandate a PyMC3 4.0 release. The development for that happens on the v4 branch

Why not start fresh with a new package "PyMC4"?

Yeah, about that...

Even though the internals of PyMC3 4.0 will be quite different to PyMC3 3.x, we can keep the user-facing API largely identical. Custom distributions will have to be refactored, but most models will work just fine! The same is true for commonly used functions for sampling and prior/posterior predictive.

We only have the capacity to maintain and develop one PyMC library. Superseding 3.x with 4.0 focuses users and developers to that one library.

How we'll pull it off

As mentioned earlier, we're continuing on a slightly faster release cycle from 3.10 onwards. Here we'll focus on the following:

  • Taking out deprecated functionality and streamlining the API to facilitate switching to 4.0
  • Not touching obsolete parts that will be replaced/removed for RandomVariable-based models.
  • We are continuing to refactor Theano-PyMC for an easier development, while making some incremental releases.

The switch to RandomVariable and Aesara 2.0 is prepared on a v4 branch on the main repo. This way the PR and Issue numbering remains compatible, and we can do a "The big one" PR into master to bump us to 4.0.

  • Some things are still ToDo in the backend. See Milestones.
  • We'll have to temporarily phase out some PyMC3 submodules such as glm, gp, ode and specialized step methods such as SMC or MLDA. This is just so we can push forward with the core functionality and gradually phase them back in (e.g. with 4.1).

JAX Linking

The JAX linker is already available in current Theano-PyMC releases. It is not a blocker w.r.t. RandomVariable and PyMC 4.0, which means that improvements to it are independent of this timeline!

Contributing

Switching the inner workings of PyMC3 while keeping large parts of its user-facing API unaffected is a bit like open heart surgery. It requires good planning, concentration and swift execution, but brings great benefits compared to the alternatives.

We manage this through milestones on PyMC3 and Theano-PyMC.

  • Everything in the vNext milestone has topmost priority.
  • Issues in milestones later than vNext may depend on things from a previous milestone.
  • Issues outside of milestones are considered "backlog". They don't block a release, but fixing them is still an important contribution.

tl;dr:

We're transplanting the inner organs of PyMC3. It's tricky, but with 4.0 your favorite library will supercharge modern probabilistic programming.