30 Nov 18:21

ColCarroll

ca35f4f

PyMC3 v3.8 (29 November, 2019)

New features

Implemented robust u turn check in NUTS (similar to stan-dev/stan#2800). See PR [#3605]
Add capabilities to do inference on parameters in a differential equation with DifferentialEquation. See #3590 and #3634.
Distinguish between Data and Deterministic variables when graphing models with graphviz. PR #3491.
Sequential Monte Carlo - Approximate Bayesian Computation step method is now available. The implementation is in an experimental stage and will be further improved.
Added Matern12 covariance function for Gaussian processes. This is the Matern kernel with nu=1/2.
Progressbar reports number of divergences in real time, when available #3547.
Sampling from variational approximation now allows for alternative trace backends [#3550].
Infix @ operator now works with random variables and deterministics #3619.
ArviZ is now a requirement, and handles plotting, diagnostics, and statistical checks.
Can use GaussianRandomWalk in sample_prior_predictive and sample_prior_predictive #3682
Now 11 years of S&P returns in data set#3682

Maintenance

Moved math operations out of Rice, TruncatedNormal, Triangular and ZeroInflatedNegativeBinomial random methods. Math operations on values returned by draw_values might not broadcast well, and all the size aware broadcasting is left to generate_samples. Fixes #3481 and #3508
Parallelization of population steppers (DEMetropolis) is now set via the cores argument. (#3559)
Fixed a bug in Categorical.logp. In the case of multidimensional p's, the indexing was done wrong leading to incorrectly shaped tensors that consumed O(n**2) memory instead of O(n). This fixes issue #3535
Fixed a defect in OrderedLogistic.__init__ that unnecessarily increased the dimensionality of the underlying p. Related to issue issue #3535 but was not the true cause of it.
SMC: stabilize covariance matrix 3573
SMC: is no longer a step method of pm.sample now it should be called using pm.sample_smc 3579
SMC: improve computation of the proposal scaling factor 3594 and 3625
SMC: reduce number of logp evaluations 3600
SMC: remove scaling and tune_scaling arguments as is a better idea to always allow SMC to automatically compute the scaling factor 3625
Now uses multiprocessong rather than psutil to count CPUs, which results in reliable core counts on Chromebooks.
sample_posterior_predictive now preallocates the memory required for its output to improve memory usage. Addresses problems raised in this discourse thread.
Fixed a bug in Categorical.logp. In the case of multidimensional p's, the indexing was done wrong leading to incorrectly shaped tensors that consumed O(n**2) memory instead of O(n). This fixes issue #3535
Fixed a defect in OrderedLogistic.__init__ that unnecessarily increased the dimensionality of the underlying p. Related to issue issue #3535 but was not the true cause of it.
Wrapped DensityDist.rand with generate_samples to make it aware of the distribution's shape. Added control flow attributes to still be able to behave as in earlier versions, and to control how to interpret the size parameter in the random callable signature. Fixes 3553
Added theano.gof.graph.Constant to type checks done in _draw_value (fixes issue 3595)
HalfNormal did not used to work properly in draw_values, sample_prior_predictive, or sample_posterior_predictive (fixes issue 3686)
Random variable transforms were inadvertently left out of the API documentation. Added them. (See PR 3690).

Assets 4

29 May 14:47

twiecki

v3.7

15a7ef8

PyMC3 3.7 (May 29 2019)

New features

Add data container class (Data) that wraps the theano SharedVariable class and let the model be aware of its inputs and outputs.
Add function set_data to update variables defined as Data.
Mixture now supports mixtures of multidimensional probability distributions, not just lists of 1D distributions.
GLM.from_formula and LinearComponent.from_formula can extract variables from the calling scope. Customizable via the new eval_env argument. Fixing #3382.
Added the distributions.shape_utils module with functions used to help broadcast samples drawn from distributions using the size keyword argument.
Used numpy.vectorize in distributions.distribution._compile_theano_function. This enables sample_prior_predictive and sample_posterior_predictive to ask for tuples of samples instead of just integers. This fixes issue #3422.

Maintenance

All occurances of sd as a parameter name have been renamed to sigma. sd will continue to function for backwards compatibility.
HamiltonianMC was ignoring certain arguments like target_accept, and not using the custom step size jitter function with expectation 1.
Made BrokenPipeError for parallel sampling more verbose on Windows.
Added the broadcast_distribution_samples function that helps broadcasting arrays of drawn samples, taking into account the requested size and the inferred distribution shape. This sometimes is needed by distributions that call several rvs separately within their random method, such as the ZeroInflatedPoisson (fixes issue #3310).
The Wald, Kumaraswamy, LogNormal, Pareto, Cauchy, HalfCauchy, Weibull and ExGaussian distributions random method used a hidden _random function that was written with scalars in mind. This could potentially lead to artificial correlations between random draws. Added shape guards and broadcasting of the distribution samples to prevent this (Similar to issue #3310).
Added a fix to allow the imputation of single missing values of observed data, which previously would fail (fixes issue #3122).
The draw_values function was too permissive with what could be grabbed from inside point, which lead to an error when sampling posterior predictives of variables that depended on shared variables that had changed their shape after pm.sample() had been called (fix issue #3346).
draw_values now adds the theano graph descendants of TensorConstant or SharedVariables to the named relationship nodes stack, only if these descendants are ObservedRV or MultiObservedRV instances (fixes issue #3354).
Fixed bug in broadcast_distrution_samples, which did not handle correctly cases in which some samples did not have the size tuple prepended.
Changed MvNormal.random's usage of tensordot for Cholesky encoded covariances. This lead to wrong axis broadcasting and seemed to be the cause for issue #3343.
Fixed defect in Mixture.random when multidimensional mixtures were involved. The mixture component was not preserved across all the elements of the dimensions of the mixture. This meant that the correlations across elements within a given draw of the mixture were partly broken.
Restructured Mixture.random to allow better use of vectorized calls to comp_dists.random.
Added tests for mixtures of multidimensional distributions to the test suite.
Fixed incorrect usage of broadcast_distribution_samples in DiscreteWeibull.
Mixture's default dtype is now determined by theano.config.floatX.
dist_math.random_choice now handles nd-arrays of category probabilities, and also handles sizes that are not None. Also removed unused k kwarg from dist_math.random_choice.
Changed Categorical.mode to preserve all the dimensions of p except the last one, which encodes each category's probability.
Changed initialization of Categorical.p. p is now normalized to sum to 1 inside logp and random, but not during initialization. This could hide negative values supplied to p as mentioned in #2082.
Categorical now accepts elements of p equal to 0. logp will return -inf if there are values that index to the zero probability categories.
Add sigma, tau, and sd to signature of NormalMixture.
Set default lower and upper values of -inf and inf for pm.distributions.continuous.TruncatedNormal. This avoids errors caused by their previous values of None (fixes issue #3248).
Converted all calls to pm.distributions.bound._ContinuousBounded and pm.distributions.bound._DiscreteBounded to use only and all positional arguments (fixes issue #3399).
Restructured distributions.distribution.generate_samples to use the shape_utils module. This solves issues #3421 and #3147 by using the size aware broadcating functions in shape_utils.
Fixed the Multinomial.random and Multinomial.random_ methods to make them compatible with the new generate_samples function. In the process, a bug of the Multinomial.random_ shape handling was discovered and fixed.
Fixed a defect found in Bound.random where the point dictionary was passed to generate_samples as an arg instead of in not_broadcast_kwargs.
Fixed a defect found in Bound.random_ where total_size could end up as a float64 instead of being an integer if given size=tuple().
Fixed an issue in model_graph that caused construction of the graph of the model for rendering to hang: replaced a search over the powerset of the nodes with a breadth-first search over the nodes. Fix for #3458.
Removed variable annotations from model_graph but left type hints (Fix for #3465). This means that we support python>=3.5.4.
Default target_acceptfor HamiltonianMC is now 0.65, as suggested in Beskos et. al. 2010 and Neal 2001.
Fixed bug in draw_values that lead to intermittent errors in python3.5. This happened with some deterministic nodes that were drawn but not added to givens.

Deprecations

nuts_kwargs and step_kwargs have been deprecated in favor of using the standard kwargs to pass optional step method arguments.
SGFS and CSG have been removed (Fix for #3353). They have been moved to pymc3-experimental.
References to live_plot and corresponding notebooks have been removed.
Function approx_hessian was removed, due to numdifftools becoming incompatible with current scipy. The function was already optional, only available to a user who installed numdifftools separately, and not hit on any common codepaths. #3485.
Deprecated vars parameter of sample_posterior_predictive in favor of varnames.
References to live_plot and corresponding notebooks have been removed.
Deprecated vars parameters of sample_posterior_predictive and sample_prior_predictive in favor of var_names. At least for the latter, this is more accurate, since the vars parameter actually took names.

Contributors sorted by number of commits

45  Luciano Paz
38  Thomas Wiecki
23  Colin Carroll
19  Junpeng Lao
15  Chris Fonnesbeck
13  Juan Martín Loyola
13  Ravin Kumar
 8  Robert P. Goldman
 5  Tim Blazina
 4  chang111
 4  adamboche
 3  Eric Ma
 3  Osvaldo Martin
 3  Sanmitra Ghosh
 3  Saurav Shekhar
 3  chartl
 3  fredcallaway
 3  Demetri
 2  Daisuke Kondo
 2  David Brochart
 2  George Ho
 2  Vaibhav Sinha
 1  rpgoldman
 1  Adel Tomilova
 1  Adriaan van der Graaf
 1  Bas Nijholt
 1  Benjamin Wild
 1  Brigitta Sipocz
 1  Daniel Emaasit
 1  Hari
 1  Jeroen
 1  Joseph Willard
 1  Juan Martin Loyola
 1  Katrin Leinweber
 1  Lisa Martin
 1  M. Domenzain
 1  Matt Pitkin
 1  Peadar Coyle
 1  Rupal Sharma
 1  Tom Gilliss
 1  changjiangeng
 1  michaelosthege
 1  monsta
 1  579397

Assets 2

20 Dec 18:04

twiecki

v3.6

081e7f4

v3.6

This is a major new release from 3.5 with many new features and important bugfixes. The highlight is certainly our completely revamped website: https://docs.pymc.io/

Note also, that this release will be the last to be compatible with Python 2. Thanks to all contributors!

New features

Track the model log-likelihood as a sampler stat for NUTS and HMC samplers
(accessible as trace.get_sampler_stats('model_logp')) (#3134)
Add Incomplete Beta function incomplete_beta(a, b, value)
Add log CDF functions to continuous distributions: Beta, Cauchy, ExGaussian, Exponential, Flat, Gumbel, HalfCauchy, HalfFlat, HalfNormal, Laplace, Logistic, Lognormal, Normal, Pareto, StudentT, Triangular, Uniform, Wald, Weibull.
Behavior of sample_posterior_predictive is now to produce posterior predictive samples, in order, from all values of the trace. Previously, by default it would produce 1 chain worth of samples, using a random selection from the trace (#3212)
Show diagnostics for initial energy errors in HMC and NUTS.
PR #3273 has added the distributions.distribution._DrawValuesContext context
manager. This is used to store the values already drawn in nested random
and draw_values calls, enabling draw_values to draw samples from the
joint probability distribution of RVs and not the marginals. Custom
distributions that must call draw_values several times in their random
method, or that invoke many calls to other distribution's random methods
(e.g. mixtures) must do all of these calls under the same _DrawValuesContext
context manager instance. If they do not, the conditional relations between
the distribution's parameters could be broken, and random could return
values drawn from an incorrect distribution.
Rice distribution is now defined with either the noncentrality parameter or the shape parameter (#3287).

Maintenance

Big rewrite of documentation (#3275)
Fixed Triangular distribution c attribute handling in random and updated sample codes for consistency (#3225)
Refactor SMC and properly compute marginal likelihood (#3124)
Removed use of deprecated ymin keyword in matplotlib's Axes.set_ylim (#3279)
Fix for #3210. Now distribution.draw_values(params), will draw the params values from their joint probability distribution and not from combinations of their marginals (Refer to PR #3273).
Removed dependence on pandas-datareader for retrieving Yahoo Finance data in examples (#3262)
Rewrote Multinomial._random method to better handle shape broadcasting (#3271)
Fixed Rice distribution, which inconsistently mixed two parametrizations (#3286).
Rice distribution now accepts multiple parameters and observations and is usable with NUTS (#3289).
sample_posterior_predictive no longer calls draw_values to initialize the shape of the ppc trace. This called could lead to ValueError's when sampling the ppc from a model with Flat or HalfFlat prior distributions (Fix issue #3294).

Deprecations

Renamed sample_ppc() and sample_ppc_w() to sample_posterior_predictive() and sample_posterior_predictive_w(), respectively.

Assets 2

21 Jul 20:58

fonnesbeck

v3.5

4e695f6

v3.5 Final

New features

Add documentation section on survival analysis and censored data models
Add check_test_point method to pm.Model
Add Ordered Transformation and OrderedLogistic distribution
Add Chain transformation
Improve error message Mass matrix contains zeros on the diagonal. Some derivatives might always be zero during tuning of pm.sample
Improve error message NaN occurred in optimization. during ADVI
Save and load traces without pickle using pm.save_trace and pm.load_trace
Add Kumaraswamy distribution
Add TruncatedNormal distribution
Rewrite parallel sampling of multiple chains on py3. This resolves
long standing issues when transferring large traces to the main process,
avoids pickling issues on UNIX, and allows us to show a progress bar
for all chains. If parallel sampling is interrupted, we now return
partial results.
Add sample_prior_predictive which allows for efficient sampling from
the unconditioned model.
SMC: remove experimental warning, allow sampling using sample, reduce autocorrelation from
final trace.
Add model_to_graphviz (which uses the optional dependency graphviz) to
plot a directed graph of a PyMC3 model using plate notation.
Add beta-ELBO variational inference as in beta-VAE model (Christopher P. Burgess et al. NIPS, 2017)
Add __dir__ to SingleGroupApproximation to improve autocompletion in interactive environments

Fixes

Fixed grammar in divergence warning, previously There were 1 divergences ... could be raised.
Fixed KeyError raised when only subset of variables are specified to be recorded in the trace.
Removed unused repeat=None arguments from all random() methods in distributions.
Deprecated the sigma argument in MarginalSparse.marginal_likelihood in favor of noise
Fixed unexpected behavior in random. Now the random functionality is more robust and will work better for sample_prior when that is implemented.
Fixed scale_cost_to_minibatch behaviour, previously this was not working and always False

Assets 2

19 Apr 02:51

fonnesbeck

v3.4.1

544a806

v3.4.1 Final

There was no 3.4 release due to a naming issue on PyPI.

New features

Add logit_p keyword to pm.Bernoulli, so that users can specify the logit of the success probability. This is faster and more stable than using p=tt.nnet.sigmoid(logit_p).
Add random keyword to pm.DensityDist thus enabling users to pass custom random method which in turn makes sampling from a DensityDist possible.
Effective sample size computation is updated. The estimation uses Geyer's initial positive sequence, which no longer truncates the autocorrelation series inaccurately. pm.diagnostics.effective_n now can reports N_eff>N.
Added KroneckerNormal distribution and a corresponding MarginalKron
Gaussian Process implementation for efficient inference, along with
lower-level functions such as cartesian and kronecker products.
Added Coregion covariance function.
Add new 'pairplot' function, for plotting scatter or hexbin matrices of sampled parameters.
Optionally it can plot divergences.
Plots of discrete distributions in the docstrings
Add logitnormal distribution
Densityplot: add support for discrete variables
Fix the Binomial likelihood in .glm.families.Binomial, with the flexibility of specifying the n.
Add offset kwarg to .glm.
Changed the compare function to accept a dictionary of model-trace pairs instead of two separate lists of models and traces.
add test and support for creating multivariate mixture and mixture of mixtures
distribution.draw_values, now is also able to draw values from conditionally dependent RVs, such as autotransformed RVs (Refer to PR #2902).

Fixes

VonMises does not overflow for large values of kappa. i0 and i1 have been removed and we now use log_i0 to compute the logp.
The bandwidth for KDE plots is computed using a modified version of Scott's rule. The new version uses entropy instead of standard deviation. This works better for multimodal distributions. Functions using KDE plots has a new argument bw controlling the bandwidth.
fix PyMC3 variable is not replaced if provided in more_replacements (#2890)
Fix for issue #2900. For many situations, named node-inputs do not have a random method, while some intermediate node may have it. This meant that if the named node-input at the leaf of the graph did not have a fixed value, theano would try to compile it and fail to find inputs, raising a theano.gof.fg.MissingInputError. This was fixed by going through the theano variable's owner inputs graph, trying to get intermediate named-nodes values if the leafs had failed.
In distribution.draw_values, some named nodes could be theano.tensor.TensorConstants or theano.tensor.sharedvar.SharedVariables. Nevertheless, in distribution._draw_value, these would be passed to distribution._compile_theano_function as if they were theano.tensor.TensorVariables. This could lead to the following exceptions TypeError: ('Constants not allowed in param list', ...) or TypeError: Cannot use a shared variable (...). The fix was to not add theano.tensor.TensorConstant or theano.tensor.sharedvar.SharedVariable named nodes into the givens dict that could be used in distribution._compile_theano_function.
Exponential support changed to include zero values.

Deprecations

DIC and BPIC calculations have been removed
df_summary have been removed, use summary instead
njobs and nchains kwarg are deprecated in favor of cores and chains for sample
lag kwarg in pm.stats.autocorr and pm.stats.autocov is deprecated.

Assets 2

25 Jan 00:58

fonnesbeck

v3.3

d97fda8

v3.3 Final

New features

Improve NUTS initialization advi+adapt_diag_grad and add jitter+adapt_diag_grad (#2643)
Added MatrixNormal class for representing vectors of multivariate normal variables
Implemented HalfStudentT distribution
New benchmark suite added (see http://pandas.pydata.org/speed/pymc3/)
Generalized random seed types
Update loo, new improved algorithm (#2730)
New CSG (Constant Stochastic Gradient) approximate posterior sampling algorithm (#2544)
Michael Osthege added support for population-samplers and implemented differential evolution metropolis (DEMetropolis). For models with correlated dimensions that can not use gradient-based samplers, the DEMetropolis sampler can give higher effective sampling rates. (also see PR#2735)
Forestplot supports multiple traces (#2736)
Add new plot, densityplot (#2741)
DIC and BPIC calculations have been deprecated
Refactor HMC and implemented new warning system (#2677, #2808)

Fixes

Fixed compareplot to use loo output.
Improved posteriorplot to scale fonts
sample_ppc_w now broadcasts
df_summary function renamed to summary
Add test for model.logp_array and model.bijection (#2724)
Fixed sample_ppc and sample_ppc_w to iterate all chains(#2633, #2748)
Add Bayesian R2 score (for GLMs) stats.r2_score (#2696) and test (#2729).
SMC works with transformed variables (#2755)
Speedup OPVI (#2759)
Multiple minor fixes and improvements in the docs (#2775, #2786, #2787, #2789, #2790, #2794, #2799, #2809)

Deprecations

Old (minibatch-)advi is removed (#2781)

Assets 2

10 Oct 19:11

fonnesbeck

v3.2

fb01894

v3.2 Final

This version includes two major contributions from our Google Summer of Code 2017 students:
- Maxim Kochurov extended and refactored the variational inference module. This primarily adds two important classes, representing operator variational inference (OPVI) objects and Approximation objects. These make it easier to extend existing variational classes, and to derive inference from variational optimizations, respectively. The variational module now also includes normalizing flows (NFVI).
- Bill Engels added an extensive new Gaussian processes (gp) module. Standard GPs can be specified using either Latent or Marginal classes, depending on the nature of the underlying function. A Student-T process TP has been added. In order to accomodate larger datasets, approximate marginal Gaussian processes (MarginalSparse) have been added.
Documentation has been improved as the result of the project's monthly "docathons".
An experimental stochastic gradient Fisher scoring (SGFS) sampling step method has been added.
The API for find_MAP was enhanced.
SMC now estimates the marginal likelihood.
Added Logistic and HalfFlat distributions to set of continuous distributions.
Bayesian fraction of missing information (bfmi) function added to stats.
Enhancements to compareplot added.
QuadPotential adaptation has been implemented.
Script added to build and deploy documentation.
MAP estimates now available for transformed and non-transformed variables.
The Constant variable class has been deprecated, and will be removed in 3.3.
DIC and BPIC calculations have been sped up.
Arrays are now accepted as arguments for the Bound class.
random method was added to the Wishart and LKJCorr distributions.
Progress bars have been added to LOO and WAIC calculations.
All example notebooks updated to reflect changes in API since 3.1.
Parts of the test suite have been refactored.

Fixes

Fixed sampler stats error in NUTS for non-RAM backends
Matplotlib is no longer a hard dependency, making it easier to use in settings where installing Matplotlib is problematic. PyMC will only complain if plotting is attempted.
Several bugs in the Gaussian process covariance were fixed.
All chains are now used to calculate WAIC and LOO.
AR(1) log-likelihood function has been fixed.
Slice sampler fixed to sample from 1D conditionals.
Several docstring fixes.

Assets 2

25 Jun 15:26

fonnesbeck

v3.1

b319dad

v3.1 Final

This is the first major update to PyMC 3 since its initial release. Highlights of this release include:

Gaussian Process submodule
Much improved variational inference support that includes:
- Stein Variational Gradient Descent
- Minibatch processing
- Additional optimizers, including ADAM
- Experimental operational variational inference (OPVI)
- Full-rank ADVI
MvNormal supports Cholesky Decomposition now for increased speed and numerical stability.
NUTS implementation now matches current Stan implementation.
Higher-order integrators for HMC
Elliptical slice sampler is now available
Added Approximation class and the ability to convert a sampled trace into an approximation via its Empirical subclass.
Add MvGaussianRandomWalk and MvStudentTRandomWalk distributions.

Assets 2

09 Jan 14:19

fonnesbeck

v3.0

40ccb10

v3.0 Final

This is the first major release of PyMC3. A number of major changes since splitting from the PyMC2 project include:

Added gradient-based MCMC samplers: Hamiltonian MC (HMC) and No-U-Turn Sampler (NUTS)
Automatic gradient calculations using Theano
Convenient generalized linear model specification using Patsy formulae
Parallel sampling via multiprocessing
New model specification using context managers
New Automatic Differentiation Variational InferenceAVDI (ADVI) allowing faster sampling than HMC for some problems.
Mini-batch ADVI

Assets 2

02 Jan 00:38

springcoil

v3.0.rc6

cad8ec2

v3.0 Release Candidate 6

Sixth release candidate of PyMC3 3.0.

Assets 2

Releases: pymc-devs/pymc

PyMC3 v3.8 (29 November, 2019)

New features

Maintenance

Uh oh!

PyMC3 3.7 (May 29 2019)

New features

Maintenance

Deprecations

Contributors sorted by number of commits

Uh oh!

v3.6

New features

Maintenance

Deprecations

Uh oh!

v3.5 Final

New features

Fixes

Uh oh!

v3.4.1 Final

New features

Fixes

Deprecations

Uh oh!

v3.3 Final

New features

Fixes

Deprecations

Uh oh!

v3.2 Final

Fixes

Uh oh!

v3.1 Final

Uh oh!

v3.0 Final

Uh oh!

v3.0 Release Candidate 6

Uh oh!