An Interface for Black Box Learning in Probabilistic Programs (PPS 2016)

Who

Jan-Willem van de Meent, Brooks Paige, David Tolpin, Frank Wood

Track

PPS 2016

Time Zone

The program is currently displayed in (GMT-05:00) Guadalajara, Mexico City, Monterrey.

Use conference time zone: (GMT-05:00) Guadalajara, Mexico City, MonterreySelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Sat 23 Jan 2016 09:30 - 09:50 at Room St Petersburg II - Session 1 Chair(s): Cameron Freer

Abstract

Structural operational semantics for probabilistic programming languages typically characterize each elementary step in an execution terms of

The resulting value and/or changes to the environment.
Effects on the probability of the execution.

In many probabilistic programming languages, the elementary steps in an execution can be separated into 3 categories

Deterministic steps, which have no effect on the probability of the execution.
Stochastic steps, which may have an effect the probability of the execution, but can typically also be executed naively to sample from the prior.
Conditioning steps, which have no effect on the program state other than altering its probability under the target density.

In a probabilistic program system in which the back end implements Monte Carlo (i.e. sampling-based) inference methods, steps of type 1 execute in the same manner regardless of the chosen inference method, whereas steps of the second and third type require a method-specific implementation. In this setting it makes sense to separate “inference” semantics, that is to say the operational semantics of the method-specific steps, from the “language” semantics, which is to say the operational semantics associated with the deterministic steps in the computation.

In this abstract describe a family of “black box” methods for parameter learning, which are related to both black box variational inference [1,2] and classic policy gradient methods such as REINFORCE [3] under an appropriate planning as inference implementation [4]. We abstract away from the details of the language by defining an interface between a deterministic computation P which performs all steps of type 1 and relegates to an inference back end B when encountering steps of type 2 and 3. This makes it possible to reason about the inference semantics for these algorithms, which are based on importance sampling semantics, from the operational semantics of the modeling language.

We refer to a longer paper [5] on Arxiv for a more detailed study of the use of this class of methods in the context of upper-level policy search problems, in which the probabilistic program defines a stochastic simulator for both the agent and the problem domain.

[1] Wingate, David, and Theo Weber. 2013. “Automated Variational Inference in Probabilistic Programming.” arXiv Preprint arXiv:1301.1299: 1–7. http://arxiv.org/abs/1301.1299.

[2] Ranganath, Rajesh, Sean Gerrish, and David M Blei. 2014. “Black Box Variational Inference.” In Artificial Intelligence and Statistics.

[3] Williams, Ronald J. 1992. “Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning.” Machine Learning 8 (3-4): 229–256. doi:10.1007/BF00992696.

[4] Toussaint, Marc, Stefan Harmeling, and Amos Storkey. 2006. “Probabilistic Inference for Solving (PO)MDPs.” Neural Computation 31 (December): 357–373. http://eprints.pascal-network.org/archive/00003924/.

[5] van de Meent, Jan-Willem, David Tolpin, Brooks Paige, and Frank Wood. 2015. “Black-Box Policy Search with Probabilistic Programs.” arXiv: 1507.04635. http://arxiv.org/abs/1507.04635.

Link to Preprint

http://pps2016.soic.indiana.edu/2015/12/21/an-interface-for-black-box-learning-in-probabilistic-programs/

Jan-Willem van de Meent

University of Oxford

Brooks Paige

University of Oxford

David Tolpin

University of Oxford

Frank Wood

University of Oxford

Time Zone

The program is currently displayed in (GMT-05:00) Guadalajara, Mexico City, Monterrey.

Use conference time zone: (GMT-05:00) Guadalajara, Mexico City, MonterreySelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Sat 23 Jan
Displayed time zone: Guadalajara, Mexico City, Monterrey change

09:00 - 10:00	Session 1PPS at Room St Petersburg II Chair(s): Cameron Freer Gamalon

09:00 20m Talk		All You Need is the Monad... What Monad Was That Again? PPS Norman Ramsey Pre-print
09:20 10m Meeting		Discussion 1 PPS
09:30 20m Talk		An Interface for Black Box Learning in Probabilistic Programs PPS Jan-Willem van de Meent University of Oxford, Brooks Paige University of Oxford, David Tolpin University of Oxford, Frank Wood University of Oxford Pre-print
09:50 10m Meeting		Discussion 2 PPS