Workflow network (1): adding inference algorithm choice for multiverse analysis

The three main themes for my research on simulation on generative model classes are as follows with related literature:

A. interaction between reality and hyperreality (simulation) on Bayesian workflow

B. include inference algorithm
– simulation-based calibration (SBC) which measures the consistency between prior, likelihood, inference algorithm
– database for posterior samples (posteriordb) with different inference algorithms

C. consistency-based calibration from B and for D
– prior preconditioning: boundary-avoiding prior, Bayesian cringe

D. utility and efficiency-based analysis
– decision-theoretic approach for model interpretability in Bayesian framework
– martingale posterior distributions

This includes A. starting from Baudrillard’s “Simulacra and Simulation”. This writing focuses on B by constructing model network2 as a dual of model network1 and three consistency check procedures.

Introduction

model class:= set of models which share parameters and can be partially ordered e.g. generalized additive time-series, feature-selection regression, and causal-graphed based causal inference

Module graph and model network are two graphical representations of this abstract model class.

module is a discrete choice in model development such as the type of prior, likelihood function, and inference algorithm. Cardinality of model network support set is defined as the product of the number of candidates at each model signature (rectangle in the figure). Examples are trend and seasonality component from generalized additive time-series model. Factorizing the entire model into set of modules without considering the structures within and between modules can lead the final model, and especially the predictive outcomes inconsistent. However, this inconsistency is difficult to be analytically foreseen. Consistency should be checked in “tried and true” way which puts a special emphasis on consistency-based calibration procedure.

module graph is a graph with alternating nodes between model signature (oval) and implementation (rectangle) which expresses modularity and exchangeability. This can be created from probabilistic programming languages like Stan and supported by metaprogramming feature.

model network1 is a graph where nodes correspond to one generative model. Edges are created between nodes with one module difference.

Below figure demonstrates the 1:1:1 mapping between tree in module graph (left top), node in model network1 (left bottom), and program file.

Module graph, model network1, generative model file

For further details on theoretical background and its application, please refer to model topology (sec.7.4) and model exploration (sec.12.1) of this paper by Gelman (2020). Also, for explicit mappings between model class-tree-graph, refer to this blog written by my collaborator Ryan Bernstein. Demo as below can be accessed here.

Birthday problem, a gaussian process based time-series example, is used to demonstrate the automation on the network of models. A trajectory from #1 START 5232.03 to #8 Goal 15384.86 of the model network (right figure) is an example of navigation algorithm. The “best” model is online-searched and action candidates at each node are neighboring nodes.Increased understanding on the connections between models and between parameters in neighboring models are byproducts. However, compared to the original case study which aptly used optimization and HMC, model network1 cannot represent inference algorithm. Also, there are too many nodes to explore. The problems motivates model network2 and consistency-based calibration which benefits the workflow in two aspects:

efficiency: summarize models on which to explore
validity: consistency between results from candidate models

Model network2

model network2 is a line graph of model network1 in that we create network2’s node from network1’s edges. Recalling an edge is associated with two nodes, motivation is that for our final inferential result, there are two generative models in action: one assumed to be a true data-generating process (what modeler choose) and the other for approximation. The purpose for approximation can be either efficiency or explainability. Concrete implementation of the two generative models that form the node of network2 is `generator` and `backend` function explained this SBC package” documentation.

The biggest motivation is that modelers simply assume perfect inferential results once the model is perfect, which is not generally true as can be seen from eight school model with centered parameterization. Here, even HMC sampler fails to return samples that best represent the posterior space (funnel problem). Without coming up with a solution to decrease the gap between true and computational model, makes the discussion on pure model space futile. SBC is one of the best existing tool to detect this discrepancy.

The

Consistency-based calibration

Parameter-sharing among model class enhances the understanding of parameters (“Parameters within diﬀerent models in the network can “talk with each other” in the sense of having a shared, observable meaning outside the conﬁnes of the model itself”) but poses additional difficulties to calibration process like SBC. Imagine a class of models with non-overlapping well-calibrated posterior distributions. This is a serious threat even if the focus is on predictive distribution such as stacking. Therefore, well-calibrated prior on model class requires interpretation and definition different from that of a single model. The following examples of inconsistency point to the standards required for model topology and basis of model network2.

model is not well-calibrated when prior is $N(0,1)$ and data-averaged posterior is $N(10,1)$
model network2 node is ineligible when well-calibrated prior is $N(0,1)$ and realistic prior is $N(10,1)$
model network2 consisting all five nodes from the table is ineligible

Starting from the inconsistent examples, the next writings would focus on C and D. For C, model class basis is constructed which enhances efficiency and validity. For D, with the definition of actionable model as simplest model containing all counterfactuals that matter, utility-based method selection is proposed.

Categories

Introduction

Model network2

Consistency-based calibration

Comment is the energy for a writer, thanks!Cancel reply