Probabilistic Integral Circuits (PICs)¶
Probabilistic Integral Circuits (PICs) are a framework for scaling continuous latent variable models by representing them as integrals over tractable circuits. They allow for flexible neural functional sharing while maintaining tractability via quadrature-based materialization.
Reference¶
PICs are described in the NeurIPS 2024 paper:
Overview¶
A PIC is a symbolic representation of a continuous mixture model. It consists of:
Input Units: Representing leaf functions \(f_u(X_u, Z_u)\).
Sum Units: Representing convex mixtures.
Product Units: Representing factorized distributions.
Integral Units: Representing integration over continuous latent variables \(Z\).
Pipeline¶
The typical PIC workflow in SPFlow follows these steps:
RegionGraph to PIC: Convert a standard
spflow.meta.region_graph.RegionGraphinto a symbolic PIC usingspflow.zoo.pic.rg2pic().Functional Sharing: Attach neural networks (e.g.,
spflow.zoo.pic.SharedMLP) to Integral units to allow parameters to be shared across the circuit.Materialization (PIC to QPC): Convert the symbolic PIC into a materialized Quadrature Product Circuit (QPC) using
spflow.zoo.pic.pic2qpc(). The QPC is a standard SPFlowspflow.modules.Modulethat can be evaluated usinglog_likelihood().
Merge Strategies¶
When converting a RegionGraph to a PIC, SPFlow supports different merge strategies:
AUTO: Uses Tucker merge if latent variables differ, otherwise CP merge (matching paper semantics).TUCKER: Always uses Tucker-style merging (resulting inspflow.modules.products.OuterProductafter materialization).CP: Always uses CP-style merging (resulting inspflow.modules.products.ElementwiseProductafter materialization).
Functional Sharing¶
PICs allow parameters of the circuit to be computed by neural networks. This is implemented via:
spflow.zoo.pic.SharedMLP: A simple MLP shared across all units in a group.spflow.zoo.pic.MultiHeadedMLP: An MLP with multiple output heads for different units.spflow.zoo.pic.FourierFeatures: Positional encoding for latent variables.
Tensorized QPCs¶
For high-performance evaluation, PICs can be materialized into a Tensorized QPC. This mode avoids creating thousands of small SPFlow modules and instead uses a single folded module that performs evaluation using efficient tensor operations.
spflow.zoo.pic.TensorizedQPC: A folded module for efficient PIC inference.
Minimal Example¶
import torch
from spflow.meta.region_graph import RegionGraph
from spflow.zoo.pic import rg2pic, pic2qpc, QuadratureRule, PICInput
# 1. Define a simple RegionGraph
rg = RegionGraph.from_nested_list([[0, 1]])
# 2. Define a leaf factory
class MyInput(PICInput):
def __init__(self, scope, latent_scope):
self.scope = scope
self.latent_scope = latent_scope
def materialize(self, quadrature_rule):
# Return a standard SPFlow module (e.g. Gaussian leaves)
from spflow.modules.leaves import Normal
K = quadrature_rule.points.shape[0]
return Normal(scope=self.scope, channels=K)
# 3. Build symbolic PIC
pic = rg2pic(rg, leaf_factory=lambda x, z: MyInput(x, z))
# 4. Materialize to QPC
q_rule = QuadratureRule(
points=torch.linspace(-3, 3, 5),
weights=torch.ones(5) / 5
)
model = pic2qpc(pic, q_rule)
# 5. Evaluate
x = torch.randn(10, 2)
ll = model.log_likelihood(x)
API Reference¶
Pipeline¶
- spflow.zoo.pic.rg2pic(rg, *, merge_strategy=MergeStrategy.AUTO, leaf_factory, function_factory=None, sum_weights_factory=None, integral_group_factory=None)[source]¶
Algorithm 1: Convert RegionGraph to a symbolic PIC.
- Parameters:
rg (
RegionGraph) – Input RegionGraph (currently expected to be binary).merge_strategy (
MergeStrategy) – Merge rule (AUTO matches the paper).leaf_factory (
Callable[[Scope,Scope],Module]) – Factory for leaf-region input units. Called with (x_scope, z_scope).function_factory (
Optional[Callable[[int,int],Module]]) – Factory for integral functions f_u, called as (z_dim, y_dim).sum_weights_factory (
Optional[Callable[[int],Tensor]]) – Optional factory for + weights of length N partitions (defaults to uniform).integral_group_factory (
Optional[Callable[[int,int,int],FunctionGroup]]) – Optional factory for depth-wise functional sharing groups. If provided, integrals created at the same RG depth with the same (z_dim, y_dim) can share a group.
- Return type:
Module- Returns:
Root PIC module (symbolic).
- spflow.zoo.pic.pic2qpc(pic, quadrature_rule, *, mode='expanded', tensorized_config=None)[source]¶
Algorithm 3: Materialize a symbolic PIC into a QPC.
This function supports two modes:
mode=”expanded”: current exact materialization into a standard SPFlow module graph (OuterProduct / ElementwiseProduct / WeightedSum), matching Eqs. (3)–(4) in the paper.
mode=”tensorized”: folded/tensorized materialization inspired by the authors’ reference implementation (see reference-repos/ten-pics). This returns a single SPFlow Module that evaluates the folded tensorized circuit and parameterizes its inner layers via neural functional sharing (perm_dim/norm_dim + chunked quadrature evaluation).
- Return type:
Module
Integral and Sum Units¶
- class spflow.zoo.pic.Integral(input_module, latent_scope, integrated_latent_scope, function, function_head_idx=None)[source]¶
Bases:
ModuleIntegral module representing a continuous latent variable integration.
- Computes:
g_u(X, Z_u) = ∫ f_u(Z_u, Y_u) · g_in(X, Y_u) dY_u
- Where:
X are variables from the input branch (descendants).
Z_u are conditioned latent variables for this unit.
Y_u are latent variables integrated out by this unit.
- inputs¶
The child module whose output depends on Y_u.
- Type:
Module
- latent_scope¶
Conditioned latent variables Z_u for this unit.
- Type:
Scope
- integrated_latent_scope¶
Integrated latent variables Y_u for this unit.
- Type:
Scope
- function¶
The weighting function f_u(Z_u, Y_u).
- Type:
Callable | nn.Module
- __init__(input_module, latent_scope, integrated_latent_scope, function, function_head_idx=None)[source]¶
Initialize the Integral module.
- Parameters:
input_module (
Module) – The child module.latent_scope (
Scope|int|Iterable[int] |None) – Scope of conditioned latent variables Z_u.integrated_latent_scope (
Scope|int|Iterable[int] |None) – Scope of integrated latent variables Y_u.function (
Callable[[Tensor,Tensor],Tensor] |Module|None) – Function f(Z, Y) parameterized by neural network or similar. Should accept broadcastable tensors z and y and return positive weights. Convention: z.shape[-1] == |Z_u| and y.shape[-1] == |Y_u|.function_head_idx (
Optional[int]) – Optional head index when function is a multi-function group.
- log_likelihood(data, cache=None)[source]¶
Compute symbolic log-likelihood (Not Implemented for direct execution).
Integral nodes are typically compiled to QPCs for inference.
- Return type:
- marginalize(marg_rvs, prune=True, cache=None)[source]¶
Structurally marginalize out specified random variables from the module.
Computes a new module representing the marginal distribution by integrating out the specified variables from the structure. For data-level marginalization, use NaNs in
log_likelihoodinputs.- Parameters:
marg_rvs (
list[int]) – Random variable indices to marginalize out.prune (
bool, optional) – Whether to prune unnecessary modules during marginalization. Defaults to True.cache (
Cache | None, optional) – Cache for intermediate computations. Defaults to None.
- Returns:
Marginalized module, or None if all variables are marginalized out.
- Return type:
Module | None
- Raises:
ValueError – If marginalization variables are not in the module’s scope.
- sample(num_samples=None, data=None, is_mpe=False, cache=None)[source]¶
Generate samples from the module’s probability distribution.
Supports both random sampling and MAP inference (via is_mpe flag). Handles conditional sampling through evidence in data tensor.
- Parameters:
num_samples (
int | None, optional) – Number of samples to generate. Defaults to 1.data (
Tensor | None, optional) – Pre-allocated tensor with NaN values indicating where to sample. If None, creates new tensor. Defaults to None.is_mpe (
bool, optional) – If True, returns most probable values instead of random samples. Defaults to False.cache (
Cache | None, optional) – Cache for intermediate computations. Defaults to None.
- Returns:
Sampled values of shape (batch_size, num_features).
- Return type:
Tensor
- Raises:
ValueError – If sampling parameters are incompatible.
- class spflow.zoo.pic.PICSum(inputs, weights, latent_scope)[source]¶
Bases:
ModuleSymbolic PIC sum unit +([u_i, w_i]).
- log_likelihood(data, cache=None)[source]¶
Compute log likelihood P(data | module).
Computes log probability of input data under this module’s distribution. Uses log-space for numerical stability. Results should be cached for efficiency.
- Parameters:
data (
Tensor) – Input data of shape (batch_size, num_features). NaN values indicate missing values to marginalize over.cache (
Cache | None, optional) – Cache for intermediate computations. Defaults to None.
- Returns:
Log-likelihood of shape (batch_size, out_features, out_channels).
- Return type:
Tensor
- Raises:
ValueError – If input data shape is incompatible with module scope.
- marginalize(marg_rvs, prune=True, cache=None)[source]¶
Structurally marginalize out specified random variables from the module.
Computes a new module representing the marginal distribution by integrating out the specified variables from the structure. For data-level marginalization, use NaNs in
log_likelihoodinputs.- Parameters:
marg_rvs (
list[int]) – Random variable indices to marginalize out.prune (
bool, optional) – Whether to prune unnecessary modules during marginalization. Defaults to True.cache (
Cache | None, optional) – Cache for intermediate computations. Defaults to None.
- Returns:
Marginalized module, or None if all variables are marginalized out.
- Return type:
Module | None
- Raises:
ValueError – If marginalization variables are not in the module’s scope.
- sample(num_samples=None, data=None, is_mpe=False, cache=None)[source]¶
Generate samples from the module’s probability distribution.
Supports both random sampling and MAP inference (via is_mpe flag). Handles conditional sampling through evidence in data tensor.
- Parameters:
num_samples (
int | None, optional) – Number of samples to generate. Defaults to 1.data (
Tensor | None, optional) – Pre-allocated tensor with NaN values indicating where to sample. If None, creates new tensor. Defaults to None.is_mpe (
bool, optional) – If True, returns most probable values instead of random samples. Defaults to False.cache (
Cache | None, optional) – Cache for intermediate computations. Defaults to None.
- Returns:
Sampled values of shape (batch_size, num_features).
- Return type:
Tensor
- Raises:
ValueError – If sampling parameters are incompatible.
- class spflow.zoo.pic.PICProduct(left, right)[source]¶
Bases:
ModuleSymbolic PIC product unit ×([u1, u2]).
- log_likelihood(data, cache=None)[source]¶
Compute log likelihood P(data | module).
Computes log probability of input data under this module’s distribution. Uses log-space for numerical stability. Results should be cached for efficiency.
- Parameters:
data (
Tensor) – Input data of shape (batch_size, num_features). NaN values indicate missing values to marginalize over.cache (
Cache | None, optional) – Cache for intermediate computations. Defaults to None.
- Returns:
Log-likelihood of shape (batch_size, out_features, out_channels).
- Return type:
Tensor
- Raises:
ValueError – If input data shape is incompatible with module scope.
- marginalize(marg_rvs, prune=True, cache=None)[source]¶
Structurally marginalize out specified random variables from the module.
Computes a new module representing the marginal distribution by integrating out the specified variables from the structure. For data-level marginalization, use NaNs in
log_likelihoodinputs.- Parameters:
marg_rvs (
list[int]) – Random variable indices to marginalize out.prune (
bool, optional) – Whether to prune unnecessary modules during marginalization. Defaults to True.cache (
Cache | None, optional) – Cache for intermediate computations. Defaults to None.
- Returns:
Marginalized module, or None if all variables are marginalized out.
- Return type:
Module | None
- Raises:
ValueError – If marginalization variables are not in the module’s scope.
- sample(num_samples=None, data=None, is_mpe=False, cache=None)[source]¶
Generate samples from the module’s probability distribution.
Supports both random sampling and MAP inference (via is_mpe flag). Handles conditional sampling through evidence in data tensor.
- Parameters:
num_samples (
int | None, optional) – Number of samples to generate. Defaults to 1.data (
Tensor | None, optional) – Pre-allocated tensor with NaN values indicating where to sample. If None, creates new tensor. Defaults to None.is_mpe (
bool, optional) – If True, returns most probable values instead of random samples. Defaults to False.cache (
Cache | None, optional) – Cache for intermediate computations. Defaults to None.
- Returns:
Sampled values of shape (batch_size, num_features).
- Return type:
Tensor
- Raises:
ValueError – If sampling parameters are incompatible.
- class spflow.zoo.pic.WeightedSum(inputs, weights, num_repetitions=1)[source]
Bases:
ModuleSum module with non-normalized weights for quadrature integration.
Unlike the standard Sum module which normalizes weights via softmax, WeightedSum preserves exact weight values. This is essential for Quadrature Probabilistic Circuits (QPCs) where weights represent integration weights from numerical quadrature.
- inputs
Input module(s) to the sum node.
- Type:
Module
- weights
Raw (non-normalized) weights tensor.
- Type:
Parameter
- __init__(inputs, weights, num_repetitions=1)[source]
Create a WeightedSum module with explicit weights.
- Parameters:
- Raises:
ValueError – If inputs empty or weights have invalid shape.
- log_likelihood(data, cache=None)[source]
Compute log likelihood P(data | module).
Uses logsumexp for numerical stability with the stored (non-normalized) weights.
- marginalize(marg_rvs, prune=True, cache=None)[source]
Marginalize out specified random variables.
- property feature_to_scope: ndarray
Mapping from output features to their respective scopes.
- Returns:
- 2D-array of scopes. Each row corresponds to an output feature,
each column to a repetition.
- Return type:
np.ndarray[Scope]
- property log_weights: Tensor
Returns the log weights (log of raw weights).
- Returns:
Log of weights, no softmax applied.
- Return type:
Tensor
- property weights: Tensor
Returns the raw (non-normalized) weights tensor.
- Returns:
Weights as stored, without normalization.
- Return type:
Tensor
Functional Sharing¶
Bases:
ModuleShared MLP backbone for functional sharing.
Parameterizes the shared function f in functional sharing. Uses Fourier features followed by MLP layers with nonlinearity.
From paper: φ^(γ) : R^I → R^M := φ_L ∘ … ∘ φ_1 ∘ FF
FourierFeatures input encoding.
Sequential MLP layers.
Initialize SharedMLP.
- class spflow.zoo.pic.MultiHeadedMLP(shared_mlp, num_heads, output_activation=None)[source]¶
Bases:
ModuleMulti-headed MLP for C-sharing (composite sharing).
Shares a SharedMLP backbone across multiple functions, with separate output heads for each function. This enables efficient C-sharing where fi = hi ∘ f, sharing inner function f.
From paper (neural C-sharing): fi : R^M → R := softplus(h^(i) · φ^(γ) + b^(i))
SharedMLP backbone.
- heads¶
List of linear heads for each function.
- class spflow.zoo.pic.FourierFeatures(in_features, out_features, scale=1.0)[source]¶
Bases:
ModuleFourier feature encoding layer for positional encoding.
Maps low-dimensional inputs to higher-dimensional features using random Fourier features, which helps MLPs learn high-frequency functions.
From paper Eq. 5: FF : R^I → R^M
- B¶
Random frequency matrix (not trained).
- scale¶
Frequency scaling factor.
- class spflow.zoo.pic.FunctionGroup(sharing_type='c', input_dim=1, hidden_dim=64, num_layers=2)[source]¶
Bases:
ModuleContainer for grouping PIC units with functional sharing.
Groups integral/input units that share the same MLP for efficient materialization.
- sharing_type¶
Type of sharing (“f” for F-sharing, “c” for C-sharing).
- units¶
List of units in this group.
- mlp¶
Shared MLP for this group.
- __init__(sharing_type='c', input_dim=1, hidden_dim=64, num_layers=2)[source]¶
Initialize FunctionGroup.
- add_unit(unit)[source]¶
Add a unit to this group.
- Parameters:
unit – PIC unit (Integral or input unit).
- Return type:
- Returns:
Index of the unit in this group (for C-sharing head selection).
- evaluate_batched(z, y)[source]¶
Evaluate all functions in the group in a single shared-backbone pass.
This implements the C-sharing/F-sharing semantics from Sec. 3.3 of the paper:
C-sharing: different heads over a shared backbone
F-sharing: a single head shared across units
- finalize()[source]¶
Finalize the group after all units are added.
Creates the multi-headed MLP for C-sharing.
- Return type:
Tensorized QPC¶
- class spflow.zoo.pic.TensorizedQPC(*, rg, quadrature_rule, config)[source]¶
Bases:
ModuleFolded tensorized QPC as a SPFlow Module.
- log_likelihood(data, cache=None)[source]¶
Compute log likelihood P(data | module).
Computes log probability of input data under this module’s distribution. Uses log-space for numerical stability. Results should be cached for efficiency.
- Parameters:
data (
Tensor) – Input data of shape (batch_size, num_features). NaN values indicate missing values to marginalize over.cache (
Cache | None, optional) – Cache for intermediate computations. Defaults to None.
- Returns:
Log-likelihood of shape (batch_size, out_features, out_channels).
- Return type:
Tensor
- Raises:
ValueError – If input data shape is incompatible with module scope.
- marginalize(marg_rvs, prune=True, cache=None)[source]¶
Structurally marginalize out specified random variables from the module.
Computes a new module representing the marginal distribution by integrating out the specified variables from the structure. For data-level marginalization, use NaNs in
log_likelihoodinputs.- Parameters:
marg_rvs (
list[int]) – Random variable indices to marginalize out.prune (
bool, optional) – Whether to prune unnecessary modules during marginalization. Defaults to True.cache (
Cache | None, optional) – Cache for intermediate computations. Defaults to None.
- Returns:
Marginalized module, or None if all variables are marginalized out.
- Return type:
Module | None
- Raises:
ValueError – If marginalization variables are not in the module’s scope.
- sample(num_samples=None, data=None, is_mpe=False, cache=None)[source]¶
Generate samples from the module’s probability distribution.
Supports both random sampling and MAP inference (via is_mpe flag). Handles conditional sampling through evidence in data tensor.
- Parameters:
num_samples (
int | None, optional) – Number of samples to generate. Defaults to 1.data (
Tensor | None, optional) – Pre-allocated tensor with NaN values indicating where to sample. If None, creates new tensor. Defaults to None.is_mpe (
bool, optional) – If True, returns most probable values instead of random samples. Defaults to False.cache (
Cache | None, optional) – Cache for intermediate computations. Defaults to None.
- Returns:
Sampled values of shape (batch_size, num_features).
- Return type:
Tensor
- Raises:
ValueError – If sampling parameters are incompatible.
- class spflow.zoo.pic.TensorizedQPCConfig(leaf_type, num_categories=None, net_dim=64, bias=False, input_sharing='none', inner_sharing='none', ff_dim=None, sigma=1.0, learn_ff=False, n_chunks=1, num_classes=1, layer_cls='auto')[source]¶
Bases:
objectConfiguration for folded tensorized QPC materialization.