Probabilistic Integral Circuits (PICs)

Probabilistic Integral Circuits (PICs) are a framework for scaling continuous latent variable models by representing them as integrals over tractable circuits. They allow for flexible neural functional sharing while maintaining tractability via quadrature-based materialization.

Reference

PICs are described in the NeurIPS 2024 paper:

Overview

A PIC is a symbolic representation of a continuous mixture model. It consists of:

  • Input Units: Representing leaf functions \(f_u(X_u, Z_u)\).

  • Sum Units: Representing convex mixtures.

  • Product Units: Representing factorized distributions.

  • Integral Units: Representing integration over continuous latent variables \(Z\).

Pipeline

The typical PIC workflow in SPFlow follows these steps:

  1. RegionGraph to PIC: Convert a standard spflow.meta.region_graph.RegionGraph into a symbolic PIC using spflow.zoo.pic.rg2pic().

  2. Functional Sharing: Attach neural networks (e.g., spflow.zoo.pic.SharedMLP) to Integral units to allow parameters to be shared across the circuit.

  3. Materialization (PIC to QPC): Convert the symbolic PIC into a materialized Quadrature Product Circuit (QPC) using spflow.zoo.pic.pic2qpc(). The QPC is a standard SPFlow spflow.modules.Module that can be evaluated using log_likelihood().

Merge Strategies

When converting a RegionGraph to a PIC, SPFlow supports different merge strategies:

Functional Sharing

PICs allow parameters of the circuit to be computed by neural networks. This is implemented via:

Tensorized QPCs

For high-performance evaluation, PICs can be materialized into a Tensorized QPC. This mode avoids creating thousands of small SPFlow modules and instead uses a single folded module that performs evaluation using efficient tensor operations.

Minimal Example

import torch
from spflow.meta.region_graph import RegionGraph
from spflow.zoo.pic import rg2pic, pic2qpc, QuadratureRule, PICInput

# 1. Define a simple RegionGraph
rg = RegionGraph.from_nested_list([[0, 1]])

# 2. Define a leaf factory
class MyInput(PICInput):
    def __init__(self, scope, latent_scope):
        self.scope = scope
        self.latent_scope = latent_scope
    def materialize(self, quadrature_rule):
        # Return a standard SPFlow module (e.g. Gaussian leaves)
        from spflow.modules.leaves import Normal
        K = quadrature_rule.points.shape[0]
        return Normal(scope=self.scope, channels=K)

# 3. Build symbolic PIC
pic = rg2pic(rg, leaf_factory=lambda x, z: MyInput(x, z))

# 4. Materialize to QPC
q_rule = QuadratureRule(
    points=torch.linspace(-3, 3, 5),
    weights=torch.ones(5) / 5
)
model = pic2qpc(pic, q_rule)

# 5. Evaluate
x = torch.randn(10, 2)
ll = model.log_likelihood(x)

API Reference

Pipeline

spflow.zoo.pic.rg2pic(rg, *, merge_strategy=MergeStrategy.AUTO, leaf_factory, function_factory=None, sum_weights_factory=None, integral_group_factory=None)[source]

Algorithm 1: Convert RegionGraph to a symbolic PIC.

Parameters:
  • rg (RegionGraph) – Input RegionGraph (currently expected to be binary).

  • merge_strategy (MergeStrategy) – Merge rule (AUTO matches the paper).

  • leaf_factory (Callable[[Scope, Scope], Module]) – Factory for leaf-region input units. Called with (x_scope, z_scope).

  • function_factory (Optional[Callable[[int, int], Module]]) – Factory for integral functions f_u, called as (z_dim, y_dim).

  • sum_weights_factory (Optional[Callable[[int], Tensor]]) – Optional factory for + weights of length N partitions (defaults to uniform).

  • integral_group_factory (Optional[Callable[[int, int, int], FunctionGroup]]) – Optional factory for depth-wise functional sharing groups. If provided, integrals created at the same RG depth with the same (z_dim, y_dim) can share a group.

Return type:

Module

Returns:

Root PIC module (symbolic).

spflow.zoo.pic.pic2qpc(pic, quadrature_rule, *, mode='expanded', tensorized_config=None)[source]

Algorithm 3: Materialize a symbolic PIC into a QPC.

This function supports two modes:

  • mode=”expanded”: current exact materialization into a standard SPFlow module graph (OuterProduct / ElementwiseProduct / WeightedSum), matching Eqs. (3)–(4) in the paper.

  • mode=”tensorized”: folded/tensorized materialization inspired by the authors’ reference implementation (see reference-repos/ten-pics). This returns a single SPFlow Module that evaluates the folded tensorized circuit and parameterizes its inner layers via neural functional sharing (perm_dim/norm_dim + chunked quadrature evaluation).

Return type:

Module

class spflow.zoo.pic.MergeStrategy(value)[source]

Bases: Enum

Merge strategy for RG → PIC (Algorithm 2).

  • AUTO: Tucker if Z_u1 != Z_u2 else CP (paper semantics)

  • TUCKER: always Tucker merge

  • CP: always CP merge (requires Z_u1 == Z_u2)

AUTO = 'auto'
CP = 'cp'
TUCKER = 'tucker'
class spflow.zoo.pic.QuadratureRule(points, weights)[source]

Bases: object

1D quadrature rule used for all PIC latents.

points: Tensor
weights: Tensor

Integral and Sum Units

class spflow.zoo.pic.Integral(input_module, latent_scope, integrated_latent_scope, function, function_head_idx=None)[source]

Bases: Module

Integral module representing a continuous latent variable integration.

Computes:

g_u(X, Z_u) = ∫ f_u(Z_u, Y_u) · g_in(X, Y_u) dY_u

Where:
  • X are variables from the input branch (descendants).

  • Z_u are conditioned latent variables for this unit.

  • Y_u are latent variables integrated out by this unit.

inputs

The child module whose output depends on Y_u.

Type:

Module

latent_scope

Conditioned latent variables Z_u for this unit.

Type:

Scope

integrated_latent_scope

Integrated latent variables Y_u for this unit.

Type:

Scope

function

The weighting function f_u(Z_u, Y_u).

Type:

Callable | nn.Module

__init__(input_module, latent_scope, integrated_latent_scope, function, function_head_idx=None)[source]

Initialize the Integral module.

Parameters:
  • input_module (Module) – The child module.

  • latent_scope (Scope | int | Iterable[int] | None) – Scope of conditioned latent variables Z_u.

  • integrated_latent_scope (Scope | int | Iterable[int] | None) – Scope of integrated latent variables Y_u.

  • function (Callable[[Tensor, Tensor], Tensor] | Module | None) – Function f(Z, Y) parameterized by neural network or similar. Should accept broadcastable tensors z and y and return positive weights. Convention: z.shape[-1] == |Z_u| and y.shape[-1] == |Y_u|.

  • function_head_idx (Optional[int]) – Optional head index when function is a multi-function group.

log_likelihood(data, cache=None)[source]

Compute symbolic log-likelihood (Not Implemented for direct execution).

Integral nodes are typically compiled to QPCs for inference.

Return type:

Tensor

marginalize(marg_rvs, prune=True, cache=None)[source]

Structurally marginalize out specified random variables from the module.

Computes a new module representing the marginal distribution by integrating out the specified variables from the structure. For data-level marginalization, use NaNs in log_likelihood inputs.

Parameters:
  • marg_rvs (list[int]) – Random variable indices to marginalize out.

  • prune (bool, optional) – Whether to prune unnecessary modules during marginalization. Defaults to True.

  • cache (Cache | None, optional) – Cache for intermediate computations. Defaults to None.

Returns:

Marginalized module, or None if all variables are marginalized out.

Return type:

Module | None

Raises:

ValueError – If marginalization variables are not in the module’s scope.

sample(num_samples=None, data=None, is_mpe=False, cache=None)[source]

Generate samples from the module’s probability distribution.

Supports both random sampling and MAP inference (via is_mpe flag). Handles conditional sampling through evidence in data tensor.

Parameters:
  • num_samples (int | None, optional) – Number of samples to generate. Defaults to 1.

  • data (Tensor | None, optional) – Pre-allocated tensor with NaN values indicating where to sample. If None, creates new tensor. Defaults to None.

  • is_mpe (bool, optional) – If True, returns most probable values instead of random samples. Defaults to False.

  • cache (Cache | None, optional) – Cache for intermediate computations. Defaults to None.

Returns:

Sampled values of shape (batch_size, num_features).

Return type:

Tensor

Raises:

ValueError – If sampling parameters are incompatible.

property feature_to_scope: ndarray

Mapping from output features to their respective scopes.

Returns:

2D-array of scopes. Each row corresponds to an output feature,

each column to a repetition.

Return type:

np.ndarray[Scope]

class spflow.zoo.pic.PICSum(inputs, weights, latent_scope)[source]

Bases: Module

Symbolic PIC sum unit +([u_i, w_i]).

log_likelihood(data, cache=None)[source]

Compute log likelihood P(data | module).

Computes log probability of input data under this module’s distribution. Uses log-space for numerical stability. Results should be cached for efficiency.

Parameters:
  • data (Tensor) – Input data of shape (batch_size, num_features). NaN values indicate missing values to marginalize over.

  • cache (Cache | None, optional) – Cache for intermediate computations. Defaults to None.

Returns:

Log-likelihood of shape (batch_size, out_features, out_channels).

Return type:

Tensor

Raises:

ValueError – If input data shape is incompatible with module scope.

marginalize(marg_rvs, prune=True, cache=None)[source]

Structurally marginalize out specified random variables from the module.

Computes a new module representing the marginal distribution by integrating out the specified variables from the structure. For data-level marginalization, use NaNs in log_likelihood inputs.

Parameters:
  • marg_rvs (list[int]) – Random variable indices to marginalize out.

  • prune (bool, optional) – Whether to prune unnecessary modules during marginalization. Defaults to True.

  • cache (Cache | None, optional) – Cache for intermediate computations. Defaults to None.

Returns:

Marginalized module, or None if all variables are marginalized out.

Return type:

Module | None

Raises:

ValueError – If marginalization variables are not in the module’s scope.

sample(num_samples=None, data=None, is_mpe=False, cache=None)[source]

Generate samples from the module’s probability distribution.

Supports both random sampling and MAP inference (via is_mpe flag). Handles conditional sampling through evidence in data tensor.

Parameters:
  • num_samples (int | None, optional) – Number of samples to generate. Defaults to 1.

  • data (Tensor | None, optional) – Pre-allocated tensor with NaN values indicating where to sample. If None, creates new tensor. Defaults to None.

  • is_mpe (bool, optional) – If True, returns most probable values instead of random samples. Defaults to False.

  • cache (Cache | None, optional) – Cache for intermediate computations. Defaults to None.

Returns:

Sampled values of shape (batch_size, num_features).

Return type:

Tensor

Raises:

ValueError – If sampling parameters are incompatible.

property feature_to_scope: ndarray

Mapping from output features to their respective scopes.

Returns:

2D-array of scopes. Each row corresponds to an output feature,

each column to a repetition.

Return type:

np.ndarray[Scope]

class spflow.zoo.pic.PICProduct(left, right)[source]

Bases: Module

Symbolic PIC product unit ×([u1, u2]).

log_likelihood(data, cache=None)[source]

Compute log likelihood P(data | module).

Computes log probability of input data under this module’s distribution. Uses log-space for numerical stability. Results should be cached for efficiency.

Parameters:
  • data (Tensor) – Input data of shape (batch_size, num_features). NaN values indicate missing values to marginalize over.

  • cache (Cache | None, optional) – Cache for intermediate computations. Defaults to None.

Returns:

Log-likelihood of shape (batch_size, out_features, out_channels).

Return type:

Tensor

Raises:

ValueError – If input data shape is incompatible with module scope.

marginalize(marg_rvs, prune=True, cache=None)[source]

Structurally marginalize out specified random variables from the module.

Computes a new module representing the marginal distribution by integrating out the specified variables from the structure. For data-level marginalization, use NaNs in log_likelihood inputs.

Parameters:
  • marg_rvs (list[int]) – Random variable indices to marginalize out.

  • prune (bool, optional) – Whether to prune unnecessary modules during marginalization. Defaults to True.

  • cache (Cache | None, optional) – Cache for intermediate computations. Defaults to None.

Returns:

Marginalized module, or None if all variables are marginalized out.

Return type:

Module | None

Raises:

ValueError – If marginalization variables are not in the module’s scope.

sample(num_samples=None, data=None, is_mpe=False, cache=None)[source]

Generate samples from the module’s probability distribution.

Supports both random sampling and MAP inference (via is_mpe flag). Handles conditional sampling through evidence in data tensor.

Parameters:
  • num_samples (int | None, optional) – Number of samples to generate. Defaults to 1.

  • data (Tensor | None, optional) – Pre-allocated tensor with NaN values indicating where to sample. If None, creates new tensor. Defaults to None.

  • is_mpe (bool, optional) – If True, returns most probable values instead of random samples. Defaults to False.

  • cache (Cache | None, optional) – Cache for intermediate computations. Defaults to None.

Returns:

Sampled values of shape (batch_size, num_features).

Return type:

Tensor

Raises:

ValueError – If sampling parameters are incompatible.

property feature_to_scope: ndarray

Mapping from output features to their respective scopes.

Returns:

2D-array of scopes. Each row corresponds to an output feature,

each column to a repetition.

Return type:

np.ndarray[Scope]

class spflow.zoo.pic.WeightedSum(inputs, weights, num_repetitions=1)[source]

Bases: Module

Sum module with non-normalized weights for quadrature integration.

Unlike the standard Sum module which normalizes weights via softmax, WeightedSum preserves exact weight values. This is essential for Quadrature Probabilistic Circuits (QPCs) where weights represent integration weights from numerical quadrature.

inputs

Input module(s) to the sum node.

Type:

Module

weights

Raw (non-normalized) weights tensor.

Type:

Parameter

__init__(inputs, weights, num_repetitions=1)[source]

Create a WeightedSum module with explicit weights.

Parameters:
  • inputs (Module | list[Module]) – Single module or list of modules to weight.

  • weights (Tensor) – Weight tensor. Shape should be compatible with (features, in_channels, out_channels, repetitions).

  • num_repetitions (int) – Number of repetitions for structured representations.

Raises:

ValueError – If inputs empty or weights have invalid shape.

log_likelihood(data, cache=None)[source]

Compute log likelihood P(data | module).

Uses logsumexp for numerical stability with the stored (non-normalized) weights.

Parameters:
  • data (Tensor) – Input data of shape (batch_size, num_features).

  • cache (Cache | None) – Cache for intermediate computations. Defaults to None.

Returns:

Log-likelihood of shape (batch_size, num_features, out_channels, repetitions).

Return type:

Tensor

marginalize(marg_rvs, prune=True, cache=None)[source]

Marginalize out specified random variables.

Parameters:
  • marg_rvs (list[int]) – List of random variables to marginalize.

  • prune (bool) – Whether to prune the module.

  • cache (Cache | None) – Optional cache dictionary.

Return type:

WeightedSum | None

Returns:

Marginalized WeightedSum module or None.

property feature_to_scope: ndarray

Mapping from output features to their respective scopes.

Returns:

2D-array of scopes. Each row corresponds to an output feature,

each column to a repetition.

Return type:

np.ndarray[Scope]

property log_weights: Tensor

Returns the log weights (log of raw weights).

Returns:

Log of weights, no softmax applied.

Return type:

Tensor

property weights: Tensor

Returns the raw (non-normalized) weights tensor.

Returns:

Weights as stored, without normalization.

Return type:

Tensor

Functional Sharing

class spflow.zoo.pic.SharedMLP(input_dim, hidden_dim, num_layers=2, activation=SiLU(), fourier_scale=1.0)[source]

Bases: Module

Shared MLP backbone for functional sharing.

Parameterizes the shared function f in functional sharing. Uses Fourier features followed by MLP layers with nonlinearity.

From paper: φ^(γ) : R^I → R^M := φ_L ∘ … ∘ φ_1 ∘ FF

fourier

FourierFeatures input encoding.

layers

Sequential MLP layers.

__init__(input_dim, hidden_dim, num_layers=2, activation=SiLU(), fourier_scale=1.0)[source]

Initialize SharedMLP.

Parameters:
  • input_dim (int) – Dimension of input (e.g., latent variable dimension).

  • hidden_dim (int) – Dimension of hidden layers M.

  • num_layers (int) – Number of hidden layers L.

  • activation (Module) – Activation function ψ.

  • fourier_scale (float) – Scale for Fourier features.

forward(x)[source]

Forward pass through shared MLP.

Parameters:

x (Tensor) – Input tensor of shape (…, input_dim).

Return type:

Tensor

Returns:

Hidden representation of shape (…, hidden_dim).

class spflow.zoo.pic.MultiHeadedMLP(shared_mlp, num_heads, output_activation=None)[source]

Bases: Module

Multi-headed MLP for C-sharing (composite sharing).

Shares a SharedMLP backbone across multiple functions, with separate output heads for each function. This enables efficient C-sharing where fi = hi ∘ f, sharing inner function f.

From paper (neural C-sharing): fi : R^M → R := softplus(h^(i) · φ^(γ) + b^(i))

shared

SharedMLP backbone.

heads

List of linear heads for each function.

__init__(shared_mlp, num_heads, output_activation=None)[source]

Initialize MultiHeadedMLP.

Parameters:
  • shared_mlp (SharedMLP) – Shared MLP backbone.

  • num_heads (int) – Number of output heads N.

  • output_activation (Optional[Module]) – Activation for outputs (default: softplus for positivity).

forward(x, head_idx=None)[source]

Forward pass through multi-headed MLP.

Parameters:
  • x (Tensor) – Input tensor of shape (…, input_dim).

  • head_idx (Optional[int]) – Optional specific head index. If None, returns all heads.

Returns:

Output of shape (…, 1). Otherwise: Output of shape (…, num_heads).

Return type:

If head_idx is specified

class spflow.zoo.pic.FourierFeatures(in_features, out_features, scale=1.0)[source]

Bases: Module

Fourier feature encoding layer for positional encoding.

Maps low-dimensional inputs to higher-dimensional features using random Fourier features, which helps MLPs learn high-frequency functions.

From paper Eq. 5: FF : R^I → R^M

B

Random frequency matrix (not trained).

scale

Frequency scaling factor.

__init__(in_features, out_features, scale=1.0)[source]

Initialize FourierFeatures layer.

Parameters:
  • in_features (int) – Input dimension I.

  • out_features (int) – Output dimension M (half of final output due to sin/cos).

  • scale (float) – Scaling factor for frequencies.

forward(x)[source]

Apply Fourier feature encoding.

Parameters:

x (Tensor) – Input tensor of shape (…, in_features).

Return type:

Tensor

Returns:

Tensor of shape (…, out_features * 2).

class spflow.zoo.pic.FunctionGroup(sharing_type='c', input_dim=1, hidden_dim=64, num_layers=2)[source]

Bases: Module

Container for grouping PIC units with functional sharing.

Groups integral/input units that share the same MLP for efficient materialization.

sharing_type

Type of sharing (“f” for F-sharing, “c” for C-sharing).

units

List of units in this group.

mlp

Shared MLP for this group.

__init__(sharing_type='c', input_dim=1, hidden_dim=64, num_layers=2)[source]

Initialize FunctionGroup.

Parameters:
  • sharing_type (str) – “f” for F-sharing (all same), “c” for C-sharing (multi-headed).

  • input_dim (int) – Input dimension for MLP.

  • hidden_dim (int) – Hidden dimension for MLP.

  • num_layers (int) – Number of layers in MLP.

add_unit(unit)[source]

Add a unit to this group.

Parameters:

unit – PIC unit (Integral or input unit).

Return type:

int

Returns:

Index of the unit in this group (for C-sharing head selection).

evaluate_batched(z, y)[source]

Evaluate all functions in the group in a single shared-backbone pass.

This implements the C-sharing/F-sharing semantics from Sec. 3.3 of the paper:

  • C-sharing: different heads over a shared backbone

  • F-sharing: a single head shared across units

Parameters:
  • z (Tensor) – Tensor with last dimension matching the z-input dimensionality.

  • y (Tensor) – Tensor with last dimension matching the y-input dimensionality. z and y must be broadcastable to the same leading shape.

Returns:

Tensor of shape (num_units, *leading_shape). If F-sharing: Tensor of shape (1, *leading_shape).

Return type:

If C-sharing

finalize()[source]

Finalize the group after all units are added.

Creates the multi-headed MLP for C-sharing.

Return type:

None

get_function(unit_idx=0)[source]

Get a callable function for a specific unit/head.

The returned callable preserves broadcast shapes: for broadcastable z and y, it returns a tensor with the broadcasted leading shape.

Parameters:

unit_idx (int) – Index of the unit in this group (only used for C-sharing).

Return type:

Callable[[Tensor, Tensor], Tensor]

Returns:

Callable mapping (z, y) to a positive tensor.

Tensorized QPC

class spflow.zoo.pic.TensorizedQPC(*, rg, quadrature_rule, config)[source]

Bases: Module

Folded tensorized QPC as a SPFlow Module.

classmethod from_region_graph(rg, *, quadrature_rule, config)[source]
Return type:

TensorizedQPC

log_likelihood(data, cache=None)[source]

Compute log likelihood P(data | module).

Computes log probability of input data under this module’s distribution. Uses log-space for numerical stability. Results should be cached for efficiency.

Parameters:
  • data (Tensor) – Input data of shape (batch_size, num_features). NaN values indicate missing values to marginalize over.

  • cache (Cache | None, optional) – Cache for intermediate computations. Defaults to None.

Returns:

Log-likelihood of shape (batch_size, out_features, out_channels).

Return type:

Tensor

Raises:

ValueError – If input data shape is incompatible with module scope.

marginalize(marg_rvs, prune=True, cache=None)[source]

Structurally marginalize out specified random variables from the module.

Computes a new module representing the marginal distribution by integrating out the specified variables from the structure. For data-level marginalization, use NaNs in log_likelihood inputs.

Parameters:
  • marg_rvs (list[int]) – Random variable indices to marginalize out.

  • prune (bool, optional) – Whether to prune unnecessary modules during marginalization. Defaults to True.

  • cache (Cache | None, optional) – Cache for intermediate computations. Defaults to None.

Returns:

Marginalized module, or None if all variables are marginalized out.

Return type:

Module | None

Raises:

ValueError – If marginalization variables are not in the module’s scope.

sample(num_samples=None, data=None, is_mpe=False, cache=None)[source]

Generate samples from the module’s probability distribution.

Supports both random sampling and MAP inference (via is_mpe flag). Handles conditional sampling through evidence in data tensor.

Parameters:
  • num_samples (int | None, optional) – Number of samples to generate. Defaults to 1.

  • data (Tensor | None, optional) – Pre-allocated tensor with NaN values indicating where to sample. If None, creates new tensor. Defaults to None.

  • is_mpe (bool, optional) – If True, returns most probable values instead of random samples. Defaults to False.

  • cache (Cache | None, optional) – Cache for intermediate computations. Defaults to None.

Returns:

Sampled values of shape (batch_size, num_features).

Return type:

Tensor

Raises:

ValueError – If sampling parameters are incompatible.

property feature_to_scope: ndarray

Mapping from output features to their respective scopes.

Returns:

2D-array of scopes. Each row corresponds to an output feature,

each column to a repetition.

Return type:

np.ndarray[Scope]

class spflow.zoo.pic.TensorizedQPCConfig(leaf_type, num_categories=None, net_dim=64, bias=False, input_sharing='none', inner_sharing='none', ff_dim=None, sigma=1.0, learn_ff=False, n_chunks=1, num_classes=1, layer_cls='auto')[source]

Bases: object

Configuration for folded tensorized QPC materialization.

bias: bool = False
ff_dim: int | None = None
inner_sharing: Literal['none', 'f', 'c'] = 'none'
input_sharing: Literal['none', 'f', 'c'] = 'none'
layer_cls: Literal['auto', 'tucker', 'cp'] = 'auto'
leaf_type: Literal['normal', 'bernoulli', 'categorical']
learn_ff: bool = False
n_chunks: int = 1
net_dim: int = 64
num_categories: int | None = None
num_classes: int = 1
sigma: float = 1.0