Random and Tensorized Sum-Product Networks (RAT-SPN)

Random and Tensorized Sum-Product Networks (RAT-SPNs) provide a principled approach to building deep probabilistic models through randomized circuit construction. They combine interpretability with expressiveness through tensorized operations.

Reference

RAT-SPNs are described in the NeurIPS 2020 paper:

Overview

RAT-SPNs consist of alternating sum (region) and product (partition) layers that recursively partition the input space. The random construction prevents overfitting while maintaining tractable exact inference.

Key features:

  • Randomized structure: Region and partition layers are constructed using random permutations and splits.

  • Tensorized evaluation: Operations are mapped to efficient tensor contractions.

  • Scalable training: Supports training via EM or Gradient Descent.

Implementation

The RAT-SPN implementation in SPFlow provides a high-level spflow.zoo.rat.RatSPN module that automates the construction of the circuit based on architectural hyperparameters.

class spflow.zoo.rat.RatSPN(leaf_modules, n_root_nodes, n_region_nodes, num_repetitions, depth, outer_product=False, split_mode=None, num_splits=2)[source]

Bases: Module, Classifier

Random and Tensorized Sum-Product Network (RAT-SPN).

Scalable deep probabilistic model with randomized circuit construction. Consists of alternating sum (region) and product (partition) layers that recursively partition input space. Random construction prevents overfitting while maintaining tractable exact inference.

leaf_modules

Leaf distribution modules.

Type:

list[LeafModule]

n_root_nodes

Number of root sum nodes.

Type:

int

n_region_nodes

Number of sum nodes per region.

Type:

int

depth

Number of partition/region layers.

Type:

int

num_repetitions

Number of parallel circuit instances.

Type:

int

scope

Combined scope of all leaf modules.

Type:

Scope

Reference:

Peharz, R., et al. (2020). “Random Sum-Product Networks: A Simple and Effective Approach to Probabilistic Deep Learning.” NeurIPS 2020.

__init__(leaf_modules, n_root_nodes, n_region_nodes, num_repetitions, depth, outer_product=False, split_mode=None, num_splits=2)[source]

Initialize RAT-SPN with specified architecture parameters.

Creates a Random and Tensorized SPN by recursively constructing layers of sum and product nodes. Circuit structure is fixed after initialization.

Parameters:
  • leaf_modules (list[LeafModule]) – Leaf distributions forming the base layer.

  • n_root_nodes (int) – Number of root sum nodes in final mixture.

  • n_region_nodes (int) – Number of sum nodes in each region layer.

  • num_repetitions (int) – Number of parallel circuit instances.

  • depth (int) – Number of partition/region layers.

  • outer_product (bool | None, optional) – Use outer product instead of elementwise product for partitions. Defaults to False.

  • split_mode (SplitMode | None, optional) – Split configuration. Use SplitMode.consecutive() or SplitMode.interleaved(). Defaults to SplitMode.consecutive(num_splits) if not specified.

  • num_splits (int | None, optional) – Number of splits in each partition. Must be at least 2. Defaults to 2.

Raises:

ValueError – If architectural parameters are invalid.

create_spn()[source]

Create the RAT-SPN architecture.

Builds the RAT-SPN circuit structure from bottom to top based on the provided architectural parameters. Architecture is constructed recursively from leaves to root using alternating layers of sum and product nodes, and the final structure depends on depth and branching parameters.

log_likelihood(data, cache=None)[source]

Compute log likelihood for RAT-SPN.

Parameters:
  • data (Tensor) – Input data tensor.

  • cache (Cache | None) – Optional cache dictionary for caching intermediate results.

Return type:

Tensor

Returns:

Log-likelihood values.

log_posterior(data, cache=None)[source]

Compute log-posterior probabilities for multi-class models.

Parameters:
  • data (Tensor) – Input data tensor.

  • cache (Cache | None) – Optional cache dictionary for caching intermediate results.

Return type:

Tensor

Returns:

Log-posterior probabilities.

Raises:

UnsupportedOperationError – If model has only one root node (single class).

marginalize(marg_rvs, prune=True, cache=None)[source]

Marginalize out specified random variables.

Parameters:
  • marg_rvs (list[int]) – List of random variables to marginalize.

  • prune (bool) – Whether to prune the module.

  • cache (Cache | None) – Optional cache dictionary.

Return type:

Module | None

Returns:

Marginalized module or None.

predict_proba(data)[source]

Classify input data using RAT-SPN.

Parameters:

data (Tensor) – Input data tensor.

Returns:

Predicted class labels.

sample(num_samples=None, data=None, is_mpe=False, cache=None)[source]

Generate samples from the RAT-SPN.

Parameters:
  • num_samples (int | None) – Number of samples to generate.

  • data (Tensor | None) – Data tensor with NaN values to fill with samples.

  • is_mpe (bool) – Whether to perform maximum a posteriori estimation.

  • cache (Cache | None) – Optional cache dictionary.

Return type:

Tensor

Returns:

Sampled values.

property feature_to_scope: ndarray

Mapping from output features to their respective scopes.

Returns:

2D-array of scopes. Each row corresponds to an output feature,

each column to a repetition.

Return type:

np.ndarray[Scope]

property n_out: int
property scopes_out: list[Scope]