RAT-SPN Architecture

Random and Tensorized Sum-Product Networks (RAT-SPN) - hyperparameter-defined probabilistic circuit architecture.

RatSPN

The RAT-SPN class provides a fully automated architecture for building probabilistic circuits with depth, factorization, and mixing layers.

class spflow.modules.rat.RatSPN(leaf_modules, n_root_nodes, n_region_nodes, num_repetitions, depth, outer_product=False, split_mode=None, num_splits=2)[source]

Bases: Module, Classifier

Random and Tensorized Sum-Product Network (RAT-SPN).

Scalable deep probabilistic model with randomized circuit construction. Consists of alternating sum (region) and product (partition) layers that recursively partition input space. Random construction prevents overfitting while maintaining tractable exact inference.

leaf_modules

Leaf distribution modules.

Type:

list[LeafModule]

n_root_nodes

Number of root sum nodes.

Type:

int

n_region_nodes

Number of sum nodes per region.

Type:

int

depth

Number of partition/region layers.

Type:

int

num_repetitions

Number of parallel circuit instances.

Type:

int

scope

Combined scope of all leaf modules.

Type:

Scope

Reference:

Peharz, R., et al. (2020). “Random Sum-Product Networks: A Simple and Effective Approach to Probabilistic Deep Learning.” NeurIPS 2020.

__init__(leaf_modules, n_root_nodes, n_region_nodes, num_repetitions, depth, outer_product=False, split_mode=None, num_splits=2)[source]

Initialize RAT-SPN with specified architecture parameters.

Creates a Random and Tensorized SPN by recursively constructing layers of sum and product nodes. Circuit structure is fixed after initialization.

Parameters:
  • leaf_modules (list[LeafModule]) – Leaf distributions forming the base layer.

  • n_root_nodes (int) – Number of root sum nodes in final mixture.

  • n_region_nodes (int) – Number of sum nodes in each region layer.

  • num_repetitions (int) – Number of parallel circuit instances.

  • depth (int) – Number of partition/region layers.

  • outer_product (bool | None, optional) – Use outer product instead of elementwise product for partitions. Defaults to False.

  • split_mode (SplitMode | None, optional) – Split configuration. Use SplitMode.consecutive() or SplitMode.interleaved(). Defaults to SplitMode.consecutive(num_splits) if not specified.

  • num_splits (int | None, optional) – Number of splits in each partition. Must be at least 2. Defaults to 2.

Raises:

ValueError – If architectural parameters are invalid.

create_spn()[source]

Create the RAT-SPN architecture.

Builds the RAT-SPN circuit structure from bottom to top based on the provided architectural parameters. Architecture is constructed recursively from leaves to root using alternating layers of sum and product nodes, and the final structure depends on depth and branching parameters.

expectation_maximization(data, cache=None)[source]

Perform expectation-maximization step.

Parameters:
  • data (Tensor) – Input data tensor.

  • cache (Cache | None) – Optional cache dictionary.

Return type:

None

log_likelihood(data, cache=None)[source]

Compute log likelihood for RAT-SPN.

Parameters:
  • data (Tensor) – Input data tensor.

  • cache (Cache | None) – Optional cache dictionary for caching intermediate results.

Return type:

Tensor

Returns:

Log-likelihood values.

log_posterior(data, cache=None)[source]

Compute log-posterior probabilities for multi-class models.

Parameters:
  • data (Tensor) – Input data tensor.

  • cache (Cache | None) – Optional cache dictionary for caching intermediate results.

Return type:

Tensor

Returns:

Log-posterior probabilities.

Raises:

UnsupportedOperationError – If model has only one root node (single class).

marginalize(marg_rvs, prune=True, cache=None)[source]

Marginalize out specified random variables.

Parameters:
  • marg_rvs (list[int]) – List of random variables to marginalize.

  • prune (bool) – Whether to prune the module.

  • cache (Cache | None) – Optional cache dictionary.

Return type:

Module | None

Returns:

Marginalized module or None.

maximum_likelihood_estimation(data, weights=None, cache=None)[source]

Update parameters via maximum likelihood estimation.

Parameters:
  • data (Tensor) – Input data tensor.

  • weights (Tensor | None) – Optional sample weights.

  • cache (Cache | None) – Optional cache dictionary.

Return type:

None

predict_proba(data)[source]

Classify input data using RAT-SPN.

Parameters:

data (Tensor) – Input data tensor.

Returns:

Predicted class labels.

sample(num_samples=None, data=None, is_mpe=False, cache=None, sampling_ctx=None)[source]

Generate samples from the RAT-SPN.

Parameters:
  • num_samples (int | None) – Number of samples to generate.

  • data (Tensor | None) – Data tensor with NaN values to fill with samples.

  • is_mpe (bool) – Whether to perform maximum a posteriori estimation.

  • cache (Cache | None) – Optional cache dictionary.

  • sampling_ctx (SamplingContext | None) – Optional sampling context.

Return type:

Tensor

Returns:

Sampled values.

property feature_to_scope: ndarray

Mapping from output features to their respective scopes.

Returns:

2D-array of scopes. Each row corresponds to an output feature,

each column to a repetition.

Return type:

np.ndarray[Scope]

property n_out: int
property scopes_out: list[Scope]

RepetitionMixingLayer

A specialized sum layer used as the first layer in RAT-SPN to sum over repetitions.

class spflow.modules.sums.RepetitionMixingLayer(inputs, out_channels=None, num_repetitions=1, weights=None)[source]

Bases: Sum

Mixing layer for RAT-SPN region nodes.

Specialized sum node for RAT-SPNs. Creates mixtures over input channels. Extends Sum with RAT-SPN specific optimizations.

__init__(inputs, out_channels=None, num_repetitions=1, weights=None)[source]

Initialize mixing layer for RAT-SPN.

Parameters:
  • inputs (Module) – Input module to mix over channels.

  • out_channels (int | None) – Number of output mixture components.

  • num_repetitions (int) – Number of parallel repetitions.

  • weights (Tensor | None) – Initial mixing weights (if None, randomly initialized).

expectation_maximization(data, cache=None)[source]

Perform expectation-maximization step.

Parameters:
  • data (Tensor) – Input data tensor.

  • cache (Cache | None) – Optional cache dictionary with log-likelihoods.

Raises:

MissingCacheError – If required log-likelihoods are not found in cache.

Return type:

None

log_likelihood(data, cache=None)[source]

Compute log likelihood via weighted log-sum-exp.

Parameters:
  • data (Tensor) – Input data tensor.

  • cache (Cache | None) – Cache for storing intermediate computations.

Returns:

Computed log likelihood values.

Return type:

Tensor

sample(num_samples=None, data=None, is_mpe=False, cache=None, sampling_ctx=None)[source]

Generate samples by choosing mixture components.

Parameters:
  • num_samples (int | None) – Number of samples to generate.

  • data (Tensor | None) – Data tensor to fill with samples.

  • is_mpe (bool) – Whether to perform most probable explanation (MPE) sampling.

  • cache (Cache | None) – Cache for storing intermediate computations.

  • sampling_ctx (SamplingContext | None) – Sampling context for managing sampling state.

Returns:

Generated samples.

Return type:

Tensor

property feature_to_scope: ndarray

Mapping from output features to their respective scopes.

Returns:

2D-array of scopes. Each row corresponds to an output feature,

each column to a repetition.

Return type:

np.ndarray[Scope]