Base Classes¶

Core infrastructure classes that form the foundation of SPFlow’s module system.

Module¶

The abstract base class for all SPFlow modules. Every probabilistic circuit component inherits from this class.

class spflow.modules.module.Module[source]

Bases: Module, ABC

Abstract base class for all SPFlow probabilistic circuit modules.

Extends PyTorch’s nn.Module with probabilistic circuit functionality including scope management, caching, and standardized interfaces for inference and learning. All concrete subclasses must implement the abstract methods for log-likelihood, sampling, and marginalization.

inputs

Child module in the circuit graph. None for leaf modules.

Type:: Module | None

scope

Variable scope defining which random variables this module operates on.

Type:: Scope

__init__()[source]

Initialize the module with no input.

forward(data, cache=None)[source]

Forward pass is simply the log-likelihood function.

Parameters:

data (Tensor) – Input data tensor.
cache (Cache | None) – Optional cache dictionary.

abstractmethod log_likelihood(data, cache=None)[source]

Compute log likelihood P(data | module).

Computes log probability of input data under this module’s distribution. Uses log-space for numerical stability. Results should be cached for efficiency.

Parameters:

data (Tensor) – Input data of shape (batch_size, num_features). NaN values indicate missing values to marginalize over.
cache (Cache | None, optional) – Cache for intermediate computations. Defaults to None.

Returns:

Log-likelihood of shape (batch_size, out_features, out_channels).

Return type:

Tensor

Raises:

ValueError – If input data shape is incompatible with module scope.

abstractmethod marginalize(marg_rvs, prune=True, cache=None)[source]

Structurally marginalize out specified random variables from the module.

Computes a new module representing the marginal distribution by integrating out the specified variables from the structure. For data-level marginalization, use NaNs in log_likelihood inputs.

Parameters:

marg_rvs (list[int]) – Random variable indices to marginalize out.
prune (bool, optional) – Whether to prune unnecessary modules during marginalization. Defaults to True.
cache (Cache | None, optional) – Cache for intermediate computations. Defaults to None.

Returns:

Marginalized module, or None if all variables are marginalized out.

Return type:

Module | None

Raises:

ValueError – If marginalization variables are not in the module’s scope.

mpe(num_samples=None, data=None, cache=None, return_leaf_params=False)[source]

Generate most probable explanation from the module’s probability distribution.

This is a convenience method that calls sample with is_mpe=True.

Handles conditional sampling through evidence in data tensor.

Parameters:

num_samples (int | None, optional) – Number of samples to generate. Defaults to 1.
data (Tensor | None, optional) – Pre-allocated tensor with NaN values indicating where to sample. If None, creates new tensor. Defaults to None.
cache (Cache | None, optional) – Cache for intermediate computations. Defaults to None.
return_leaf_params (bool, optional) – If True, also return leaf distribution parameters gathered during traversal.

Returns:

MPE values of shape (batch_size, num_features).

Return type:

Tensor

Raises:

ValueError – If sampling parameters are incompatible.

print_structure_stats()[source]

Return a readable text overview of this module’s structure statistics.

This is intended for quick debugging/logging in experiments and mirrors the traversal behavior used by to_str() (skipping internal Cat and ModuleList wrappers).

Return type:: str
Returns:: Multi-line string summary of structure statistics.

probability(data, cache=None)[source]

Computes likelihoods for modules given input data.

Likelihoods are computed from the log-likelihoods of a module.

Parameters:

data (Tensor) – Tensor containing the input data. Each row corresponds to a sample.
cache (Cache | None) – Optional cache dictionary.

Return type:

Tensor

Returns:

Tensor containing the likelihoods of the input data. Each row corresponds to an input sample.

sample(num_samples=None, data=None, is_mpe=False, cache=None, return_leaf_params=False)[source]

Generate samples from the module’s probability distribution.

Supports both random sampling and MAP inference (via is_mpe flag). Handles conditional sampling through evidence in data tensor.

Parameters:

num_samples (int | None, optional) – Number of samples to generate. Defaults to 1.
data (Tensor | None, optional) – Pre-allocated tensor with NaN values indicating where to sample. If None, creates new tensor. Defaults to None.
is_mpe (bool, optional) – If True, returns most probable values instead of random samples. Defaults to False.
cache (Cache | None, optional) – Cache for intermediate computations. Defaults to None.
return_leaf_params (bool, optional) – If True, also return leaf distribution parameters gathered during traversal.

Return type:

Tensor | tuple[Tensor, list[LeafParamRecord]]

Returns:

Sampled values of shape (batch_size, num_features), optionally with collected leaf-parameter records.

Raises:

ValueError – If sampling parameters are incompatible.

sample_with_evidence(evidence, is_mpe=False, cache=None, return_leaf_params=False)[source]

Samples from module with evidence.

This is effectively calling log_likelihood then sampling from the module with a populated cache.

Parameters:

evidence (Tensor) – Evidence tensor.
is_mpe (bool) – Boolean value indicating whether to perform maximum a posteriori estimation (MPE). Defaults to False.
cache (Cache | None) – Optional cache dictionary to reuse across calls.
return_leaf_params (bool) – If True, also return leaf distribution parameters gathered during traversal.

Return type:

Tensor | tuple[Tensor, list[LeafParamRecord]]

Returns:

Tensor containing the sampled values. Each row corresponds to a sample.

to_str(format='tree', max_depth=None, show_params=True, show_scope=True)[source]

Convert this module to a readable string representation.

This method provides visualization formats for understanding module structure.

Parameters:

format (str) – Visualization format, one of: - “tree”: ASCII tree view (default, recommended) - “pytorch”: Default PyTorch format
max_depth (int | None) – Maximum depth to display (None = unlimited). Only applies to tree format.
show_params (bool) – Whether to show parameter shapes (Sum weights, etc.). Only applies to tree format.
show_scope (bool) – Whether to show scope information. Only applies to tree format.

Return type:

str

Returns:

String representation of the module.

Examples

>>> leaves = Normal(scope=Scope([0, 1]), out_channels=2)
>>> model = Sum(inputs=leaves, out_channels=3)
>>> print(model.to_str())  # Tree view (default)
Sum [D=2, C=3] [weights: (2, 2, 3, 1)] → scope: 0-1
└─ Normal [D=2, C=2] → scope: 0-1
>>> print(model.to_str(format="pytorch"))  # PyTorch format
Sum(
  D=2, C=3, R=1, weights=(2, 2, 3, 1)
  (inputs): Normal(D=2, C=2, R=1)
)
>>> print(model.to_str(max_depth=2))  # Limit depth
Sum [D=2, C=3] [weights: (2, 2, 3, 1)] → scope: 0-1
└─ Normal [D=2, C=2] → scope: 0-1

property device

Device where the module’s parameters are located.

Returns first parameter’s device, or CPU if no parameters exist.

Returns:: Device where parameters are located.
Return type:: torch.device

abstract property feature_to_scope: ndarray

Mapping from output features to their respective scopes.

Returns:

2D-array of scopes. Each row corresponds to an output feature,: each column to a repetition.

Return type:

np.ndarray[Scope]

property in_shape: ModuleShape

Expected input tensor shape (features, channels, repetitions).

For leaf modules, returns the shape of data tensors: (features, 1, 1).

Returns:: The expected input shape.
Return type:: ModuleShape

property inputs: Module | Iterable[Module]

Returns the input module, or None for leaf modules.

Returns:: The child input module, or None if this is a leaf module.
Return type:: Module | None

property out_shape: ModuleShape

Output tensor shape (features, channels, repetitions).

Returns:: The output shape produced by this module.
Return type:: ModuleShape

property scope: Scope

Variable scope defining which random variables this module operates on.

Returns:: The module’s scope.
Return type:: Scope

LeafModule¶

Abstract base class for all probability distribution implementations at the leaves of the circuit.

class spflow.modules.leaves.leaf.LeafModule(scope, out_channels=1, num_repetitions=1, params=None, parameter_fn=None, validate_args=True)[source]¶

Bases: Module, ABC

__init__(scope, out_channels=1, num_repetitions=1, params=None, parameter_fn=None, validate_args=True)[source]¶

Base class for leaf distribution modules.

Parameters:

scope (Union[Scope, int, Iterable[int]]) – Variable scope. Can be a Scope object, a single integer, or an iterable of integers (list, tuple, numpy array, torch tensor, etc.).
out_channels (int) – Number of output channels (inferred from params if None).
num_repetitions (int) – Number of repetitions (for 3D event shapes).
params (list[Tensor | None] | None) – List of parameter tensors (can include None to trigger random init).
parameter_fn (Callable[[Tensor], dict[str, Tensor]]) – Optional function that takes evidence and returns distribution parameters as dictionary.
validate_args (bool | None) – Whether to enable torch.distributions argument validation.

conditional_distribution(evidence, with_differentiable_sampling=False)[source]¶

Generate a conditional distribution from evidence.

Parameters:

evidence (Tensor) – Evidence tensor for conditioning.
with_differentiable_sampling (bool) – Hook for subclasses to return an alternative distribution with differentiable sampling when needed. Ignored by the base implementation.

Return type:

Distribution

Returns:

torch.distributions.Distribution constructed from conditional parameters.

distribution(with_differentiable_sampling=False)[source]¶

Return this leaf’s distribution.

Parameters:: with_differentiable_sampling (bool) – Hook for subclasses to return an alternative differentiable distribution when sampling requires gradient flow. Ignored by the base implementation.
Return type:: Distribution

log_likelihood(data, cache=None)[source]¶

Compute log-likelihoods, marginalizing over NaN values.

Parameters:

data (Tensor | IntervalEvidence) – Input data tensor or IntervalEvidence for range queries.
cache (Cache | None) – Optional cache dictionary.

Return type:

Tensor

Returns:

Log-likelihood tensor.

marginalize(marg_rvs, prune=True, cache=None)[source]¶

Structurally marginalize specified variables.

Parameters:

marg_rvs (list[int]) – Variable indices to marginalize.
prune (bool) – Unused (for interface consistency).
cache (Cache | None) – Optional cache dictionary.

Return type:

Optional[LeafModule]

Returns:

Marginalized leaf or None if fully marginalized.

marginalized_params(indices)[source]¶

Return parameters marginalized to specified indices.

Parameters:: indices (list[int]) – List of indices to marginalize to.
Return type:: dict[str, Tensor]
Returns:: Dictionary of marginalized parameters.

maximum_likelihood_estimation(data, weights=None, bias_correction=True, nan_strategy=None)[source]¶

Maximum (weighted) likelihood estimation via template method pattern.

Delegates distribution-specific logic to _mle_compute_statistics() hook. Weights normalized to sum to N.

Parameters:

data (Tensor) – Input data tensor.
weights (Optional[Tensor]) – Optional sample weights.
bias_correction (bool) – Apply bias correction.
nan_strategy (Union[str, Callable, None]) – Handle NaN (‘ignore’, callable, or None).

Return type:

None

mode(is_differentiable=False)[source]¶

Return distribution mode.

Parameters:: is_differentiable (bool) – Whether to return the mode from the differentiable distribution (if supported).
Return type:: Tensor
Returns:: Mode of the distribution.

abstractmethod params()[source]¶

Returns the parameters of the distribution.

Return type:: Dict[str, Tensor]

property device: device¶

Return device of first parameter or buffer.

Returns:: Device of the module.

property event_shape: tuple[int, ...]¶

Return event shape.

Returns:: Event shape tuple.

property feature_to_scope: ndarray[Scope]¶

Return list of scopes per feature.

Returns:: List of Scope objects, one per feature.

property inputs: Module | Iterable[Module]¶: Leaf modules do not have inputs.

property is_conditional¶: Indicates if the leaf uses a parameter network for conditional parameters.