Base Classes¶
Core infrastructure classes that form the foundation of SPFlow’s module system.
Module¶
The abstract base class for all SPFlow modules. Every probabilistic circuit component inherits from this class.
- class spflow.modules.module.Module[source]¶
-
Abstract base class for all SPFlow probabilistic circuit modules.
Extends PyTorch’s nn.Module with probabilistic circuit functionality including scope management, caching, and standardized interfaces for inference and learning. All concrete subclasses must implement the abstract methods for log-likelihood, sampling, and marginalization.
- inputs¶
Child module in the circuit graph. None for leaf modules.
- Type:
Module | None
- scope¶
Variable scope defining which random variables this module operates on.
- Type:
Scope
- expectation_maximization(data, bias_correction=True, cache=None)[source]¶
Expectation-maximization step.
- abstractmethod log_likelihood(data, cache=None)[source]¶
Compute log likelihood P(data | module).
Computes log probability of input data under this module’s distribution. Uses log-space for numerical stability. Results should be cached for efficiency.
- Parameters:
data (
Tensor) – Input data of shape (batch_size, num_features). NaN values indicate missing values to marginalize over.cache (
Cache | None, optional) – Cache for intermediate computations. Defaults to None.
- Returns:
Log-likelihood of shape (batch_size, out_features, out_channels).
- Return type:
Tensor
- Raises:
ValueError – If input data shape is incompatible with module scope.
- abstractmethod marginalize(marg_rvs, prune=True, cache=None)[source]¶
Structurally marginalize out specified random variables from the module.
Computes a new module representing the marginal distribution by integrating out the specified variables from the structure. For data-level marginalization, use NaNs in
log_likelihoodinputs.- Parameters:
marg_rvs (
list[int]) – Random variable indices to marginalize out.prune (
bool, optional) – Whether to prune unnecessary modules during marginalization. Defaults to True.cache (
Cache | None, optional) – Cache for intermediate computations. Defaults to None.
- Returns:
Marginalized module, or None if all variables are marginalized out.
- Return type:
Module | None
- Raises:
ValueError – If marginalization variables are not in the module’s scope.
- maximum_likelihood_estimation(data, weights=None, bias_correction=True, nan_strategy='ignore', cache=None)[source]¶
Update parameters via maximum likelihood estimation.
- Parameters:
- Return type:
- mpe(num_samples=None, data=None, cache=None, sampling_ctx=None)[source]¶
Generate most probable explanation from the module’s probability distribution.
This is a convenience method that calls sample with is_mpe=True.
Handles conditional sampling through evidence in data tensor.
- Parameters:
num_samples (
int | None, optional) – Number of samples to generate. Defaults to 1.data (
Tensor | None, optional) – Pre-allocated tensor with NaN values indicating where to sample. If None, creates new tensor. Defaults to None.cache (
Cache | None, optional) – Cache for intermediate computations. Defaults to None.sampling_ctx (
SamplingContext | None, optional) – Context for routing samples through the circuit. Defaults to None.
- Returns:
MPE values of shape (batch_size, num_features).
- Return type:
Tensor
- Raises:
ValueError – If sampling parameters are incompatible.
- probability(data, cache=None)[source]¶
Computes likelihoods for modules given input data.
Likelihoods are computed from the log-likelihoods of a module.
- abstractmethod sample(num_samples=None, data=None, is_mpe=False, cache=None, sampling_ctx=None)[source]¶
Generate samples from the module’s probability distribution.
Supports both random sampling and MAP inference (via is_mpe flag). Handles conditional sampling through evidence in data tensor.
- Parameters:
num_samples (
int | None, optional) – Number of samples to generate. Defaults to 1.data (
Tensor | None, optional) – Pre-allocated tensor with NaN values indicating where to sample. If None, creates new tensor. Defaults to None.is_mpe (
bool, optional) – If True, returns most probable values instead of random samples. Defaults to False.cache (
Cache | None, optional) – Cache for intermediate computations. Defaults to None.sampling_ctx (
SamplingContext | None, optional) – Context for routing samples through the circuit. Defaults to None.
- Returns:
Sampled values of shape (batch_size, num_features).
- Return type:
Tensor
- Raises:
ValueError – If sampling parameters are incompatible.
- sample_with_evidence(evidence, is_mpe=False, cache=None, sampling_ctx=None)[source]¶
Samples from module with evidence.
This is effectively calling log_likelihood then sampling from the module with a populated cache.
- Parameters:
evidence (
Tensor) – Evidence tensor.is_mpe (
bool) – Boolean value indicating whether to perform maximum a posteriori estimation (MPE). Defaults to False.cache (
Cache|None) – Optional cache dictionary to reuse across calls.sampling_ctx (
SamplingContext|None) – Optional sampling context containing the instances (i.e., rows) ofdatato fill with sampled values and the output indices of the node to sample from.
- Return type:
- Returns:
Tensor containing the sampled values. Each row corresponds to a sample.
- to_str(format='tree', max_depth=None, show_params=True, show_scope=True)[source]¶
Convert this module to a readable string representation.
This method provides visualization formats for understanding module structure.
- Parameters:
format (
str) – Visualization format, one of: - “tree”: ASCII tree view (default, recommended) - “pytorch”: Default PyTorch formatmax_depth (
int|None) – Maximum depth to display (None = unlimited). Only applies to tree format.show_params (
bool) – Whether to show parameter shapes (Sum weights, etc.). Only applies to tree format.show_scope (
bool) – Whether to show scope information. Only applies to tree format.
- Return type:
- Returns:
String representation of the module.
Examples
>>> leaves = Normal(scope=Scope([0, 1]), out_channels=2) >>> model = Sum(inputs=leaves, out_channels=3) >>> print(model.to_str()) # Tree view (default) Sum [D=2, C=3] [weights: (2, 2, 3, 1)] → scope: 0-1 └─ Normal [D=2, C=2] → scope: 0-1 >>> print(model.to_str(format="pytorch")) # PyTorch format Sum( D=2, C=3, R=1, weights=(2, 2, 3, 1) (inputs): Normal(D=2, C=2, R=1) ) >>> print(model.to_str(max_depth=2)) # Limit depth Sum [D=2, C=3] [weights: (2, 2, 3, 1)] → scope: 0-1 └─ Normal [D=2, C=2] → scope: 0-1
- property device¶
Device where the module’s parameters are located.
Returns first parameter’s device, or CPU if no parameters exist.
- Returns:
Device where parameters are located.
- Return type:
- abstract property feature_to_scope: ndarray¶
Mapping from output features to their respective scopes.
- Returns:
- 2D-array of scopes. Each row corresponds to an output feature,
each column to a repetition.
- Return type:
np.ndarray[Scope]
- property in_shape: ModuleShape¶
Expected input tensor shape (features, channels, repetitions).
For leaf modules, returns the shape of data tensors: (features, 1, 1).
- Returns:
The expected input shape.
- Return type:
- property inputs: Module | Iterable[Module]¶
Returns the input module, or None for leaf modules.
- Returns:
The child input module, or None if this is a leaf module.
- Return type:
Module | None
- property out_shape: ModuleShape¶
Output tensor shape (features, channels, repetitions).
- Returns:
The output shape produced by this module.
- Return type:
LeafModule¶
Abstract base class for all probability distribution implementations at the leaves of the circuit.
- class spflow.modules.leaves.leaf.LeafModule(scope, out_channels=None, num_repetitions=1, params=None, parameter_fn=None, validate_args=True)[source]¶
-
- __init__(scope, out_channels=None, num_repetitions=1, params=None, parameter_fn=None, validate_args=True)[source]¶
Base class for leaf distribution modules.
- Parameters:
scope (
Scope|int|list[int]) – Variable scope (Scope, int, or list[int]).out_channels (
int) – Number of output channels (inferred from params if None).num_repetitions (
int) – Number of repetitions (for 3D event shapes).params (
list[Tensor|None] |None) – List of parameter tensors (can include None to trigger random init).parameter_fn (
Callable[[Tensor],dict[str,Tensor]]) – Optional function that takes evidence and returns distribution parameters as dictionary.validate_args (
bool|None) – Whether to enable torch.distributions argument validation.
- conditional_distribution(evidence)[source]¶
Generates torch.distributions object conditionally based on evidence.
- Parameters:
evidence (
Tensor) – Evidence tensor for conditioning.- Return type:
- Returns:
torch.distributions.Distribution constructed from conditional parameters.
- marginalize(marg_rvs, prune=True, cache=None)[source]¶
Structurally marginalize specified variables.
- maximum_likelihood_estimation(data, weights=None, bias_correction=True, nan_strategy=None, cache=None)[source]¶
Maximum (weighted) likelihood estimation via template method pattern.
Delegates distribution-specific logic to _mle_compute_statistics() hook. Weights normalized to sum to N.
- Parameters:
- Return type:
- sample(num_samples=None, data=None, is_mpe=False, cache=None, sampling_ctx=None)[source]¶
Sample from leaf distribution given potential evidence.
- Parameters:
- Return type:
- Returns:
Sampled data tensor.
- property distribution: Distribution¶
Returns the underlying torch.distributions.Distribution object.
- property feature_to_scope: ndarray[Scope]¶
Return list of scopes per feature.
- Returns:
List of Scope objects, one per feature.
- property is_conditional¶
Indicates if the leaf uses a parameter network for conditional parameters.