Sum Modules

Sum

Weighted sum over input modules with learnable log-space weights.

class spflow.modules.sums.Sum(inputs, out_channels=None, num_repetitions=1, weights=None)[source]

Bases: Module

Sum module representing mixture operations in probabilistic circuits.

Implements mixture modeling by computing weighted combinations of child distributions. Weights are normalized to sum to one, maintaining valid probability distributions. Supports both single input (mixture over channels) and multiple inputs (mixture over concatenated inputs).

inputs

Input module(s) to the sum node.

Type:

Module

sum_dim

Dimension over which to sum the inputs.

Type:

int

weights

Normalized weights for mixture components.

Type:

Tensor

logits

Unnormalized log-weights for gradient optimization.

Type:

Parameter

__init__(inputs, out_channels=None, num_repetitions=1, weights=None)[source]

Create a Sum module for mixture modeling.

Weights are automatically normalized to sum to one using softmax. Multiple inputs are concatenated along dimension 2 internally.

Parameters:
  • inputs (Module | list[Module]) – Single module or list of modules to mix.

  • out_channels (int | None, optional) – Number of output mixture components. Required if weights not provided.

  • num_repetitions (int | None, optional) – Number of repetitions for structured representations. Inferred from weights if not provided.

  • weights (Tensor | list[float] | None, optional) – Initial mixture weights. Must have compatible shape with inputs and out_channels.

Raises:
expectation_maximization(data, bias_correction=True, cache=None)[source]

Perform expectation-maximization step.

Parameters:
  • data (Tensor) – Input data tensor.

  • cache (Cache | None) – Optional cache dictionary with log-likelihoods.

  • bias_correction (bool) – Whether to apply bias correction.

Raises:

MissingCacheError – If required log-likelihoods are not found in cache.

Return type:

None

log_likelihood(data, cache=None)[source]

Compute log likelihood P(data | module).

Computes log likelihood using logsumexp for numerical stability. Results are cached for parameter learning algorithms.

Parameters:
  • data (Tensor) – Input data of shape (batch_size, num_features). NaN values indicate evidence for conditional computation.

  • cache (Cache | None) – Cache for intermediate computations. Defaults to None.

Returns:

Log-likelihood of shape (batch_size, num_features, out_channels)

or (batch_size, num_features, out_channels, num_repetitions).

Return type:

Tensor

marginalize(marg_rvs, prune=True, cache=None)[source]

Marginalize out specified random variables.

Parameters:
  • marg_rvs (list[int]) – List of random variables to marginalize.

  • prune (bool) – Whether to prune the module.

  • cache (Cache | None) – Optional cache dictionary.

Return type:

Sum | None

Returns:

Marginalized Sum module or None.

maximum_likelihood_estimation(data, weights=None, cache=None)[source]

Update parameters via maximum likelihood estimation.

For Sum modules, this is equivalent to EM.

Parameters:
  • data (Tensor) – Input data tensor.

  • weights (Tensor | None) – Optional sample weights (currently unused).

  • cache (Cache | None) – Optional cache dictionary.

Return type:

None

sample(num_samples=None, data=None, is_mpe=False, cache=None, sampling_ctx=None)[source]

Generate samples from sum module.

Parameters:
  • num_samples (int | None) – Number of samples to generate.

  • data (Tensor | None) – Data tensor with NaN values to fill with samples.

  • is_mpe (bool) – Whether to perform maximum a posteriori estimation.

  • cache (Cache | None) – Optional cache dictionary.

  • sampling_ctx (SamplingContext | None) – Optional sampling context.

Returns:

Sampled values.

Return type:

Tensor

property feature_to_scope: ndarray

Mapping from output features to their respective scopes.

Returns:

2D-array of scopes. Each row corresponds to an output feature,

each column to a repetition.

Return type:

np.ndarray[Scope]

property log_weights: Tensor

Returns the log weights of all nodes as a tensor.

Returns:

Log weights normalized to sum to one.

Return type:

Tensor

property weights: Tensor

Returns the weights of all nodes as a tensor.

Returns:

Weights normalized to sum to one.

Return type:

Tensor

ElementwiseSum

Element-wise summation over multiple inputs with the same scope.

class spflow.modules.sums.ElementwiseSum(inputs, out_channels=None, weights=None, num_repetitions=None)[source]

Bases: Module

Elementwise sum operation for mixture modeling.

Computes weighted combinations of input tensors element-wise. Weights are automatically normalized to sum to one. Uses log-domain computations.

logits

Unnormalized log-weights for gradient optimization.

Type:

Parameter

unraveled_channel_indices

Mapping for flattened channel indices.

Type:

Tensor

__init__(inputs, out_channels=None, weights=None, num_repetitions=None)[source]

Initialize elementwise sum module.

Parameters:
  • inputs (list[Module]) – Input modules (same features, compatible channels).

  • out_channels (int | None) – Number of output nodes per sum. Note that this results in a total of out_channels * in_channels (input modules) output channels since we sum over the list of modules.

  • weights (Tensor | None) – Initial weights (if None, randomly initialized).

  • num_repetitions (int | None) – Number of repetitions.

expectation_maximization(data, cache=None)[source]

Perform EM step to update mixture weights.

Parameters:
  • data (Tensor) – Training data tensor.

  • cache (Cache | None) – Cache for memoization.

Return type:

None

log_likelihood(data, cache=None)[source]

Compute log likelihood via weighted log-sum-exp.

Parameters:
  • data (Tensor) – Input data tensor.

  • cache (Cache | None) – Cache for memoization.

Returns:

Computed log likelihood values.

Return type:

Tensor

marginalize(marg_rvs, prune=True, cache=None)[source]

Marginalize out specified random variables.

Parameters:
  • marg_rvs (list[int]) – Random variables to marginalize out.

  • prune (bool) – Whether to prune the resulting module.

  • cache (Cache | None) – Cache for memoization.

Returns:

Marginalized module or None if fully marginalized.

Return type:

Optional[ElementwiseSum]

maximum_likelihood_estimation(data, weights=None, cache=None)[source]

MLE step (equivalent to EM for sum nodes).

Parameters:
  • data (Tensor) – Training data tensor.

  • weights (Optional[Tensor]) – Optional weights for data points.

  • cache (Cache | None) – Cache for memoization.

Return type:

None

sample(num_samples=None, data=None, is_mpe=False, cache=None, sampling_ctx=None)[source]

Generate samples by choosing mixture components.

Parameters:
  • num_samples (int | None) – Number of samples to generate.

  • data (Tensor | None) – Existing data tensor to fill with samples.

  • is_mpe (bool) – Whether to perform most probable explanation.

  • cache (Cache | None) – Cache for memoization.

  • sampling_ctx (Optional[SamplingContext]) – Sampling context for conditional sampling.

Returns:

Generated samples.

Return type:

Tensor

property feature_to_scope: ndarray

Mapping from output features to their respective scopes.

Returns:

2D-array of scopes. Each row corresponds to an output feature,

each column to a repetition.

Return type:

np.ndarray[Scope]

property log_weights: Tensor
property weights: Tensor