Sum of Compatible Squares (SOCS)

SPFlow includes an implementation of SOCS / Σ2cmp (“Sum of Compatible Squares Circuits”). SOCS turns a set of (possibly signed) compatible component circuits into a valid, non-negative probability model.

Reference

The core SOCS / Σ2cmp construction is described in the AAAI paper:

Definition

Let c_i(x) be real-valued component circuits that share the same structured decomposition (“compatible” components). SOCS defines the non-negative function:

\[c(x) = \sum_{i=1}^r c_i(x)^2\]

and the normalized density:

\[p(x) = \frac{c(x)}{Z}, \qquad Z = \int c(x) \, dx = \sum_{i=1}^r \int c_i(x)^2 \, dx.\]

In SPFlow, SOCS is implemented as the wrapper module spflow.zoo.sos.SOCS.

Why “signed” components?

Standard SPFlow sum nodes (spflow.modules.sums.Sum) represent convex mixtures and require strictly positive weights. To represent signed circuits, the Paper Zoo provides spflow.zoo.sos.SignedSum, which allows real-valued (including negative) weights.

Important: SignedSum is not a probabilistic mixture node (its output may be negative), so it does not implement log_likelihood. Instead, SOCS evaluates signed components using a stable signed representation internally.

Exact normalization via inner products

SOCS needs the normalization terms:

\[Z_i = \int c_i(x)^2 \, dx.\]

Rather than building an explicit “squared circuit”, SPFlow computes these terms using an exact, bottom-up inner-product dynamic program implemented in spflow.utils.inner_product.

The implementation supports exact inner products for common leaves (and can be extended by adding new closed-form formulas in spflow/exp/sos/inner_product.py). Currently supported include:

  • Normal, Bernoulli, Categorical

  • Exponential, Laplace, LogNormal

  • Poisson, Gamma

  • CLTree (only when both trees share the same structure)

Non-scalar outputs

SPFlow modules return log-likelihood tensors with shape (batch, F, C, R) where:

  • F = output features

  • C = output channels

  • R = repetitions

SOCS supports non-scalar component outputs and normalizes per output entry:

\[Z[f,c,r] = \sum_i \int c_{i,f,c,r}(x)^2 \, dx.\]

This makes SOCS usable as a building block inside larger architectures (e.g., class-conditional SOCS with channels = num_classes).

Sampling (signed components)

Sampling from the density proportional to c_i(x)^2 is not generally tractable with the standard top-down sampler when c_i is signed.

SPFlow implements an independence Metropolis–Hastings sampler for scalar-output SOCS (out_shape == (1,1,1)), targeting:

\[\pi_i(x) \propto c_i(x)^2.\]

For signed components, the proposal distribution is built by replacing each SignedSum node with a standard Sum using abs(weights) (normalized). This yields a monotone proposal circuit q_i(x) that supports both sample() and log_likelihood().

The MH kernel uses the acceptance rule:

\[\alpha(x \to x') = \min\left(1, \frac{\pi_i(x') \, q_i(x)}{\pi_i(x) \, q_i(x')}\right).\]

Configuration

Sampling behavior can be controlled via spflow.utils.cache.Cache extras:

  • cache.extras["socs_mcmc_steps"]: number of MH steps after burn-in (default: 50)

  • cache.extras["socs_mcmc_burn_in"]: burn-in steps (default: 10)

Limitations

  • SOCS sampling currently supports unconditional sampling only (no evidence / NaN-conditional sampling).

  • SOCS sampling currently supports only scalar outputs (out_shape == (1,1,1)).

Compatibility checks

SOCS assumes component circuits are compatible (same decomposition / region graph). SPFlow provides conservative structural checks in spflow.utils.compatibility:

These utilities verify that corresponding nodes across components have the same “skeleton” (node types, scopes, arities, and selected structural metadata like Cat.dim and CLTree.parents).

Structure builder

To reduce boilerplate, SPFlow includes a small builder that clones a template circuit into multiple compatible components and optionally converts all Sum nodes to SignedSum nodes:

Minimal example

Build a SOCS model from one signed component and evaluate it:

import torch
from spflow.meta.data.scope import Scope
from spflow.modules.leaves import Bernoulli
from spflow.zoo.sos import SignedSum, SOCS

b1 = Bernoulli(scope=Scope([0]), probs=torch.tensor([[[0.2]]]))
b2 = Bernoulli(scope=Scope([0]), probs=torch.tensor([[[0.8]]]))
comp = SignedSum(inputs=[b1, b2], weights=torch.tensor([[[[0.9]], [[-0.2]]]]))

model = SOCS([comp])
x = torch.tensor([[0.0], [1.0]])
ll = model.log_likelihood(x)  # (B,1,1,1)

API Reference

class spflow.zoo.sos.SOCS(components)[source]

Bases: Module

Sum of Compatible Squares (SOCS) wrapper module.

Represents a non-negative density of the form:

c(x) = Σ_i c_i(x)^2 p(x) = c(x) / Z, where Z = ∫ c(x) dx = Σ_i ∫ c_i(x)^2 dx

Notes

  • log_likelihood() is supported for signed components built with SignedSum.

  • sample() is supported only when all components are standard monotone SPFlow PCs (i.e., do not contain SignedSum), using a Metropolis–Hastings independence sampler.

log_likelihood(data, cache=None)[source]

Compute log likelihood P(data | module).

Computes log probability of input data under this module’s distribution. Uses log-space for numerical stability. Results should be cached for efficiency.

Parameters:
  • data (Tensor) – Input data of shape (batch_size, num_features). NaN values indicate missing values to marginalize over.

  • cache (Cache | None, optional) – Cache for intermediate computations. Defaults to None.

Returns:

Log-likelihood of shape (batch_size, out_features, out_channels).

Return type:

Tensor

Raises:

ValueError – If input data shape is incompatible with module scope.

marginalize(marg_rvs, prune=True, cache=None)[source]

Structurally marginalize out specified random variables from the module.

Computes a new module representing the marginal distribution by integrating out the specified variables from the structure. For data-level marginalization, use NaNs in log_likelihood inputs.

Parameters:
  • marg_rvs (list[int]) – Random variable indices to marginalize out.

  • prune (bool, optional) – Whether to prune unnecessary modules during marginalization. Defaults to True.

  • cache (Cache | None, optional) – Cache for intermediate computations. Defaults to None.

Returns:

Marginalized module, or None if all variables are marginalized out.

Return type:

Module | None

Raises:

ValueError – If marginalization variables are not in the module’s scope.

sample(num_samples=None, data=None, is_mpe=False, cache=None)[source]

Generate samples from the module’s probability distribution.

Supports both random sampling and MAP inference (via is_mpe flag). Handles conditional sampling through evidence in data tensor.

Parameters:
  • num_samples (int | None, optional) – Number of samples to generate. Defaults to 1.

  • data (Tensor | None, optional) – Pre-allocated tensor with NaN values indicating where to sample. If None, creates new tensor. Defaults to None.

  • is_mpe (bool, optional) – If True, returns most probable values instead of random samples. Defaults to False.

  • cache (Cache | None, optional) – Cache for intermediate computations. Defaults to None.

Returns:

Sampled values of shape (batch_size, num_features).

Return type:

Tensor

Raises:

ValueError – If sampling parameters are incompatible.

property feature_to_scope: ndarray

Mapping from output features to their respective scopes.

Returns:

2D-array of scopes. Each row corresponds to an output feature,

each column to a repetition.

Return type:

np.ndarray[Scope]

class spflow.zoo.sos.SignedSum(inputs, out_channels=1, num_repetitions=1, weights=None)[source]

Bases: Module

Linear-combination node that allows negative, non-normalized weights.

This node is not a probabilistic mixture node. It represents a real-valued linear combination of input channels:

y = Σ_j w_j * x_j

where weights may be negative and do not need to sum to one.

Notes

  • SignedSum does not implement log_likelihood() because its output may be negative (log is undefined). Use SOCS or signed evaluation utilities for inference.

  • sample() is only supported when all weights are non-negative and no evidence is present, in which case it behaves like an unnormalized mixture over inputs.

log_likelihood(data, cache=None)[source]

Compute log likelihood P(data | module).

Computes log probability of input data under this module’s distribution. Uses log-space for numerical stability. Results should be cached for efficiency.

Parameters:
  • data (Tensor) – Input data of shape (batch_size, num_features). NaN values indicate missing values to marginalize over.

  • cache (Cache | None, optional) – Cache for intermediate computations. Defaults to None.

Returns:

Log-likelihood of shape (batch_size, out_features, out_channels).

Return type:

Tensor

Raises:

ValueError – If input data shape is incompatible with module scope.

marginalize(marg_rvs, prune=True, cache=None)[source]

Structurally marginalize out specified random variables from the module.

Computes a new module representing the marginal distribution by integrating out the specified variables from the structure. For data-level marginalization, use NaNs in log_likelihood inputs.

Parameters:
  • marg_rvs (list[int]) – Random variable indices to marginalize out.

  • prune (bool, optional) – Whether to prune unnecessary modules during marginalization. Defaults to True.

  • cache (Cache | None, optional) – Cache for intermediate computations. Defaults to None.

Returns:

Marginalized module, or None if all variables are marginalized out.

Return type:

Module | None

Raises:

ValueError – If marginalization variables are not in the module’s scope.

signed_logabs_and_sign(data, cache=None)[source]

Evaluate this node in (log|·|, sign) form.

Returns:

Tensor of shape (B, F, OC, R) sign: Tensor of shape (B, F, OC, R) in {-1,0,+1}

Return type:

logabs

property feature_to_scope: ndarray

Mapping from output features to their respective scopes.

Returns:

2D-array of scopes. Each row corresponds to an output feature,

each column to a repetition.

Return type:

np.ndarray[Scope]

Builders

spflow.zoo.sos.build_socs(template, *, num_components, signed=True, noise_scale=0.05, flip_prob=0.5, seed=None)[source]

Build a SOCS model from a compatible component template.

Parameters:
  • template (Module) – A SPFlow module representing a (typically scalar-output) circuit. This circuit is deep-copied num_components times to ensure all components share the same structure.

  • num_components (int) – Number of components r.

  • signed (bool) – If True, convert all Sum nodes in each clone to SignedSum nodes with perturbed weights (allowing negative weights).

  • noise_scale (float) – Standard deviation of additive Gaussian noise applied to copied weights when signed=True.

  • flip_prob (float) – Probability of flipping the sign of each weight entry when signed=True. Must be in [0, 1].

  • seed (int | None) – Optional random seed used for weight perturbations.

Return type:

SOCS

Returns:

A SOCS module with num_components compatible components.

spflow.zoo.sos.build_abs_weight_proposal(component, *, eps=1e-08)[source]

Build a monotone proposal q(x) from a (possibly signed) component.

Replaces each SignedSum with a standard Sum whose weights are proportional to abs(weights), ensuring q is non-negative and normalized at each sum node.

Parameters:
  • component (Module) – Component circuit to convert.

  • eps (float) – Small additive constant to avoid all-zero abs weights.

Return type:

Module

Returns:

A new Module that supports .sample() and .log_likelihood() and can be used as an independence proposal.

Compatibility

spflow.zoo.sos.check_compatible_components(components)[source]

Raise if components are not structurally compatible.

Return type:

None

spflow.zoo.sos.check_socs_compatibility(model)[source]

Convenience wrapper for SOCS.

Return type:

None

Exact Inner Products

spflow.zoo.sos.inner_product_matrix(a, b, *, cache=None)[source]
Return type:

Tensor

spflow.zoo.sos.leaf_inner_product(a, b)[source]

Compute per-feature/channel inner products ∫ f_a(x) f_b(x) dx for leaves.

Return type:

Tensor

spflow.zoo.sos.log_self_inner_product_scalar(module, *, cache=None)[source]
Return type:

Tensor

Signed Semiring Utilities

spflow.zoo.sos.signed_logsumexp(logabs_terms, sign_terms, dim, keepdim=False, eps=0.0)[source]

Compute log|Σ_i s_i exp(a_i)| and sign of the sum in a stable way.

Parameters:
  • logabs_terms (Tensor) – Log-absolute-values a_i of the terms.

  • sign_terms (Tensor) – Signs s_i of the terms in {-1, 0, +1}. Must be broadcastable.

  • dim (int) – Dimension to reduce over.

  • keepdim (bool) – Whether to keep the reduced dimension.

  • eps (float) – Additive epsilon to avoid log(0) in edge cases.

Return type:

tuple[Tensor, Tensor]

Returns:

(logabs_sum, sign_sum)

spflow.zoo.sos.sign_of(x)[source]

Return sign(x) in {-1, 0, +1} as an integer tensor.

Return type:

Tensor

spflow.zoo.sos.logabs_of(x, eps=0.0)[source]

Return log(|x|), with optional epsilon to avoid log(0).

Parameters:
  • x (Tensor) – Input tensor.

  • eps (float) – If > 0, computes log(|x| + eps).

Return type:

Tensor