Sum of Compatible Squares (SOCS)¶
SPFlow includes an implementation of SOCS / Σ2cmp (“Sum of Compatible Squares Circuits”). SOCS turns a set of (possibly signed) compatible component circuits into a valid, non-negative probability model.
Reference¶
The core SOCS / Σ2cmp construction is described in the AAAI paper:
Definition¶
Let c_i(x) be real-valued component circuits that share the same structured decomposition
(“compatible” components).
SOCS defines the non-negative function:
and the normalized density:
In SPFlow, SOCS is implemented as the wrapper module spflow.zoo.sos.SOCS.
Why “signed” components?¶
Standard SPFlow sum nodes (spflow.modules.sums.Sum) represent convex mixtures and require
strictly positive weights.
To represent signed circuits, the Paper Zoo provides spflow.zoo.sos.SignedSum, which allows
real-valued (including negative) weights.
Important: SignedSum is not a probabilistic mixture node (its output may be negative), so it
does not implement log_likelihood. Instead, SOCS evaluates signed components using a stable
signed representation internally.
Exact normalization via inner products¶
SOCS needs the normalization terms:
Rather than building an explicit “squared circuit”, SPFlow computes these terms using an exact,
bottom-up inner-product dynamic program implemented in spflow.utils.inner_product.
The implementation supports exact inner products for common leaves (and can be extended by adding
new closed-form formulas in spflow/exp/sos/inner_product.py). Currently supported include:
Normal,Bernoulli,CategoricalExponential,Laplace,LogNormalPoisson,GammaCLTree(only when both trees share the same structure)
Non-scalar outputs¶
SPFlow modules return log-likelihood tensors with shape (batch, F, C, R) where:
F= output featuresC= output channelsR= repetitions
SOCS supports non-scalar component outputs and normalizes per output entry:
This makes SOCS usable as a building block inside larger architectures (e.g., class-conditional
SOCS with channels = num_classes).
Sampling (signed components)¶
Sampling from the density proportional to c_i(x)^2 is not generally tractable with the standard
top-down sampler when c_i is signed.
SPFlow implements an independence Metropolis–Hastings sampler for scalar-output SOCS
(out_shape == (1,1,1)), targeting:
For signed components, the proposal distribution is built by replacing each SignedSum node with
a standard Sum using abs(weights) (normalized). This yields a monotone proposal circuit
q_i(x) that supports both sample() and log_likelihood().
The MH kernel uses the acceptance rule:
Configuration¶
Sampling behavior can be controlled via spflow.utils.cache.Cache extras:
cache.extras["socs_mcmc_steps"]: number of MH steps after burn-in (default: 50)cache.extras["socs_mcmc_burn_in"]: burn-in steps (default: 10)
Limitations¶
SOCS sampling currently supports unconditional sampling only (no evidence / NaN-conditional sampling).
SOCS sampling currently supports only scalar outputs (
out_shape == (1,1,1)).
Compatibility checks¶
SOCS assumes component circuits are compatible (same decomposition / region graph).
SPFlow provides conservative structural checks in spflow.utils.compatibility:
These utilities verify that corresponding nodes across components have the same “skeleton”
(node types, scopes, arities, and selected structural metadata like Cat.dim and CLTree.parents).
Structure builder¶
To reduce boilerplate, SPFlow includes a small builder that clones a template circuit into multiple
compatible components and optionally converts all Sum nodes to SignedSum nodes:
Minimal example¶
Build a SOCS model from one signed component and evaluate it:
import torch
from spflow.meta.data.scope import Scope
from spflow.modules.leaves import Bernoulli
from spflow.zoo.sos import SignedSum, SOCS
b1 = Bernoulli(scope=Scope([0]), probs=torch.tensor([[[0.2]]]))
b2 = Bernoulli(scope=Scope([0]), probs=torch.tensor([[[0.8]]]))
comp = SignedSum(inputs=[b1, b2], weights=torch.tensor([[[[0.9]], [[-0.2]]]]))
model = SOCS([comp])
x = torch.tensor([[0.0], [1.0]])
ll = model.log_likelihood(x) # (B,1,1,1)
API Reference¶
- class spflow.zoo.sos.SOCS(components)[source]¶
Bases:
ModuleSum of Compatible Squares (SOCS) wrapper module.
Represents a non-negative density of the form:
c(x) = Σ_i c_i(x)^2 p(x) = c(x) / Z, where Z = ∫ c(x) dx = Σ_i ∫ c_i(x)^2 dx
Notes
log_likelihood() is supported for signed components built with SignedSum.
sample() is supported only when all components are standard monotone SPFlow PCs (i.e., do not contain SignedSum), using a Metropolis–Hastings independence sampler.
- log_likelihood(data, cache=None)[source]¶
Compute log likelihood P(data | module).
Computes log probability of input data under this module’s distribution. Uses log-space for numerical stability. Results should be cached for efficiency.
- Parameters:
data (
Tensor) – Input data of shape (batch_size, num_features). NaN values indicate missing values to marginalize over.cache (
Cache | None, optional) – Cache for intermediate computations. Defaults to None.
- Returns:
Log-likelihood of shape (batch_size, out_features, out_channels).
- Return type:
Tensor
- Raises:
ValueError – If input data shape is incompatible with module scope.
- marginalize(marg_rvs, prune=True, cache=None)[source]¶
Structurally marginalize out specified random variables from the module.
Computes a new module representing the marginal distribution by integrating out the specified variables from the structure. For data-level marginalization, use NaNs in
log_likelihoodinputs.- Parameters:
marg_rvs (
list[int]) – Random variable indices to marginalize out.prune (
bool, optional) – Whether to prune unnecessary modules during marginalization. Defaults to True.cache (
Cache | None, optional) – Cache for intermediate computations. Defaults to None.
- Returns:
Marginalized module, or None if all variables are marginalized out.
- Return type:
Module | None
- Raises:
ValueError – If marginalization variables are not in the module’s scope.
- sample(num_samples=None, data=None, is_mpe=False, cache=None)[source]¶
Generate samples from the module’s probability distribution.
Supports both random sampling and MAP inference (via is_mpe flag). Handles conditional sampling through evidence in data tensor.
- Parameters:
num_samples (
int | None, optional) – Number of samples to generate. Defaults to 1.data (
Tensor | None, optional) – Pre-allocated tensor with NaN values indicating where to sample. If None, creates new tensor. Defaults to None.is_mpe (
bool, optional) – If True, returns most probable values instead of random samples. Defaults to False.cache (
Cache | None, optional) – Cache for intermediate computations. Defaults to None.
- Returns:
Sampled values of shape (batch_size, num_features).
- Return type:
Tensor
- Raises:
ValueError – If sampling parameters are incompatible.
- class spflow.zoo.sos.SignedSum(inputs, out_channels=1, num_repetitions=1, weights=None)[source]¶
Bases:
ModuleLinear-combination node that allows negative, non-normalized weights.
This node is not a probabilistic mixture node. It represents a real-valued linear combination of input channels:
y = Σ_j w_j * x_j
where weights may be negative and do not need to sum to one.
Notes
SignedSum does not implement log_likelihood() because its output may be negative (log is undefined). Use SOCS or signed evaluation utilities for inference.
sample() is only supported when all weights are non-negative and no evidence is present, in which case it behaves like an unnormalized mixture over inputs.
- log_likelihood(data, cache=None)[source]¶
Compute log likelihood P(data | module).
Computes log probability of input data under this module’s distribution. Uses log-space for numerical stability. Results should be cached for efficiency.
- Parameters:
data (
Tensor) – Input data of shape (batch_size, num_features). NaN values indicate missing values to marginalize over.cache (
Cache | None, optional) – Cache for intermediate computations. Defaults to None.
- Returns:
Log-likelihood of shape (batch_size, out_features, out_channels).
- Return type:
Tensor
- Raises:
ValueError – If input data shape is incompatible with module scope.
- marginalize(marg_rvs, prune=True, cache=None)[source]¶
Structurally marginalize out specified random variables from the module.
Computes a new module representing the marginal distribution by integrating out the specified variables from the structure. For data-level marginalization, use NaNs in
log_likelihoodinputs.- Parameters:
marg_rvs (
list[int]) – Random variable indices to marginalize out.prune (
bool, optional) – Whether to prune unnecessary modules during marginalization. Defaults to True.cache (
Cache | None, optional) – Cache for intermediate computations. Defaults to None.
- Returns:
Marginalized module, or None if all variables are marginalized out.
- Return type:
Module | None
- Raises:
ValueError – If marginalization variables are not in the module’s scope.
Builders¶
- spflow.zoo.sos.build_socs(template, *, num_components, signed=True, noise_scale=0.05, flip_prob=0.5, seed=None)[source]¶
Build a SOCS model from a compatible component template.
- Parameters:
template (
Module) – A SPFlow module representing a (typically scalar-output) circuit. This circuit is deep-copied num_components times to ensure all components share the same structure.num_components (
int) – Number of components r.signed (
bool) – If True, convert all Sum nodes in each clone to SignedSum nodes with perturbed weights (allowing negative weights).noise_scale (
float) – Standard deviation of additive Gaussian noise applied to copied weights when signed=True.flip_prob (
float) – Probability of flipping the sign of each weight entry when signed=True. Must be in [0, 1].seed (
int|None) – Optional random seed used for weight perturbations.
- Return type:
- Returns:
A SOCS module with num_components compatible components.
- spflow.zoo.sos.build_abs_weight_proposal(component, *, eps=1e-08)[source]¶
Build a monotone proposal q(x) from a (possibly signed) component.
Replaces each SignedSum with a standard Sum whose weights are proportional to abs(weights), ensuring q is non-negative and normalized at each sum node.
- Parameters:
component (
Module) – Component circuit to convert.eps (
float) – Small additive constant to avoid all-zero abs weights.
- Return type:
Module- Returns:
A new Module that supports .sample() and .log_likelihood() and can be used as an independence proposal.
Compatibility¶
Exact Inner Products¶
Signed Semiring Utilities¶
- spflow.zoo.sos.signed_logsumexp(logabs_terms, sign_terms, dim, keepdim=False, eps=0.0)[source]¶
Compute log|Σ_i s_i exp(a_i)| and sign of the sum in a stable way.
- Parameters:
logabs_terms (
Tensor) – Log-absolute-values a_i of the terms.sign_terms (
Tensor) – Signs s_i of the terms in {-1, 0, +1}. Must be broadcastable.dim (
int) – Dimension to reduce over.keepdim (
bool) – Whether to keep the reduced dimension.eps (
float) – Additive epsilon to avoid log(0) in edge cases.
- Return type:
- Returns:
(logabs_sum, sign_sum)