Measures¶
Information Theory¶
- spflow.measures.information_theory.entropy(model, scope, *, method='mc', num_samples=10000, seed=None, channel_agg='logmeanexp', repetition_agg='logmeanexp')[source]¶
Estimate the entropy H(X) (in nats) for a subset of variables.
The returned value is in nats (natural logarithm base), consistent with SPFlow log-likelihood conventions.
- Parameters:
model (
Module) – SPFlow probabilistic circuit.scope (
Scope|int|Iterable[int]) – Variables X to compute entropy for.method (
str) – “mc” (Monte Carlo) or “exact” (enumeration for tiny discrete domains).num_samples (
int) – Number of samples for Monte Carlo estimation.seed (
int|None) – Optional seed for best-effort deterministic sampling.channel_agg (
str) – How to aggregate multiple channels (“logmeanexp”, “logsumexp”, “first”).repetition_agg (
str) – How to aggregate multiple repetitions (“logmeanexp”, “logsumexp”, “first”).
- Return type:
- Returns:
Scalar tensor containing H(X) in nats.
Weight of Evidence¶
- spflow.measures.weight_of_evidence.conditional_probability(model, *, y_index, y_value, evidence, channel_agg='logmeanexp', repetition_agg='logmeanexp')[source]¶
Compute p(y=y_value | evidence) for a discrete target variable.
- This follows the legacy SPFlow definition:
p(y|x) = p(x,y) / p(x)
- Parameters:
model (
Module) – SPFlow probabilistic circuit.y_index (
int) – Column index of the target variable Y in the data.evidence (
Tensor) – Evidence tensor of shape (batch, D) with NaNs for missing values.channel_agg (
str) – How to aggregate multiple channels (“logmeanexp”, “logsumexp”, “first”).repetition_agg (
str) – How to aggregate multiple repetitions (“logmeanexp”, “logsumexp”, “first”).
- Return type:
- Returns:
Tensor of shape (batch,) with conditional probabilities in [0, 1].
- spflow.measures.weight_of_evidence.weight_of_evidence(model, *, y_index, y_value, evidence_full, evidence_reduced, n, k=None, eps=1e-06, channel_agg='logmeanexp', repetition_agg='logmeanexp')[source]¶
Compute the weight of evidence (WoE) between two evidence settings (in nats).
- This compares evidence_full against evidence_reduced using a log-odds difference:
WoE = logit(L(p(y|e_full))) - logit(L(p(y|e_reduced)))
- where L(.) is a Laplace correction:
L(p) = (p*n + 1) / (n + k)
- Parameters:
model (
Module) – SPFlow probabilistic circuit.y_index (
int) – Column index of Y.evidence_full (
Tensor) – Evidence tensor (batch, D).evidence_reduced (
Tensor) – Evidence tensor (batch, D).n (
int) – Number of training instances used for Laplace correction.k (
int|None) – Cardinality of Y (if None, inferred for Bernoulli/Categorical).eps (
float) – Clamp used to keep probabilities away from 0/1 before logit.channel_agg (
str) – How to aggregate multiple channels (“logmeanexp”, “logsumexp”, “first”).repetition_agg (
str) – How to aggregate multiple repetitions (“logmeanexp”, “logsumexp”, “first”).
- Return type:
- Returns:
Tensor of shape (batch,) with WoE values in nats.
- spflow.measures.weight_of_evidence.weight_of_evidence_leave_one_out(model, *, y_index, y_value, x_instance, n, k=None, eps=1e-06, channel_agg='logmeanexp', repetition_agg='logmeanexp')[source]¶
Compute per-feature leave-one-out WoE attributions (legacy-style, in nats).
- For each non-NaN entry X_i in
x_instance(excludingy_index), this computes: WoE_i = logit(L(p(y|x))) - logit(L(p(y|xi)))
- Parameters:
model (
Module) – SPFlow probabilistic circuit.y_index (
int) – Column index of Y.x_instance (
Tensor) – Evidence tensor of shape (batch, D). NaNs indicate missing values.n (
int) – Number of training instances used for Laplace correction.k (
int|None) – Cardinality of Y (if None, inferred for Bernoulli/Categorical).eps (
float) – Clamp used to keep probabilities away from 0/1 before logit.channel_agg (
str) – How to aggregate multiple channels (“logmeanexp”, “logsumexp”, “first”).repetition_agg (
str) – How to aggregate multiple repetitions (“logmeanexp”, “logsumexp”, “first”).
- Return type:
- Returns:
Tensor of shape (batch, D) with WoE scores per feature and NaNs elsewhere.
- For each non-NaN entry X_i in