Hidden Chow-Liu Trees (HCLT)

Hidden Chow-Liu Trees (HCLTs) are latent-variable models where the structure is derived from a Chow-Liu tree over the observed variables, and hidden states are modeled via the channel dimension.

Reference

HCLTs and their learning algorithms are discussed in:

Overview

HCLTs represent a powerful compromise between the simplicity of tree-structured models and the expressiveness of deep circuits. By introducing latent variables at each node of a Chow-Liu tree, they can capture complex dependencies while remaining extremely efficient to learn and evaluate.

Key features:

  • Structure learning: Uses the Chow-Liu algorithm to find the optimal tree structure.

  • Top-k Mixtures: Supports building mixtures over multiple high-scoring trees for increased robustness.

  • Latent states: Each observed variable is associated with a hidden category that mediates its dependencies.

Implementation

SPFlow provides automated learners for binary and categorical HCLTs.

Binary HCLT

spflow.zoo.hclt.learn_hclt_binary(data, *, num_hidden_cats, num_trees=1, dropout_prob=0.0, weights=None, pseudocount=1.0, init='uniform', device=None, dtype=None)[source]

Learn an HCLT circuit from binary data.

Parameters:
  • data (Tensor) – (N, F) tensor with values in {0,1} (or bool). Must be complete (no NaNs).

  • num_hidden_cats (int) – Hidden categories per observed variable.

  • num_trees (int) – If >1, builds a mixture of HCLTs over the top-k Chow-Liu trees.

  • dropout_prob (float) – Edge dropout probability for top-k enumeration.

  • weights (Tensor | None) – Optional per-sample weights.

  • pseudocount (float) – MI pseudocount (ChowLiuTrees.jl semantics).

  • init (str) – “uniform” or “random” (random uses module defaults).

  • device/dtype – Optional placement overrides for created modules.

Return type:

Module

Categorical HCLT

spflow.zoo.hclt.learn_hclt_categorical(data, *, num_hidden_cats, num_cats=None, num_trees=1, dropout_prob=0.0, weights=None, pseudocount=1.0, init='uniform', device=None, dtype=None, chunk_size_pairs=4096)[source]

Learn an HCLT circuit from categorical data.

The structure is learned via a Chow-Liu tree on the observed variables, and emissions are Categorical(X_i | Z_i) with num_hidden_cats latent states.

Return type:

Module