tbnpy.cpt ======== Overview -------- The :class:`~tbnpy.cpt.Cpt` class stores a **conditional probability tensor** (CPT) in an *event-table* form: - ``C``: an event matrix whose rows enumerate compatible assignments of ``childs + parents`` (stored as **composite-state indices** for each variable). - ``p``: probabilities for each row/event in ``C``. It also supports sampling and log-probability evaluation, including cases where parent assignments are provided as samples (optionally aligned with multiple evidence rows). Quick start ----------- .. code-block:: python import torch from tbnpy.variable import Variable from tbnpy.cpt import Cpt # Define variables (discrete example) X = Variable("X", ["0", "1"]) Y = Variable("Y", ["0", "1"]) # Event table for P(X | Y) with two events per value of Y (toy example) # Columns are [childs | parents] = [X | Y] C = [ [0, 0], # X=0, Y=0 [1, 0], # X=1, Y=0 [0, 1], # X=0, Y=1 [1, 1], # X=1, Y=1 ] p = [0.8, 0.2, 0.1, 0.9] # probabilities for each row in C cpt = Cpt(childs=[X], parents=[Y], C=C, p=p) # Sample X given parent samples Y Cs_par = torch.tensor([[0], [1], [1]]) # three parent samples: Y=0,1,1 Cs_child, logp = cpt.sample(Cs_pars=Cs_par) # Evaluate log probability of full samples [X,Y] Cs_full = torch.tensor([[0, 0], [1, 0], [1, 1]]) logp_full = cpt.log_prob(Cs_full) Public API ---------- Cpt ~~~ .. py:class:: Cpt(childs: list, parents: list = [], C = [], p = [], Cs = [], ps = [], evidence = [], device: str = "cpu") Defines a conditional probability tensor (CPT) using tensor operations (PyTorch). Parameters ---------- childs List of :class:`~tbnpy.variable.Variable` instances treated as the **child** variables. parents List of :class:`~tbnpy.variable.Variable` instances treated as the **parent** variables. C Event matrix (rows = events; columns = ``childs + parents``) storing **composite-state indices**. Accepted input types: - ``list`` (nested) - ``numpy.ndarray`` - ``torch.Tensor`` Stored as ``torch.int64``. p Probability vector aligned with rows of ``C`` (same number of rows). Accepted input types: - ``list`` - ``numpy.ndarray`` - ``torch.Tensor`` Stored as ``torch.float32``. If provided as 1D, it is reshaped to a column ``(n_events, 1)``. Cs Stored samples of assignments (optional). See :meth:`~tbnpy.cpt.Cpt.sample` and :meth:`~tbnpy.cpt.Cpt.sample_evidence`. ps Stored sampling probabilities/log-probabilities (optional). evidence Observations of the child variables (optional). See :attr:`~tbnpy.cpt.Cpt.evidence`. device Torch device specifier (e.g. ``"cpu"``, ``"cuda"``, or ``torch.device(...)``). Attributes ---------- childs : list[Variable] Child variables. parents : list[Variable] Parent variables. C : torch.Tensor Event matrix of shape ``(n_events, n_childs + n_parents)``. p : torch.Tensor Probability vector of shape ``(n_events, 1)`` (or sometimes treated as ``(n_events,)`` internally). Cs : torch.Tensor Sample matrix (shape depends on the sampling mode). ps : torch.Tensor Sampling probabilities (stored as log probabilities in some methods). evidence : torch.Tensor Evidence matrix for child variables, shape ``(n_evidence, n_childs)``. Notes ----- - For consistency, ``C`` and ``Cs`` store **composite-state indices** (not raw basic-state indices). - Some internal methods convert composite-state indices to **binary** representations for fast compatibility checks. Those helpers are documented at a high level below. Core data structures ^^^^^^^^^^^^^^^^^^^^ .. py:attribute:: Cpt.C :type: torch.Tensor Event matrix ``(n_events, n_childs + n_parents)`` of composite-state indices. .. py:attribute:: Cpt.p :type: torch.Tensor Event probabilities aligned with ``C``. Shape is typically ``(n_events, 1)``. Evidence ^^^^^^^^ .. py:attribute:: Cpt.evidence :type: torch.Tensor Evidence for the child variables. Input / storage rules --------------------- - If ``None`` or ``[]``: stored as an empty tensor of shape ``(0, n_childs)``. - If there is **one** child variable: accepts evidence of shape ``(N,)`` or ``(N, 1)``. - If there are **multiple** child variables: evidence must be shape ``(N, n_childs)``. Sampling ^^^^^^^^ .. py:method:: Cpt.sample(n_sample: int | None = None, Cs_pars: torch.Tensor | None = None, batch_size: int = 100_000) Sample child assignments from this CPT. Parameters ---------- n_sample Number of samples to generate **when there are no parents**. Cs_pars Parent samples (composite states) of shape ``(n_samples, n_parents)`` **when parents exist**. batch_size Batch size used to avoid materialising very large intermediate tensors. Returns ------- (Cs, ps) ``Cs`` is a tensor of sampled child composite states: - No parents: ``Cs`` has shape ``(n_sample, n_childs)`` - With parents: ``Cs`` has shape ``(n_samples, n_childs)`` and aligns with rows of ``Cs_pars`` ``ps`` stores the **log** probability of the selected event for each sample. Notes ----- When parents exist, sampling is performed by: 1. Converting event definitions and parent samples to a binary form; 2. Zeroing out events incompatible with each parent sample; and 3. Sampling an event index from the resulting conditional distribution. .. py:method:: Cpt.sample_evidence(Cs_pars: torch.Tensor, batch_size: int = 100_000) Sample child assignments when parent samples are **aligned with multiple evidence rows**. Parameters ---------- Cs_pars Parent samples arranged per evidence row, shape ``(n_evi, n_samples, n_parents)``. batch_size Batch size for internal computations. Returns ------- (Cs_out, ps_out) ``Cs_out`` has shape ``(n_evi, n_samples, n_childs + n_parents)``, storing ``[childs | parents]`` for each evidence row and parent sample. ``ps_out`` has shape ``(n_evi, n_samples)`` and contains log probabilities. Notes ----- The current module defines ``sample_evidence`` twice; the *second* definition (the vectorised version) is the one that is effective at runtime. Log-probability evaluation ^^^^^^^^^^^^^^^^^^^^^^^^^^ .. py:method:: Cpt.log_prob(Cs: torch.Tensor, batch_size: int = 100_000) -> torch.Tensor Compute ``log P(Cs)`` for each row of ``Cs`` where ``Cs`` stores assignments in the order ``[childs | parents]``. Parameters ---------- Cs Composite-state assignments of shape ``(n_samples, n_childs + n_parents)``. batch_size Batch size for internal compatibility checks. Returns ------- torch.Tensor Log probabilities of shape ``(n_samples,)``. Notes ----- This method treats **all variables uniformly** when checking compatibility against the event table. .. py:method:: Cpt.log_prob_evidence(Cs_par: torch.Tensor, batch_size: int = 100_000) -> torch.Tensor Compute ``log P(evidence | parents(samples))`` for each parent sample. Parameters ---------- Cs_par Parent samples (composite states). Accepted shapes: - ``(N_samples, n_parents)`` - ``(n_evi, N_samples, n_parents)`` (if evidence exists for parents too) batch_size Batch size for internal computations. Returns ------- torch.Tensor Log probabilities of shape ``(N_samples,)``. Advanced / internal helpers --------------------------- The following methods are primarily implementation details used to speed up sampling/inference. You may document them later if you decide they are part of the public API. .. py:method:: Cpt._get_C_binary() Convert the event matrix ``C`` into a padded binary tensor of shape ``(n_events, n_vars, max_basic)`` for vectorised compatibility checks. .. py:method:: Cpt.expand_and_check_compatibility(C_binary, samples) Parent-aware compatibility filtering used by :meth:`~tbnpy.cpt.Cpt.sample`. .. py:method:: Cpt.expand_and_check_compatibility_all(C_binary, samples_binary) Compatibility filtering that treats all variables uniformly, used by :meth:`~tbnpy.cpt.Cpt.log_prob`. Utility functions ----------------- .. py:function:: get_names(var_list) Return the list of ``.name`` for variables in ``var_list``. Parameters ---------- var_list List of :class:`~tbnpy.variable.Variable`. Returns ------- list[str] Names of the variables.