Guarantees And Scope¶
This page separates implemented guarantees from research demonstrations and unsupported extensions.
The public API — the names exported from the top-level causalrl package — is considered stable
under semantic versioning. It is released as v0.99.0, a humble step short
of a 1.0 tag while the API settles in real use.
Stable Contracts¶
Graph And Intervention Sets¶
CausalGraph represents an ADMG for analytical operations. pomis and
minimal_intervention_sets implement the single-reward structural-causal-bandit slice and
support an explicit manipulable subset through latent projection.
from causalrl import CausalGraph, pomis
graph = CausalGraph(
directed_edges=[("X", "Z"), ("Z", "Y")],
bidirected_edges=[("X", "Y")],
)
assert set(pomis(graph, "Y", manipulable={"X"})) == {frozenset(), frozenset({"X"})}
Executable SCMs¶
StructuralCausalModel executes explicit-latent DAGs only. An ADMG with a bidirected edge is
useful for graph analysis, but cannot be sampled as an SCM unless the shared latent causes are
represented as nodes. Constructors validate graph/mechanism/exogenous-distribution alignment.
Sampling uses private Torch state and does not overwrite an experiment's global RNG state.
Environment Interoperability¶
Public environments satisfy Gymnasium's environment checker. reset(seed=...) follows
Gymnasium seeding behavior, and rollout utilities end episodes on either terminated or
truncated.
Assumption-Dependent Methods¶
Multi-stage DOVI propagates value through learned transitions. Its causal interpretation
requires transitions that are not confounded:
from causalrl import DOVI
agent = DOVI(
n_states=5,
n_actions=2,
horizon=2,
transition_assumption="unconfounded",
)
assert agent.is_certified
For exploratory runs where that premise is not available, callers must opt in:
agent = DOVI(n_states=10, n_actions=4, horizon=6, allow_heuristic=True)
assert not agent.is_certified
POMISThompsonSampling should be given the legal intervention variables explicitly:
agent = POMISThompsonSampling(
env.graph,
env.reward,
env.arms,
seed=0,
manipulable=env.manipulable,
)
The manipulable set is required: inferring it from arm enumeration was removed in v0.99.0.
Causal-RL Taxonomy Methods¶
These slices implement the Bareinboim 9-task taxonomy. Each is faithful to its cited source within a
stated scope; conservative helpers return None or raise outside that scope rather than guess.
- Counterfactual decision-making (ETT).
counterfactual_expectationandeffect_of_treatment_on_treatedevaluate Layer-3 queries on an executable SCM;CounterfactualOptimalPolicyacts by the Regret Decision Criterion. Requires an executable explicit-latent SCM. - General identification (ID algorithm).
identify_effectruns the sound and complete Shpitser-Pearl ID algorithm: it returns a do-freeEstimandforP(y | do(x))in any ADMG or raisesNotIdentifiableErrorwith the witnessing hedge;estimate_effectevaluates the estimand on data andis_identifiable_effectgives the decision. Validated by simulation on the back-door and front-door graphs (the estimand matches the truedo()distribution) and on the bow-arc and instrumental-variable graphs (correctly non-identifiable).requires_experimentanswers Task 2's "when to intervene": an experiment is needed exactly when the effect is not observationally identifiable. Scope: a single observational distribution over discrete variables. - General identification from surrogate experiments (gID).
identify_effect_with_experiments/is_gid_identifiableextend the ID recursion: where observation hits a hedge, the needed c-factor is obtained from an available experiment (Tian'sIdentifysubroutine), andestimate_effect_with_experimentsevaluates the result on observational plus (randomized) experimental data. With no experiments it coincides exactly with ID; validated by simulation on a graph that is not observationally identifiable but is identified by a surrogate experiment. - Transportability (sID, mz, meta).
transport_formula/is_transportablegive a readable closed form for the two workhorse cases (direct and S-admissible adjustment) over selection diagrams.identify_transport/transport_estimand/is_transportable_effectdecompose the target effect into c-factors and route each to whichever domain can supply it. At c-factor granularity a factor is invariant exactly when it touches no selection-marked variable, so for a single observational source this routing is the complete single-domain sID: it reduces to the ID algorithm and raises a witnessing transport-hedge when no domain supplies a needed factor.Domain+identify_transport_general/is_transportable_general/estimate_transport_generalgeneralize this to multiple source domains (meta-transportability) and to surrogate experiments in a domain (mz-transportability): each c-factor is searched across the domains/experiments that can supply it, with the target as the fallback. Validated by simulation: a covariate-shift estimate matches the target's truedo(), a source experiment breaks a bow-arc hedge, and an effect is assembled from invariant factors contributed by different sources. Sound throughout; the one case it does not stitch is a single c-factor identifiable only by combining several experiments (resolved per-experiment, reported non-transportable rather than guessed). - Causal discovery.
discoverruns the PC algorithm (conditional independence by conditional mutual information, then collider and Meek orientation) and returns aCPDAG;discover_interventionaladditionally orients edges from interventional (L2) data by the invariance principle, yielding the interventional essential graph. These assume causal sufficiency and faithfulness; the CMI test is thresholded, not a calibrated hypothesis test; andCPDAG.to_causal_graphrefuses to orient an equivalence class.discover_latentruns the FCI algorithm — dropping causal sufficiency — and returns aPAGwith the complete orientation rules R1-R10 (Zhang 2008, sound and complete for latent confounders and selection bias):a <-> bmarks a latent confounder and a circle endpoint is undetermined by the equivalence class. Validated against the true MAG of the data-generating DAG-with-latents (the M-bias collider among them). - Causal imitation.
is_imitable/imitation_backdoor_setdecide imitability via the π-backdoor criterion (an observed back-door-admissible set);CausalImitatorclonesP(A | Z). - Causal curriculum.
causal_curriculumorders skills by the causal topological order;PrerequisiteLearnermodels causally-gated mastery;curriculum_q_learningtrains Q-learning through a sequence of subtasks (warm-start transfer), reaching a sparse target that flat learning on the same budget misses. - Causal reward shaping.
apply_potential_shapingis policy-invariant for any potential andcausal_potentialsuppliesV*, over deterministic tabular MDPs. - Causal games.
CausalGame/pure_nash_equilibriarepresent MACIDs and enumerate pure-strategy Nash equilibria for any number of agents;mixed_nash_equilibriafinds all mixed-strategy equilibria of a two-player game exactly by support enumeration (rational arithmetic), and for three or more agents by support enumeration with a numerical Newton solve of the multilinear indifference system — every returned profile is verified to be an ε-Nash equilibrium (no agent gains more than1e-6by deviating to a pure action). - Partial-identification / OPE bounds.
causal_q_bounds(confounded RL logs) andmanski_bounds(observational data) give the sharp no-assumptions Manski interval on a counterfactual mean;ipw_sensitivity_boundsgives the marginal-sensitivity-model (Tan's Γ) interval, collapsing to the IPW point at Γ=1 and containing the truth once Γ exceeds the true confounding odds ratio. Outcomes are assumed bounded; validated against a confounded SCM with a known effect.
Not Yet Claimed¶
- Multi-experiment c-factor stitching. Transportability resolves each c-factor from a single
domain/experiment; a c-factor identifiable only by combining several experiments (the deepest
gID case) is reported non-transportable rather than guessed. Single-domain observational ID
(
identify_effect), gID (identify_effect_with_experiments), and the multi-domain mz/meta transportability (identify_transport_general) above are implemented and validated. - Score-based causal discovery (GES) and interventional FCI.
- Doubly-robust OPE point estimators (the bounds above are partial-identification intervals, not point estimates).
- Production-ready deep or offline-RL training integrations.
- General statistical guarantees from the maintained toy benchmark environments.
The experimental sensitivity helper lives under causalrl.experimental.ope.