Skip to yearly menu bar Skip to main content


Poster

Linear Causal Bandits: Unknown Graph and Soft Interventions

Zirui Yan · Ali Tajer

[ ]
Fri 13 Dec 11 a.m. PST — 2 p.m. PST

Abstract: Designing causal bandit algorithms depends on two central categories of assumptions: (i) the extent of information about the underlying causal graphs and (ii) the extent of information about interventional statistical models. There have been extensive recent advances in dispensing with assumptions on either category. In particular, recent advances include assuming known graphs but unknown interventional distributions and the converse setting of assuming unknown graphs but access to restrictive hard/$\operatorname{do}$ interventions, which removes the stochasticity and ancestral dependencies. However, the problem in its general form, i.e., *unknown* graph and *unknown* stochastic intervention models, remains open. This paper addresses this problem and establishes that in a graph with $N$ nodes, maximum in-degree $d$, and causal path length $L$ after $T$ interaction rounds the regret upper bound scales as $\tilde{\mathcal{O}}((cd)^{L-\frac{1}{2}}\sqrt{T} + d + N)$ where $c>1$ is a constant. A universal minimax lower bound is also established, which scales as $\Omega(d^{L-1}\sqrt{T})$. Remarkably, the graph size $N$ has a diminishing effect on the regret as $T$ grows. These bounds have matching behavior in $T$, exponential dependence on $L$, and polynomial dependence on $d$ (with the gap $\sqrt{d}\ $). On the algorithmic aspect, the paper presents a novel way of designing a computationally efficient CB algorithm, which is a challenge that all the existing CB algorithms using soft interventions face.

Live content is unavailable. Log in and register to view live content