Learning Probabilistic and Causal Models with(out) Imperfect Advice
Abstract:
From early education, we’ve learned a general problem-solving approach: abstract real-world problems into models, solve them, and map the solutions back to reality. While effective, this "Abstract, Solve, Map back" framework can sometimes overlook crucial contextual details in real-world instances. This thesis addresses key questions in learning probabilistic and causal models, and introduces algorithmic innovations to incorporate contextual information as imperfect advice. The contributions are organized into three themes: (I) Algorithms for learning probabilistic models, (II) Algorithms for learning causal models, and (III) Algorithms with imperfect advice.
Theme (I) explores finite-sample distribution learning under the PAC framework. We design algorithms for degree-bounded Bayesian networks, extending existing results for linear Gaussian models and offering new insights for discrete polytrees. A key insight for this theme is that structural sparsity enables more sample-efficient learning.
Theme (II) focuses on causal inference, specifically on the problems of causal graph discovery via adaptive interventions and causal effect estimation. For discovery, we provide the first full characterization for identifying causal graphs with interventions and propose adaptive algorithms for various settings. For estimation, we derive PAC bounds for covariate adjustments and develop algorithms to find small adjustment sets. The key idea is that reframing certain causal problems as graph learning or distribution learning tasks allows us to leverage tools from graph theory and statistics respectively.
Theme (III) studies algorithms that utilize instance-specific side-information, explored through the framework of learning-augmented algorithms. Inspired by the property testing insight that "testing can be cheaper than learning", we introduce the TestAndAct framework for using imperfect advice. This approach incorporates a test to assess the quality of the advice and adapts the algorithm's behavior accordingly. We instantiate variants of this idea to improve competitive ratios in online bipartite matching under random arrivals, reduce sample complexity in learning multivariate Gaussians, and minimize interventions for learning causal graphs. Our methods provide performance guarantees that scales with the quality of the advice, without requiring prior knowledge of its accuracy.