PH.D DEFENCE - PUBLIC SEMINAR

Combined Risk Modeling and Subtyping in Intensive Care Units

Speaker
Mr. Shivin Srivastava
Advisor
Dr Vaibhav Rajan, Assistant Professor, School of Computing


14 Mar 2024 Thursday, 01:00 PM to 02:30 PM

SR15, COM3 01-25

Abstract:

Risk models used in clinical decision support systems play a crucial role in disease prevention. This is particularly true in Intensive Care Units (ICUs), where patients are vulnerable to clinical complications. These diseases often have complex manifestations with heterogeneous subpopulations called subtypes. These subtypes exhibit distinct clinical characteristics. Identification of subtypes facilitates deeper clinical understanding and enables the development of personalized care strategies. Extant risk models either do not explicitly model the underlying heterogeneity or adopt a two-step approach where clustering is performed to find subtypes followed by subtype-specific risk modeling. The former one-size-fits-all approach may under-or over-estimate risks for specific subtypes, while the latter may fail to discover clusters that are beneficial for subsequent risk modeling. Both these approaches ultimately lead to inadequate accuracy of risk prediction.

In this thesis, we adopt the computational design science paradigm to develop subtype-aware risk modeling approaches. We begin by theoretically studying the properties of data that lead to better risk prediction performance of linear classifiers trained on that data. We find that if the data classes are sufficiently well separated in the data domain, then the downstream linear classifiers can classify the data points quickly. We formalize this notion of class separability and prove that the standard logistic loss decreases when the points are well separated. Experiments on synthetic datasets are carried out to verify the theoretical results empirically. Based on this theoretical foundation, we propose a simple k-means-based classification algorithm called Classification Aware Clustering (CAC) and its Deep Neural variant DeepCAC. DeepCAC effectively leverages deep representation learning to learn latent embeddings and finds clusters in a manner that makes the clustered data suitable for training classifiers for each underlying subpopulation.

Furthermore, we develop a subtype-aware risk modeling approach, called ExpertNet, that leverages the powerful representation learning ability of deep neural networks to simultaneously model the underlying heterogeneity and effectively utilize the clustered patient representations within a mixture of cluster-specific classifiers. We design novel strategies to address challenges in network training and, through knowledge-distillation, obtain interpretable subtype-specific risk factors. We evaluate ExpertNet on the tasks of predicting the risk of two health complications -- Sepsis and Acute Respiratory Distress Syndrome (ARDS), where improved prediction leading to early detection can substantially reduce clinical and economic burden globally. Our experiments, on electronic medical records from ICUs, demonstrate that ExpertNet discovers clinically meaningful subtypes and effectively utilizes them for risk prediction, significantly outperforming competing approaches.