Bayesian Sequential Decision-Making with Informative Timing

This post is a brief summary of highlights from our recent working paper titled “Bayesian Semiparametric Model for Sequential Treatment Decisions with Informative Timing.” It’s part of a series of projects on developing robust Bayesian nonparametric models for sequential decision making in a variety of complex settings. In this case, we deal with a situation in which the decisions are informatively timed - with a motivating application in chemotherapy treatments for pediatric acute myeloid leukemia (AML).

This is partially funded by a PCORI grant grant awarded earlier this year (2022).

Setting and Problem

Chemotherapy treatment in AML is not “one and done” but involves making a sequence of treatment decisions over time, with each subsequent treatment decision depending on how the patient has responded to previous treatments and the evolution of their disease process. There are many chemo agents. This work focuses on understanding the effect of anthracyclines (ACT) in particular on survival. ACT is known to be effective in certain cases, but it can also lead to cardiotoxicity and subsequent early death.

This presents clinicians with a risk-reward tradeoff - trting too aggressively or too conservatively with ACT may reduce survival. To help inform decisions, an echocardiogram is done to help decide if the heart is healthy enough to tolerate ACT. Clinicians follow various rules of thumb in practice. For instance: withholding ACT if a patient’s current ejection fraction (measured via echocardiogram) falls below some absolute threshold (eg 50%) or if it declines more than some relative (e.g. declines more than 20% from time of enrollment). Briefly, ejection fraction (EF) is the proportion of blood that is pumped out of the heart’s left ventricle during a beat. In healthy individuals, this is fairly high (from 50-75%) but can be lower among patients with cardiotoxicity. These treatment rules are “dynamic” in that ACT is decided based on evolving ejection fraction. This is in contrast to static rules such as “always treat with ACT”, “never treat with ACT” or “alternate ACT” - regardless of EF. The goal is to develop a strategy for estimating survival rates under various hypothetical ACT assignment rules. If we have such a method, we could evaluate the efficacy of the various rules of thumb employed in clinics and arrive at a more data-driven approach.

Luckily, using data from the Phase III AAML1031 trial, we can estimate effects of such ACT treatment strategies on survival. In the trial, patients move through a sequence of 4 chemo courses. Ahead of each course, an echo is conducted & used to decide ACT inclusion.


There are many impediments to valid statistical estimation. 1) ACT is not randomized in the trial but informed by time-varying features such as EF. We need to adjust for such features to get an apples-to-apples comparison of different ACT rules. 2) Some patients die or drop out before ever completing the sequence. In the latter case, we are left with incomplete survival information. 3) Treatment courses are not initiated at pre-fixed times, but depending on when subjects recover from the previous chemotherapy course.

This last point suggests that the waiting times between treatments are potential confounders (e.g. slower recovery after previous treatment may inform subsequent treatment and drive survival) - quite a unique type of time-varying confounding which requires modifications to the usual g-computation algorithm for sequential treatments

A Bayesian Semiparametric Method

We model the causal structure using a non-homogenous continuous-time transition process. After each course, patients can transition into a state of subsequent death or transition into a state of subsequent treatment - whichever comes first - in continuous-time. These process is characterized by a pair of transition probabilities: one pair for each treatment course. We use Bayesian semiparametric hazard models to estimate the transition probabilities at each stage as a function of features available at the start of this course.

Causal effects of certain rules are computed by simulating from the transition process under a specified rule and computing the proportion of simulated subjects who survive past a certain time point (e.g. 3 years, if we want to compute 3-year survival).

Here is a twitter thread with some more details and images:

Arman Oganisian
Arman Oganisian
Assistant Professor of Biostatistics