Target Trial Emulation: The Causal Inference Framework Quietly Rewriting How We Read ICU Observational Studies in 2026
Why most observational critical care research has been quietly misleading you, the seven components every clinician should look for, and how the interprofessional ICU team can spot immortal time bias
Here is a quiet truth about modern critical care literature.
Most of what we read every week is not from randomized trials. It is from observational studies. Registry data, MIMIC-IV pulls, single-center retrospectives, large multicenter cohorts, propensity-matched analyses, machine-learning prediction papers. We read them quickly. We update practice slowly. And somewhere between the abstract and the bedside, a hidden defect often does most of the talking.
That defect has a name. It was first described by William Farr in 1843, rediscovered in the 1970s, and has been quietly inflating treatment effects ever since. It is called immortal time bias. It is one of several recurring distortions that target trial emulation, the framework now reshaping how observational evidence is generated and interpreted in 2026, was built to correct.
If you have not yet spent serious time with target trial emulation, this Thursday’s piece is for you. Because by the end of 2026, you will not be able to read a serious ICU observational paper without it.
Why This Matters
In the last five years, observational critical care research has exploded. EHR-derived cohorts, ICU registries, the Medical Information Mart for Intensive Care (MIMIC-IV), eICU, the Amsterdam UMCdb, and a generation of national datasets have made it possible to study almost any ICU question on tens of thousands of patients. Randomized controlled trials remain the gold standard, but they are slow, expensive, narrowly inclusive, and frequently impossible for many of the questions clinicians actually face.
When should we switch from controlled to assisted ventilation? When should we initiate vasopressin in septic shock? Does early dexmedetomidine reduce delirium when initiated in the first 24 hours? When should renal replacement therapy begin? When should we cannulate for VA-ECMO?
These questions are not luxuries. They drive practice. And the literature trying to answer them is overwhelmingly observational.
Here is the problem. A large fraction of those papers, even ones published in high-impact journals, suffer from a small set of recurring design defects that quietly invert the conclusion. Immortal time bias. Misaligned time-zero. Selection bias from inclusion criteria that depend on post-baseline information. Confounding by indication that propensity scores cannot fully repair. Treatment strategies that no clinician would actually deliver.
Target trial emulation, often shortened to TTE, addresses these defects head-on by forcing investigators to write down the randomized trial they wish they could have run, and then to emulate that hypothetical trial as closely as possible with observational data.
The result is a methodology that has the rigor of a trial protocol and the realism of routine ICU care. For the interprofessional team, it changes what we can trust at the bedside.
The new ICCN website is open. Two features are worth a look. The Research section organizes our coverage of the trials and reviews driving practice change, by system and continuously updated from 26 different journals. The Article section holds the full ICCN back catalog, searchable and free to browse. Both are built to the same standard as the Substack you are reading. Explore ICCN.
The Study / Evidence in Context
The anchor paper for this week comes from Critical Care (November 2025), authored by Reep, Wils, and Heunks at Erasmus, Franciscus Gasthuis, and Radboud. The title says exactly what the field needs: “Opportunities, challenges and future perspectives for target trial emulation in critical care clinical research.”
The paper is a structured methodological review aimed squarely at intensive care clinicians and researchers. It walks through the foundational logic of TTE using a clinically familiar example: the timing of the switch from controlled to assisted ventilation. That example matters because RT, intensivist, and nursing teams make that decision dozens of times a week, the evidence base is overwhelmingly observational, and the question maps cleanly to a hypothetical trial that could in principle be run.
The Reep paper builds on a much larger methodological scaffold. The framework itself traces back to Miguel Hernán and James Robins, whose 2016 papers in the American Journal of Epidemiology and Journal of Clinical Epidemiology formalized the concept of specifying a target trial before any analysis is conducted. The BMJ published a widely cited 2022 explainer by Matthews and colleagues. A 2023 JAMA Network Open systematic review by Hansford and colleagues quantified the rapid uptake of TTE across medicine. And in 2025, Yates, Parks, and Dodd in Open Forum Infectious Diseases released IMMORTOOL, an open-source tool to quantify the potential for immortal time bias in observational studies of acute severe infection. The field is moving fast.
What ties all of this together is a simple proposition. If you cannot draw the trial you wish you had run, you cannot defend the conclusion you are drawing.
Listen to the following podcast:
What Stood Out
Several elements of the Reep review and the broader TTE literature stood out enough to influence how I now read every observational ICU paper that comes across my desk.
The seven components of a target trial protocol.
To emulate a target trial, an investigator (and a careful reader) should be able to specify seven elements before looking at outcomes:
Eligibility criteria.
Treatment strategies under comparison.
Treatment assignment procedures.
Follow-up period.
Outcomes.
Causal contrasts (intention-to-treat analog, per-protocol analog).
Analysis plan, including how time-varying confounding will be handled.
If any of those seven are missing, ambiguous, or impossible to operationalize with the data, the causal claim weakens. This is not pedantry. The seven-component protocol is the single most useful evidence appraisal tool in critical care that most clinicians have never used.
Time-zero alignment is the single most common failure point.
In a randomized trial, eligibility is assessed, treatment is assigned, and follow-up starts at the same moment. That moment is time-zero. In observational research, those three events frequently drift apart. Eligibility may be defined using data that did not exist until later in the stay. Treatment may be classified based on what eventually happened. Follow-up may start days after the patient became eligible. When time-zero drifts, immortal time bias appears almost automatically.
The defining characteristic of immortal time is this: during a specific window, the outcome of interest cannot occur. Patients have to survive long enough to receive the treatment, so any death during that window is, by construction, assigned to the comparison group. The treated group is artificially protected. The effect is sometimes large enough to flip the direction of an entire literature.
Causal language without causal methods is the quiet epidemic.
Hernán’s 2018 American Journal of Public Health essay, “The C-Word,” made an uncomfortable observation about the field. Observational papers routinely use the language of association while drawing the conclusions of causation. Readers, including clinicians, then translate the soft language into hard practice changes. TTE forces investigators to be honest about which question they are actually answering.
Cloning, censoring, and weighting (CCW) is now the dominant analytic strategy for sustained-treatment questions.
For questions like “when should we start vasopressin,” patients do not receive a one-time exposure. They live through a dynamic regime over hours and days. Cloning each patient into multiple hypothetical strategies, censoring clones when their actual care diverges from the strategy, and weighting to address the resulting selection bias has become the workhorse approach. It is also the analytic plan that, when absent or hand-waved, should make a careful reader skeptical.
Three core assumptions decide whether the causal estimate is interpretable at all.
Reep and colleagues call them out explicitly:
Consistency. The treatment strategy in the data is well-defined and matches the strategy in the hypothetical trial.
Conditional exchangeability. After adjusting for measured confounders, treated and untreated patients are comparable.
Positivity. Every eligible patient has a non-zero probability of receiving each strategy under comparison.
Violate any of these, and the math returns a number that is not what it appears to be.
Clinical, Research, and Leadership Interpretation
For the interprofessional ICU team, this matters in concrete and discipline-specific ways.
For intensivists and APPs, target trial emulation changes how to read a paper. A retrospective cohort claiming that “early X reduced mortality” should now trigger an immediate mental checklist. What was time-zero? Was eligibility assessed before treatment was assigned? Was follow-up aligned across groups? Was a per-protocol analog estimated using g-methods? Were the three assumptions explicitly addressed? If those questions are unanswerable from the paper, the effect size is not trustworthy enough to change practice.
For respiratory therapists, the Reep paper uses our work as its central example. The switch from controlled to assisted ventilation is one of the most consequential, judgment-driven decisions in modern critical care, and the evidence base is observational. RTs reading the literature on weaning timing, NIV strategies, prone positioning duration, and APRV initiation are now operating in a TTE-era. When a paper claims that earlier or later weaning reduces ventilator days, the next question is whether the treatment strategy was actually deliverable and whether time-zero was aligned. If not, the headline is suspect.
For ICU nurses, observational research on early mobilization, sedation interruption protocols, and delirium prevention is constantly cited in practice changes. Many of these studies have classical immortal time problems: patients are coded as “mobilized” only if they survived long enough to be mobilized. The mortality benefit shrinks or disappears once time is handled properly. Nurses who can articulate this objection in unit-based practice meetings raise the floor of the entire team’s evidence appraisal.
For pharmacists, drug-effectiveness questions in the ICU live almost entirely in observational space. Dexmedetomidine and delirium. Early antibiotics in sepsis. Vasopressin timing in septic shock. Steroids in pneumonia. TTE-style reanalyses have already begun to revise the magnitude of effects that pharmacotherapy literature once treated as settled. Pharmacists who can flag missing per-protocol analyses or unspecified treatment strategies become indispensable to multidisciplinary rounds.
For perfusionists, the timing of VA-ECMO cannulation is one of the highest-stakes observational questions in adult critical care. The decision to cannulate depends on hemodynamic trajectory, which is itself a time-varying confounder. Naive comparisons of “early” versus “late” ECMO almost always suffer from immortal time bias and selection bias from indication. Reading these papers through a TTE lens is not optional anymore.
For ICU leaders and educators, the shift to TTE is also a professional development opportunity. Journal clubs that incorporate the seven-component protocol as a standing checklist produce sharper readers within a few sessions. Quality improvement teams that use the framework to design their own observational evaluations of new bundles or protocols produce far more defensible data.
There is also a leadership lesson hiding in here. The willingness to slow down, write the protocol you wish you had run, and admit which question your data can and cannot answer is the same discipline that distinguishes mature clinical leadership from impulsive practice change. The framework rewards intellectual honesty, and so does the bedside.
Bedside and Workplace Takeaways
A practical short list to use this week:
When you see “early versus late” comparisons, write down your own time-zero. If the paper’s time-zero is anywhere other than the moment of eligibility, flag it.
When a paper reports a striking mortality reduction with a treatment that takes hours to initiate in a high-mortality condition, suspect immortal time bias by default. Tools like IMMORTOOL now exist to quantify the bias for any specific study.
When the analysis plan does not specify either an intention-to-treat analog or a per-protocol analog handled with g-methods, downgrade your confidence one full level.
When the treatment strategy under comparison is one no real ICU could actually deliver (for example, “intubate all patients exactly at hour 6”), the estimand has no clinical meaning, regardless of the p-value.
When discussing observational evidence at multidisciplinary rounds, ask the team to articulate the target trial in one sentence. If no one can, the evidence is not ready to drive a protocol change.
These five reflexes will quietly upgrade the quality of every practice change your unit considers in 2026.
Teaching Pearl
Patients must survive long enough to receive a treatment. That requirement, when it is not handled in the study design, creates artificial protection that has nothing to do with biology, the drug, or the strategy. It is a function of when the stopwatch starts. Target trial emulation is, at its heart, a discipline for making sure the stopwatch starts in the right place.
What We Should Not Over-Assume
A few cautions to keep this from being read as triumphalist.
TTE is not a substitute for a randomized trial. It is a tool for asking causal questions when randomization is impossible, infeasible, or has not yet been done. It does not eliminate confounding by indication. It does not solve unmeasured confounding. It does not rescue datasets that lack the necessary variables to define eligibility, treatment, and outcome with precision.
A TTE-labeled paper is not automatically a good paper. The recent Hansford systematic review in JAMA Network Open found wide variation in reporting quality among studies that explicitly described themselves as target trial emulations. The label is becoming common faster than the methodology is being executed well.
And TTE does not eliminate the need for clinical judgment. A correctly emulated target trial of a strategy no one would actually use produces a precise estimate of an irrelevant quantity. The framework is rigorous, but it cannot replace the clinical instinct that selects the right question.
Limitations
The Reep review is a perspective piece, not a primary study. Its illustrative example (ventilation mode transition) is helpful but limited in scope. The paper does not exhaustively benchmark all available analytic approaches for time-varying confounding, nor does it deeply explore Bayesian approaches to causal inference, which are an active and adjacent area of methodological work.
Broader limitations of the TTE literature itself include: the dependence on EHR data quality, which varies widely between centers and countries; the difficulty of capturing all relevant confounders in routinely collected data; the under-discussed challenge of measurement error in exposures and outcomes; and the relatively small pool of clinicians with training in g-methods who can independently audit the analytic plan of a complex TTE paper.
There is also a sociological limitation. Even excellent TTE work can be misinterpreted by readers who are not yet fluent in the framework. Education, including the kind of work this Substack tries to do, is part of the methodology’s actual implementation.
Bottom Line
Target trial emulation is no longer an academic curiosity. It is becoming the default analytic framework for serious observational research in critical care, and within the next two years it will be a competency the interprofessional team is expected to read and apply. The seven-component protocol, the three core assumptions, and a working understanding of immortal time bias and cloning-censoring-weighting are now the minimum literacy required to evaluate the literature that drives ICU practice.
For ICCN readers, the upgrade is simple. Before changing any practice based on an observational paper, ask one question. Can I draw, on a single page, the randomized trial the authors say their analysis emulates? If yes, read deeper. If no, the headline is not yet ready to influence the patient in bed eight.
That single discipline, applied weekly, will improve the quality of evidence translation across the entire ICU.
References
Reep CAT, Wils EJ, Heunks L. Opportunities, challenges and future perspectives for target trial emulation in critical care clinical research. Crit Care. 2025;29(1). doi:10.1186/s13054-025-05723-x
Hernán MA, Robins JM. Using big data to emulate a target trial when a randomized trial is not available. Am J Epidemiol. 2016;183(8):758-764. doi:10.1093/aje/kwv254
Hernán MA, Sauer BC, Hernández-Díaz S, Platt R, Shrier I. Specifying a target trial prevents immortal time bias and other self-inflicted injuries in observational analyses. J Clin Epidemiol. 2016;79:70-75. doi:10.1016/j.jclinepi.2016.04.014
Matthews AA, Danaei G, Islam N, Kurth T. Target trial emulation: applying principles of randomised trials to observational studies. BMJ. 2022;378:e071108. doi:10.1136/bmj-2022-071108
Hansford HJ, Cashin AG, Jones MD, et al. Reporting of observational studies explicitly aiming to emulate randomized trials: a systematic review. JAMA Netw Open. 2023;6(9):e2336023. doi:10.1001/jamanetworkopen.2023.36023
Hernán MA. The C-word: scientific euphemisms do not improve causal inference from observational data. Am J Public Health. 2018;108(5):616-619. doi:10.2105/AJPH.2018.304337
Dickerman BA, García-Albéniz X, Logan RW, Denaxas S, Hernán MA. Avoidable flaws in observational analyses: an application to statins and cancer. Nat Med. 2019;25(10):1601-1606. doi:10.1038/s41591-019-0597-x
Yates TA, Parks T, Dodd PJ. Quantifying potential immortal time bias in observational studies in acute severe infection. Open Forum Infect Dis. 2025;12(4):ofaf173. doi:10.1093/ofid/ofaf173
da Rosa Decker SR, Serpa Neto A. Bringing credibility to observational research in critical care: the case of target trial emulation designs. Crit Care Sci. 2025;37:e20250142. doi:10.62675/2965-2774.20250142
Yadav K, Lewis RJ. Immortal time bias in observational studies. In: Livingston EH, Lewis RJ, eds. JAMA Guide to Statistics and Methods. McGraw-Hill; 2019.
Tyrer F, Bhaskaran K, Rutherford MJ. Immortal time bias for life-long conditions in retrospective observational studies using electronic health records. BMC Med Res Methodol. 2022;22:86. doi:10.1186/s12874-022-01581-1
Hernán MA, Hsu J, Healy B. A second chance to get causal inference right: a classification of data science tasks. Chance. 2019;32(1):42-49. doi:10.1080/09332480.2019.1579578
Admon AJ, Donnelly JP, Casey JD, et al. Emulating a novel clinical trial using existing observational data. Predicting results of the PreVent study. Ann Am Thorac Soc. 2019;16(8):998-1007. doi:10.1513/AnnalsATS.201903-241OC
Zampieri FG, Casey JD, Shankar-Hari M, Harrell FE Jr, Harhay MO. Using Bayesian methods to augment the interpretation of critical care trials. Am J Respir Crit Care Med. 2021;203(5):543-552. doi:10.1164/rccm.202006-2381CP
Hernán MA. Methods of public health research: strengthening causal inference from observational data. N Engl J Med. 2021;385(15):1345-1348. doi:10.1056/NEJMc2113319
Labrecque JA, Swanson SA. Target trial emulation: teaching epidemiology and beyond. Eur J Epidemiol. 2017;32(6):473-475. doi:10.1007/s10654-017-0293-4
García-Albéniz X, Hsu J, Hernán MA. The value of explicitly emulating a target trial when using real world evidence: an application to colorectal cancer screening. Eur J Epidemiol. 2017;32(6):495-500. doi:10.1007/s10654-017-0287-2
Cain LE, Robins JM, Lanoy E, Logan R, Costagliola D, Hernán MA. When to start treatment? A systematic approach to the comparison of dynamic regimes using observational data. Int J Biostat. 2010;6(2):Article 18. doi:10.2202/1557-4679.1212
Hernán MA, Robins JM. Causal Inference: What If. Boca Raton: Chapman & Hall/CRC; 2020.
Schneeweiss S, Rassen JA, Brown JS, et al. Graphical depiction of longitudinal study designs in health care databases. Ann Intern Med. 2019;170(6):398-406. doi:10.7326/M18-3079
Clinical Disclaimer
The content above is for educational purposes only and is not intended to replace clinical judgment, institutional protocols, or care delivered by qualified healthcare professionals. Patient care decisions should always be individualized, made in collaboration with the full interprofessional team, and aligned with current local guidelines, regulatory standards, and the patient’s clinical context. ICCN is not responsible for clinical actions taken solely on the basis of this article.
© 2026 Interprofessional Critical Care Network (ICCN). All rights reserved. Unauthorized reproduction or redistribution of this content is prohibited. Subscribers may share excerpts with proper attribution to ICCN and the author.



