Stuck at Level 2 – Kirkpatrick Levels of Training Evaluation and EBM Teaching

The instructional design world, focused as it is on training in business, has done a better job of defining educational outcomes, in my opinion, than we in medicine have.  In medicine, we seem wedded to a clinical trial mentality.  Not really in terms of study design – the randomized controlled trial is just as much a gold standard in education as it is in medicine – but in terms of outcomes.  Just as most biomedical research has been very disease-oriented in its outcomes (lowering a hemoglobin A1c level or improving a physiologic parameter rather than lengthening somebody's quality life), educational research in medicine has been focused on proxy measures – student evaluations and post-intervention knowledge testing.

There are many examples in medicine where teaching the "process" is the goal, rather than improving specific knowledge.  History taking and physical examination are two easy examples. Evidence-based medicine – looking for research-based answers to clinical questions, assessing their usefulness and applying the conclusions back to the patient – is a really good example.  In the two decades or so that I've been teaching EBM, I have seen the community of EBM teachers (and others) bemoan the lack of any higher order evidence for teaching outcomes.  One tool for getting us beyond those more basic outcomes is an evaluation model for training in EBM.  I propose the Kirkpatrick Training Evaluation model.  Others have too, I just wanted to blog about it…

The Kirkpatrick scheme comprises, in its original form, four levels of outcome after a training intervention: 

  1. Reaction – how well did the learners enjoy the session and do they think they benefitted from it?
  2. Knowledge Gain – did the learners' knowledge improve?
  3. Process Change – did the learners change their behavior?
  4. Results – did the results improve?

We tend to adequately cover the first two outcomes in educational research. It's the last two that are harder.  In business, process change can be observed in a factory worker visibly changing how they assemble a widget.  In the military, soldiers can be observed utilizing different skills when responding to a threat.  In medicine, how do we observe the process of identifying a clinical question, searching for the answer, appraising the research found and applying it back to the patient?  Those are primarily "cognitive behaviors" (if you will…) and are hard to observe without intervening in some way – asking the learner what they did, constructing automatic measurements like hits on a web site, etc.  In all of those cases, objective, non-biased observation is difficult.  It gets even more difficult if you are inclined to measure changed clinical processes as the outcome – treating a given condition in a certain way (prescribing aspirin after heart attacks, etc.).  These sorts of outcomes are proven to work "on average" for the patients in the relevant study.  However, because as physicians we often need to apply evidence beyond the usual narrow confines of study inclusion criteria, a simple increase in the percentage of patients in whom we prescribe aspirin may be confounded by each patients' individual characteristics, the contribution of multiple socioeconomic determinants and a whole host of other effects.

Level four "results" are even harder to nail down.  What is the result that counts for the patient?  You might include the proxy outcome of "prescribed aspirin" at this level, but isn't the ultimate result we want an improved outcome for the patient?  Imagine, then, the list of potential confounders and biases that could creep into any educational research study that looks at the outcome of patients as a result of an educational intervention – even if that intervention is at the level of the currently-practicing physician…never mind if the learner is a medical student.

These are difficult, but not impossible, issues to address in medical education, and the first step requires only that we attempt to move beyond level 2.  Introducing some measure of process change after educating students in evidence-based medicine is what's needed.  Even if the process measure is simulated – as in these evaluations using simulated clinical examinations – it is a start.