Evaluating Learning, It’s Important! Kirkpatrick’s Model

Previously, we discussed the ADDIE model and the importance of each step when designing instruction. The last step, evaluation is equally important to the actually design and development of instruction.

Evaluation helps instructional designers determine the success of the instruction or determine the gaps in learning that must be overcome to improve future designs of the instruction.

Donald Kirkpatrick created a ‘four level’ model for training course evaluation in 1959 and gained popularity in the 70s.

Kirkpatrick PyramidKirkpatrick’s Four Levels of Evaluation are designed to evaluate training programs in a sequenced order. The model is typically displayed in a pyramid, in which the later levels are more difficult to assess and take a longer time to do so.

The four levels of Kirkpatrick’s Evaluation Model are:

Level One- Reaction
Level Two- Learning
Level Three- Behavior
Level Four- Results

Level One- Reaction is the basic level of evaluation in which the participants’ opinions and feelings about the training are measured. I typically hand out evaluation surveys at the end of each training to poll participants on how they liked the overall presentation and whether or not they are interested in using the technology in question.


Level Two- Learning is an increase in knowledge and/or skills as a result of the training. This training can be measured during the training in the form of a test. I have my colleagues walk through the instructional quick guides and complete the task at hand to demonstrate basic knowledge.


Level Three- Behavior is the transfer of knowledge and/or skills from the training to the job. This step is best evident 3-6 months after training and is observed while the trainee is performing the task. I have observed my colleagues as they attempt to achieve the tasks we discussed during training. Behavior skills are not yet achieved, as my trainees are not comfortable in performing the tasks without me standing by.


Level Four- Results the last level of evaluation occurs when results can be measured as a byproduct of the training program; such as, attendance and participation has a monetary or performance-based impact. Performance has positively been affected by trainings conducted on collaboration tools and best practices in email. Colleagues have stated that they spend less time emailing documents and working on shared resources since being trained on these tools.



About Julie Tausend Burba

Instructional Designer at Hulu, Ed Tech and Project Management enthusiast. MBA Technology Management, MS Management, BS Communications, Traveler and Cook.

Posted on July 11, 2013, in Instructional Design, Tausend Talks Shop and tagged , , , , . Bookmark the permalink. 4 Comments.

  1. Julia,

    Yes, yes, yes, evaluation is important from my point of view as an ex-classroom teacher who is new to the world of Instructional Design.

    As a teacher, I had professional development on about dozens of teaching strategies, programs and technologies mandated for adoption in my classroom. At the same time, I’ve seen the emergence of high-stakes evaluations of teachers that broadly try to measure teachers’ skills. But, until I began my ID program, I had never heard of ADDIE and its last step, a formal evaluation of instruction that does not focus on teachers.

    Frustrated teachers often complain that our schools lurch from one program or philosophy every two or three years but we never really tried to measure the effectiveness of these programs. What’s more, we often don’t stick with any program long enough to measure its impact. That’s why I think that local public education leaders need to employ this last step of the ADDIE model if they hope to improve the quality of education.

    In your post about Kirkpatrick’s model of evaluation, I learned that he created it in the 1950s and that training programs had adopted it the 1970s. I was amazed that a formal approach to evaluating specific programs of instruction had existed for so long. With the exception of Step 1, in which you measure the reaction of students to a class, the model has largely been ignored by public education when it comes to figuring out what specific approaches, strategies and materials work.

    That is not to say that we don’t loosely use Kirkpatrick’s four steps but we don’t apply them systematically enough to specific changes we make. Yes, we do survey – Step 1 Reaction – and test – Step 2 Learning — at the end of a course. And later, we do have on-going measures of performance – Kirpatrick Step 3 — with standardized testing. Further down the line we gather statistics on how students’ progress over the years in education and even employment. We ask things like: How many students graduate high school? Of those who go to college, how successful are they and how well-prepared are they for college-level classes? These measures could be viewed as completing Kirpatrick’s Step 4 by measuring results. However, the focus is on measuring students’ performance very broadly over time, I would call this macro-evaluation of schools and school systems. What Kirkpatrick calls for in his model is more micro-evaluation of the use of specific methods, curriculum or even textbooks we spend millions of dollars regularly updating.

    It seems to me that systematic micro-evaluation of specifics might lead to systematic change and progress that would percolate up from the school-level rather than trickle down from theorists and the corporations that profit from the application of the theories. I have tried out numerous approaches, strategies and activities that the district mandated I use. But I’ve never worked at a school or district that tried to measure impacts of these mandated changes.

    One recent example that comes to mind is a program called BYOT that came out last year in my school district. BYOT stands for Bring Your Own Technology. The idea is that teachers incorporate smart phones and tablets into their lesson plans encouraging students to use them to manage learning and master skills. District technology trainers certified many teachers to use BYOT. But no one attempted to measure its effectiveness even though it would have been easy to evaluate. It was an optional program and some teachers jumped for it while others hung back. Pre- and post-surveys of students and teachers in classes that used BYOT and others that did not could have yielded invaluable knowledge for next year. We’d have a better idea of whether the program worked and was worth continuing. We’d know how to tweak it and we’d have a better idea of what tasks adapt well to BYOT and what’s better left to more traditional methods. Instead we tried it without any evaluation.

    The evolving field called Instructional Design is merging two instructional domains: the world of brick-and-mortar and online classrooms and that of corporate and public employment training. As public education begins to move toward Instructional Design, we could and should borrow a commitment to systematic micro-evaluation of instruction from the corporate training domain.

    I am convinced that using a systematic micro-evaluation model like Kirkpatrick’s would be a welcome change. And, who knows? We might even learn something about what really works. We might even become more systematic in our adoption of new approaches. We might even save some taxpayer money. So, Julia, yes systematically evaluating the tools we buy and the programs we adopt is important. It’s about time public educators used ADDIE-style evaluation to systematically measure results.

    Is Kirkpatrick’s model the best one for public education to use? I have read about newer models that have emerged more recently including Kaufman’s five-step model. I’m not sure which is best and I’m not sure it matters that much. But I’m convinced we should be systematically evaluating instruction. After all, we now teach metacognition to students to help them learn more about how they learn. Why not apply some systematic metacognition to evaluate results of instruction?

    • Hello katmirin@aol.com,

      Thank you for your insightful and relevant comment to my post on Kirkpatrick’s Four Level Evaluation Model. Yes, there are other Evaluation models an instructional designer should consider using when evaluating instruction. I plan on writing articles detailing some of these alternative models.

      You touched on some basic underlying problems with our education system- instructors aren’t sure how to or even if they should evaluate their instruction. This should come from Educational Leadership in a top-down manner. But it’s important that not only are evaluation models used, but the data and information collected are used for positive change. I say, change the way instruction is being done if the evaluation results are not met. This is easier said than done, but I have taught a few sections of higher education and if the results aren’t met, I look at the curriculum and objectives and determine where the gaps were to make effective changes.

      Thanks for contributing!
      Julie Tausend of Tausend Talks EdTech

  1. Pingback: Put the Design Back in Instructional Designer! | Tausend Talks EdTech

  2. Pingback: It’s not a test! Assessing Learning. | Tausend Talks EdTech

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: