Evaluating evaluation

Lately, I’ve been thinking a lot about how we evaluate instruction in higher education.  My thoughts emanate from different discussions I’ve had with colleagues regarding quality of instruction both in online and face-to-face formats.  Traditionally, colleges and universities have relied on student evaluations as measures of quality instruction.  As I’ve written before, these measures often reflect other dimensions beyond good teaching.  I’ve discussed the power of the “blink response” and how important affective aspects can be in determining student satisfaction in classes.  I’ve also written that students don’t always know what quality instruction looks like and may be confused about what their role in the process should be.  Stark and Freishtat, writing in ScienceOpen, also examined the use of student evaluations by looking at some of the statistical, contextual and comparative implications in collegiate environments.  Looking across the data, they write “There is strong evidence that student responses to questions of ‘effectiveness’ do not measure teaching effectiveness.”  It’s clear that there’s room for improvement in how we evaluate quality and effectiveness.

Recognizing the problem is only half the battle.  If we wanted better measures of quality instruction, what would they look like?  Please understand that I’m not advocating any widespread bureaucratic system of evaluation like some K-12 educational systems have adopted.  But I think a larger discussion needs to happen.  In online settings, we have measures like Quality Matters who have developed comprehensive rubrics for evaluating online course construction.  The list includes a condensed set of observables that they feel every online class should have.  The challenge, however, is that while the QM rubrics identify elements of quality instruction, they don’t capture whether learning is occurring.  It’s the age-old dichotomy between outputs and outcomes.  While the QM rubrics examine clear outputs, they don’t capture outcomes.  They identify factors that are observable and countable (identifying clear student learning objectives, for instance) but don’t capture whether learning has occurred.  From my point of view, the QM standards are a step in the right direction but we still have a distance to travel.

In my travels recently, I came across an evaluation system developed by Kirkpatrick that identifies four levels of evaluation.  The system was developed for use in training systems so it’s not directly applicable to our roles in higher education.  But the levels can provide some fodder for conversation.  In Kirkpatrick’s model, the levels of evaluation include:

Level 1:  Reaction –  Evaluations at this level examine the satisfaction of participants, whether learners felt engaged in the learning process and whether the content was relevant.  In Kirkpatrick’s framework, this is lowest level of evaluation hierarchy yet this is where most of our current student evaluations reside.

Level 2: Learning – As instructors, this is where I hope most of us spend our time.  We want to see whether students have learned the content we’re teaching.  Systematic evaluations rarely focus on this level unless students are working towards some larger certification.  How do we use student learning in our courses as a metric for examining quality and effectiveness?

Level 3:  Behavior – In Kirkpatrick’s framework, this level of evaluation examines whether learners have changed behavior by participating in the instruction.  This may be hard to capture at the collegiate level but it may be helpful for instructors to think about the behavioral implications of our classes.  How can we capture behavioral aspects from our instruction?  How can we use this data to communicate instructional quality?

Level 4:  Results – This one may be even harder to conceptualize in collegiate environments.  In Kirkpatrick’s framework, this level examines the measurable results (reduction in injuries, improved morale, etc.) that come about from the training.  Again, while not a perfect fit for higher education, it might be instructive to consider.  What are the tangible results from our classes? Can they be measured?

I honestly don’t know the answers to these questions, but I think we need to start thinking more broadly about instructional quality and leave behind the ineffective student evaluations.  We also need to consider how we can develop ways to not just measure quality more effectively but to foster it more comprehensively across our campuses.  After all, it’s hard to promote quality when we don’t know what it looks like or how to effectively measure it.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s