This got forwarded to me via one of the mailing lists I’m on: Judgment Day, a long piece from the New York Times Magazine this past Sunday, Among other things, it quotes a U of M researcher on student evaluations, and it tells the story of a non-tenure-track instructor/professor who was not hired back in part because of some bad evaluations.
Donate!
Recent Comments
- ang on EMU Memes is pretty awesome
- Mark Higbee on “Duncan, Biden talk affordability”
- Kevin S. Devine on “Duncan, Biden talk affordability”
- Geoff Larcom on EMU Memes is pretty awesome
- sitedad on EMU Memes is pretty awesome
- cmadler on EMU Memes is pretty awesome
- cmadler on EMU Memes is pretty awesome
- Jamin on “Faculty Labor Divorce”
- Mark Higbee on Budget cut woes in Pennsylvania and Kentucky
- Mark Higbee on “Duncan, Biden talk affordability”
- Sherry on “Julea Ward, Christian Counseling Student Expelled For Gay And Lesbian Views, To Argue Discrimination Case In Court”
- Rimshot on “Julea Ward, Christian Counseling Student Expelled For Gay And Lesbian Views, To Argue Discrimination Case In Court”
- dypo on “Duncan, Biden talk affordability”
- salty dog on “Duncan, Biden talk affordability”
- Carl on “Faculty Labor Divorce”
- grady on “Faculty Labor Divorce”
- licorice on “Faculty Labor Divorce”
- sitedad on “Duncan, Biden talk affordability”
- sitedad on “Faculty Labor Divorce”
- licorice on “Faculty Labor Divorce”
EMU Bloggers
EMU Links
Higher Education News
In the Local Media
Ypsi-Arbor Bloggers

The EMU system of student ratings of instructors was designed about 30 years ago, and is badly in need of a redesign.
This NYTMagazine piece is one of the more serious journalistic stories on student evaluations — but I have known of EMU administators who, contrary to what the article says, thought ratemyprofessor.com was a suitable substitute for formal student ratings. Happily, those administrators have gone on and no longer have power at EMU. These officials also for a while tolerated a system that conflated the student ratings from one section of a class with entirely different sections of different classes, and moved to restrict student access to the student ratings of instructors. Those were the bad old days, 5 years ago, and while there’s been some progress, not enough.
Plenty of evidence exists that suggests higher student ratings can be “purchased” with easy grades, or with other inducements, especially if the survey instrument used is badly designed, as is our decades out of date 2 question instrument at EMU. (EMU has 2 campus wide questions, and then departments can select a handful of other questions for their programs, and do so with a great deal of inconsistency and no clear standards.)
I’ve read dozens of the scholarly studies on student ratings, and several things are clear from that scholarly literature that is not brought out in the NYT Mag article: 1) students, overall, can rate how effectively an instructor conveys material to them, but they cannot accurately rate the instructor’s professional knowledge of the subject; 2) students often as a group tend to ignore what the course’s objectives are and instead place priority on how appealing the course is in other respects – like when it’s offered, the work load of the course, whether they like the subject; 3) any simple 2 question survey of student opinion on teaching quality is unlikely to produce reliable data; what’s needed are multiple questions probing the same issue in different words, so that the resulting data can be cross tabulated for reliability; and 4) it is a serious mistake to put any weight on any given handwritten anonymous comment on these forms — they have no statistical variety whatsoever, and the extreme comments stand out (“my professor is a nutcase!”); attention to that kind of comment mitigates against the purpose of having a formal numeric evaluation system. (If a student cares to write an actual letter, signed, then that’s a different matter – it represents her or his views and should be considered and is no corruption of the formal, standardized evaluaton system, unlike putting weight on anonymous handwritten comments made thru the formal system.)
The scholarly scientific merit of the current EMU 2 question survey is close to zero. Instead of tackling the real job of devising a better survey, the provost’s office ducked the issue in fall 2006 and has been ducking it ever since. The EMU-AAUP leadership has also ducked the issue as well, sad to say; and it is a matter that must involve the union, as it is a contractual issue. In my view, most EMU faculty favor an improved system and believe that the current system offers little.
There is no justification whatsoever to base employment decisions about instructor’s on just the numeric ratings, yet that is widely done, even sometimes in cases when the absolute numbers of students making the ratings is too low to be statistically meaningful. Using numbers that are not statistically valid to support decisions that are career-shaping is entirely unethical, and the article mentions instances that seem to fit that description. Less of this happens now at EMU, I believe, than did not so many years ago, so there’s been progress.
Just as an unethical, in my view, is EMU’s continued reliance on an outdated evaluation system that does not help instructors become better teachers. The good efforts of many people went into seeking an improvement circa 2004-06, only to be betrayed by the provost’s office, the then university president, the then student govt. president, and the EMU-AAUP leadership. It all crashed down, and we’re left with a status quo student evaluations system that has huge shortcomings.
Education First! Let’s fix the student evaluations system.
I also agree that the system has many shortcomings. I have been in classes where I found no enrichment at all and/or didn’t feel the instructor was helpful, but where most of the students in my class gave the instructor good ratings just because it was an easy A.
When I’ve had a good professor, I often feel almost as nervous as a professor must on evaluation day because I think about how many students carelessly fill out evaluations or base their scores not on an instructor’s devotion to the subject, attention to students, and ability to teach, etc. but on a grade they received.
I have had fantastic professors with whom I didn’t necessarily develop an extremely close relationship, or in whose classes I received grades lower than what I would normally expect from myself, and I still remained conscious of the fact that evaluations should not be biased by my disappointment in the grade that I earned or my personality differences with those professors.
Until this system changes (which, hopefully, will be soon), I would encourage my fellow students to remember that evaluations are not the time for revenge or for deciding whether or not you agree with an instructor’s personal opinions/beliefs.
Very wise and thoughtful observations, EMUHeather. Thank you for sharing! The worst thing about our antiquated system of student ratings is that it provides so little information by which instructors can meaningfully assess how to improve their teaching (and we can all improve!); as a result, it’s a situation in which EMU’s full potential goes needlessly unrealized.
Thanks again!
Well, you can customize your evaluations as an instructor and I think you can do that as a department– that is, there are questions that have to be on all of these evaluations, and then there are questions that you can add. If you teach an online class, then you have access to a much larger selection of questions that the instructor can customize quite a bit. So the instrument here is mostly dated, but not completely.
The good news, EMUHeather, is that it is unlikely that a bad evaluation from a student is going to be a make or break issue for most faculty. Not to go into all the details, but in my department, student evaluations are but one part of a measure for teaching effectiveness. As I used to say to students before they filled out their evaluations, you are not going to be able to get me fired and you are not going to be able to get me a pay raise just by an evaluation.
And to the extent that teaching evaluations are a factor for anyone’s job security, I’m of the opinion that there has to be a very clear pattern. This is why I wonder a bit about the claims Annmarie Bean makes in this article. I don’t know the details or the story behind the story here, but I have to think that if Bean wasn’t getting universally bad evaluations that there was something else going on.
Sitedad — I largely agree with you, but I also recall cases which I dealt with as a faculty grievance officer at EMU in which administrators tried to take severe actions against faculty members based solely or almost solely on student ratings of the instructor. One of these was a denial of tenure case; that colleague won tenure thru the grievance process. Other administrators have argued that a half decimal point on average ratings is sufficient cause to deny promotion. One Department Head pointedly told me that he was not obligated to read all the materials submitted by a faculty member applying for promotion since, in the Dept Head’s estimation, the student ratings data alone were sufficient to conclude the professor did not merit promotion! Of course, the faculty contract says all materials submitted must be evaluated; but the relevant Dean backed up his Department Head.
Standards only work when they are followed.
As for the self selected questions that a prof can opt for — yes, that’s do-able at EMU, depending on exact circumstances (such as sections, how often you want to change them, etc.); but that still does not add up to a survey instrument that has been designed by experts in assessing instructional effectiveness; it won’t produce a set of questions that reflect state of the art assessmenet tools, but rather questions that strike an instructor’s fancy; and i doubt many of us on the faculty have the expertise to select the right combo of questions to get the cross tabulated data needed to be a meaningful metric.
Education First! Fix the system.
The danger with student evaluations, in my opinion, is when they are viewed as “the” data rather than as a piece of the data.
From my experience, I believe the student evaluations *often* convey a *pretty good* estimate of the overall quality of the instructor. Please note the two qualifications in the previous sentence. When I look at people I believe to be good teachers, their student ratings are *usually* strong. When I look at people I do not believe to be strong teachers, their student ratings *usually* reflect this.
There are exceptions. Many faculty, for example, might complain that their evaluations are worse than a colleague’s because they have higher expectations of their students, and demand more. The solution there, in my judgment, is more data. The tenure and promotion process here allows faculty to offer their own argument for why their work in the classroom merits high ratings from evaluators, even if the student ratings may not support it. So, if someone believes the student ratings do not reflect their qualifications as a teacher, I think the solution is to offer more data, such as:
1. Proof of the rigor of the assignments;
2. Examples of how the instructor is a reflective, devoted teacher (such as lesson outlines, detailed syllabi, teaching journals);
3. Statements of teaching philosophy;
4. Samples of student work to show that learning has taken place;
5. Peer observation reports;
All of these are reasonable, albeit imperfect, sources of data. Presenting these data are fair game under the EMU evaluation system (although one might need to get permission to present student work).
Relying solely, or even primarily, on student evaluations is problematic. The truth often lies in the triangulation among imperfect sources.
I ran across these postings by accident looking for Bruce Springsteen (!) and had to add my two cents worth. Mark, I was with you on the Student Evals task force (or whatever it was called) for two years (!) from 04-06, when our contractual authorization expired. We examined (exhaustively) the literature, the experience of other schools, etc. etc. We all agreed that our instrument was flawed, we agreed that students have NO CLUE how evals are used in tenure/promotion decisions and as a consequence may behave irresponsibly, we even came up with some pretty good ways to fix things. Last I remember, a report was to be made to Bob Neeley, who, we were assured, was vitally interested in our findings and would take any action necessary. End of story. I never heard another thing – the whole enterprise sank without a trace. Jeff is quite right about the need for “triangulation” and of course, that is exactly what the contract calls for. But when he says that “usually” student evals confirm his opinions about the teaching abilities of people he “looks at,” I have to wonder what it is he thinks he sees. Good teachers come in all sizes (fortunately!), and sometimes even good teachers are not so good. We all talk about easy grades as a way to improve evals, but it is far more complicated – context is all. The size of the class, the composition of the class, the time of day, required or elective, the popularity of the subject matter, the race and gender of the teacher, are all variables that impact student evals and are generally outside the control of the instructor. The literature is pretty clear – student evals can help us identify truly gifted teachers and really abysmal teachers – but can’t distinguish among those in between.