This Appendix is not intended to be a complete analysis of technical problems associated with conducting student surveys. It deals only with issues we have encountered and which we think are important.
Sooner or later, the conversation at the committee meeting or in the faculty lounge turns to student ratings of instructors. It’s a sure bet that within six seconds, someone will announce that ratings are meaningless - students don’t know enough to evaluate the quality of their instruction …What is interesting is that these assertions are invariable offered without a scrap of evidence by individuals with well-deserved reputations for analytical thinking. If someone offered such unsupported arguments in a research seminar, most of us would dismiss both the arguments and the arguer out of hand. In discussions of teaching, however, we routinely suspend the rules of logical inference without a second thought.
— (Feldman, 1992)
Feldman goes on to analyse a number of myths which seem to be almost universal. The points below use the terminology we have adopted in this manual rather than Feldman’s.
Believers in the myths should simply read the comprehensive reviews of the approximately 2000 research projects about the evaluation of teaching written by Cashin (1995), Marsh (1987) and Murray (1980).
Marsh concludes his most thorough review with these words:
Research described in this article demonstrates that student ratings are clearly multidimensional, quite reliable, reasonably valid, relatively uncontaminated by many variables often seen as sources of potential bias, and are seen to be useful by students, faculty, and administrators. However, the same findings also demonstrate that student ratings may have some halo effect, have at least some reliability, have only modest agreement with some criteria of effective teaching, are probably affected by some potential sources of bias, and are viewed with some scepticism by faculty as a basis for personnel decisions. It should be noted that this level of uncertainty probably exists in every area of applied psychology and for all personnel evaluation systems. Nevertheless, the reported results clearly demonstrate that a considerable amount of useful information can be obtained from student ratings; useful for feedback to faculty, useful for personnel decisions, useful to students in the selection of courses, and useful for the study of teaching. Probably, students’ evaluations of teaching effectiveness are the most thoroughly studied of all forms of personnel evaluation, and one of the best in terms of being supported by empirical research.
—(Marsh, 1987) [our emphases]
There is no simple answer to the question of which is the best method of processing the data obtained from student surveys. Choice will depend on the size of the project and the resources available. Table A.1 provides a comparison between the three main options.
Thus, a teacher wishing to conduct a formative evaluation with a small class may be happy with slow scanning rates, especially as he/she can easily design the questionnaire without professional help. On the other hand, where large classes are involved, Optical Mark Readers may have to be used because of the numbers involved. Currently, web-based questionnaires are only suitable for classes which meet in a computer laboratory which enables them to be completed under supervision.
Most readers will be familiar with the following question format where ‘Strongly Agree’ is rated 5 and ‘Strongly Disagree’ is rated zero:
The lecturer speaks clearly:
There are two problems with this format:
A rather better format is:
The lecturer speaks clearly:
The lecturer speaks:
Note that here, the anchors will change with each question which is something that provides the OMR software developers with a problem. It is, however, generally possible to adapt their standard software to accommodate this better practice.