For many academics, the phrase ‘evaluation of teaching’ conjures up the notion of student surveys which ask impertinent questions about the quality of lectures. This notion may be reinforced by institutional practices or rumours about such practices but embodies serious misconceptions about best practice in evaluation and of the nature of teaching itself.
The concept of a ‘research-led university’ is relatively new and its implications are still being unravelled but one such implication should surely be that the university’s policies and practices should, wherever possible, be based on sound research rather than on myth and rumour.
Research into evaluation of teaching has produced an extraordinary number of publications from which a consensus has emerged on the main issues. This manual attempts to set out a number of the key elements of that consensus.
In this Introduction, we discuss the implications for evaluation of different conceptions of teaching, the purposes of evaluation and general methodological issues which will be treated further in subsequent chapters.
In recent years, there has been a re-examination of the nature of teaching in Higher Education. Ramsden (1992) provides a readable account of the main issues which have emerged. In particular, he notes two polar conceptions or definitions of teaching:
The first conception has been criticised as inadequate on a number of grounds but the only one which concerns us here relates to evaluation. The objection is that the definition focuses too narrowly on the role of the individual teacher in the classroom, the implication being that evaluation of teaching consists only in making judgments about the effectiveness of individual teachers in their role as instructor. Such judgments are a necessary part of comprehensive evaluation of teaching but are not sufficient.
Nevertheless, the fact that much of the evaluation literature refers to judgments about individual teachers in just one of their roles is testament to the ongoing hold the conception has in universities.
But consider the matter from the students’ perspective. The quality and quantity of their learning depends on rather more than the teacher’s input in the classroom. Access to a good library, access to computers which work, pleasant and appropriate working spaces and effective student support services are all obvious influences. Less obvious to the students perhaps, but no less significant, are Faculty and Department policies relating to curriculum design and assessment practices Still less obvious may be institutional attitudes towards teaching which may be reflected in promotion policies and the level of financial and other support for teaching and learning.
When contemplating a comprehensive evaluation of teaching, it may therefore, be more productive to consider the whole environment in which learning takes place rather than simply the one aspect of the contribution of the individual teacher (Figure 1.1 ).
Such consideration suggests a broadening of the definition of teaching to include all elements of the learning environment. For evaluation purposes however, even this definition may not suffice. Students may learn in even the poorest of environments.
We propose, therefore, that for the purposes of this manual, teaching should be defined as follows:
|Teaching is the creation and sustaining of an environment which promotes effective learning.|
An implication of such a definition is that any comprehensive evaluation of teaching should address at least the following elements:
The first answer to this question must be, ‘Why not?’ All other areas of academic life are continually evaluated both formally and informally—in particular research, where elaborate systems have been established to provide a means for awarding grants and for approving works for publication.
More specifically, there are many reasons why teaching should be evaluated:
The ultimate purpose of all evaluation of teaching ought to be the improvement of teaching and hence of learning. The results of evaluation provide a foundation for individual teachers, academic and support departments and the institution itself on which to base plans for enhanced outcomes. All other purposes derive from this one.
Curricula need to be evaluated on a regular basis because of changes to the composition of the student body, demands from government, professional bodies and employers and the constant need to revise course content to take account of advances in knowledge.
What am I doing well? How do I know? What do I need to do to improve my performance? The answers to these questions provides the basis for any systematic programme of personal professional development
Society, through the medium of government and its agencies has an undeniable right to be assured that university programmes are of the highest quality. Universities themselves have a legal and professional duty to ensure the quality of teaching. Evaluation is an essential component of quality assurance
Universities maintain that they recruit high quality staff and retain them through policies and procedures which encourage professional growth. The recruiting and the promotions processes, therefore, should involve evaluation of performance in relation to potential and achievement respectively.
Administrative decisions relating to teaching programmes (including funding and priority setting) should, in the first instance, be made on educational grounds. This is impossible unless the decisions are made on a sound basis provided by evaluation.
At this point, it is useful to introduce the distinction between formative and summative evaluation. There have been many (not always compatible) definitions of these two terms but perhaps the most helpful approach has been to define the distinction itself. Harvey’s definition makes the point: ‘When the cook tastes the soup, it is formative evaluation; when the dinner guest tastes the soup, it is summative evaluation.’ (Harvey, 1998, p. 7)
More formally, Table 1.1 draws out some of the differences.
The major distinction between formative and summative evaluation is one of primary purpose although the distinction is not entirely clear-cut. Thus, the results of a summative evaluation may themselves provide feedback which could lead to improvements in future programmes. On the other hand, the results of formative evaluations should normally never be used for summative purposes. One reason is that many formative evaluations seek to discover reasons why innovations did not work as well as expected. It is not incumbent on an institution, department or an individual to advertise problems. Other reasons will emerge in the chapters to follow.
Normally, formative evaluations are conducted during a programme. For example, lecturers may wish to evaluate the effectiveness of an innovation with a view to amending it if it is not working well. Classically, summative evaluations are conducted at, or towards the end of a programme and look backwards. Again, however, the distinction is not clear-cut. A summative evaluation of an institution’s or a department’s effect on the learning environment cannot be undertaken when all its programmes have ceased. The work of the institution and the department continues during and after the evaluation.
In principle, those (institutions, departments or individuals) who initiate a formative evaluation are themselves the evaluators because the primary purpose of the evaluation is to provide them with feedback on which improvements can be based. On the other hand, evaluations which lead to administrative or personnel decision making are usually commissioned by people external to the department or individual being evaluated. Yet again, the distinction is not entirely unambiguous, particularly in the instance of institutional evaluations where the institutions themselves may be the initiators.
When the cook tastes the soup, he/she is in the process of developing the final product. The outcome of the process is the completed dish which the guests evaluate summatively—does it taste nice? Again, the distinction should not be carried too far. The guests may well argue that the preparation process is not completed until each of them has the opportunity to add salt and pepper to taste!
Few people enjoy their work being evaluated and much opposition to the evaluation of teaching is based on fears of its consequences. Some of those fears may be assuaged by negotiating and establishing confidentiality ground rules before any evaluation takes place.
In general such ground rules should ensure that the only people with access to the results of formative evaluations are those person(s) or bodies wishing to receive feedback. On the other hand, the results of summative evaluations will usually need to be seen by administrative bodies such as Promotions Committees, Departmental Review Panels and so on. In general, detailed results of teaching evaluations should be seen by as few people as possible. On the other hand, contributors to any evaluation have the right to know results in broad terms. This issue, particularly where it relates to students is discussed further in Chapter 6 .
Formative evaluations may be quite informal to the extent that feedback may be obtained in a variety of ways in a variety of forms beginning with chats with students or colleagues over a drink. Most academic staff, however, will wish for something more valid and reliable which brings with it greater formality. For summative evaluations, because of the potential implications for individuals, Departments and institutions and because of the requirement for natural justice, a high degree of formality is needed.
Detailed discussion of evaluation methodologies will be found in the following chapters. There are, however, certain general points which can be made which apply to all evaluations.
Why are you undertaking the evaluation? What is it you want to find out and why? Is it to be formative or summative. While summative evaluations can be used formatively, the reverse is not true. Are you intending to evaluate people or processes, teachers or courses, individuals or departments or institutions? Evaluations can have more than one purpose, but, if so, extra care needs to be taken to make those purposes explicit to all stakeholders.
Each evaluation has a unique set of stakeholders who should be identified at an early stage. The list may include any or all of students, parents, employers, unions, academic staff, non-academic staff, the university, government agencies, professional bodies and department review panels. Where possible, stakeholders should be involved in the design and implementation of the evaluation. They should always be informed of the evaluation and its purposes. Where appropriate (as in settling confidentiality rules), the evaluation should be negotiated with unions (including student unions) and staff associations.
Ultimately someone or a group of persons must take responsibility for the set of judgments which is the defining characteristic of all evaluations. In the case of a lecturer seeking feedback on a course, that lecturer is the evaluator. At the other end of the spectrum, the evaluator might be the Higher Education Authority. It may seem a simple matter of commonsense to suggest that evaluators be formally identified, but failure to do so can lead to confusion and opposition to the evaluation itself.
Do not, however, confuse evaluators with sources of evidence. A common confusion is to call student surveys of teaching ‘student evaluations’. Not so. The student opinions are one source of evidence used by the evaluator(s) in reaching a judgement.
Useful sources of evidence in any evaluation are those who can answer the questions that the evaluation asks. Setting appropriate questions is perhaps the most fundamental part of any evaluation and is related to its purposes. Thus, in evaluating an engineering curriculum, an appropriate question might be, ‘How relevant is the curriculum to the workplace?’ In this instance, useful sources of evidence would include graduates, careers officers, professional bodies and employers but would exclude undergraduate students who almost certainly have no experience of the workplace.
Appropriate methods will be related to the purpose(s) of the evaluation and usually to the budget available. They may be qualitative or quantitative or both.
Informal chats with students in a bar may only cost a few rounds of drinks and some time. Anything more sophisticated, however may involve both direct and indirect costs which should be budgeted for. Student questionnaires, for example, require considerable time to design, significant costs to produce and scan and, particularly if the evaluation is to be summative, the cost of its administration must be taken into account.
It is both immoral and impolitic to commence an evaluation without thinking through possible consequences. On the one hand, an evaluation may raise expectations which are impossible to meet. Thus, an institutional-wide morale survey may reveal problems which the institution may not be in a position to address. On the other hand, an evaluation may draw attention to the shortcomings of individuals or groups. Unless there are support mechanisms in place to assist those individuals or groups, the whole exercise is likely to be counterproductive. Of what use is it, for example for a teacher to discover that 90% of students give him/her low ratings on some aspect of lecturing if there is no one to assist in identifying the problems and assisting him or her to find solutions?
A preliminary checklist for evaluating teaching is provided in Appendix B .