describes undergraduate medical examinations. It discusses the strengths and weaknesses of various examination methods. The
methods covered are those that the author thinks are the best: multiple choice questions, short answer questions, the objective
structured clinical examination, and the log book. There are other methods that have deliberately not been discussed. The
presentation also covers writing and validating examination questions, appointment of examiners, examination regulations,
statistical adjustment of scores, transparency of examination results, and student appeals.
1.0 TIMING OF ASSESSMENT
Description of continuous assessment: Continuous assessment occurs throughout the semester. The results of the
consecutive assessments are cumulated and are used either singly or in combination with another method of assessment to grade
the student as performing satisfactorily or unsatisfactorily. The assessment can be in the form of observing and recording
student participation in and contribution to classroom activities. It may take the form of a short test or quiz at the end
of a module of study. The test could also be given by time periods for example weekly, monthly, or quarterly. Another form
of assessment could be a project undertaken throughout the semester and its various stages are assessed as they are accomplished.
Strengths and weaknesses of continuous assessment: Continuous assessment is the best form of assessment because
it has several strengths. It keeps the student on his toes all the time instead of relaxing and waiting for the end of the
semester to study hard and prepare for an examination. It takes away the stress of the examinations given in a short period
of time at the end of the semester. Another advantage of continuous assessment is the immediate feedback that the teacher
receives about the results of the assessment. The teacher can then take measures to address any deficiencies in student understanding
of the material taught. The results of continuous assessment represent the work done in the whole semester because students
are tested on each unit as it is taught. A major disadvantage of continuous assessment is that students do not have an opportunity
to integrate knowledge acquired in the whole semester.
1.2 END OF SEMESTER
ASSESSMENT (SUMMATIVE ASSESSMENT)
End of semester
assessment is the traditional form of assessment. Students put in extra effort towards the end of the semester to prepare
for examinations. It is the opposite of continuous assessment. Its weaknesses are the strengths of continuous assessment.
Its strengths are the weaknesses of continuous assessment. End of semester assessment has a big potential for biased assessment
because it is selective. Only a few items of what was covered in the semester are examined. A weak student who happens to
be good only on the items that were asked could appear to perform well. A strong student who for some reason failed to prepare
adequately for the items selected by the examination could be assessed as weak. Students can, using cumulative experience
of previous examinations, spot or guess what is likely to be asked and prepare for that while ignoring the rest of the curriculum.
End of semester assessment has less possibility of biased assessment by the examiner in contrast to continuous assessment
especially if it is not in objective form like multiple choice questions.
2.0 ASSESSMENT MEDIA
2.1 ORAL ASSESSMENT
can be very good in assessment but are not popular. The reason for this is the suspicion of examiner bias. I however think
that we can develop the oral examination to overcome these biases. Many examination questions can be written on index cards
and a candidate is asked to pick a certain number of cards at random. He is then given some time to prepare the answer. The
marking carried out by 2 examiners can be based on pre-determined model answers. When the candidate mentions the points expected
he gets credit and loses credit for failure to mention expected points. The examiners have an opportunity to cross examine
the candidate if the response is vague and it is difficult to determine whether the candidate knows the facts or not. To have a permanent record of the examination in case a third examiner will be needed
especially on appeal by a failing candidate, the examination can be recorded on video tape or audio tape.
2.2 WRITTEN ASSESSMENT
are the most popular form of assessment. They enjoy several advantages. The candidate has time to think about the questions.
If the first question is not clear at the start, the candidate can leave it and try the next question while thinking about
the first one. The candidate also has the opportunity to reread previous answers and may be correct or improve them. The written
examination provides a permanent record that can be reviewed later in case of disputes or for purposes of quality control.
The disadvantage of a written examination is that it assesses both the literacy of the student as well as well as knowledge
of subject matter. Students who are knowledgeable may not be able to demonstrate their knowledge fully because of limitations
in expressing themselves in writing.
skills acquired is best done by asking the candidate to perform or demonstrate skills taught or acquired. Likelihood of examiner
bias is decreased by using a structured assessment. Points of assessment are written in advance and as the student performs
according to expectation credit is given.
3.0 FORMS OF ASSESSMENT
3.1 MULTIPLE CHOICE
The multiple choice
question method of assessment is in my view one of the best assessment methods if used correctly. Its major advantage is that
it enables separation of ability to express oneself in written or spoken language from knowledge of facts. The facts are written
and the student need only separate the true from the untrue. MCQ examinations are easy to mark with results being available
a few minutes after the examination by using optical scanning technology. A disadvantage of the MCQ technique is that students
can identify the right alternative by using logical exclusions and sometimes by ‘gut decisions’ when they actually
do not have full knowledge or understanding.
MCQ is avoided
by many examiners because it is difficult to write good MCQ questions. Some avoid MCQ questions because of lack of familiarity.
Others do not like them because they are used to more traditional forms of examination.
3.2 PROBLEM BASED
PBQ is the ideal
in clinical or quasi-clinical medical examinations because they simulate the actual diagnostic and management processes that
occur in the clinical situation. In this form of assessment a candidate is given background information and he is required
to identify the problem. The initial information is not adequate to define and describe the problem fully and the candidate
has to formulate various hypotheses. Then more information is released progressively. With each new information release the
candidate is able to eliminate some hypotheses until he remains with the most likely hypothesis. The examination is not confined
only to formulating and eliminating hypotheses. Questions on facts related to the problem under study may be asked.
3.3 SHORT ANSWER
Short answer questions
have largely replaced the traditional essay form of assessment. The candidate is asked several short questions each requiring
an answer of about a paragraph. The questions could be free-standing and unrelated to one another. A preferred approach is
to provide the candidate with a background to provide a context. Then 5-10 questions are asked all related to that one context.
STRUCTURED CLINICAL EXAMINATION (OSCE)
OSCE has largely
replaced the traditional clinical examination involving giving a candidate a long or a short case to take history, undertake
a physical examination, reach a provisional diagnosis and be able to discuss the findings with the examiner. A major disadvantage
of the traditional method was that candidates did not get comparable cases. Some would pass by doing well on relatively easy
cases whereas others would fail by being given more difficult cases. The examiners for different candidates were also not
the same raising issues of objectivity, comparability, and fairness. The OSCE approach seeks to overcome these disadvantages
by presenting all candidates with exactly the same clinical problem and being examined by the same examiner on each item in
An OSCE examination
consists of 10-20 stations each lasting 5-10 minutes. The purpose of OSCE is to test clinical skills as well as communication
skills. The candidate is given clear written instructions of what to do. The examiner observes the candidate with minimal
interference. If there are any questions they must be standardized for all candidates. The station should cover as few skills
as possible. A check list of items to be marked is provided to the examiner. Any necessary equipment and supplies are made
available. The patient may be real, a simulated patient, or a mannequin. Simulated patients are given clear written instructions
and background information. An attempt is made to make them as real as possible. Simulated patients have an advantage over
real patients in that all candidates are presented with the same standard situation. Items are scored 2, 1, or 0.
3.5 LOG BOOK
in clinical attachments are asked to maintain a record of all cases seen and what they did with them. The log book is then
assessed by the examiners during and after the period of clinical attachment. They can reach an opinion if the student obtained
sufficient clinical experience. The log book has to be an authentic record. Some bad students and these are usually very few,
can cheat by recording cases they did not see or ‘creating’ details about cases seen that are not true. It is
therefore necessary that the log books be written up immediately and be available for inspection at random. If the log book
is examined as soon as the record is made, the examiner has the option to go to the ward and see that the facts written about
a particular case are correct. In this way cheating can be discouraged. Another
way of discovering cheating is to question the student about the case. Inconsistencies can be discovered very easily by an
4.0 TRAINING IN QUESTION WRITING
Writing good examination
questions is not easy. Lecturers need to attend regular workshops during the year to upgrade skills in question item writing.
These workshops should be practical and hands-on. Participating lecturers should write question items that are critiqued by
colleagues during the workshop. The workshop should be moderated by a person with experience in question item writing but
the level of expertise needed may not be too high because the workshop is essentially learning from one another. Besides critiquing
questions of fellow lecturers the workshop can also critique examination questions of other universities and examination bodies
that are readily available on the internet. This critique will give the workshop participants a benchmark against which to
5.0 WRITING QUESTIONS
There are many
approaches to writing questions and we cannot prescribe any one approach. The author’s preferred method is that the
lecturer writes questions immediately after teaching a topic while the material taught is still fresh in the mind. It is even
preferred that he has the teaching material in front of him to make sure that all what is asked was actually taught. The question should also mirror the way the material was presented. These questions
are accumulated so that at the end of the semester the lecturer has a wide range of questions to choose from. This type of
examination setting is thus very individualized and quite customized and differs from that of public examinations (in schools
or professional bodies) in which the writers of the questions are not the same as the teachers.
6.0 VALIDATING AND BANKING QUESTIONS
The best practice
is to have a bank of examination questions from which questions for each examination are selected randomly. Building up a
questions bank takes time about 5 years before a faculty has a respectable question bank. Questions for each examination are
discussed thoroughly by members of that department. The discussion should include trying to answer them from the students’
point of view to discover inconsistencies and vagueness. The questions should then be tested on students either in continuous
examinations or in end-semester examinations. The questions are graded in terms of ease and hardness according to student
performance. Questions that are consistently answered wrongly by many students should be identified and reasons found for
the failure of the students. The questions could be changes or modified and after that they are put in the bank. The question
bank must be renewed continuously as the curriculum changes and as the type of material taught changes with growth of scientific
7.0 SELECTING EXAMINERS
7.1 INTERNAL EXAMINERS
Teachers at the
university level enjoy academic freedom. This means that they have the right to teach what they want and in the way they want
it. They also have the right to examine or assess their students in the way they like. In practice there are policies and
procedures for examinations set by each faculty. These however do not violate the principle of academic freedom because the
teachers took part in formulating those policies and procedures. Each examination should have internal examiners who are the
lecturers who taught the subject and it is they who must set the examination questions. The questions should be vetted by
colleagues in the department. The process of vetting should essentially be feedback to the person who formulated the questions
so that he can go back and improve them. The process of critiquing and rewriting questions can be done several times until
the questions are perfected. At no stage in this process should the lecturer be marginalized. When the question is formulated
in the final version, the lecturer responsible is asked to write a list of points that would be accepted as a correct response.
The purpose of the model answer is to cross check on the suitability of the question. If the model answer is incongruent with
the question then we can know that there is something wrong with the question. The lecturer who wrote the question is also
the one to mark it unless there are overwhelming numbers of students necessitating more than one lecturer marking the question.
In any case all those who mark must be lecturers who teach that specific course to the candidates. It is a major mistake to
take the model answer and give it to anyone who is not a teacher of the course to correct. This could result into unfair assessment
of the candidates.
7.2 EXTERNAL EXAMINERS
As part of quality
control and benchmarking, lecturers who teach the subject from other universities can be asked to be external examiners. The
external examiner must be involved in the process of writing the questions. He must review the questions and give his input
before they are finalized. He can then after that marks the questions alongside the internal examiners. The marks awarded
per item for the internal and external examiners are compared. Where a wide divergence is seen, the 2 should have a conference
to establish the cause of the difference. They can through discussions be able to reach a compromise. The external examiner
is also expected to submit a written evaluation of the examination process with recommendations for improvement.
8.0 EXAMINATION REGULATIONS: STANDARD SETTING
Each faculty should
have written examination policies and procedures. These could cover, inter alia,
the following matters:
- Appointment of examiners
- Minimum attendance
- Absence from examinations
- Allocation of grades and passing mark
- Supplementary examinations
- Instructions to candidates
- Breach of examination regulations
- Release of results
- Appeal process
9.0 STATISTICAL ADJUSTMENT OF SCORES
Despite the best
efforts to make examinations comparable in hardness from year to year, examinations in one year may turn out to be more difficult
than other years. For comparability adjustments may be made to the passmark and
the actual scores. This has to be done to ensure justice and fairness for the students.
Setting the passmark
decides the proportions of false negative and false positive results as shown in the table below:
There are several
ways. A fixed percentage could be fixed for passing let us 90%. Candidates marks are then arranged in descending order and
the bottom 10% fail irrespective of their score. This method is used by certain professional examinations but is patently
A second method
is to set a criterion for passing by fixing a mark above which a candidate passes and below which a candidate fails. The criterion
of 50% has traditionally been used. In some cases faculties of medicine have set a higher criterion like 60% to ensure higher
standards. There is yet another system in which the passmark remains 50% but any score below B (80-89%) or C (70-79%) requires
that the examination be repeated.
There are other
methods of determining the pass mark but I do not recommend them because they are laborious and involve subjective judgment
that could create unintended bias. Other methods used are the Angoff method (pass mark is judged as the score of a borderline
student) and the Ebel method (pass mark for each question based on level of difficulty).
I do not recommend these because of their subjective nature and lack of consistency from year to year. They are also
very laborious to implement.
The marks could
also be statistically adjusted using a normal curve. The normal curve is the most objective and reliable methods. Using the
mean and standard score for each candidate, the marks can be fitted on a normal curve whose mean is either higher or lower
than that of the candidate exam marks.
10.0 TRANSPARENCY OF THE EXAMINATION PROCESS: MULTI-LAYER OF ENDORSING EXAMINATION RESULTS
and assuring fairness for all involved in the assessment process (lecturers and students), transparency must exist at all
levels of the examination. The examination questions must be vetted by a committee as explained above. One of the issues checked
is whether the questions reflect what was taught. In some cases the teaching material may be examined to ensure this. When
the examination is marked, the results are presented to a departmental or faculty meeting at which the breakdown of marks
for each candidate is available. All members of the meeting are free to raise any questions and seek clarifications. The answer
scripts should also be available in case some one wants to cross check. After endorsing the results at the department or faculty
levels, the results are next submitted to the university senate for endorsement. Only then can the results of the examination
be released officially.
11.0 STUDENT APPEALS OF APPEALS
As a further measure
of transparency candidates should be given an opportunity to make appeals about their examination score. For this reason the
answer scripts or any recording of the oral examination should be kept for a period of not less than 3 years to enable remarking.
In case of an appeal the lecturer who marked the answer script is asked to recheck because he may find a mistake that can
be corrected easily. If he cannot resolve the matter another examiner or two who teach that subject are asked to recheck the
script. Opening the door to student appeals can lead to a flood gate that cannot be controlled easily because students tend
to think that they performed better than they were awarded. Some of the ways of restricting the appeal process to only genuine
cases is to allow only those who failed to appeal. Another method is to tell the students that an appeal if allowed will nullify
the first grade and that the paper will be remarked with the possibility of either raising or lowering the mark. In some cases
the students may be asked to pay a small amount of money to submit their appeal.