The attempt to create “higher level” multiple choice questions does not always result in valid assessment. In fact, I’m beginning to wonder if this kind of close-ended, limited choice question can even legitimately test anything beyond a simple analysis.
In secondary school, I’ve seen students who could easily answer multiple choice questions aimed at synthesis and evaluation (according to Bloom’s taxonomy). Yet, these same students stumble on lower level questions. They had picked up the pattern of how the distractors were written… supposedly to require higher level thinking. But they struggle with lower level questions because the test makers get desperate for distractors that are plausible for themselves. Test writers sometimes expect students to show evidence of learning by the ability to perceive the difference in two words with one letter variance, or two identical numbers, except for the placement of a decimal. Students misread and get easy questions wrong. In this case are we testing comprehension or visual acuity?
Even on an objective based test there are students who can perform a particular task, and miss the questions testing it. For example in math, if questions are vaguely worded and require extensive vocabulary, a student who has mastered the math skill can still miss the question. I have discovered that when this type question is read aloud, students answer correctly more frequently. So are the questions testing higher level thinking skills in math or reading skills? What are we really testing?
Formulas for creating questions based on research are used in education. However, the test questions themselves are not often subjected to any kind of real evaluation. The best way to determine efficacy is to “test the test. ” When developing training for industry we used an alpha review of courses and tests that required problem-solving and applications of theories. Two or three students – the same average age and background as those to be trained -would provide feedback. Many invalid questions were dealt with during alpha review as students recognized ones that were confusing or contained multiple right answers. Then, a beta review was conducted with 50 to 100 students to provide statistics on validity and reliability.
However in the educational test industry, “testing the test” is often done as part of the actual administration. Because of the high stakes of the test for the schools, there is a great fear of letting any but a select few see the standardized tests before the release date. Even on the day of the exam students are warned not to discuss the questions with anyone after the test. Of course they still do. However, does that really affect how well students perform? Or only how well they think they have done.
What is the educational system gambling on when it depends on this kind of multiple choice testing ? It is whether or not the test results adequately mirror the students’ actual skills.