Students, Studying, and Multiple-Choice Questions

Multiple-choice questions are not the pariah of all test questions. They can make students think and measure their mastery of material. But they can also do little more than measure mastery of memorization. Memorizing is usually an easier option than thinking and truly understanding.

Biologist Kathrin Stanger-Hall wondered what would happen if she changed the exam format in her large introductory biology courses. Would a change in format, specifically the inclusion of short-answer questions, affect how students studied? Would having to construct responses improve scores on the final exam? And, perhaps most important, would it improve performance on multiple-choice questions that tested higher-level thinking skills?

She answered those questions in a study that compared two large sections of an introductory course for biology majors. In one section (282 students) students were assessed using traditional 50-question multiple-choice exams (four of them) with 25 to 30 percent of the questions assessing higher-level thinking skills (levels three to five on the Bloom taxonomy). This section is subsequently referred to as the MC section. In the second section (192 students), the exams included 30 multiple-choice questions (with the same percentage testing higher-order thinking) and three or four short-answer (SA) questions. This section is subsequently referred as the MC+SA section. In addition to data from the exams themselves, students in both sections completed four online surveys.

The survey items used in this study are particularly interesting. Students reported how many hours a week they were studying for their science classes generally and for this biology course specifically. They were also asked to identify which study behaviors they were using. The study behaviors they were to select from included two categories: (1) “cognitively passive” behaviors typical of surface learning (e.g., I came to class, I read the assigned text, I rewrote my notes, I made index cards, and I highlighted in the text); and (2) “cognitively active” behaviors typical of deep learning (e.g., I reorganized the class information, I wrote my own study questions, I tried to figure out the answer before looking it up, and I closed my notes and tested how much I remembered).

Analysis of the data revealed a variety of intriguing findings, only a few of which are highlighted here. For starters, on average, students in both sections of the course reported studying just about exactly the same amount of time—and that result was the same each time the survey was administered. They reported studying about three-and-a-half hours per week, which the researcher/instructor notes was significantly less than the six hours she recommended.

As for how they were studying, once again students in both sections reported using about the same number of cognitively passive strategies, and that didn’t change throughout the semester. But the cognitively active strategies used in the second section (the MC+SA section) increased significantly from the beginning of the semester to the second exam and from the second to the fourth exam.

Final exam scores also contained some significant differences. Students who took the MC+SA exam format scored significantly higher on the final than did those taking the MC-only test (67.34 percent and 63.82 percent, respectively). The final exam, taken by students in both sections, included 90 MC questions and questions where students had to construct responses. Students in the MC+SA section scored significantly higher on the multiple-choice questions and the difference “was mostly due to significantly better performance on the higher-level MC questions.” (p. 300) In fact, the MC+SA students significantly outperformed the MC section on all final exam measures. So although the change in test format did not increase study time, it changed how the students studied, improved their exam scores, and promoted development of their higher-level thinking skills.

Stanger-Hall made students in both sections aware of her goal to help them develop higher-level thinking skills. She told them 25 to 30 percent of the questions on the exams would test their thinking at these higher levels, and she included a number of activities (skeleton templates for writing their own study questions and use of higher-level clicker questions, for example) designed to prepare students to study for and answer these questions. But despite this support and the fact that students in the MC+SA section learned significantly more, “they did not like being assessed with CR [constructed response] questions. In the anonymous end-of-semester class evaluations, the students in the MC+SA section rated the fairness of grading in the course much lower than did the students in the MC-only section.” (p. 302) Students in both sections objected to the emphasis in the course on higher-level thinking skills. One student wrote that the instructor should “just teach biology” rather than focus on thinking skills.

Reference: Stanger-Hall, K.R. (2012). Multiple-choice exams: An obstacle for higher-level thinking in introductory science classes. Cell Biology Education—Life Sciences Education, 11 (3), 294-306

Reprinted from Students and Multiple-Choice Questions. The Teaching Professor, 26.10 (2012): 4.

This Post Has 6 Comments

  1. Laura S.

    I had to cringe at the student comment at the end: "“just teach biology” rather than focus on thinking skills."!!!
    But I also have to wonder about those exam results. Seems to me that a mere 4% higher is not "significantly higher" and a class average of less than 70% is not good enough. What are "constructed response" questions on a MC exam?

    1. Ian

      Laura, the "constructed response" was the (SA) or short answer. It did not appear in the (MC) only test.

      1. Laura S

        thanks for clarifying the CR questions (must have misread the context of that). I also give students exams using CR questions (short response essays of a paragraph). I find that they tend to do better on such exams mostly because they can get partial credit while a M/C exam the answer is either right or wrong with no mid-range. I would think that if we point out to students this advantage of CR type questions, they might appreciate them better.

  2. nancy s

    I wonder how the researcher determined the test should have 25-30% of the questions at higher-level thinking skills? Anyone know?

Leave a Reply