Testing

One of the most important parts of a successful learning experience is the opportunity for learners to demonstrate their understanding of the facts and processes they are learning and receive feedback from their instructors. At the same time, you can learn how effective you have been in facilitating learning for your students. Use this information to revise your instructional practices.

Selection of Test Material

The selection of material to be tested should be based on learning objectives for the course; however, the complexity of the course material associated with those objectives (and the limited time for taking exams) means you can only sample the material in any given unit or course. Writing good exam questions requires plenty of time for composition, review, and revision.

Jot down a few questions after class each day when the material is fresh in your mind.

The exam will be more likely to reflect your teaching emphases than if you wait to write all the questions later.

Ask a colleague to review the questions before you give the exam.

Another teacher might identify potential problems of interpretation or spot confusing language.

Analyze the results after students have completed a test.

The process of test development does not end when the students take the exam; careful analysis afterward will help refine your questions and sharpen your testing technique.

Use a two-dimensional chart or matrix when planning an exam.

Such a chart will help you select questions across the spectrum of learning objectives.

Example of a test planning matrix:

Topics Learning Objectives
Recall Application Analysis Synthesis Evaluation % Wt.
Topic A
Topic B
Topic C
Topic D
% Wt.
  1. List in the left column (A,B,C) the main topics studied during the preceding period, and in the top row the types of learning objectives you had established.
  2. In each cell, indicate what percentage of the whole test you want to devote to a particular learning objective/topic. You do NOT have to put something in each cell; some cells may be blank, indicating a 0% weight.
  3. Total the rows and columns to review the relative weight you put on each topic and on each level of learning.
  4. If you are satisfied with the weights and balance, start writing questions for each cell and allocate points to each category of learning. You may write one or several questions for each cell. If you are not satisfied with the balance, adjust the weights in the cells to achieve the desired balance, and then begin to write questions.

Types of Tests

Different kinds of tests are appropriate in different settings. Performance testing, for example, is important when the learning goals involve acquisition of skills that can be demonstrated through action (e.g., music, theater, art, dance, medicine and physical education). Diagnostic testing is important at the beginning of a course or a given segment of a course to help you know the strengths and weaknesses of the learners so you can modify learning activities accordingly. Frequent short self-tests enable students to judge their performance while they are learning. More common in college settings, however, is the pencil and paper test used to measure a student's mastery of the subject matter.

From the standpoint of measurement, tests fall into two general categories: those in which students select the correct response from information provided on the test, and those in which students supply the answers themselves. Multiple choice, true/false and matching tests fall in the former category, completion, short answer and essay tests in the latter. The cognitive capabilities required to supply answers are different from those required to select answers, regardless of content.

Multiple-Choice Tests

In testing, multiple choice is the most widely used selection type test, because it can test such a wide range of instructional objectives. One major weakness of multiple choice tests, however, is the ease with which they can be structured to require only recognition or recall of information. Test designers should strive for questions that require application of knowledge as well as recall. For example, higher level multiple-choice questions can be based on interpretation of data presented in charts, graphs, maps, or other formats. All questions in a multiple-choice test should stand on their own; avoid questions that depend on knowing the answers to other questions on the test. Also, check to see if information given in some items provides clues to the answers of others.

Multiple choice questions consist of a stem and a number of response options. Following are some tips for constructing multiple choice questions.

The "stem" of the item, which poses a problem or states a question, should be written first. The basic rule for stem writing is that students should be able to understand the question without reading it several times or having to read all the options.
  • Write the stem as a single, clearly-stated problem. Direct questions are best, but completion statements may be used to avoid awkward phrasing.
  • If you do use a completion statement, place the blank at the end of the stem, never within it.
  • State the question as briefly as possible, avoiding wordiness and undue complexity.
  • In higher-level questions, the stem will normally be longer than in lower-level questions, but still strive for brevity.
  • Include as much of the item in the stem as possible, to keep alternative responses brief.
  • To test for mastery of vocabulary, use the term, not the definition, in the stem.
  • Stems are used for testing, not for teaching; two sentence stems that convey information first and then ask for responses violate good practice.
  • State the question in positive form, if possible, because students often misread negatively-phrased questions.
Multiple-choice questions normally have four or five optional responses, to make it difficult for students to guess the correct answer. Only one option should be unequivocally correct; "distracters" should be unequivocally wrong. If you write items in which more than one answer is correct and the student must pick out all the correct responses, each item is essentially a set of true-false questions. There are two basic rules for writing responses:
  1. students should be able to select the right response without having to sort out complexities that have nothing to do with knowing the correct answer
  2. students should not be able to guess the correct answer from the way the responses are written
  • Write the correct answer immediately after writing the stem and make sure it is unquestionably correct.
  • Match incorrect options and the correct response in length, complexity, phrasing and style.
  • Increase the believability of the distracters by including extraneous information and by basing them on logical fallacies or common errors students have made in class.
  • Be sure all options are plausible; humorous throw-away options defeat the purpose of having multiple options, which is to reduce the possibility of getting the correct answer by chance.
  • Be sure all options flow grammatically from the stem. If an item reads poorly, students' confusion will yield results that are not measures of actual knowledge.
  • Use capital letters (A, B, C, D, E) as response signs rather than lower-case letters ("a" gets confused with "d," and "c" with "e" if the type or duplication is poor).
  • Try to write items with equal numbers of alternatives so students don't have to continually adjust to a new pattern.
  • Keep the number of alternatives at five or fewer. (The more alternatives used, the lower the probability of getting the correct answer by guessing. Beyond five alternatives, however, confusion and poor alternatives are likely.)
  • Randomly distribute correct responses among the alternative positions so there are no discernible patterns to the answer sequence and a nearly equal proportion of As, Bs, Cs, etc.
  • Never use trick questions — they have no legitimate testing function.

True/False Tests

True/false tests are relatively easy to prepare since each item comes rather directly from the course content. They offer the instructor the opportunity to write questions that cover more subject matter than most other item types, since students can respond to so many questions in the time allowed. They are easy to score accurately and quickly.

True/false items, however, may not give a valid estimate of the students' knowledge, since half can be answered correctly simply by chance. True/false tests are poor instruments for diagnosing student strengths and weaknesses and are generally considered to be "tricky" by students. Since true/false questions tend to be either extremely easy or extremely difficult, they do not discriminate between students of varying ability as well as other types of questions do.

When constructing True/False tests, keep these suggestions in mind:

Keep language as simple and clear as possible.

Avoid verbatim statements from the text, negative statements (especially double negatives) and ambiguous or trick items. Be aware that extremely long or complicated statements will test reading skill rather than content knowledge. Use precise terms (such as 50% of the time), rather than less precise terms (such as several, seldom and frequently.)

Require students to circle/underline a typed "T" or "F."

Asking students to write "T" or "F" next to the statement can lead to confusion resulting from illegible handwriting.

Make sure statements are entirely true or entirely false.

Partially or marginally true or false statements cause unnecessary ambiguity.

Use certain key words sparingly since they tip students off to the correct answers.

The words all, always, never, every, none and only usually indicate a false statement, whereas the words generally, sometimes, usually, maybe and often are frequently used in true statements.

Use more false than true items, but not more than 15% more.

False items tend to discriminate more than true items.

Completion Tests

Completion questions are an alternative to selection items for testing recall, and are especially useful in assessing mastery of factual information when a specific word or phrase is important to know.

Advantages

Completion tests preclude the kind of guessing that is possible on limited choice items, since they require a definite response rather than simple recognition of the correct answer. Because only a short answer is required, their use on a test can enable a wide sampling of content. Sometimes multiple-choice questions can be converted to completion items, a feature that can be useful in creating subsequent tests on the same material.

Disadvantages

Completion items, however, tend to test only rote, repetitive responses and may encourage a fragmented study style since memorization of bits and pieces will result in higher scores. They are more difficult to score than forced-choice items and scoring often must be done by the test writer since more than one answer may have to be considered correct. On the whole, they have little advantage over other test types unless the need for specific recall is essential.

Suggestions for constructing completion items
  • Place blanks at the end of the statement.
  • Use original questions rather than taking questions directly from the text.
  • Provide clear and concise cues about the expected response in the statement.
  • Use vocabulary and phrasing that comes from the text or class presentation.
  • When possible, explain the degree of variation acceptable in the answers.
  • Avoid using a long quote with multiple blanks to complete.
  • Ask students to fill in only important terms or expressions.
  • Facilitate scoring by having the students write their responses on lines arranged in a column to the left of the items.
  • Require only one word or phrase in each blank.
  • Give students enough information to answer the question but not enough to give the answer away. For example, articles (a, an, the) and specific antecedents often provide clues.

Short Answer Tests

Short-answer items can take a variety of forms: definitions, descriptions, short essays or mixtures of the three. Short essays require students to apply their knowledge to a specific situation carefully delimited by instructions. This type of question is the equivalent of a math or physics problem and is less time-consuming to prepare than any other item type.

Advantages

Like essays, short answer items:

  • encourage students to strive toward understanding a concept as an integrated whole
  • permit students to demonstrate achievement of such higher level objectives as analyzing given conditions and critical thinking
  • allow expression of both breadth and depth of learning
  • encourage originality, creativity, and divergent thinking
  • offer students the opportunity to use their own judgment, writing styles and vocabularies.
Suggestions for writing short answer questions
  • Provide clear, unambiguous directions for the expected answer. For example, if you ask for a definition, outline the expected length of the response and the specific elements you require in a complete definition.
  • On a typed exam, leave only enough space for the desired length of response.
  • With the directions, list the number of points each question is worth; for longer questions with higher scores, the worth of each section should be clear.

Essay Tests

Many teachers consider essay questions the ideal form of testing, since essays seem to require more effort from the student than other types of questions. Students cannot answer an essay question by simply recognizing the correct answer, nor can they study for an essay exam by only memorizing factual material.

Advantages

Essay questions can test complex thought processes, critical thinking, and problem-solving skills, and require students to use the English language to communicate in sentences and paragraphs — a skill undergraduates need to exercise more frequently. However, essay questions that require no more than a regurgitation of facts do not measure higher-order learning.

Disadvantages

Essay exams also place limitations on the amount of material that can be sampled in the test, a fact that may cause students to complain (sometimes legitimately): "I knew a lot more about the subject than the test showed," or "Your test didn't reflect the material we covered."

Essay tests also provide students more opportunity for bluffing, rambling and "snowing" than do limited-choice tests. They favor students who possess good writing skills and neatness and are pitfalls for students who tend to go off on tangents or misunderstand the main point of the question. The main disadvantage, however, is that essay items are difficult and time-consuming to score and potentially subject to biased and unreliable scoring.

Planning for essay exams

As you plan for essay exams, define the type of learning you expect to measure. For example, do you expect students to be able to construct a reasoned argument from evidence, to analyze weaknesses in competing arguments, to select the best course of action in a new situation, or some combination of all these things? The best essay questions are based on the cognitive skills underlying the content rather than on the content alone.

To test problem-solving skills, clearly communicate the format and method for solving the problems to students. Without clues about how to proceed, students may adopt a plausible but incorrect approach, even if they knew how to solve the problem correctly. If you're interested in testing students' writing skills, stipulate the kinds of skills they must demonstrate and provide some test time for thinking and composing a well-crafted answer (otherwise, the effects of time pressure and test anxiety will usually result in poor writing).

Guidelines for constructing essay items
  • Distinguish between questions that require objectively verifiable answers and those that ask students to express their opinions, attitudes, or creativity. The latter are more difficult to construct and evaluate because it is more difficult to specify grading criteria (such questions therefore tend to be less valid measures of performance). Take-home tests and other out-of-class writing assignments may be more appropriate for demonstrating these kinds of skills.
  • Don't allow students to select which questions to answer (e.g., "choose two out of five"). It's not possible to compose five equivalent questions and students will choose the weaker questions, thereby reducing the exam's reliability.
  • Write an outline of your best approximation of the correct answer, with all of its sections in place. Decide on the total number of points the essay will be worth and assign points to each section. Tell students the value of each item in relation to the total grade.
  • Describe the expected length of the answer, its form and structure, and any special elements that should be present.
  • Make essay questions comprehensive rather than focused on small units of content.
  • Allow students an appropriate amount of time. Also give guidelines on how much time to use on each question, as well as the desired length and format of the response, such as full sentences, phrases only, or outlines.
  • Require students to demonstrate command of background information by asking them to provide supporting evidence for claims and assertions.
Example of an effective essay question

Lectures covering Piltdown ManGradualismPunctuated Equilibrium, and Catastrophismwere given sequentially to illustrate the interplay of theory and fact in the formulation of an anthropological account of the evolution of humankind. Write a three-part essay addressing the following questions:

  1. Name the major proponents of the underlined concepts above and briefly describe the significance of these people for the history of a science of evolution. (10 minutes, 10 points)
  2. Select any two of the four concepts above and explain how they illustrate the relationship between fact and theory. (10 minutes, 10 points)
  3. In your opinion, are new discoveries or theories of evolution really new or are they just repetitions of past ideas that have fallen out of favor? Your answer to part 3 must draw upon the four concepts underlined above and be consistent with what you have already written in parts 1 and 2. (20 minutes, 20 points)