CHALKBORED      
     
 ORDER  CONTACT  LESSONS
 contents  1  2  4  5  6  7  references
         
 
 

 

Chapter 3: Fundamental Flaws

Schools suffer from an assortment of problems, ranging from unhealthy cafeteria food to deteriorating buildings to persistent racial inequities. These issues are disturbing but can be solved given enough attention, time, and capital. A second set of problems, which I call fundamental flaws, present a tougher challenge because they are inseparable from our current system and can only be solved by radically changing the basic structure of education itself. Our reliance on lectures is the first fundamental flaw. The remaining five are equally destructive.

 

Flaw 2: Using Grades to Sort and Punish

What is the main job of teachers? The answer seems obvious: to teach. Yet if this is true, why are they not evaluated on this basis? Teachers can go decades without being observed in the classroom. Whether they are good as instructors seems to be irrelevant since no action is taken either way. Conversely, grades are scrutinized by administrators every semester. When something goes wrong, immediate action is taken, implying that the real job of teachers is to assign grades.

As a teacher, I was reminded annually that my primary task was to judge and sort students rather than educate them. This reminder came in the form of a memo saying that if class averages fell outside of a certain range, “teachers should meet with the principal to justify their grades”. Initially, the premise seemed reasonable. Low averages implied that students were not learning or that expectations were unrealistic. High averages implied that standards were lax or that grade inflation was a problem.

Grade inflation happens when teachers yield to pressure, causing averages to rise despite unchanging student ability. Since the grading ceiling is constant, grade inflation is also called “grade compression”, as the bell curve gets squeezed toward the high end of the grading spectrum. The existence of this phenomenon is not universally accepted. Critics claim that grade inflation is a myth.1 They point to one study of college transcripts that showed a small decline in grades over a twenty-year period. However, college enrollments increased by more than 50% during this period.2 The author of the study himself said that the decline in grades was “not surprising given the overall increase in participation in higher education”.3 In other words, high enrollments may have diluted the talent pool, keeping grades low.

Despite a few doubters, most researchers agree that grades can increase over time.4–8 In the largest study of high school transcripts ever published, it was found that averages increased by 11% between 1990 and 2000.9 Increases were seen in all subject areas and academic levels. When faced with this evidence, critics are not dissuaded. They argue that students are earning higher grades because they are smarter and working harder than previous generations. But this explanation is difficult to accept since standardized test scores have remained constant while grades increased.10

Regardless of whether grade inflation is real, administrators try to prevent it by ensuring that averages are similar between classes and that grades are distributed along bell curves within classes. Sometimes this is an unspoken rule; sometimes it is an official policy. In Arkansas, a statewide “Grade Inflation Index” is used to equalize grades across high schools.11 Such well-intentioned attempts to maintain standards have serious unintended consequences. Instead of striving for universal excellence, teachers need poor performances to make their averages work out, creating a conflict of interest. Teachers cannot serve as advocates for students if they are also expected to act as judge and jury.

In graduate school, I took a course that can best be described as surreal. Each student was given a thick manual of photocopied articles compiled by the professor. The strange thing was that the pages were not numbered. Rather than, “Turn to page 132,” we were told, “Open your manual to the three-quarter mark. Look for a picture of a lung. It’s four pages before that, on the back of a graph.” That was weird, but the most absurd part of the course came when the final exams were returned. Our professor, Dr. Forrester, informed us that he had incorporated “insult marks” into the calculation of exam grades. This was a new concept for me. Apparently, insult marks arise when one of your answers is so insulting to the collective intelligence of humanity that a zero is insufficient. Think of it as academic antimatter. Exceedingly stupid answers receive negative marks, which then annihilate some of the positive marks from other questions. Dr. Forrester had an equally bizarre perspective when it came to questions that students aced. He would say, “Question five was pointless because everyone got it right; good questions have normal distributions that separate students.”

The belief that students should compete for grades is not new. It is called “grading on the curve” or “norm-referenced” evaluation. The opposite is “criterion-referenced” evaluation. Generally, I avoid these terms because they are bombastic and because there is no practical difference between them. For example, is the SAT a norm-referenced or criterion-referenced test? In theory, all students can get a good score. In reality, the test is carefully designed to ensure a bell curve. Thus, the SAT is a norm-referenced test, with the “norming” done in advance. The same is true of teacher-designed tests. Teachers rarely grade on the curve after the fact; they arrange this beforehand by adjusting the difficulty level of assessments. A test that is too easy is followed by a more difficult one and will be made harder for next year’s class. Since teachers are expected to keep averages within a reasonable range, they give norm-referenced tests that masquerade as criterion-referenced tests. Even if the true intent is to measure ability without sorting students, this is defeated when grades are used to decide honors, scholarships, and college admissions.

Before I continue condemning grades, let me take a step back to answer some basic questions. What exactly are grades? Are they used universally? Do students in India, Russia, and other countries also get graded, or has someone devised a better system that we can follow?

Grades come in many flavors. In most countries, good grades are represented by large numbers. But some countries use small numbers, whereas others use letters. Converting between systems is complicated because few are linear; averages can be bunched anywhere along the grading spectrum. Even similar systems have curious differences: “D” is equal to 40%–54% in Ireland, 50%–59% in Canada, and 60%–69% in the United States. (It should be noted that higher ranges do not necessarily imply higher standards. Despite associating “D” with a higher percentage, the United States does slightly worse than Ireland and Canada on tests of math, reading, and science proficiency.)12 The most unique system has to be Denmark’s. They use a ten-level scale, ranging from zero to thirteen (one, two, four, and twelve are excluded to accentuate extreme grades). The worst grades in Denmark are 00 and 03 – the extra zero prevents students from changing 0 to 10 or 3 to 13 with a single line on report cards.

Table 3.1: Grading Systems Used Around the World 13

Levels

Grading system Failure Country
4

MVG, VG, G, IG

IG

Sweden

5

A, B, C, D, F (or E)

F (or E)

United States

5

5, 4, 3, 2, 1

2, 1

Russia

6

1, 2, 3, 4, 5, 6

5, 6

Germany

7

10, 9, … 5, 4

4

Finland

10

13, 11, 10, … 5, 03, 00

5, 03, 00

Denmark

10

10, 9, … 2, 1

3, 2, 1

Latvia

21

20, 19, … 1, 0

<10

France

61

7.0, 6.9, … 1.1, 1.0

<4.0

Chile

101

100, 99, … 1, 0

<40

India

Table 3.1 is not exhaustive. It lists only a handful of countries and fails to reflect the variety within each. Systems can vary by historical period, school level, and region. For example, consider the United States. In the nineteenth century, universities experimented with comments, descriptive adjectives, letters that represent adjectives, alphabetic scales, and various numerical scales (1–4, 0–9, 0–20, 0–100, etc.).14 By the turn of the century, percentage-based systems were common, although passing cutoffs continued to vary from 50% to 75%. Today, most high schools use letters. A–E is popular in the Northeast; A–F is preferred in Southern and Western States. (E and F are not used together since F can easily be converted to E with a single line. If nothing else, Americans and Danes share their distrust of students.)


 Also in this chapter ...

  • How grades affect student motivation and morale
  • Grading alternatives: pass/fail and narrative assessment
  • The public cost of K-12 education
  • The importance of choice, inspiration, and feedback
  • The last fundamental flaw: misplaced responsibility

This is an excerpt from Chalkbored: What's Wrong with School and How to Fix It. Order the book here.