|
Chapter 3: Fundamental Flaws
Schools suffer from an assortment of problems,
ranging from unhealthy cafeteria food to deteriorating
buildings to persistent racial inequities. These issues
are disturbing but can be solved given enough attention,
time, and capital. A second set of problems, which I
call fundamental flaws, present a tougher challenge
because they are inseparable from our current system and
can only be solved by radically changing the basic
structure of education itself. Our reliance on lectures
is the first fundamental flaw. The remaining five are
equally destructive.
Flaw 2: Using Grades to Sort and Punish
What is the main job of teachers? The answer seems
obvious: to teach. Yet if this is true, why are they not
evaluated on this basis? Teachers can go decades without
being observed in the classroom. Whether they are good
as instructors seems to be irrelevant since no action is
taken either way. Conversely, grades are scrutinized by
administrators every semester. When something goes
wrong, immediate action is taken, implying that the real
job of teachers is to assign grades.
As a teacher,
I was reminded annually that my primary task was to
judge and sort students rather than educate them. This
reminder came in the form of a memo saying that if class
averages fell outside of a certain range, “teachers
should meet with the principal to justify their grades”.
Initially, the premise seemed reasonable. Low averages
implied that students were not learning or that
expectations were unrealistic. High averages implied
that standards were lax or that grade inflation was a
problem.
Grade inflation happens when teachers
yield to pressure, causing averages to rise despite
unchanging student ability. Since the grading ceiling is
constant, grade inflation is also called “grade
compression”, as the bell curve gets squeezed toward the
high end of the grading spectrum. The existence of this
phenomenon is not universally accepted. Critics claim
that grade inflation is a myth.1 They point to one study
of college transcripts that showed a small decline in
grades over a twenty-year period. However, college
enrollments increased by more than 50% during this
period.2 The author of the study himself said that the
decline in grades was “not surprising given the overall
increase in participation in higher education”.3 In
other words, high enrollments may have diluted the
talent pool, keeping grades low.
Despite a few
doubters, most researchers agree that grades can
increase over time.4–8 In the largest study of high
school transcripts ever published, it was found that
averages increased by 11% between 1990 and 2000.9
Increases were seen in all subject areas and academic
levels. When faced with this evidence, critics are not
dissuaded. They argue that students are earning higher
grades because they are smarter and working harder than
previous generations. But this explanation is difficult
to accept since standardized test scores have remained
constant while grades increased.10
Regardless of
whether grade inflation is real, administrators try to
prevent it by ensuring that averages are similar between
classes and that grades are distributed along bell
curves within classes. Sometimes this is an unspoken
rule; sometimes it is an official policy. In Arkansas, a
statewide “Grade Inflation Index” is used to equalize
grades across high schools.11 Such well-intentioned
attempts to maintain standards have serious unintended
consequences. Instead of striving for universal
excellence, teachers need poor performances to make
their averages work out, creating a conflict of
interest. Teachers cannot serve as advocates for
students if they are also expected to act as judge and
jury.
In graduate school, I took a course that can
best be described as surreal. Each student was given a
thick manual of photocopied articles compiled by the
professor. The strange thing was that the pages were not
numbered. Rather than, “Turn to page 132,” we were told,
“Open your manual to the three-quarter mark. Look for a
picture of a lung. It’s four pages before that, on the
back of a graph.” That was weird, but the most absurd
part of the course came when the final exams were
returned. Our professor, Dr. Forrester, informed us that
he had incorporated “insult marks” into the calculation
of exam grades. This was a new concept for me.
Apparently, insult marks arise when one of your answers
is so insulting to the collective intelligence of
humanity that a zero is insufficient. Think of it as
academic antimatter. Exceedingly stupid answers receive
negative marks, which then annihilate some of the
positive marks from other questions. Dr. Forrester had
an equally bizarre perspective when it came to questions
that students aced. He would say, “Question five was
pointless because everyone got it right; good questions
have normal distributions that separate students.”
The belief that students should compete for grades is
not new. It is called “grading on the curve” or
“norm-referenced” evaluation. The opposite is
“criterion-referenced” evaluation. Generally, I avoid
these terms because they are bombastic and because there
is no practical difference between them. For example, is
the SAT a norm-referenced or criterion-referenced test?
In theory, all students can get a good score. In
reality, the test is carefully designed to ensure a bell
curve. Thus, the SAT is a norm-referenced test, with the
“norming” done in advance. The same is true of
teacher-designed tests. Teachers rarely grade on the
curve after the fact; they arrange this beforehand by
adjusting the difficulty level of assessments. A test
that is too easy is followed by a more difficult one and
will be made harder for next year’s class. Since
teachers are expected to keep averages within a
reasonable range, they give norm-referenced tests that
masquerade as criterion-referenced tests. Even if the
true intent is to measure ability without sorting students,
this is defeated when grades are used to decide honors,
scholarships, and college admissions.
Before I
continue condemning grades, let me take a step back to
answer some basic questions. What exactly are grades?
Are they used universally? Do students in India, Russia,
and other countries also get graded, or has someone
devised a better system that we can follow?
Grades
come in many flavors. In most countries, good grades are
represented by large numbers. But some countries use
small numbers, whereas others use letters. Converting
between systems is complicated because few are linear;
averages can be bunched anywhere along the grading
spectrum. Even similar systems have curious differences:
“D” is equal to 40%–54% in Ireland, 50%–59% in Canada,
and 60%–69% in the United States. (It should be noted
that higher ranges do not necessarily imply higher
standards. Despite associating “D” with a higher
percentage, the United States does slightly worse than
Ireland and Canada on tests of math, reading, and
science proficiency.)12 The most unique system has to be
Denmark’s. They use a ten-level scale, ranging from zero
to thirteen (one, two, four, and twelve are excluded to accentuate extreme grades). The worst grades in Denmark
are 00 and 03 – the extra zero prevents students from
changing 0 to 10 or 3 to 13 with a single line on report
cards.
Table 3.1: Grading Systems Used
Around the World 13
Levels |
Grading system |
Failure |
Country |
| 4 |
MVG, VG, G, IG |
IG |
Sweden |
| 5 |
A, B, C, D, F (or E) |
F (or E) |
United States |
| 5 |
5, 4, 3, 2, 1 |
2, 1 |
Russia |
| 6 |
1, 2, 3, 4, 5, 6 |
5, 6 |
Germany |
| 7 |
10, 9, … 5, 4 |
4 |
Finland |
| 10 |
13, 11, 10, … 5, 03, 00 |
5, 03, 00 |
Denmark |
| 10 |
10, 9, … 2, 1 |
3, 2, 1 |
Latvia |
| 21 |
20, 19, … 1, 0 |
<10 |
France |
| 61 |
7.0, 6.9, … 1.1, 1.0 |
<4.0 |
Chile |
| 101 |
100, 99, … 1, 0 |
<40 |
India |
Table 3.1 is not exhaustive. It lists only a handful
of countries and fails to reflect the variety within
each. Systems can vary by historical period, school
level, and region. For example, consider the United
States. In the nineteenth century, universities
experimented with comments, descriptive adjectives,
letters that represent adjectives, alphabetic scales,
and various numerical scales (1–4, 0–9, 0–20, 0–100,
etc.).14 By the turn of the century,
percentage-based systems were common, although passing
cutoffs continued to vary from 50% to 75%. Today, most
high schools use letters. A–E is popular in the
Northeast; A–F is preferred in Southern and Western
States. (E and F are not used together since F can
easily be converted to E with a single line. If nothing
else, Americans and Danes share their distrust of
students.)
Also in this chapter ...
- How grades affect student motivation and
morale
- Grading alternatives: pass/fail and narrative
assessment
- The public cost of K-12 education
- The importance of choice, inspiration, and
feedback
- The last fundamental flaw: misplaced responsibility
This is an excerpt
from Chalkbored: What's Wrong with School and How to
Fix It. Order the book
here.
|