What's Wrong With Mean?
Lesson 7 of 11
Objective: SWBAT understand the limitations of using mean to understand the nature of a data set, which lays the groundwork for their upcoming study of standard deviation and the normal distribution.
During today's lesson, my goal is to get students to understand how the need arose for the statistical tool of standard deviation. In practice, I enjoy the opportunity to lead some fun conversations that will generate this need.
Today's Opener, which you'll find on slide #2 of the Whats Wrong with Mean Powerpoint, is the first of three problems that I use to start working toward this goal. There are two cars full of people, and as students will see by solving the problem, the average age of the people in each car is 24 years old. But something shouldn't feel right about this. I ask everyone to infer whatever they can about what's happening in each car, and we note that in one car there are 5 twenty-somethings
With my students, I don't just say it as simply as that, however. Instead I'll ask, "Who is listening to better music?" and "Which car would you rather be in?" I have some fun with this. No matter what, the big idea is that even though the mean age might be the same in each car, these situations are rather different. So just looking at the mean of a data set actually leaves a lot of information out.
Following the opener, I further establish the shortcomings of mean with two more problems.
On slide #3 of Whats Wrong with Mean, the salaries of all the employees at a small company are given. This is a useful problem to pose because it requires students to notice that ten people are earning $25k per year and eight are earning $36k, so this is my chance to check if students know how to deal with that while computing the average. If they do it right, they'll figure out that the average salary is a little less than $42,000 per year (I show students how to attend to precision by showing them that it's perfectly acceptable to state the mean in those terms), and then my questions begin.
I try to make things personal as I get the conversation going. If you scored a job at this company, how much would you expect to make? Is anyone actually earning ~$42k at this company? If mean is a "measure of center," we might expect that about half of the company's employees make more than this. But how many do? Just 6 out of 24. Is that fair? What do you think people do at this company? As with the opening problem that precedes this, we can make some reasonable assumptions about the structure of the company based only on the salaries of the people working there (as long as we remain aware that these are just assumptions).
I also bring it back to the math we've been studying, with questions like: Is median any better for this task? Would a box plot help?
Average High Temperature
For the third problem, I post graphs of the average daily high temperatures in two different cities for each month of the year (two cities). Take a look: can you guess which city is represented by each graph?
In a conversation with students, I keep our use of statistical language informal. I might ask, "What might the median annual temperature be in each of these cities?" or more simply, "Where is the middle of each of these graphs?" Either way, what I want students to see is that in both cities, we can say that the center of the data is around the low 60's (in degrees Fahrenheit). But in one city, there's a dramatic fluctuation between low and high monthly temperatures, while in the other there's a flatter, more moderate graph. Again, I make my questions personal: "Where would you rather live? Why?" I also ask about seasons, "What can you tell me about the seasons in each of these places?" I get warmly personal as I profess my love for the changing seasons, and I'm not afraid to defend the city of the left if it seems to be getting a bad rap (which, in my experience, is often the case).
Of course, the reveal is exciting here: these two cities are New York and San Francisco, and kids are excited to see their home city represented here.
Summary: "So what's wrong with mean?"
To summarize, I simply post a series of statements (slide #6 of Whats Wrong with Mean) about the average age, salary, and temperature in each context, and I hope that students can now see these statements in a new way. I give the class a moment to consider these statements. Some students will want to write them down, some will want to touch base with a table-mate, and others will want to share their thoughts with me. I allow a little time for each of these to happen.
Learning Target and Vocabulary Review
When we're all ready to move on, I move on to slide #7, which summarizes what we've done by saying that mean "doesn't tell us everything there is to know about a data set!" I simply read from the slide here, telling students that I hope they have a better idea for why we have the measures of center. Then, I say that there's another statistical tool called "standard deviation," and that this is what we're going to study for the next couple of classes.
I post the new learning target, SLT 1.4 (slide #8), and follow my normal procedure of asking for a volunteer to read it, then for all students to shout out key vocabulary words. After that, I point out that we've already spent some time with mean, so now we're going to focus on the idea of standard deviation. I use slide #9 to prompt discussion of the two words that comprise the phrase, asking, "What is a deviation?" (student responses: it means something is not normal) and "What does it mean if something is standard?" (student responses: it means something is normal) This leads to a nice informal working definition of standard deviation along the lines of "how much much abnormality it is normal to expect."
Return to the Opener for Average Absolute Deviation
With the idea established that mean isn't everything, I return to the opening problem (slide #10 of Whats Wrong with Mean). I ask the class how we might quantify the ways in which these two car loads differ, and I prod my students to think about measuring the difference between each passenger's age and the average age. On the board, I develop the notes that are illustrated in the illustration two car loads. After student calculate these differences, I ask them to find the sum of the five differences for each car load, and it's always fun to see what they all think when they see that these are also the same for each car load: zero. Of course, it's in the nature of the mean that this will always be the case (unless the mean has been rounded), and I choose how much to dig into this based on the reactions of my students. What's most important is that now we have a reason to use absolute value (and later, the squares of these differences), because we want to eliminate the signs. Now, when we add up the absolute values of the differences for each car load, the dramatic difference between these two sums makes it clear that we've found a way to quantify the difference between the two cars.
Linear Practice #1
After establishing the idea that the mean and the standard deviation will combine to provide a better overview of a data set than just the mean will, I tell the class that now we're going to generate a new data set that we'll be able to use as we go a bit deeper into how standard deviation works.
In order to create that data set, I give students 50 linear equations to solve in 10 minutes. During the next class, we'll use the data set comprised of the number of equations solved correctly by each student. I use a tool like Kuta to generate sets of problems.
You have a variety of options for how to structure this exercise, which I outline in the linear practice video. I really like how this allows students to practice a background skill without making too big of a deal out of it, and how it generates a real set of data over which my students can claim ownership.
Exit Slip: Predict the Plot
When students finish Linear Practice #1, I hand out today's exit ticket LP1 Prediction Plot. Students are asked to choose a data representation from SLT 1.1 (dot plot, histogram, or box plot), and to sketch what they think it might look like for the data set that was just generated during Linear Practice #1.
This task gives me the opportunity to see what my students think about their skills in solving linear equations. I find it interesting to see which data representation each of my students will choose, and how much attention they pay to the details of their choice. This activity also helps to increase student buy-in for when we dig a little deeper into this data during the next class.