Data and Plots on the Number Line

4 teachers like this lesson
Print Lesson

Objective

SWBAT consider the relative strengths of dot plots and box plots, while translating between these two data representations.

Big Idea

We continue to make use of real data - which is everywhere - in our study of statistics.

Opener: Examine Linear Practice #1 Data

10 minutes

As today's class opens, there is a dot plot projected on the front board (this Linear Practice Data file includes the data for each of my classes).  The dot plot represents the number of problems each student attempted on Linear Practice #1, on which students tried to solve as many one-step and two-step linear equations as they could in 10 minutes.  This is the first time students are seeing a dot plot in my class, but I post it as is, knowing that this is a very intuitive data representation.  

As posted on today's agenda, the opener is to "Copy this data into your notes, then find the mean, median and mode."  Some students copy the dot plot as they see it, and other choose to record the set of numbers in a comma-separated list.  I make the observation that I see both as I circulate, and I say that either option is great.  

After a few minutes, I ask, "When you look at this dot plot, which of the following is easiest to find: the mean, the median, or the mode?"  We are then able to discuss that on a dot plot, mode is quite easy to see.  "Compare this to a box plot," I say.  "is it easy to tell what the mode of a data set is on a box plot?"  On the other hand, we realize, the median is obvious on a box plot, but it will take a little bit extra attention to find it here.  

Many of my students want to use the method of crossing off number on each end of the data set in order to find the middle number.  I'll work hard today and over the course of the next week to undo this training, by saying that it's really important to think of the median as the number that splits the list in half.  The best first step for finding the median is to count how many values are in the data set, then divide this number in half, and count that many data points.  Each class has a different number of data points, so this conversation is slightly different each time.

Problem Set: Data and Plots on the Number Line

20 minutes

As student finish recording the data and finding those three measures of central tendency, I distribute the first Problem Set of Unit 2.  It consists of five problems, the first four of which are related to Linear Practice #1.  The first problem is to create a box plot of the "problems attempted" data set that opens today's class.

In some of my classes, students are ready to run with this task.  In others, they need some help.  If students need help, it's important to demonstrate the use of space on a box plot.  I draw the number line, and write the data above and around it, showing how much of the data goes where.  Moving from dot plots - which are a very explicit form of data representation - to box plots, whose use can be very mysterious, is an exercise in reasoning abstractly and quantitatively (Mathematical Practice #2).  The notes I give to students on the board are designed to help students in this challenge, by allowing them to see half of the data on one side of the median, and half on the other.  

One of the niftiest tricks here is to trace the number line from the initial dot plot slide, then to remove the data from the whiteboard and make the box plot.  Once the box plot is made, I then return the dot plot to the screen, which yields this box plot on a dot plot.  This really helps kids to see that one-quarter of the data is in each quartile on the box plot.  Of course, it looks great for n=16, and can be a little more ambiguous depending on the data set, but this really moves kids toward understanding what a box plot does.

Up to now, we've been analyzing the number of problems attempted by each student.  The next two problems on the problem set ask students to create a dot plot (#2) and a box plot (#3) of set of numbers of correct answers on Linear Practice #1.  On today's presentation slides, there's a comma-separated list of data for each class.  My planning move here is a time-saver: I can make copies of the same problem set for all students, and then allow each class to work with their own data, projected at the front of the room, as they work on these problems.  

Problem #4 shows a dot plot of the data for all of my algebra classes.  Students are challenged to take a much bigger data set and produce a box plot.  

Problem #5 gets us into the realm of reverse-engineering what a data set might be.  I don't expect students to get to this today.  I expect that they'll work on this for homework, and that many will come in with questions.  The start of my next lesson is designed to help students with this problem.

My favorite moment is when I overlay the box plot on #2 to the dot plot that opened the class.

Generate More Data: Linear Practice #2

13 minutes

Just like we did two days ago, today's class ends with the data generating activity I call "Linear Practice".  Today students will complete the second trial, Linear Practice #2, and it's just like Linear Practice #1, whose data we analyzed today.  It consists of 40 one-step and two-step linear equations, and I challenge students to solve as many as they can in 10 minutes.  I tell everyone that I want them to do better than last time.  I will once again use this activity to create two data sets: number attempted, and number correct.

My hope is that students will indeed do better than last time, and almost all of them do.  This will then allow us to construct arguments (Mathematical Practice #3) as to why and how we improved.  Did yesterday's Delta Math session help? Was everyone a little more motivated to do a little bit better?  Whatever happens, we'll have the opportunity to compare the data sets as we begin to shift our focus to the second and third learning targets of the unit:

2.2: I can compare two or more different data sets by using shape, center, and spread. (from S-ID.2)

and

2.3: I can interpret differences in shape, center, and spread in the context of data sets. (S-ID.3)