As a warm up to our study of linear regression and scatterplots, I ask students to sort through the cards from the previous days Data Display Matching Activity. They should put all the relationships that can be modeled with a linear function in one pile and all the relationships that cannot be modeled with a linear function in another. While students sort, I walk around checking homework and listening to their discussions.
To begin our focus on scatterplots, I outline for my students the arc of our unit on one-variable statistics. First we talked about variable types, then we learned which displays can be used for each type of data, then we focused in on a model that can be useful for certain types of distributions (mound-shaped, symmetric ones). Our study of bivariate statistics follows a similar arc. We discussed data sets that have two categorical, two quantitative and one of each type of variable. We learned about they types of displays that are appropriate for each combination of variable types. Now we focus on a model that is useful for certain types of relationships [MP3].
I impress upon my students that there are conditions that must be satisfied before the Normal model is used, there are criteria that must be checked before the linear model can be used. To engage students in the discussion, I ask share how they sorted the Data Display Matching Activity cards for the day's warm-up.
While students sort, I walk around checking homework and taking a look at their classification. Because students have been exposed to scatterplots and lines-of-best-fit in algebra 1 (S-ID.6), they may be very comfortable fitting a line to data. However, students often neglect to check the conditions for using the model. I lead a whole-class discussion about why some scatterplots suggest linearity and some do not.
I make a big deal out of this idea of using a line to represent a two variable data set and why we would want to do that. I want my students to understand that the power of a good model is that we can use it to make predictions: If a math function can reasonably be used to model the relationship, then we can predict y values that we don't know by plugging in x-values. [MP4]
I emphasize that using a line to represent a relationship is a very useful thing to do, but is not always appropriate. (This was also the case for using the Normal model to represent the distribution of a single variable.) The conditions for using a linear model are as follows:
I discuss the first two conditions and tell my students that the condition regarding residuals will be covered in the next day's lesson.
After we have discussed the conditions for applying a linear model to a bivariate data set, I want my students to practice analyzing plots to see if they meet the criteria. Scatter Plot Cards is a collection of 10 graphical representations of bivariate relationships. Eight of these cards depict the relationship between two quantitative variables and are presented as scatterplots. The final two are stacked dot plots of relationships between one quantitative and one categorical variable. The first time this activity is used, the Scatter Plot Cards should be printed on cardstock and cut out.
I place my students in groups of three for this activity and instruct them to examine each card for the first two linearity conditions:
Students will use the Record Sheet Evaluating Bivariate Data Sets for Linearity to record the decisions they make about the appropriateness of using a linear model to represent the relationship depicted on each card [MP4].
Teacher's Note: The Data Cards were created using Fathom.
When the groups have worked through all 10 cards, I bring them together for a second whole group discussion. I assign each group one or two cards to discuss with the class and we go through them as a group one-by-one. During this discussion I emphasize precise use of vocabulary and informally assess my students' conceptual understanding of qualitative vs. categorical and univariate vs. bivariate data sets (MP6, MP3).