14 - Scatterplots
Explain the following quote by George Box: “All models are wrong, but some are useful.”
Practice
Chapter 7 # 5, 11,15,17,31 Chapter 8 # 1,9,11,28 Chapter 9 # 21,31
Vocabulary
model, scatterplot, association, response variable, explanatory variable, correlation coefficient, lurking variable, residuals, regression line, Se, R2, r, e, extrapolation, leverage, influential point
Study Questions
- What is the distinction between ‘association’ and ‘correlation’?
- What do you discuss/ point-out when describing the scatter and any association present?
- How do you know when it’s appropriate to calculate correlation?
- What impact (if any) does changing units have on correlation? Explain.
- Does correlation imply causation? Explain.
- What is the role of residuals in determining the appropriateness of the linear model?
- What does R2 tells us?
- Does y-intercept have meaning in every context? Explain with an example.
- What is the meaning of a positive residual? negative residual?
- What points would be considered unusual on a scatterplot? How would you identify influence points, points with high residuals, and leverage points?
- What would you do (in regards to running regression and reporting results) if significant outliers were present in your scatter?
Resources
- Link: Rossman/Chance - Correlation Guessing Game
- Link: Correlation and Regression
- Slide Deck: 14-Scatterplots.pdf