Modeling with Data in the Tidyverse
1
Prerequisites
2
Introduction to Modeling
2.1
Exploratory visualization of age
2.2
Numerical summaries of age
2.3
Exploratory visualization of house size
2.4
Log10 transformation of house size
2.5
EDA of relationship of teaching & “beauty” scores
2.6
Correlation between teaching and “beauty” scores
2.7
EDA of relationship of house price and waterfront
2.8
Predicting house price with waterfront
3
Modeling with Basic Regression
3.1
Plotting a “best-fitting” regression line
3.2
Fitting a regression with a numerical x
3.3
Making predictions using “beauty score”
3.4
Computing fitted/predicted values & residuals
3.5
EDA of relationship of score and rank
3.6
Fitting a regression with a categorical x
3.7
Making predictions using rank
3.8
Visualizing the distribution of residuals
4
Modeling with Multiple Regression
4.1
EDA of relationship
4.2
Fitting a regression
4.3
Making predictions using size and bedrooms
4.4
Interpreting residuals
4.5
Parallel slopes model
4.6
Making predictions using size and waterfront
4.7
Automating predicing on “new” houses
5
Model Assessment and Selection
5.1
Refresher: sum of squared residuals
5.2
Which model to select?
5.3
Computing the R-squared of a model
5.4
Comparing the R-squared of two models
5.5
Computing the MSE & RMSE of a model
5.6
Comparing the RMSE of two models
5.7
Fitting model to training data
5.8
Predicting on test data
References
Published with bookdown
Modeling with Data in the Tidyverse
References