f3) Linear regression as model fitting

Preliminaries

Load data

Install the mammals milk package if needed.

# install_github("brouwern/mammalsmilk")

You can then load the package with library()

library(mammalsmilk)

The example data is milk_primates

data("milk_primates")

Log transformation makes things more linear.

#this could be done with dplyr::mutate() too
milk_primates$mass.fem.log <- log(milk_primates$mass.fem)

Load libraries

library(ggplot2)
library(ggpubr)
#> Loading required package: magrittr
library(cowplot)
#> 
#> Attaching package: 'cowplot'
#> The following object is masked from 'package:ggpubr':
#> 
#>     get_legend
#> The following object is masked from 'package:ggplot2':
#> 
#>     ggsave
library(bbmle)
#> Loading required package: stats4
library(broom)

Linear regression as model fitting

Fit 2 models

Null model (Ho): flat line

Null hypothesis: Y variable doesn’t vary with x variable
Null model in R: “fat ~ 1”
“~1” means fit model
- w/ slope = 0
- intercept = mean of response variable

lm.null <- lm(fat ~ 1,
              data = milk_primates)

Alternative model: fat ~ log(mass.fem)

lm.mass <- lm(fat ~ mass.fem.log,
              data = milk_primates)

Plot both models

What is the intercept of null model?

The mean fat value is 8.6

summary(milk_primates[,"fat"])
#>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#>   0.700   3.750   6.600   8.581  11.875  27.000

plot

Compare models: significance test

Nested Models:

Ha: y ~ -1.75*log(mass) + 20.5
Ho: y ~ 0*log(mass) + 8.6
Ha has TWO parameters (aka coefficients, betas)
- Intercept = 20.515
- Slope = -1.75
Ho has ONE parameter
- Intercept = 8.6
- (Slope = 0, so it doesn’t count)
Hypothesis can be formulated of 2 “nested models”
- Applies to t-test, ANOVA, ANCOVA
- A major difference between Hypothesis testing w/p-values and IT-AIC is that AIC doesn’t require models to be nested!

Hypothesis test of nested models in R

anova() command
carries out an F test for linear models
“likelihood ratio test” for GLMs/GLMMs
- GLMM = generalized linear mixed models
- Models w/random effects, blocking, repeated measures etc

#NOTE: order w/in anova() doesn't matter
anova(lm.null, 
      lm.mass)
#> Analysis of Variance Table
#> 
#> Model 1: fat ~ 1
#> Model 2: fat ~ mass.fem.log
#>   Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
#> 1     41 1942.8                                  
#> 2     40 1421.6  1    521.22 14.666 0.0004425 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Output of anova() is similar to content of summary()
I like to emphasize the testing of nested models
GLMs, GLMMs often can only be tested with anova()

Examine model: IT-AIC

IT-AIC background

“Information Theory- Akaike’s Information Criteria”
“Information Theory” is a general approach to investigating models
AIC is a particular information theory
- Others: BIC, WAIC, FIC, DIC (Bayesian), …
- seems like every statistician has one now
AICc: correction for small samples size
- Should always use AICc, but people often forget
qAIC: correction for “overdispersion”
- For GLMs (i.e.. logistic regression)

AIC in practice

Lower is “better”
Absolute values have NO MEANING
AICs are only interpretable relative to other AICs
focus: “delta AIC” (dAIC)
AIC goes down if a model fits the data better
BUT - every additional parameter has a “penalty”

AIC in R

AIC in base R

The AIC() command in base R.

Note: no AICc command in base R!
Other packages can do AICc and other ICs
DF = number of parameters

AIC(lm.null, 
    lm.mass)
#>         df      AIC
#> lm.null  2 284.2276
#> lm.mass  3 273.1082

AIC table from AICtab

AICtab() from the bbmle package is super handy. Can give AIC and dAIC. Also can do AICc (see below).

bbmle::AICtab(lm.null,
       lm.mass,
       base = T)
#>         AIC   dAIC  df
#> lm.mass 273.1   0.0 3 
#> lm.null 284.2  11.1 2

AICc

Use ICtab() with type = c(“AICc”)
AICc is a correction for small sample size
Doesn’t make much difference here

bbmle::ICtab(lm.null,
      lm.mass,
      type = c("AICc"),
      base = T)
#>         AICc  dAICc df
#> lm.mass 273.7   0.0 3 
#> lm.null 284.5  10.8 2

What about R^2?

Hypothesis testing

p-values: tell if you if a predictor is “significantly” associated w/ a response
Does not tell you the EFFECT SIZE is large
- p can be very very small
- but slope also very small

IC-AIC

AIC tells you if one model fits the data better than another
but, AIC doesn’t tell you if either model fits the data very well.
So, even if a model has a low AIC relative to other models, it still might not be a very useful model!

R^2

Tells you how much variation in the data is explained by model
whether you use p-values or AIC, you should report R^2

How good is the alternative model?

Slope of Alt models is highly significant
But R2 isn’t huge…
Lots of scatter around line
Other predictors might fit better
Can also build models with multiple predictors used at the same time.

Nathan Brouwer | brouwern@gmail.com | @lowbrowR

2018-11-27