Preliminaries

Load data

Install the mammals milk package if needed.

# install_github("brouwern/mammalsmilk")

You can then load the package with library()

library(mammalsmilk)

The example data is milk_primates

data("milk_primates")

Log transformation makes things more linear.

#this could be done with dplyr::mutate() too
milk_primates$mass.fem.log <- log(milk_primates$mass.fem)

Load libraries

library(ggplot2)
library(ggpubr)
#> Loading required package: magrittr
library(cowplot)
#> 
#> Attaching package: 'cowplot'
#> The following object is masked from 'package:ggpubr':
#> 
#>     get_legend
#> The following object is masked from 'package:ggplot2':
#> 
#>     ggsave
library(bbmle)
#> Loading required package: stats4
library(broom)

Linear regression as model fitting

Fit 2 models

Null model (Ho): flat line

  • Null hypothesis: Y variable doesn’t vary with x variable
  • Null model in R: “fat ~ 1”
  • “~1” means fit model
    • w/ slope = 0
    • intercept = mean of response variable
lm.null <- lm(fat ~ 1,
              data = milk_primates)

Alternative model: fat ~ log(mass.fem)

lm.mass <- lm(fat ~ mass.fem.log,
              data = milk_primates)

Plot both models

What is the intercept of null model?

The mean fat value is 8.6

summary(milk_primates[,"fat"])
#>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#>   0.700   3.750   6.600   8.581  11.875  27.000

plot

Compare models: significance test

Nested Models:

  • Ha: y ~ -1.75*log(mass) + 20.5
  • Ho: y ~ 0*log(mass) + 8.6

  • Ha has TWO parameters (aka coefficients, betas)
    • Intercept = 20.515
    • Slope = -1.75
  • Ho has ONE parameter
    • Intercept = 8.6
    • (Slope = 0, so it doesn’t count)
  • Hypothesis can be formulated of 2 “nested models”
    • Applies to t-test, ANOVA, ANCOVA
    • A major difference between Hypothesis testing w/p-values and IT-AIC is that AIC doesn’t require models to be nested!

Hypothesis test of nested models in R

  • anova() command
  • carries out an F test for linear models
  • “likelihood ratio test” for GLMs/GLMMs
    • GLMM = generalized linear mixed models
    • Models w/random effects, blocking, repeated measures etc
#NOTE: order w/in anova() doesn't matter
anova(lm.null, 
      lm.mass)
#> Analysis of Variance Table
#> 
#> Model 1: fat ~ 1
#> Model 2: fat ~ mass.fem.log
#>   Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
#> 1     41 1942.8                                  
#> 2     40 1421.6  1    521.22 14.666 0.0004425 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  • Output of anova() is similar to content of summary()
  • I like to emphasize the testing of nested models
  • GLMs, GLMMs often can only be tested with anova()

Examine model: IT-AIC

IT-AIC background

  • “Information Theory- Akaike’s Information Criteria”
  • “Information Theory” is a general approach to investigating models
  • AIC is a particular information theory
    • Others: BIC, WAIC, FIC, DIC (Bayesian), …
    • seems like every statistician has one now
  • AICc: correction for small samples size
    • Should always use AICc, but people often forget
  • qAIC: correction for “overdispersion”
    • For GLMs (i.e.. logistic regression)

AIC in practice

  • Lower is “better”
  • Absolute values have NO MEANING
  • AICs are only interpretable relative to other AICs
  • focus: “delta AIC” (dAIC)
  • AIC goes down if a model fits the data better
  • BUT - every additional parameter has a “penalty”

AIC in R

AIC in base R

The AIC() command in base R.

  • Note: no AICc command in base R!
  • Other packages can do AICc and other ICs
  • DF = number of parameters
AIC(lm.null, 
    lm.mass)
#>         df      AIC
#> lm.null  2 284.2276
#> lm.mass  3 273.1082

AIC table from AICtab

AICtab() from the bbmle package is super handy. Can give AIC and dAIC. Also can do AICc (see below).

bbmle::AICtab(lm.null,
       lm.mass,
       base = T)
#>         AIC   dAIC  df
#> lm.mass 273.1   0.0 3 
#> lm.null 284.2  11.1 2

AICc

  • Use ICtab() with type = c(“AICc”)
  • AICc is a correction for small sample size
  • Doesn’t make much difference here
bbmle::ICtab(lm.null,
      lm.mass,
      type = c("AICc"),
      base = T)
#>         AICc  dAICc df
#> lm.mass 273.7   0.0 3 
#> lm.null 284.5  10.8 2

What about R^2?

Hypothesis testing

  • p-values: tell if you if a predictor is “significantly” associated w/ a response
  • Does not tell you the EFFECT SIZE is large
    • p can be very very small
    • but slope also very small

IC-AIC

  • AIC tells you if one model fits the data better than another
  • but, AIC doesn’t tell you if either model fits the data very well.
  • So, even if a model has a low AIC relative to other models, it still might not be a very useful model!

R^2

  • Tells you how much variation in the data is explained by model
  • whether you use p-values or AIC, you should report R^2

How good is the alternative model?

  • Slope of Alt models is highly significant
  • But R2 isn’t huge…
  • Lots of scatter around line
  • Other predictors might fit better
  • Can also build models with multiple predictors used at the same time.