Preliminaries
Load data
Install the mammals milk package if needed.
# install_github("brouwern/mammalsmilk")
You can then load the package with library()
The example data is milk_primates
Log transformation makes things more linear.
#this could be done with dplyr::mutate() too
milk_primates$mass.fem.log <- log(milk_primates$mass.fem)
Load libraries
library(ggplot2)
library(ggpubr)
#> Loading required package: magrittr
library(cowplot)
#>
#> Attaching package: 'cowplot'
#> The following object is masked from 'package:ggpubr':
#>
#> get_legend
#> The following object is masked from 'package:ggplot2':
#>
#> ggsave
library(bbmle)
#> Loading required package: stats4
library(broom)
Linear regression as model fitting
Fit 2 models
Null model (Ho): flat line
- Null hypothesis: Y variable doesn’t vary with x variable
- Null model in R: “fat ~ 1”
- “~1” means fit model
- w/ slope = 0
- intercept = mean of response variable
lm.null <- lm(fat ~ 1,
data = milk_primates)
Alternative model: fat ~ log(mass.fem)
lm.mass <- lm(fat ~ mass.fem.log,
data = milk_primates)
What is the intercept of null model?
The mean fat value is 8.6
summary(milk_primates[,"fat"])
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 0.700 3.750 6.600 8.581 11.875 27.000
plot
Compare models: significance test
Nested Models:
Hypothesis test of nested models in R
- anova() command
- carries out an F test for linear models
- “likelihood ratio test” for GLMs/GLMMs
- GLMM = generalized linear mixed models
- Models w/random effects, blocking, repeated measures etc
#NOTE: order w/in anova() doesn't matter
anova(lm.null,
lm.mass)
#> Analysis of Variance Table
#>
#> Model 1: fat ~ 1
#> Model 2: fat ~ mass.fem.log
#> Res.Df RSS Df Sum of Sq F Pr(>F)
#> 1 41 1942.8
#> 2 40 1421.6 1 521.22 14.666 0.0004425 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
- Output of anova() is similar to content of summary()
- I like to emphasize the testing of nested models
- GLMs, GLMMs often can only be tested with anova()
Examine model: IT-AIC
IT-AIC background
- “Information Theory- Akaike’s Information Criteria”
- “Information Theory” is a general approach to investigating models
- AIC is a particular information theory
- Others: BIC, WAIC, FIC, DIC (Bayesian), …
- seems like every statistician has one now
- AICc: correction for small samples size
- Should always use AICc, but people often forget
- qAIC: correction for “overdispersion”
- For GLMs (i.e.. logistic regression)
AIC in practice
- Lower is “better”
- Absolute values have NO MEANING
- AICs are only interpretable relative to other AICs
- focus: “delta AIC” (dAIC)
- AIC goes down if a model fits the data better
- BUT - every additional parameter has a “penalty”
AIC in R
AIC in base R
The AIC() command in base R.
- Note: no AICc command in base R!
- Other packages can do AICc and other ICs
- DF = number of parameters
AIC(lm.null,
lm.mass)
#> df AIC
#> lm.null 2 284.2276
#> lm.mass 3 273.1082
AIC table from AICtab
AICtab() from the bbmle package is super handy. Can give AIC and dAIC. Also can do AICc (see below).
bbmle::AICtab(lm.null,
lm.mass,
base = T)
#> AIC dAIC df
#> lm.mass 273.1 0.0 3
#> lm.null 284.2 11.1 2
AICc
- Use ICtab() with type = c(“AICc”)
- AICc is a correction for small sample size
- Doesn’t make much difference here
bbmle::ICtab(lm.null,
lm.mass,
type = c("AICc"),
base = T)
#> AICc dAICc df
#> lm.mass 273.7 0.0 3
#> lm.null 284.5 10.8 2
What about R^2?
Hypothesis testing
- p-values: tell if you if a predictor is “significantly” associated w/ a response
- Does not tell you the EFFECT SIZE is large
- p can be very very small
- but slope also very small
IC-AIC
- AIC tells you if one model fits the data better than another
- but, AIC doesn’t tell you if either model fits the data very well.
- So, even if a model has a low AIC relative to other models, it still might not be a very useful model!
R^2
- Tells you how much variation in the data is explained by model
- whether you use p-values or AIC, you should report R^2
How good is the alternative model?
- Slope of Alt models is highly significant
- But R2 isn’t huge…
- Lots of scatter around line
- Other predictors might fit better
- Can also build models with multiple predictors used at the same time.