Bibliography: Recommended/Supplemental Readings

This bibliography is a compiliation of some papers that extend ideas covered in this course and will be updated regularly.

Weekly readings

Week listed is the one when it was assigned or mentioned.

Bullet points under articles indicate if that are required or recommended.


Week 7: T-tests & effect sizes

Anon. N.D. Codebook cookbook: A guide to writing a good codebook for data analysis projects in medicine. McGill University.

  • Contains information on how to make a data dictionary

Broman, KW and K Woo. 2018. Data organization in spreadsheet. The American Statistician.

  • Recommended
  • Excellent article for for understanding tidy data and data dictionaries

Ellis, SE and JT Leek. 2018. How to share data for collaboration. The American Statistician.

  • Recommended reading

Goodman et al. 2014. Ten Simple Rules for the Care and Feeding of Scientific Data. PLoS Computational Biology.

Harrel (nd).STATISTICAL GRAPHICS: Chapter 1

Nakagawa & Cuthill. 2007. Effect size, confidence interval and statistical significance: a practical guide for biologists. Biological Reviews

  • To my knowledge on of the 1st (and few) proponents of reporting relative effect sizes like Cohen’s d for use in ecology outside of meta-analyses.

Ruxton. 2006. The unequal variance t-test is an underused alternative to Student’s t-test and the Mann–Whitney U test. Behavioral Ecology.

  • Required
  • Assigned reading on unequal variance t-tests

Savik. Reporting P Values.Journal of Wound, Ostomy and Continence Nursing

  • Required
  • Assigned reading on how to report p-values

Walker, J. 2018a. Combining data, distribution summary, model effects, and uncertainty in a single plot.

  • Required
  • Assigned reading; Excellent discussion about why we should think in terms of effect sizes. Fig. 2 in the blog currently isn’t shown, but can be seen at Walker (2018b) and page 6 of Harrel(nd)

Walker, J. 2018b. When do we introduce best statistical practices to undergraduate biology majors? Rapid Ecology.

  • A nice abstract on Walker, J. 2018a

Wilson, G, J Kitzes, et al. 2018. Good enough practices in scientific computing. PLoS Computational Biology.

  • Section 1 “Data management” of Box 1 is an excellent overview of key tasks in setting up and preserving your raw and analysis data.


On deck

**Boldina & Beninger. 2016, Strengthening statistical usage in marine ecology: Linear regression. Journal of Experimental Marine Biology & Ecology

Brinny, K. The Rule of 3. http://dataabinitio.com/?p=320 * Store your digital data in at least 3 places (plus raw data sheets)! Eg, your hard drive, an external hard drive, and on the cloud (Box, Dropbox, private GitHub repository)

Bryan, J. Naming Files. Speakerdeck

  • Excellent advice on naming files to facilitate downstream organization

Bryan, J. 2018. Happy Git for the userR. http://happygitwithr.com/

  • One stop source for Git for R users

Colegrave and Ruxton 2017. Using Biological Insight and Pragmatism When Thinking about Pseudoreplication. Trends in Ecology & Evolution.

Hart et al. 2016. Ten Simple Rules for Digital Data Storage. PLoS Comp Bio

Marwick et al. 2018. Packaging Data Analytical Work Reproducibly Using R (and Friends). Am Stat

**Parker et al. 2019.*8 Empowering peer reviewers with a checklist to improve transparency. Nature Ecology & Evolution.


GitHub, Git & Version Control

Blischak, J. D., Davenport, E. R., & Wilson, G. (2016). A Quick Introduction to Version Control with Git and GitHub. PLoS Computational Biology, 12(1), 1–18. https://doi.org/10.1371/journal.pcbi.1004668

Bryan, J. 2018a Happy Git for the userR. http://happygitwithr.com/

  • One stop source for Git for R users

Bryan, J. 2018b. Excuse me, do you have a moment to talk about version control? Why Git? American Stat.

  • Good quick overivew of GitHub, especially if you will use it for collaboration, by the author of “Happy Git for the UserR.”

Perez-Riverol et al. (2016). Ten Simple Rules for Taking Advantage of Git and GitHub. PLoS Computational Biology, 12(7), 1–11. https://doi.org/10.1371/journal.pcbi.1004947

Ram, K. (2013). Git can facilitate greater reproducibility and increased transparency in science. Source Code for Biology and Medicine, 8

Vuorre, M., & Curley, J. P. (2018). Curating Research Assets in Behavioral Sciences: A Tutorial on the Git Version Control System. Advances in Methods and Practices in Psychological Science, 1–33.

  • Very thorough and readable intro


Reshaping data (dplyr, etc)

Richmond, Jenny. 2018. “gather spread unite separate”

Case studies

Code for understanding, reproducing and/or extending the analyses of these case studies will be used in the course or is available for self-study.

Skibiel et al. 2013. The evolution of the nutrient composition of mammalian milks. J. of Animal Eco. 82:1254–1264.


Nature Methods Tutorials

Nature Methods produces a number of short, useful tutorials. (Though inorder to be short some rely on compact equations more than I like.)

Altman, N., & M. Krzywinski. 2016. Analyzing outliers: influential or nuisance? Nature Methods 13:281–282.

Altman 2016. P values & the search for significance. Nat. Meth. 14:3–4.

Altman 2016. Regression diagnostics. Nature Methods 13:385–386.

Altman 2015. Simple linear regression. Nature Methods 12.

Krzywinski, M., & N. Altman. 2013. Significance, P values & t-tests. Nature Methods 10:1041–1042.

Krzywinski 2013. Error bars. Nat. Meth 10:921–922.

Krzywinski 2014. Visualizing samples w/ box plots. Nat. Meth. 11:


R introduction

Fox, J. 2006. Getting Started With R:1–42.


Regression

Fox, J. Dummy-Variable Regression. Applied Regression Analysis & Generalized Linear Models.

Fox, J. Bootstrapping Regression Models.

Fox, J., & S. Weisberg. 2011. Diagnosing Problems in Linear & Generalized Linear Models. An R Companion to Applied Regression:285–328.

Lever, J. et al 2016. Model selection & overfitting. Nature Methods 13:703–704.

Schielzeth, H. 2010. Simple means to improve the interpretability ofregression coefficients. Methods in Ecology & Evolution 1:103–113.

Steel, E. A. et al 2013. Applied statistics in ecology: common pitfalls & simple solutions. Ecosphere 4:art115.

Zuur, A. F. et al. 2010. A protocol for data exploration to avoid common statistical problems. Methods in Ecology & Evolution 1:3–14.


Reproducibility

A major emerging issue in science is how to assure the quality of our lab/field data and the integrity of our anlayses. Below are some examples from a rapidly growing literature on this topic.

Anon. 2015. Let’s think about cognitive bias. Nature.

Baggerly & Coombes. 2009. Deriving chemosensitivity from cell lines: Forensic bioinformatics & reproducible research in high-throughput biology. Ann. of App. Statistics 3:1309–1334.

Baker 2016. Reproducibility: Respect your cells. Nature 537:433–435.

Casadevall & Fang. 2010. Reproducible science. Infec. & Immunity 78:4972–4975.

Clark et al2016. Scientific Misconduct: The Elephant in the Lab. A Response to Parker et al. TREE 31:899–900.

Forstmeier et al 2016. Detecting & avoiding likely false-positive findings – a practical guide. Biological Reviews.

Gelman 2015. Working through some issues. Significance 12:33–35.

Gelman & Loken. 2014. The statistical crisis in Science:4–7.

Ioannidis 2014. How to Make More Published Research True. PLoS Med. 11.

Ioannidis, J. P. a., & M. J. Khoury. 2011. Improving validation practices in “omics” research. Science 334:1230–1232.

Ioannidis, J. P. A. 2003. Genetic associations: false or true? Trends in Molecular Medicine 9:133–135.

Ioannidis, J. P. A. 2005. Microarrays & molecular research: noise discovery? Lancet, The 365:454–455.

Landis et al. 2012. A call for transparent reporting to optimize the predictive value of preclinical research. Nature 490:187–91.

Nuzzo, R. 2014. Statistical errors: P values, the “gold standard” of statistical validity, are not as reliable as many scientists assume. Nature 506:150–152.

Parker et al 2016. Transparency in Eco. & Evo: Real Problems, Real Solutions. Trends in Eco. & Evo. 31:711–719.

Schnitzer & Carson. 2016. Would Ecology Fail the Repeatability Test? BioScience 66:98–99.

Yamada & Hall. 2015. Reproducibility & cell biology. J. of Cell Bio. 209:191–193.