10.7 Loading data into R using read.csv()
Copy and paste the .csv file name from the console into the source viewer then Execute the command “read.csv(file =”Medley1998.csv“)”. You can type it but you must be careful to have NO TYPOS. R is unforgiving when it comes to typos.
If you’ve done it correctly you’ll see the data table printed out in the console (I show only some of the output).
read.csv(file = "Medley1998.csv")
## station pH DO cond temp alk hard ZN spp.rich spp.div prop.Achnanthes
## 1 ERI 8.5 8.4 180 11 119 122 2 35.3 2.27 0.37
## 2 ER2 8.0 8.0 145 14 52 84 407 21.7 1.25 0.48
## 3 ER3 8.0 8.0 150 15 54 86 336 20.7 1.15 0.35
## 4 ER4 8.8 7.8 240 18 77 126 104 16.7 1.62 0.02
## 5 FC1 7.8 8.6 55 9 30 42 7 19.0 1.70 0.17
## 6 FC2 7.4 8.8 130 8 41 84 1735 5.7 0.63 0.76
You must have the file name in quotation marks and include the “.csv”. Any small error will cause things to not work.
Here are examples of mistakes that won’t work (no matter how much you cuss at it.)
read.csv(file = Medley1998.csv) #missing quotes " "
read.csv(file = "Medley1998.csv") #missing .csv
read.csv(file "Medley1998.csv") #missing =
Note that R returns error messages in red, but they aren’t necessarily very helpful in figuring out what the problem actually is. This is an unfortunate feature of R, and reading error messages is a skill that must be learned.
10.7.1 Load data into an R “object”
Now type this: “med98 <- read.csv(file =”Medley1998.csv“)”. The “<-” is the assignment operator. What happens when you execute this command?
med98 <- read.csv(file = "Medley1998.csv") [ ]
It might actually look like not much has happened. But that’s good! It means the data has successful been loaded into R. You have “assigned” the data from your file to the “object” named “med98”
10.7.2 The assignment operator “<-”
“<-” is called the “assignment operator”. It is a special type of R command.
“<” usually shares The comma key. Type “shift + ,” To get it.
If you type just “med98” and execute it as a command, what happens?
med98
## station pH DO cond temp alk hard ZN spp.rich spp.div prop.Achnanthes
## 1 ERI 8.5 8.4 180 11 119 122 2 35.3 2.27 0.37
## 2 ER2 8.0 8.0 145 14 52 84 407 21.7 1.25 0.48
## 3 ER3 8.0 8.0 150 15 54 86 336 20.7 1.15 0.35
## 4 ER4 8.8 7.8 240 18 77 126 104 16.7 1.62 0.02
## 5 FC1 7.8 8.6 55 9 30 42 7 19.0 1.70 0.17
## 6 FC2 7.4 8.8 130 8 41 84 1735 5.7 0.63 0.76
You should see the entire dataset spit out in the console (I’ve just shown the top part).
Now execute the list command ls(). You should now see “med98” shown in the console.
ls()
## [1] "crabs" "eagle.df" "eagles" "eaglesWV.url"
## [5] "eaglesWV.url_2" "eaglesWV_2" "iris" "med98"
## [9] "msleep" "my.abc" "my.mean" "x"
## [13] "year"
This means that the object you assigned your data is now in your “workspace.” The workspace is what I call the working memory of R.
We can learn about the med98 data using command like dim(), names() and summary().
How big is the dataset overall?
dim(med98)
## [1] 34 11
How man columns are there?
names(med98)
## [1] "station" "pH" "DO"
## [4] "cond" "temp" "alk"
## [7] "hard" "ZN" "spp.rich"
## [10] "spp.div" "prop.Achnanthes"
Are any of the variables categorical?
summary(med98)
## station pH DO cond
## AR2 : 1 Min. :6.700 Min. :6.800 Min. : 40.00
## AR3 : 1 1st Qu.:7.425 1st Qu.:7.500 1st Qu.: 76.25
## AR5 : 1 Median :7.900 Median :7.600 Median :100.00
## AR8 : 1 Mean :7.841 Mean :7.794 Mean :116.76
## ARI : 1 3rd Qu.:8.200 3rd Qu.:8.175 3rd Qu.:150.00
## BR2 : 1 Max. :8.800 Max. :8.800 Max. :240.00
## (Other):28
## temp alk hard ZN
## Min. : 8.00 Min. : 10.00 Min. : 10.00 Min. : 2.0
## 1st Qu.:11.00 1st Qu.: 28.50 1st Qu.: 45.00 1st Qu.: 24.0
## Median :12.50 Median : 46.50 Median : 62.00 Median : 54.0
## Mean :13.06 Mean : 46.38 Mean : 66.76 Mean : 177.3
## 3rd Qu.:15.00 3rd Qu.: 64.00 3rd Qu.: 90.50 3rd Qu.: 213.2
## Max. :21.00 Max. :119.00 Max. :126.00 Max. :1735.0
##
## spp.rich spp.div prop.Achnanthes
## Min. : 5.70 Min. :0.630 Min. :0.0200
## 1st Qu.:18.77 1st Qu.:1.377 1st Qu.:0.2125
## Median :22.85 Median :1.855 Median :0.3900
## Mean :22.42 Mean :1.694 Mean :0.3756
## 3rd Qu.:26.82 3rd Qu.:2.058 3rd Qu.:0.4950
## Max. :42.00 Max. :2.830 Max. :0.7600
##