Chapter 7 Functions
seq()
is()
,is.vector()
,is.matrix()
gsub()
7.1 Vectors in R
Variables in R include scalars, vectors, and lists. Functions in R carry out operations on variables, for example, using the log10()
function to calculate the log to the base 10 of a scalar variable x
, or using the mean()
function to calculate the average of the values in a vector variable myvector
. For example, we can use log10()
on a scalar object like this:
# store value in object
<- 100
x
# take log base 10 of object
log10(x)
## [1] 2
Note that while mathematically x is a single number, or a scalar, R considers it to be a vector:
is.vector(x)
## [1] TRUE
There are many “is” commands. What is returned when you run is.matrix()
on a vector?
is.matrix(x)
## [1] FALSE
Mathematically this is a bit odd, since often a vector is defined as a one-dimensional matrix, e.g., a single column or single row of a matrix. But in R land, a vector is a vector, and matrix is a matrix, and there are no explicit scalars.
7.2 Math on vectors
Vectors can serve as the input for mathematical operations. When this is done R does the mathematical operation separately on each element of the vector. This is a unique feature of R that can be hard to get used to even for people with previous programming experience.
Let’s make a vector of numbers:
<- c(30,16,303,99,11,111) myvector
What happens when we multiply myvector
by 10?
*10 myvector
## [1] 300 160 3030 990 110 1110
R has taken each of the 6 values, 30 through 111, of myvector
and multiplied each one by 10, giving us 6 results. That is, what R did was
## 30*10 # first value of myvector
## 16*10 # second value of myvector
## 303*10 # ....
## 99*10
## 111*10 # last value of myvector
The normal order of operations rules apply to vectors as they do to operations we’re more used to. So multiplying myvector
by 10 is the same whether you put he 10 before or after vector. That is myvector\*10
is the same as 10\*myvector
.
*10 myvector
## [1] 300 160 3030 990 110 1110
10*myvector
## [1] 300 160 3030 990 110 1110
What happen when you subtract 30 from myvector? Write the code below.
-30 myvector
## [1] 0 -14 273 69 -19 81
So, what R did was
## 30-30 # first value of myvector
## 16-30 # second value of myvector
## 303-30 # ....
## 99-30
## 111-30 # last value of myvector
Again, myvector-30
is vectorized operation.
You can also square a vector
^2 myvector
## [1] 900 256 91809 9801 121 12321
Which is the same as
## 30^2 # first value of myvector
## 16^2 # second value of myvector
## 303^2 # ....
## 99^2
## 111^2 # last value of myvector
Also you can take the square root of a vector using the functions sqrt()
…
sqrt(myvector)
## [1] 5.477226 4.000000 17.406895 9.949874 3.316625 10.535654
…and take the log of a vector with log()
…
log(myvector)
## [1] 3.401197 2.772589 5.713733 4.595120 2.397895 4.709530
…and just about any other mathematical operation. Here we are working on a separate vector object; all of these rules apply to a column in a matrix or a dataframe.
This attribute of R is called vectorization. When you run the code myvector*10
or log(myvector)
you are doing a vectorized operation - its like normal math with special vector-based super power to get more done faster than you normally could.
7.3 Functions on vectors
As we just saw, we can use functions on vectors. Typically these use the vectors as an input and all the numbers are processed into an output. Call the mean()
function on the vector we made called myvector
.
mean(myvector)
## [1] 95
Note how we get a single value back - the mean of all the values in the vector. R saw that we had a vector of multiple and knew that the mean is a function that doesn’t get applied to single number, but sets of numbers.
The function sd()
calculates the standard deviation. Apply the sd()
to myvector:
sd(myvector)
## [1] 110.5061
7.4 Operations with two vectors
You can also subtract one vector from another vector. This can be a little weird when you first see it. Make another vector with the numbers 5, 10, 15, 20, 25, 30. Call this myvector2:
<- c(5, 10, 15, 20, 25, 30) myvector2
Now subtract myvector2 from myvector. What happens?
-myvector2 myvector
## [1] 25 6 288 79 -14 81
7.5 Subsetting vectors
You can extract an element of a vector by typing the vector name with the index of that element given in square brackets. For example, to get the value of the 3rd element in the vector myvector
, we type:
3] myvector[
## [1] 303
Extract the 4th element of the vector:
4] myvector[
## [1] 99
You can extract more than one element by using a vector in the brackets:
First, say I want to extract the 3rd and the 4th element. I can make a vector with 3 and 4 in it:
<- c(3,4) nums
Then put that vector in the brackets:
myvector[nums]
## [1] 303 99
We can also do it directly like this, skipping the vector-creation step:
c(3,4)] myvector[
## [1] 303 99
In the chunk below extract the 1st and 2nd elements:
c(1,2)] myvector[
## [1] 30 16
7.6 Sequences of numbers
Often we want a vector of numbers in sequential order. That is, a vector with the numbers 1, 2, 3, 4, … or 5, 10, 15, 20, … The easiest way to do this is using a colon
1:10
## [1] 1 2 3 4 5 6 7 8 9 10
Note that in R 1:10 is equivalent to c(1:10)
c(1:10)
## [1] 1 2 3 4 5 6 7 8 9 10
Usually to emphasize that a vector is being created I will use c(1:10)
We can do any number to any numbers
c(20:30)
## [1] 20 21 22 23 24 25 26 27 28 29 30
We can also do it in reverse. In the code below put 30 before 20:
c(30:20)
## [1] 30 29 28 27 26 25 24 23 22 21 20
A useful function in R is the seq()
function, which is an explicit function that can be used to create a vector containing a sequence of numbers that run from a particular number to another particular number.
seq(1, 10)
## [1] 1 2 3 4 5 6 7 8 9 10
Using seq()
instead of a :
can be useful for readability to make it explicit what is going on. More importantly, seq
has an argument by = ...
so you can make a sequence of number with any interval between For example, if we want to create the sequence of numbers from 1 to 10 in steps of 1 (i.e.. 1, 2, 3, 4, … 10), we can type:
seq(1, 10,
by = 1)
## [1] 1 2 3 4 5 6 7 8 9 10
We can change the step size by altering the value of the by
argument given to the function seq()
. For example, if we want to create a sequence of numbers from 1-100 in steps of 20 (i.e.. 1, 21, 41, … 101), we can type:
seq(1, 101,
by = 20)
## [1] 1 21 41 61 81 101
7.7 Vectors can hold numeric or character data
The vector we created above holds numeric data, as indicated by class()
class(myvector)
## [1] "numeric"
Vectors can also holder character data, like the genetic code:
# vector of character data
<- c("A","T","G")
myvector
# how it looks
myvector
## [1] "A" "T" "G"
# what is "is"
class(myvector)
## [1] "character"
7.8 Regular expressions can modify character data
We can use regular expressions to modify character data. For example, change the Ts to Us
<- gsub("T", "U", myvector) myvector
Now check it out
myvector
## [1] "A" "U" "G"
Regular expressions are a deep subject in computing. You can find some more information about them here.