Chapter 19 Simple for() loop example
19.1 Key functions / terms
- paste()
- functions to learn about vectors and other objects
- nchar()
- vector element
- bracket notation to access vector elements
- for() loop
- curly brackets { }
- function()
- class()
Here’s a simple “toy” example of using four loops.
Let’s say we need look AG different codons and need to vary the first base of a codon. For example, we want to look AG all the codons that are “XAG”, so we want
AAG TAG CAG GAG
We make a vector that holds the first base of the code
#element 1 2 3 4
<- c("A","T","C","G") x
This is a vector with 4 elements. We can explore it using the usual functions that tell us about objects (aka variables) in R.
is(x)
## [1] "character" "vector"
## [3] "data.frameRowLabels" "SuperClassMethod"
## [5] "EnumerationValue" "character_OR_connection"
## [7] "character_OR_NULL" "atomic"
## [9] "vector_OR_Vector" "vector_OR_factor"
length(x)
## [1] 4
nchar(x)
## [1] 1 1 1 1
dim(x)
## NULL
nrow(x)
## NULL
ncol(x)
## NULL
If we need to know what a function does we should always look it up in the help file
?nchar
Note that when we call the function nchar() on x we get
nchar(x)
## [1] 1 1 1 1
This means that each element of x has one character in it.
Compare that result with this. Let’s make a vector with a single codon in it.
#element 1
<- c("AGC") y
Now nchar() says this
nchar(y)
## [1] 3
That is, 3 character (3 letters) in the first and only element of the vectors.
If our vector contained codons, it would look like this
<- c("AAG", "TAG", "CAG", "GAG")
y nchar(y)
## [1] 3 3 3 3
That is, four elements in the vector, each element with 3 characters in it.
Let’s say we are keeping the second and third position of our codon fixed AG “AG” and will vary the first position. A function that will be handy is paste().
Paste takes things and combines them into a single element of a vector. So, I can do this with my name
<- paste("Nathan","Linn","Brouwer") n
This gives me a vector of length 1
length(n)
## [1] 1
That contains my name
n
## [1] "Nathan Linn Brouwer"
If I don’t want any spaces I can do this
paste("NAGhan","Linn","Brouwer", sep = "")
## [1] "NAGhanLinnBrouwer"
I can use paste() to assembly codons for me
<- paste("A", "AG", sep = "")
codon1 codon1
## [1] "AAG"
I can make all for possible codon that end in “AG” like this
paste("A", "AG", sep = "")
## [1] "AAG"
paste("T", "AG", sep = "")
## [1] "TAG"
paste("C", "AG", sep = "")
## [1] "CAG"
paste("G", "AG", sep = "")
## [1] "GAG"
Since I have a vector with the first base I’m varying in it, I can also do this using bracket notation, with x[1], x[2], etc.
paste(x[1], "AG", sep = "")
## [1] "AAG"
paste(x[2], "AG", sep = "")
## [1] "TAG"
paste(x[3], "AG", sep = "")
## [1] "CAG"
paste(x[4], "AG", sep = "")
## [1] "GAG"
Copying the same line of code multiple times gets the job done but will be prone to errors. Anytime the same process gets repeated you should consider using for() loops and/or functions. I can take the four lines of code in the previous chunk and turn it into a four loop like this.
for(i in 1:length(x)){
<- paste(x[i],"AG", sep = "")
codon print(codon)
}
## [1] "AAG"
## [1] "TAG"
## [1] "CAG"
## [1] "GAG"
All for loops start with for(…). Don’t worry about what’s in between the parentheses right now. Then there’s a curly bracket {, some code thAG does whAG we want, and a closing curly bracket }.
This is a “toy” example and doesn’t accomplish much - my four loop has as many lines of code as the stuff it replaces. But if I have to do something dozens, hundreds, or thousands of times then its very useful to use for() loops.
Functions also allow you to take a process and consolidated it. Often, functions contain for loops in them. For example, I can consolidated the for() loop into a function like this.
First, I define a function called for_loop_function()
<- function(x){
for_loop_function for(i in 1:length(x)){
<- paste(x[i],"AG", sep = "")
codon print(codon)
} }
All function definitions start with function(…), have a curly bracket {, some code, and end with a }. Functions often don’t have for loops, but in this case it does, so there are two sets of curly brackets, one for the for() loop and one for the function wrapping around it.
So now I can get the results I did before like this
for_loop_function(x)
## [1] "AAG"
## [1] "TAG"
## [1] "CAG"
## [1] "GAG"
This is handy if I’m going to modify or reuse the process I’m doing. Let’s say I want to work with RNA instead of DNA. I’ll define a vector like this with U as the second element of the vector instead of T
<- c("A","U" ,"C", "G")
x1
2] x1[
## [1] "U"
Now I can get my results for RNA
for_loop_function(x1)
## [1] "AAG"
## [1] "UAG"
## [1] "CAG"
## [1] "GAG"
Note that because everything in R is an object, I can learn about objects containing functions like this
is(for_loop_function)
## [1] "function" "OptionalFunction" "PossibleMethod"
## [4] "expression_OR_function"
The first element of the output tells me that this is a function object.,
More succinctly I can use the class() function
class(for_loop_function)
## [1] "function"
Here’s an example of another function. Note the key elements: The function name: entrez_fetch_list The assignment operator <- The function() function The brackets
In this case, there’s also a for() loop in this function.
<- function(db, id, rettype, ...){
entrez_fetch_list
#setup list for storing output
<- length(id)
n.seq <- as.list(rep(NA, n.seq))
list.output names(list.output) <- id
# get output
for(i in 1:length(id)){
<- rentrez::entrez_fetch(db = db,
list.output[[i]] id = id[i],
rettype = rettype)
}
return(list.output)
}