Loops in R

There are many times when one needs to do essentially the same calculation over and over again, perhaps thousands of times -- or more. When writing any program, whether in R or some other language, one certainly doesn't want to write the code to do this calculation over and over again.

Fortunately, almost all programming languages provide "looping structures" that allow us to execute a statement or group of statements over and over again, either some fixed number of times, or until some specified condition is met.

The simplest looping structure in R is the "while-loop". Here's the syntax:

while (some.boolean.condition) {
   statement(s);
}

The while-loop will continue to repeatedly execute the statements in the block following the boolean condition as long as the the boolean condition is true. As soon as the boolean condition is false, the loop stops, and the program proceeds normally.

Noting that cat(x) prints the value of $x$ to the screen, consider the following simple example of a while-loop, where a variable count is initially given the value of $0$, and then the following two actions happen repeatedly: 1) the word "Hello" is printed to the screen; and 2) the value of count is incremented -- both of these continuing to happen until the condition (count < 5) is false.

count = 0;
while (count < 5) {
  cat("Hello ")
  count = count + 1
}

Here's the output seen upon executing the above:

Hello Hello Hello Hello Hello

We can use loops to build vectors too. Remembering that c(a,b) combines values or vectors a and b into a single vector, note how easily we can use a while-loop to create a vector of consecutive values from $1$ to $10$:

v = c();
i = 1;
while (i <= 10) {
  v = c(v,i)
  i = i + 1
}
v

Here's the output:

 [1]  1  2  3  4  5  6  7  8  9 10

The two examples above share a structure needed in many applications of loops. They both have a variable that essentially just counts how many times one has gone through the loop. In the first, that variable is count. In the second, its i.

As is the case with many other languages, there is another looping structure in R made just for these types of circumstances. The structure in question is a "for-loop" and its syntax is shown below:

for (value in sequence) {
   statement(s)
}

In this case, the variable value will take on all of the values of sequence, in order -- and with each such value, the for-loop executes the statements found between the curly braces.

Consider how we can print "Hello" to the screen 5 times more easily using a for-loop::

for (count in 0:4) {
  cat("Hello ")
}

or how we can construct a vector of consecutive integers from 1 to 10...

v = c()
for (i in 1:10) {
  v = c(v,i)
}
v

Of course, if we just want a vector from $1$ to $10$, there are easier ways in R. Just use:

> 1:10 
 [1]  1  2  3  4  5  6  7  8  9 10

Indeed, R has been designed so that loops are often unnecessary. Because of this, one should think carefully before introducing loops into an R program.

Where loops in R shine, however, is when one doesn't know in advance haw many times "through the loop" one needs to go.

Suppose one wanted to find the highest power of two that is still less than a million and consider the following while-loop as a means to answer that question:

n = 1
exponent = 0
while (2*n < 1000000) {
  n = 2*n
  exponent = exponent + 1
}
exponent
[1] 19

As another nice example, there is a famous unsolved problem in mathematics called the Collatz Conjecture. This conjecture states that if $$f(n) = \left\{ \begin{array}{c,l,l} n\,/\,2 &, & \textrm{ when $n$ is even}\\ 3n+1 & , & \textrm{ when $n$ is odd} \end{array} \right.$$ and one picks any positive integer $n$, finds $f(n)$, and then finds $f$ of that value, and $f$ of that value, and so on -- one eventually will see the value $1$ in the list of values so produced.

For example, $f(5) = 16, \quad f(16)=8, \quad f(8) = 4, \quad f(4) = 2,$ and $f(2) = 1$.

The sequence of output values produced are called the "iterates" of $n$ under $f$. Suppose we wanted to write an R function to produce a vector of all of the iterates of a given value up to the conjectured $1$. Here, it is not clear how long such a vector will be, so a while-loop is the best way to proceed:

f = function(n) {
  ifelse(n %% 2 == 0, n %/% 2, 3*n+1)
}

collatz.iterates = function(n) {
  v = c(n)
  while (v[length(v)] != 1) {
    v = c(v,f(v[length(v)]))
  }
  return(v)
}

Note the different lengths seen in the sequence of the iterates for $n=5$, versus $n=23$:

> collatz.iterates(5)
 [1]  5 16  8  4  2  1
> collatz.iterates(23)
 [1]  23  70  35 106  53 160  80  40  20  10   5  16   8   4   2   1