R and Matrices

Recall that matrices are simply two dimensional arrays of values, as suggested by the following examples:

$$\begin{bmatrix} 1 & 2\\ 3 & 4 \end{bmatrix}, \quad \begin{bmatrix} 5 & 3 & 9\\ 1 & -2 & 7\\ -8 & 2 & 0 \end{bmatrix}, \quad \begin{bmatrix} 1 & 2 & 5\\ 3 & 4 & 6 \end{bmatrix}$$

They have their own arithmetic, with addition and subtraction being defined element-wise:

$$\begin{bmatrix} a & b\\ c & d \end{bmatrix} + \begin{bmatrix} e & f\\ g & h \end{bmatrix} = \begin{bmatrix} a+e & b+f\\ c+g & d+h \end{bmatrix}$$ $$\begin{bmatrix} a & b\\ c & d \end{bmatrix} - \begin{bmatrix} e & f\\ g & h \end{bmatrix} = \begin{bmatrix} a-e & b-f\\ c-g & d-h \end{bmatrix}$$

Matrix multiplication (which is only defined when the first matrix's number of columns matches the second matrix's number of rows) is slightly more involved, as suggested by the following:

$$\begin{bmatrix} a & b\\ c & d \end{bmatrix} \cdot \begin{bmatrix} e & f\\ g & h \end{bmatrix} = \begin{bmatrix} ae+bg & af+bh\\ ce+dg & cf+dh \end{bmatrix}$$ $$\begin{bmatrix} a & b & c\\ d & e & f\\ g & h & i \end{bmatrix} \cdot \begin{bmatrix} j & k & l\\ m & n & o\\ p & q & r \end{bmatrix} = \begin{bmatrix} aj+bm+cp & ak+bn+cq & al+bo+cr\\ dj+em+fp & dk+en+fq & dl+eo+fr\\ gj+hm+ip & gk+hn+iq & gl+ho+ir \end{bmatrix}$$

Whether they are intended to be used in conjunction with the operations shown above, or simply to represent tabular data -- R supports matrices through the use of the matrix data type.

Creating Matrices in R

There are several ways to create a matrix in R. The most direct way uses the matrix() function, as shown below. This way is the one most closely tied to how matrices are stored internally in R. Essentially, R stores a single vector of all of the values in the matrix, ordered by column, along with some information about its dimensions so that it can display these values in a two-dimensional array.

> m = matrix(c(1,2,3,4),nrow=2)  # nrow=2 tells R this matrix should have 2 rows
> m
     [,1] [,2]
[1,]    1    3
[2,]    2    4

If desired, one can give the values to the matrix() function in row order instead by using the byrow argument:

> m = matrix(c(1,2,3,4),nrow=2,byrow=TRUE)
> m
     [,1] [,2]
[1,]    1    2
[2,]    3    4

Alternatively, one can "bind" several row vectors together with the rbind() function to create a matrix:

> m = rbind(c(1,2),c(3,4))
> m
     [,1] [,2]
[1,]    1    2
[2,]    3    4

Or, just as easily, one can "bind" column vectors together with the cbind() function:

> m = cbind(c(1,3),c(2,4))
> m
     [,1] [,2]
[1,]    1    2
[2,]    3    4

One can also use these last two functions to add rows or columns to an existing matrix, as suggested by the following:

> m = matrix(c(1,2,3,4),nrow=2,byrow=TRUE)
> m.augmented = cbind(m,c(5,6))
> m.augmented
     [,1] [,2] [,3]
[1,]    1    2    5
[2,]    3    4    6

One powerful function for creating matrices is the outer() function. It allows one to build a matrix by using a function that takes two arguments. For example, suppose we wanted to create a matrix whose element in the $r^{th}$ row and $c^{th}$ column was $\log_r c$. where both $r$ and $c$ ranged from $2$ to $4$. Here's how one might do this with the outer() function:

> outer(2:4,2:4,log)
         [,1]      [,2]      [,3]
[1,] 1.000000 0.6309298 0.5000000
[2,] 1.584963 1.0000000 0.7924813
[3,] 2.000000 1.2618595 1.0000000

As a quick check to see if the result produced was reasonable, we note that $\log_3 2 \doteq 0.6309298$.

One can define their own functions of two arguments, of course. While the broader topic of functions will be saved for later, here is a quick example that can serve as a template for simple functions:

Suppose one wishes to construct a matrix where the element in the $r^{th}$ row and $c^{th}$ column should be $2r+c$:

> outer(1:5,1:5,function(r,c){2*r+c})
     [,1] [,2] [,3] [,4] [,5]
[1,]    3    4    5    6    7
[2,]    5    6    7    8    9
[3,]    7    8    9   10   11
[4,]    9   10   11   12   13
[5,]   11   12   13   14   15

Interestingly, there are some other two-argument functions that are used all the time, but that probably didn't come to mind when thinking about the outer function.

Binary operators like addition ("+"), subtraction ("-"), multiplication ("*"), and division ("/") are actually functions in R. We have seen that R supports using these operations in the ways we normally write them , as seen below

> 2 + 3
[1] 5

> 5 - 1
[1] 4

> 7 * 9
[1] 63
However, one can also apply these operators using function notation -- where the name of the function is the operator symbol in double quotes, as shown below:

> "+"(2,3)
[1] 5

> "-"(5,1)
[1] 4

> "*"(7,9)
[1] 63

So as one last example of creating a matrix, note how quickly we can quickly make a matrix corresponding to the body of a multiplication table in R using the multiplication "operator" function:

> outer(1:9,1:9,"*")
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
 [1,]    1    2    3    4    5    6    7    8    9
 [2,]    2    4    6    8   10   12   14   16   18
 [3,]    3    6    9   12   15   18   21   24   27
 [4,]    4    8   12   16   20   24   28   32   36
 [5,]    5   10   15   20   25   30   35   40   45
 [6,]    6   12   18   24   30   36   42   48   54
 [7,]    7   14   21   28   35   42   49   56   63
 [8,]    8   16   24   32   40   48   56   64   72
 [9,]    9   18   27   36   45   54   63   72   81

Accessing Elements of a Matrix

Matrices, tables, and data frames share some similarities in how their elements can be accessed. The following shows how to access an individual value, an entire row, and an entire column of a matrix, respectively:

> m = matrix(c(1,2,3,4),nrow=2,byrow=TRUE)
> m
     [,1] [,2]
[1,]    1    2
[2,]    3    4

> m[2,1]  # the element in the 2nd row, 1st column of m
[1] 3

> m[2,]   # the entire 2nd row of m
[1] 3 4

> m[,2]   # the entire 2nd column of m
[1] 2 4

One can create a sub-matrix of a given matrix easily as well, as the following examples demonstrate:

> m3 = matrix(c(2,4,6,8,10,11,12,14,16),nrow=3,byrow=TRUE)
> m3
     [,1] [,2] [,3]
[1,]    2    4    6
[2,]    8   10   11
[3,]   12   14   16

> m3[,2:3]      # here, we show the sub-matrix built from
                # all row values of m3 for columns 2 through 3
     [,1] [,2]
[1,]    4    6
[2,]   10   11
[3,]   14   16

> m3[1:2,1:2]   # here, we show the sub-matrix built only from
                # the first two row values of m3 for the first two columns
     [,1] [,2]
[1,]    2    4
[2,]    8   10

> m3[,3]        # this last one behaves a bit different...
[1]  6 11 16

Notice in the last example above, the syntax suggested we were after all of the row values in column 3 -- but the result wasn't shown as a column. Indeed, the result wasn't even a matrix -- it was a vector! This is an example of R doing an automatic dimension reduction on the output. We can suppress this reduction of dimension and force R to return a matrix with just one column if we employ the drop argument of the subset function:

> m3 = matrix(c(2,4,6,8,10,11,12,14,16),nrow=3,byrow=TRUE)
> m3
     [,1] [,2] [,3]
[1,]    2    4    6
[2,]    8   10   11
[3,]   12   14   16

> m3[,3,drop=FALSE] 
     [,1]
[1,]    6
[2,]   11
[3,]   16

Just as with vectors, one can use negative values to remove rows or columns as well:

> m3 = matrix(c(2,4,6,8,10,11,12,14,16),nrow=3,byrow=TRUE)
> m3
     [,1] [,2] [,3]
[1,]    2    4    6
[2,]    8   10   11
[3,]   12   14   16

> m3[-1,]             # here, we remove the first row
     [,1] [,2] [,3]
[1,]    8   10   11
[2,]   12   14   16

> m3[,-3]            # and here, we remove the 3rd column
     [,1] [,2]
[1,]    2    4
[2,]    8   10
[3,]   12   14

Importantly, one can "filter" a matrix down to another new matrix, by pulling out rows (or parts of rows) that adhere to some condition. This sets the stage for the incredibly powerful technique of filtering out rows of a data frame -- which we will see how to do soon. Take the time to go over the following demonstrations carefully. Make sure you understand each one.

> m3 = matrix(c(2,4,6,8,10,11,12,14,16),nrow=3,byrow=TRUE)
> m3
     [,1] [,2] [,3]
[1,]    2    4    6
[2,]    8   10   11
[3,]   12   14   16

> m3[m3[,2] >= 10,]   # here, we filter out the rows of m3 whose 
                      # second column values are at least 10
     [,1] [,2] [,3]
[1,]    8   10   11
[2,]   12   14   16

> m3[m3[2,] >= 10,]  # since the last two elements of row 2 are at least 10,
                     # this returns the last two rows, all columns 
     [,1] [,2] [,3]
[1,]    8   10   11
[2,]   12   14   16

> m3[m3[2,] >= 10,2:3]  # this is almost identical to the last example, except
                        # only the last two columns are shown
     [,1] [,2]
[1,]   10   11
[2,]   14   16

Applying Functions to Matrices

Similar to how functions could be applied to parts of tables, we can use the apply() function to apply a function to the vectors that make up either the rows or columns of a matrix. The apply() function has four arguments of note: m, a matrix; dimcode, which should be a 1 if you wish to apply the function to the rows of m or a 2 if you wish to apply the function m's columns; f, the name of the function one wishes to apply; and fargs, an optional argument which allows one to specify additional argument values for the function. Here are a few examples of this apply() function in action:

> m3 = matrix(c(2,4,6,8,10,11,12,14,16),nrow=3,byrow=TRUE)
> m3
     [,1] [,2] [,3]
[1,]    2    4    6
[2,]    8   10   11
[3,]   12   14   16

> apply(m3,1,sum)     # this sums the rows of m3
[1] 12 29 42

> apply(m3,2,sum)     # here, we sum the columns of m3
[1] 22 28 33

> scaled.sum = function(v,k) {   # Let's create a simple function with more
+   return(k*sum(v))             # than one argument...
+ }

> apply(m3,1,scaled.sum,k=5)       # notice the additional optional last argument where
                                   # we specify the scaled.sum function to be applied
                                   # to matrix m3 with the additional argument of k=5
[1]  60 145 210

Matrix Operations

If one attempts to add, subtract, multiply, etc. two matrices -- the results mirror the same operations on vectors. Values are combined element-wise:

> m1 = matrix(c(1,2,3,4),nrow=2,byrow=TRUE)
> m2 = matrix(c(5,6,7,8),nrow=2,byrow=TRUE)

> m1
     [,1] [,2]
[1,]    1    2
[2,]    3    4

> m2
     [,1] [,2]
[1,]    5    6
[2,]    7    8

> m1 + m2       # Note how, for example the top right value below (8)
                # is the sum of the two top right values above (2+6).
                # Things work similarly for the other 3 positions.
     [,1] [,2]
[1,]    6    8
[2,]   10   12

> m1 * m2       # Here too, each value below is simply the product of the
                # corresponding values in the two matrices above
     [,1] [,2]
[1,]    5   12
[2,]   21   32

Of course, true matrix multiplication doesn't work this way. If one wishes to perform matrix multiplication instead of element-wise multiplication, we use the %*% operator:

> m1 %*% m2     # here we have matrix multiplication done the way it is supposed to be!
     [,1] [,2]
[1,]   19   22
[2,]   43   50

One can also perform scalar multiplication of a matrix (i.e., multiplying every element of matrix by a constant) in the normal "vector way":

> m1 = matrix(c(1,2,3,4),nrow=2,byrow=TRUE)

> m1
     [,1] [,2]
[1,]    1    2
[2,]    3    4

> 3 * m1
     [,1] [,2]
[1,]    3    6
[2,]    9   12

Two other matrix operations of note include finding the transpose of a matrix and finding the inverse of a matrix. The former simply swaps the rows of a matrix with its columns. The second finds a matrix $M^{-1}$ for a given square matrix $M$ so that their matrix product is the identity matrix (i.e., a square matrix whose with $1$'s along its main diagonal and $0$'s elsewhere).

One simply uses the t() function to find the transpose of a matrix, as shown below:

> m3 = matrix(c(2,4,6,8,10,11,12,14,16),nrow=3,byrow=TRUE)
> m3
     [,1] [,2] [,3]
[1,]    2    4    6
[2,]    8   10   11
[3,]   12   14   16

> t(m3)
     [,1] [,2] [,3]
[1,]    2    8   12
[2,]    4   10   14
[3,]    6   11   16

To find the inverse of a square matrix, use the solve() function. Below, we find the inverse of the $3 \times 3$ matrix m3 just given above.

> m3.inv = solve(m3)
> m3.inv
     [,1] [,2] [,3]
[1,] -0.3   -1  0.8
[2,] -0.2    2 -1.3
[3,]  0.4   -1  0.6

The claim was made earlier that the product of a matrix and its inverse is the identity matrix. Indeed, a quick hand calculation confirms that for the inverse just found, we have:

$$\begin{bmatrix} 2 & 4 & 6\\ 8 & 10 & 11\\ 12 & 14 & 16 \end{bmatrix} \cdot \begin{bmatrix} -0.3 & -1 & 0.8\\ -0.2 & 2 & -1.3\\ 0.4 & -1 & 0.6 \end{bmatrix} = \begin{bmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1 \end{bmatrix}$$

Interestingly, if one finds this product in R, one sees the following:

> product = m3 %*% m3.inv
> product
     [,1]          [,2]         [,3]
[1,]    1 -8.881784e-16 4.440892e-16
[2,]    0  1.000000e+00 1.776357e-15
[3,]    0  0.000000e+00 1.000000e+00

While incredibly close to the identity matrix we expected, things appear a bit off here. Wondering why? Well, as is the case with numerical types in most programming languages, decimal values are stored in an approximate way in R.

The use of such approximations should be no big surprise, given that many values we encounter in mathematics (e.g., $\sqrt{2}, \pi, e, \textrm{etc.}$) involve an infinite number of decimal digits -- which would be impossible for a computer to store no matter how much memory it has.

Unfortunately, using approximate values can cause small errors to creep into one's calculations.

In this particular case, noting that the errors are well below $10^{-10}$, we could simply round the elements to a reasonable number of digits (e.g., 10) to see the identity matrix we had originally expected:

> round(product,digits=10)
     [,1] [,2] [,3]
[1,]    1    0    0
[2,]    0    1    0
[3,]    0    0    1