Vectors in R

A vector in R is an ordered collection of values, and is a data type that is fundamental to how R functions.

The term "values" in R is broader in scope than what we normally take the term to mean. A value in R can be a numerical value (e.g., 1.2, 5, -79.843), but it can also be a character or string of characters(e.g., "a", "alice", "trial_4"). Values can also contain logical values (i.e., TRUE or FALSE). As a convenience, a special value, NA, can also be used in any vector as a place-holder for a value that is "not available" for one reason or another.

An important characteristic for a vector, however, is that all of its values must be of the same type. Thus, you can't mix numerical values and strings of characters in the same vector, for instance.

A variable can store a vector just as easily as it stores a single numerical value (Spoiler alert: single numerical values actually are vectors in R). Indeed, variables can store any of the types of "values" mentioned above.

Creating and Storing Vectors

To create a vector and assign it to a variable, we use the c() function and the = assignment operator, as the next couple of examples illustrate. Recall that the = operator as used here does not mean what it means in mathematics. It means instead to take the element specified on its right side and store it in the specified element on its left side.

> myNumericalVector = c(1,2,3)
> myNumericalVector
[1] 1 2 3

> myStringVector = c("alpha","beta","gamma")
> myStringVector
[1] "alpha" "beta" "gamma"

> myLogicalVector = c(TRUE,FALSE,FALSE,FALSE)
> myLogicalVector
[1] TRUE FALSE FALSE FALSE

In the case where we only wish to store a single value, we may do so in a manner that requires slightly less typing. However, the result is still a vector.

> x = 5          # yields exactly the same output as: x = c(5)
> x
[1] 5

In the case where we wish to store a vector whose consecutive elements have a common difference of one (e.g., $(4,5,6,7,8)$), we can use a colon to more quickly define the vector, as shown below.

> a = 4:8        # yields exactly the same output as: a = c(4,5,6,7,8)
> a
[1] 4 5 6 7 8

To create a vector whose elements form an arithmetic sequence with a common difference other than one, we can use the seq() function:

> b = seq(from=6,to=18,by=2)           # here we specify the common difference with "by="
> b
[1] 6 8 10 12 14 16 18

> c = seq(from=6,to=26,length.out=11)  # the number of elements is determined by "length.out="
> c
[1] 6 8 10 12 14 16 18 20 22 24 26

> c = seq(from=5,by=5,length.out=5)    # other combinations of arguments can work as well
> c
[1]  5 10 15 20 25

Vectors can be easily concatenated (i.e., joined together) by simply using them as arguments to the c() function.

> v1 = c(1,2,3)
> v1
[1] 1 2 3

> v2 = 4:6
> v2
[1] 4 5 6

> v3 = c(v1,v2,7)
> v3
[1] 1 2 3 4 5 6 7

To create a vector containing repeated values, we use the rep() function

> rep(x=1,times=4)
[1] 1 1 1 1

> rep(x=c(1,2,3),each=2)
[1] 1 1 2 2 3 3

> rep(x=c(1,2,3),times=3,each=2)
[1] 1 1 2 2 3 3 1 1 2 2 3 3 1 1 2 2 3 3

> rep(x=c(1,2,3),times=c(2,3,4))
[1] 1 1 2 2 2 3 3 3 3

We can use vectors to create other vectors too. What follows gives some examples of this (but it is certainly not an exhaustive list):

Other Immediately Useful Vector Functions

The list of functions that take a vector as input in R is ridiculously long. However, the following three functions may prove very useful in the near future:

Subsetting

Sometimes one might be interested in the value at a particular position in a vector. Extracting such an element, or a group of such elements is called subsetting, and can be accomplished through an appropriate use of square brackets, as illustrated below.

> nums = c(5,8,3,2)

> nums[1]    # The 1st element is at position 1, which is contrary to how vectors 
               are treated in some other programming languages
[1] 5

> nums[2]
[1] 8

> nums[length(nums)]
[1] 2

> nums[2:4]
[1] 8 3 2

> nums[c(1,3,4)]
[1] 5 3 2

One can also extract elements from a vector by using logical values (i.e., TRUE and FALSE). Essentially, for each TRUE seen, the corresponding element of the vector being subsetted will be included in the subset and for each FALSE seen, the corresponding element will be excluded. An example is shown below:

> nums = c(5, 8, 3, 2)
> nums[c(TRUE,FALSE,FALSE,TRUE)]
[1] 5 2

The example immediately above may seem like a cumbersome way to do things -- why would one want to type all those "TRUE" and "FALSE" values when one could simply use nums[c(1,4)] instead?

If you knew the positions of the elements of the vector you want, you would be correct. However, we often want to extract elements of a vector that meet some condition instead.

For example, maybe one wishes to extract all of the even elements of the vector nums. As we will soon see when we discuss logical values in R, there is a super fast way for R to decide if the condition "this element is even" is TRUE or FALSE for each element of a vector, producing a vector of TRUE/FALSE values as a result. The same can be said of many other conditions of interest. We can then use each such generated vector of TRUE/FALSE values to subset nums or whatever other vector we might need to subset (instead of a hand-typed vector of TRUE/FALSE values).

Removing Elements from a Vector

One can also use square brackets to remove elements from a vector. The difference between this use of square brackets and subsetting is that here we use negative values inside of the brackets. Each negative value indicates a position in the original vector with its absolute value (e.g., $-3$ corresponds to position $3$, $-5$ corresponds to position $5$, and so on).

The result is a new vector with the elements of the original vector at the indicated positions removed. Note, actually altering the original vector will require another assignment, as shown below:

> nums = c(4,5,6,7,8,9,10,11,12)

> nums[-3]                  # create vector identical to nums but with 3rd element removed 
[1] 4 5 7 8 9 10 11 12

> nums                      # nums is left unchanged
[1] 4 5 6 7 8 9 10 11 12

> nums = nums[-c(1,6:8)]    # remove elements in 1st, 6th, 7th, and 8th positions
                              and reassign this new vector to nums

> nums                      # now nums has been altered
[1] 5 6 7 8 9