## Vectors in R

A vector in R is an ordered collection of values, and is a data type that is fundamental to how R functions.

The term "values" in R is broader in scope than what we normally take the term to mean. A value in R can be a numerical value (e.g., 1.2, 5, -79.843), but it can also be a character or string of characters(e.g., "a", "alice", "trial_4"). Values can also contain logical values (i.e., TRUE or FALSE). As a convenience, a special value, NA, can also be used in any vector as a place-holder for a value that is "not available" for one reason or another.

An important characteristic for a vector, however, is that all of its values must be of the same type. Thus, you can't mix numerical values and strings of characters in the same vector, for instance.

A variable can store a vector just as easily as it stores a single numerical value (Spoiler alert: single numerical values actually are vectors in R). Indeed, variables can store any of the types of "values" mentioned above.

### Creating and Storing Vectors

To create a vector and assign it to a variable, we use the c() function and the = assignment operator, as the next couple of examples illustrate. Recall that the = operator as used here does not mean what it means in mathematics. It means instead to take the element specified on its right side and store it in the specified element on its left side.

> myNumericalVector = c(1,2,3)
> myNumericalVector
[1] 1 2 3

> myStringVector = c("alpha","beta","gamma")
> myStringVector
[1] "alpha" "beta" "gamma"

> myLogicalVector = c(TRUE,FALSE,FALSE,FALSE)
> myLogicalVector
[1] TRUE FALSE FALSE FALSE


In the case where we only wish to store a single value, we may do so in a manner that requires slightly less typing. However, the result is still a vector.

> x = 5          # yields exactly the same output as: x = c(5)
> x
[1] 5


In the case where we wish to store a vector whose consecutive elements have a common difference of one (e.g., $(4,5,6,7,8)$), we can use a colon to more quickly define the vector, as shown below.

> a = 4:8        # yields exactly the same output as: a = c(4,5,6,7,8)
> a
[1] 4 5 6 7 8


To create a vector whose elements form an arithmetic sequence with a common difference other than one, we can use the seq() function:

> b = seq(from=6,to=18,by=2)           # here we specify the common difference with "by="
> b
[1] 6 8 10 12 14 16 18

> c = seq(from=6,to=26,length.out=11)  # the number of elements is determined by "length.out="
> c
[1] 6 8 10 12 14 16 18 20 22 24 26

> c = seq(from=5,by=5,length.out=5)    # other combinations of arguments can work as well
> c
[1]  5 10 15 20 25


Vectors can be easily concatenated (i.e., joined together) by simply using them as arguments to the c() function.

> v1 = c(1,2,3)
> v1
[1] 1 2 3

> v2 = 4:6
> v2
[1] 4 5 6

> v3 = c(v1,v2,7)
> v3
[1] 1 2 3 4 5 6 7


To create a vector containing repeated values, we use the rep() function

> rep(x=1,times=4)
[1] 1 1 1 1

> rep(x=c(1,2,3),each=2)
[1] 1 1 2 2 3 3

> rep(x=c(1,2,3),times=3,each=2)
[1] 1 1 2 2 3 3 1 1 2 2 3 3 1 1 2 2 3 3

> rep(x=c(1,2,3),times=c(2,3,4))
[1] 1 1 2 2 2 3 3 3 3


We can use vectors to create other vectors too. What follows gives some examples of this (but it is certainly not an exhaustive list):

• Use the sort() function to put a vector's elements in increasing or decreasing order:
> data = c(3,2,9,5,4,1,6,8,7)
> sort(data, decreasing=FALSE)
[1] 1 2 3 4 5 6 7 8 9
> sort(data, decreasing=TRUE)
[1] 9 8 7 6 5 4 3 2 1

• Use the sqrt() function to take square roots of the elements of a vector:
> squares = c(1,4,9,16,25)
> sqrt(squares)
[1] 1 2 3 4 5


Other mathematical functions, like log(), exp(), sin(), etc work similarly.

• Use the is.na() function to determine if any values of a vector are NA or NaN. Note that the result is a vector of logical values (i.e., TRUE or FALSE).
> v = c(1,5,NA,3,NaN,7,NA,NA)
> is.na(v)
[1] FALSE FALSE  TRUE FALSE  TRUE FALSE  TRUE  TRUE

• You can add two vectors together, which adds their corresponding elements.
> vec = c(1,2,3) + c(4,5,6)
> vec
[1] 5 7 9

Similar things happen with other arithmetic operators. However, one should be careful when asking R to divide by zero or do some other inappropriate mathematical operation. In such cases, you may see special values (i.e., Inf or NaN) appear when the result of a calculation is either "infinite" or "not a number", as the below demonstrates:

> v1 = c(1,0,1,-1)
> v2 = c(1,0,0,0)
> v1/v2
[1]    1  NaN  Inf -Inf

• Adding a value to a vector creates a new vector formed by adding the value to each of the original vector's values. Similar things happen with other arithmetic operators.
> vec = c(1,2,3,4)
> vec+3
[1] 4 5 6 7

• As an extension of the previous idea -- or more precisely, as its explanation -- when you combine two vectors of different sizes, the smaller one expands to the size of the larger through the repetition of its elements, and then the vectors are combined component-wise. This is called recycling.

If the longer vector's length is not a multiple of the smaller vector length - while the calculation will proceed - R will issue a warning.

> c(1,2,3,4,5,6) + c(1,2,3)
[1] 2 4 6 5 7 9

> c(1,2,3,4,5,6,7,8) + c(1,2,3)
[1] 2 4 6 5 7 9 8 10
Warning message:
In c(1, 2, 3, 4, 5, 6, 7, 8) + c(1, 2, 3) :
longer object length is not a multiple of shorter object length


### Other Immediately Useful Vector Functions

The list of functions that take a vector as input in R is ridiculously long. However, the following three functions may prove very useful in the near future:

• We can find the number of elements in a given vector with the length() function:

> nums = c(5,8,3,2)
> length(nums)
[1] 4

• We can find the sum and product of elements in a given vector with the sum() and product() functions:

> nums = c(5,8,3,2)
> sum(nums)
[1] 18
> product(nums)
[1] 240


### Subsetting

Sometimes one might be interested in the value at a particular position in a vector. Extracting such an element, or a group of such elements is called subsetting, and can be accomplished through an appropriate use of square brackets, as illustrated below.

> nums = c(5,8,3,2)

> nums[1]    # The 1st element is at position 1, which is contrary to how vectors
are treated in some other programming languages
[1] 5

> nums[2]
[1] 8

> nums[length(nums)]
[1] 2

> nums[2:4]
[1] 8 3 2

> nums[c(1,3,4)]
[1] 5 3 2


One can also extract elements from a vector by using logical values (i.e., TRUE and FALSE). Essentially, for each TRUE seen, the corresponding element of the vector being subsetted will be included in the subset and for each FALSE seen, the corresponding element will be excluded. An example is shown below:

> nums = c(5, 8, 3, 2)
> nums[c(TRUE,FALSE,FALSE,TRUE)]
[1] 5 2


The example immediately above may seem like a cumbersome way to do things -- why would one want to type all those "TRUE" and "FALSE" values when one could simply use nums[c(1,4)] instead?

If you knew the positions of the elements of the vector you want, you would be correct. However, we often want to extract elements of a vector that meet some condition instead.

For example, maybe one wishes to extract all of the even elements of the vector nums. As we will soon see when we discuss logical values in R, there is a super fast way for R to decide if the condition "this element is even" is TRUE or FALSE for each element of a vector, producing a vector of TRUE/FALSE values as a result. The same can be said of many other conditions of interest. We can then use each such generated vector of TRUE/FALSE values to subset nums or whatever other vector we might need to subset (instead of a hand-typed vector of TRUE/FALSE values).

### Removing Elements from a Vector

One can also use square brackets to remove elements from a vector. The difference between this use of square brackets and subsetting is that here we use negative values inside of the brackets. Each negative value indicates a position in the original vector with its absolute value (e.g., $-3$ corresponds to position $3$, $-5$ corresponds to position $5$, and so on).

The result is a new vector with the elements of the original vector at the indicated positions removed. Note, actually altering the original vector will require another assignment, as shown below:

> nums = c(4,5,6,7,8,9,10,11,12)

> nums[-3]                  # create vector identical to nums but with 3rd element removed
[1] 4 5 7 8 9 10 11 12

> nums                      # nums is left unchanged
[1] 4 5 6 7 8 9 10 11 12

> nums = nums[-c(1,6:8)]    # remove elements in 1st, 6th, 7th, and 8th positions
and reassign this new vector to nums

> nums                      # now nums has been altered
[1] 5 6 7 8 9