A full discussion of object-oriented programming -- while admittedly one of the most powerful paradigms in modern computing -- is probably better suited to an environment where the primary focus is on learning about computer science, as opposed to statistics. As such, we will not attempt to describe working with objects in R in all its full glory. However, having at least a minimal understanding of objects can help us understand why we see some of the things we see in R.

R uses two different (competing) structures for objects, one referred to as *S3* and the other as *S4*. The *S3* structure is simpler than *S4*, and at the heart of many familiar R actions. In these notes, we will focus solely on the *S3* structure.

Before discussing exactly what an object is, let us first talk about generic functions. You may have noticed that many of the functions you use in R do different things when presented with different types of input. For example, consider the `summary()`

function:

> v = c(1,2,6,5,7,3,4,5) > summary(v) Min. 1st Qu. Median Mean 3rd Qu. Max. 1.000 2.750 4.500 4.125 5.250 7.000 > t = as.table(matrix(c(51,43,22,92,28,21,68,22,9),ncol=3,byrow=TRUE)) > summary(t) Number of cases in table: 356 Number of factors: 2 Test for independence of all factors: Chisq = 18.51, df = 4, p-value = 0.0009808

Here, we see `summary()`

does one thing for vectors and something different for tables.

The `print()`

function behaves in a similar way -- doing different things for different types of input:

> print(v) [1] 1 2 6 5 7 3 4 5 > print(t) A B C A 51 43 22 B 92 28 21 C 68 22 9

Functions in R like these, whose behavior depends on the types of their inputs, are called *generic* functions.

Now suppose that you wanted to write an R program to run simulations on a number of card games. Given that each card has a suit and rank, it would be nice if we could consolidate both pieces of information into something that could be stored in a single variable. Of course, R provides the list data type that serves that purpose well.

Assuming that the ranks of cards in order was given by {Ace, 2, 3, ..., 10, Jack, Queen, King} and the suits of cards in order was given by {Hearts, Clubs, Diamonds, Spades}, you might start with something like this:

> my.card = list(rank=11,suit=3) # a possible way to represent the Jack of Diamonds

However, when we print `my.card`

, the result is similar to what one would see when printing any other list:

> print(my.card) $rank [1] 11 $suit [1] 3

Wouldn't it be better if -- when dealing with cards, anyways -- R could show us something more descriptive, like "`Jack of Diamonds`

"?

If we register this list as being associated with a particular *class* of card *objects* which we will name as "card", then we can supply a custom implementation of `print()`

that is just used for objects of the "card" class, as shown below:

> my.card = list(rank=11,suit=3) > class(my.card) = "card" # <-- this "registers" my.card as # an object of class "card"

Now all we need to do is provide a custom implementation for printing cards. As mentioned before, the `print()`

function is an existing generic function. It knows to check the class name of whatever object it is given and then take an appropriate action based on the class it sees. The function `print()`

takes this action by essentially dispatching its work to another function.

For example, if executing `print(x)`

and `x`

is a table, then the `print()`

function would ask the function named `print.table()`

to do the work. If `x`

is a card, `print()`

will instead look for a function named `print.card()`

to do its work.

The `print.table()`

function is one of the many functions already built into R, but the `print.card()`

function we will need to provide:

print.card = function(c) { suit.names = c("hearts","clubs","diamonds","spades") rank.names = c("ace",paste(2:10),"jack","queen","king") cat(rank.names[c$rank],"of",suit.names[c$suit],"\n") }

Now, look what happens when we try to print `my.card`

:

> print(my.card) jack of diamonds

That looks better!

Additional card objects we might create will be printed in the same way -- just remember that each such creation involves populating a list with the information associated with the card *and* registering the list with the class "card"

> another.card = list(rank=1,suit=4) > class(another.card) = "card" > print(another.card) ace of spades

To simplify the creation of objects, a common approach is to write another function (called a *constructor*) to attend to these two tasks, such as the one below for cards:

card = function(r,s) { c = list(rank=r,suit=s) class(c) = "card" return(c) } # now we can use the constructor above # to create multiple card objects... > c1 = card(1,4) > print(c1) ace of spades > c2 = card(2,1) > print(c2) 2 of hearts

Now would be a good time to note that the generic `print()`

method is unique in that it gets called automatically when evaluating a variable by itself. To see this consider the following, which produces outputs identical to the last two outputs above:

> c1 = card(1,4) > c1 ace of spades > c2 = card(2,1) > c2 2 of hearts

Beyond just `print()`

, there are other built-in generic functions for which one can provide custom implementations associated with different classes. We've already mentioned the `summary()`

function. The `plot()`

function too is generic.

Certainly though, the makers of R were not able to predict all of the function names that people would ever want to be generic. This suggests a natural question: "How does one make a function generic?"

Let us make things more concrete. Suppose in creating code to run simulations of a number of card games you not only create a "card" class, but you also create a "winnings" class. Perhaps objects associated with this latter class include the various amounts won from a variety of different games.

In some games, different cards have different "values". For example, suppose in the game *Crazy Face*, face cards are worth 10 points, while other cards are worth points equal to their rank. That said, there is a total "value" associated with all of one's winnings too.

Wouldn't it be great if we had a generic function `value()`

that we could associate with both the card and winnings classes, so that `value(x)`

would produce the appropriate output for the nature of the input it was given? In other words, we desire `value()`

to be a generic function.

We can make `value()`

generic with the following:

value = function(obj) { UseMethod("value") }

After executing the above, we can supply the associated custom implementations. For example, we could add the following implementation for objects associated with the card class:

value.card = function(c) { return(ifelse(c$rank <= 10, c$rank, 10)) }

Here's an example of its use:

> drawn.card = card(13,2) > drawn.card king of clubs > value(drawn.card) # note, a king is a face card # and our value.card() function # assigns face cards the value 10 [1] 10

If one is curious about which classes are associated with a given generic function, one can use the `methods()`

function.

As an example, the `print()`

function, as you might expect, is associated with *many* other classes, as suggested by what follows:

> methods(print) [1] print.acf* [2] print.anova* [3] print.aov* [4] print.aovlist* [5] print.ar* [6] print.Arima* [7] print.arima0* [8] print.AsIs [9] print.aspell* [10] print.aspell_inspect_context* [11] print.bibentry* [12] print.Bibtex* [13] print.browseVignettes* [14] print.by [15] print.card [16] print.changedFiles* [17] print.check_code_usage_in_package* [18] print.check_compiled_code* [19] print.check_demo_index* [20] print.check_depdef* ...

While all are not shown above, there are a total of 184 built-in classes associated with `print()`

. Of course, we have just added the card class, so it shows up in this list as well. (see [15] above).

(*If you are wondering why some of these are asterisked, this happens when they are hidden under other namespaces. We won't talk about namespaces here, but you can always google "R namespaces" if you are curious!* 😊)

Now that you have some minimal understanding of objects in R, think back on the outputs you have seen from various functions in R, especially those whose output seemed a bit verbose (e.g. any of the hypothesis test functions).

Take the function `t.test()`

for example. Here is an example of its application:

> men = c(102,87,101,96,107,101,91,85,108,67,85,82) > women = c(73,81,111,109,143,95,92,120,93,89,119,79,90,126,62,92,77,106,105,111) > t.test(x=men, y=women, alternative="two.sided", conf.level=0.95, var.equal=TRUE) Two Sample t-test data: men and women t = -0.93758, df = 30, p-value = 0.3559 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -19.016393 7.049727 sample estimates: mean of x mean of y 92.66667 98.65000

What's really happening here is that the `t.test()`

outputs an object associated with the class "`htest`

".

We can see the list sitting behind the scenes in any object by using the `unclass()`

function. Below, we apply this function to the `htest`

object resulting from an application of the `t.test()`

function.

> results = t.test(x=men, y=women, alternative="two.sided", conf.level=0.95, var.equal=TRUE) > unclass(results) $statistic t -0.9375846 $parameter df 30 $p.value [1] 0.3559453 $conf.int [1] -19.016393 7.049727 attr(,"conf.level") [1] 0.95 $estimate mean of x mean of y 92.66667 98.65000 $null.value difference in means 0 $stderr [1] 6.381646 $alternative [1] "two.sided" $method [1] " Two Sample t-test" $data.name [1] "men and women"

Indeed, the elements in the list above are exactly those discussed in the *Value* section of the help file for `t.test()`

(Type `?t.test`

to see this in RStudio.)

However, simply printing `results`

results in the same text we saw earlier.

> results Two Sample t-test data: men and women t = -0.93758, df = 30, p-value = 0.3559 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -19.016393 7.049727 sample estimates: mean of x mean of y 92.66667 98.65000

The reason we get a more consolidated version of the same information printed for us is due to the presence of a (built-in) custom `print.htest`

function, that the (untyped) application of `print()`

uses to accomplish the printing.