Counting

In statistics we ultimately seek to understand and conduct a decision-making process called a hypothesis test where one decides if there is evidence some assumption should be rejected. This "evidence", when it exists, is linked to the probability of seeing what was seen in some observation or as the result of some experiment. When, under the assumption considered, the probability of seeing what was seen (or something even more compelling) is sufficiently small, we have evidence that the assumption is likely wrong.

Calculating these probabilities requires the ability to count possible outcomes in a variety of contexts. For example, given that each side of a standard die is equally likely to occur on a given roll, and out of the $6$ sides of a standard die, the count of sides that are even is $3$, the probability of rolling an even number must equal $3/6 = 1/2 = 0.50$ (i.e., 50%).

Of course, finding the probabilities of more complicated events can be a bit more challenging. Consider the probability of getting a full house (a pair and three-of-a-kind) when dealt 5 cards at random from a fully-shuffled standard deck of cards. This probability is clearly the number of possible full houses we could see in 5 dealt cards divided by the total number of 5-card hands that can be dealt -- but how do we find the counts of these things?

We need some more advanced counting techniques before we can effectively deal with these more complicated problems...


Fundamental Counting Rule

First, consider how many ways one can select one letter from the set $\{A,B,C,D,E\}$ and one number from the set $\{1,2,3\}$.

To this end, notice how each of the circles below can be paired with one of the possible letter-number combinations, and vice-versa. As there are $5$ letters and $3$ numbers, there must then be $3 \times 5 = 15$ possible circles -- and consequently the same number of possible letter-number combinations.

In a like manner, the number of ways two events can occur, where the first one can happen in $k_1$ possible ways, and the second can happen in $k_2$ possible ways, must be $k_1 \cdot k_2$.

Generalizing this result, it must be the case that a sequence of $n$ events in which the first one has $k_1$ possibilities, and the second event has $k_2$ possibilities, and the third has $k_3$, and so forth, can occur in a total of $k_1 \cdot k_2 \cdot k_3 \cdots k_n$ ways.

When one counts something by taking advantage of this principle, one is said to be applying the Fundamental Counting Rule.

As an example, suppose we want to count, using the fundamental counting principle, the number of possibilities resulting from flipping a coin, rolling a single die, and picking a prime number between 1 and 10. There are 2 possible ways to flip a coin (heads and tails), there are 6 possible rolls of a single die (1,2,3,4,5, and 6), and there are 4 primes between 1 and 10. Thus, there are $2 \cdot 6 \cdot 4 = 48$ possibilities for the result of doing all three of these actions.


Permutations

Consider how we might count the number of ways one can line up 4 people from a group of 7. Certainly any of the $7$ people of the group could be chosen to start the line. This leaves $6$ to choose from for the second person in line. With two people already standing in line, we only have $5$ persons to choose from for the third person in line, and then we must choose among the $4$ remaining for the last person in our line.

Applying the fundamental counting rule, we then have $7 \cdot 6 \cdot 5 \cdot 4 = 840$ ways to arrange $4$ people from a group of $7$ in a line.

In a like manner, the number of ways we can line up 50 people from a group of 70 is given by: $$70 \cdot 69 \cdot 68 \cdots 23 \cdot 22 \cdot 21$$ Writing out this entire product would be cumbersome, so we use the "$\cdots$" above to suggest the pattern. Still, this "$\cdots$" introduces a certain ambiguity. For example, is $2 \cdot 3 \cdots 11$ the product of the consecutive integers from $2$ to $11$ or the product of the first five primes? Wanting to "have our cake and eat it too", wouldn't it be great if we could eliminate this ambiguity and still write our expression in a compact way?

To this end, we define the factorial of $n$, denoted by $n!$, with the following: $$n! = n \cdot (n-1) \cdot (n-2) \cdots 3 \cdot 2 \cdot 1$$ For example, $4! = 4 \cdot 3 \cdot 2 \cdot 1 = 24$.

Notice how we can employ the factorial function to express the number of arrangements of 4 people possible from a group of 7 in an unambiguous and compact way: $$7 \cdot 6 \cdot 5 \cdot 4 = \frac{7 \cdot 6 \cdot 5 \cdot 4 \cdot 3 \cdot 2 \cdot 1}{3 \cdot 2 \cdot 1} = \frac{7!}{3!} = \frac{7!}{(7-4)!}$$

Granted, the last step above may seem to be a move in the wrong direction (i.e., away from a compact form), but it has the highly desirable property of being written solely in terms of the original numbers of our problem (i.e., 4 and 7). This helps us the see the general form...

Counting the number of ways one can arrange a group of $k$ things, selected from a group of $n$ things, in a sequence, also called the number of permutations of $n$ things taken $k$ at a time, and denoted by ${}_n P_k$, can thus be found in a similar manner: $${}_nP_k = \frac{n!}{(n-k)!}$$ As some observations that will prove useful later, note that since permutations are arrangements, order matters. Also, the objects in question are not reused.


Permutations Involving Identical Objects

Now consider how many ways one can arrange $n$ objects when some subsets of them consist of identical objects. As a concrete example, let us consider how many ways one can arrange the letters of the word "TEETER".

Suppose we start by asking the related question of how many ways can one rearrange the subscripted (and thus not identical) letters $T_1, E_1, E_2, E_3, T_2, R_1$. This will undoubtedly overcount what we seek, as $T_1 T_2 E_1 E_2 R_1 E_3$ and $T_2 T_1 E_3 E_1 R_1 E_2$ are counted as different in this question, but correspond to only a single arrangement of non-subscripted letters, $TTEERE$.

With all of the subscripted letters being unique, we have $6$ choices for the first letter, $5$ left to choose from for the second letter, $4$ choices for the third, and so on -- until we have but $1$ choice for the last letter. Applying the fundamental counting rule, we have $6 \cdot 5 \cdot 4 \cdot 3 \cdot 2 \cdot 1 = 6!$ ways to arrange these subscripted letters.

Not surprisingly, had we started with $n$ subscripted letters instead of $6$, we would have $n!$ ways to arrange them.

As a tangential note, since we should also be able to count these arrangements by counting the permutations of $n$ things taken $n$ at at time, we define $0! = 1$ so that this all works out... $${}_n P_n = \frac{n!}{(n-n)!} = \frac{n!}{0!} = \frac{n!}{1} = n!$$

Getting back to the problem at hand, notice that for any permutation of non-subscripted letters, say $REETET$, there are always $12$ related subscripted arrangements. Here, these are given by $$\begin{array}{cccccc} R_1 E_1 E_2 T_1 E_3 T_2 & R_1 E_1 E_3 T_1 E_2 T_2 & R_1 E_2 E_1 T_1 E_3 T_2 & R_1 E_2 E_3 T_1 E_1 T_2 & R_1 E_3 E_1 T_1 E_2 T_2 & R_1 E_3 E_2 T_1 E_1 T_2 \\ R_1 E_1 E_2 T_2 E_3 T_1 & R_1 E_1 E_3 T_2 E_2 T_1 & R_1 E_2 E_1 T_2 E_3 T_1 & R_1 E_2 E_3 T_2 E_1 T_1 & R_1 E_3 E_1 T_2 E_2 T_1 & R_1 E_3 E_2 T_2 E_1 T_1 \end{array}$$

For this particular problem then, we can "divide out the duplicates" to find the number of arrangements of letters in the word $TEETER$ to be $6!/12$.

Of course, one might wonder from where this $12$ came -- must one resort to listing out all possibilities for subscripted letter sequences for a given non-subscripted letter sequence? Certainly not! Think about how we might predict the size of this list. The list is formed by accounting for all rearrangements of $\{E_1, E_2,$, $E_3\}$ and $\{T_1, T_2\}$. As the first set consists of $3$ things, these can be rearranged in $3! = 6$ ways. The second set consists of $2$ things, resulting in $2! = 2$ arrangements. To count rearrangements of both sets, we employ the fundamental counting rule to discover $3! \cdot 2! = 6 \cdot 2 = 12$ ways to do this.

Consider what happens in other words now. Suppose the word has length $n$ and involves $k$ different kinds of letters. Suppose further that the counts for each of these different letters are given by $n_1$, $n_2$, ... and $n_{k}$. Every single permutation of $n$ non-subscripted letters then corresponds to $n_1! n_2! \cdots n_{k}!$ related subscripted arrangements. "Dividing out the duplicates", we have $$\frac{n!}{n_1! n_2! \cdots n_{k}!} \quad \textrm{permutations of the original } n \textrm{ letters}$$

As an example, one can arrange the letters of the word MISSISSIPPI in $$\frac{11!}{1!4!4!2!} = 34650 \textrm{ ways}$$

Generalizing, and leaving the context of letters behind, suppose we have a group of $n$ objects that fall into $k$ groups, where the elements of any one group are indistinguishable from each other, but distinguishable from the elements of any other group. If the numbers of objects in these groups are given by $n_1, n_2, \ldots n_k$, then the number of ways to arrange these $n$ objects is given by $$\frac{n!}{n_1! n_2! \cdots n_k!}$$


Combinations

As a variant on the questions above, suppose one wishes to count the number of ways to merely choose (as opposed to arrange) a subset of $k$ objects from a group of $n$ (distinguishable) objects.

We can leverage our earlier work in the following way. Suppose we denote these $n$ objects by $1,2,3,\ldots,n$. Now "choose" which of these to include in a given subset and which to omit, by replacing the numbers in this sequence with an "I" or "O", respectively.

For example, for objects $1,2,3,4,5,6,7,8$, the subset consisting of the second, third, seventh, and eight objects would correspond to the "word": $OIIOOOII$.

Counting the number of subsets of $k$ objects that can be formed from a group of $n$ objects can then be expressed as counting the number of arrangements of letters in a "word" with $n$ total letters, $k$ of which are "$I$"'s and $(n-k)$ of which are "$O$'s".

From our earlier work on permutations involving identical objects, we know this to be $$\frac{n!}{k!(n-k)!}$$

Giving this a name, we call the number of ways to choose $k$ objects from a set of $n$ objects the number of combinations of $n$ things taken $k$ at at time, denoted by ${}_n C_k$ with $${}_n C_k = \frac{n!}{k!(n-k)!}$$ As some observations that will prove useful later, note that since we are simply choosing objects from a group in this context, and not arranging these objects in any way -- order does not matter. Also, objects are not reused.

As an example, the number of ways to choose 4 people to serve on a committee out from a group of 12 people, is given by $${}_{12}C_4 = \frac{12!}{4!8!} = 495$$