Random Variables / Discrete Random Variables

The idea of a random variable starts with a numerical value determined by some chance process (i.e., a random experiment). As some examples, consider the following:
• The sum of values shown when two dice are rolled
• The fraction of area covered by crab grass in a randomly selected lawn
• The number of heads seen when flipping a coin 3 times
• The number of spades seen in 5 cards drawn from a standard deck of playing cards
• The length of a randomly selected fish in a lake
• The net profit (or loss) resulting from an investment playing the lottery
• The income associated with issuing an insurance policy
• The number of microwaves sold each day at a local appliance store
Note how in each case, there is a numerical value in which we are interested. Things like the "prize won at the fair" or the "type of fish caught by a fisherman" didn't make this list. There are many things we can do with numerical values that we would have trouble doing with other types of information.

As an example -- and hopefully it won't spoil what is to come -- but sometimes in statistics we will want to talk about the average value (i.e., the mean) of something investigated. Finding averages should not be new to anyone -- they consist of adding up things and then dividing by how many things one has. However, how does one average different prizes won at the fair, or types of fish?

Being more accurate in our wording, random variables are methods for turning (possibly non-numerical) outcomes of a random experiment into numbers. Indeed, we define a random variable to be a function $X$, that assigns to each outcome $x$ in the sample space $S$ one and only one number.

Let us consider an example:

We've seen before the sample space for rolling two fair dice:

$$\begin{array}{c|c|c|c|c|c|c} & 1 & 2 & 3 & 4 & 5 & 6\\\hline 1 & (1,1) & (1,2) & (1,3) & (1,4) & (1,5) & (1,6)\\\hline 2 & (2,1) & (2,2) & (2,3) & (2,4) & (2,5) & (2,6)\\\hline 3 & (3,1) & (3,2) & (3,3) & (3,4) & (3,5) & (3,6)\\\hline 4 & (4,1) & (4,2) & (4,3) & (4,4) & (4,5) & (4,6)\\\hline 5 & (5,1) & (5,2) & (5,3) & (5,4) & (5,5) & (5,6)\\\hline 6 & (6,1) & (6,2) & (6,3) & (6,4) & (6,5) & (6,6)\\\hline \end{array}$$ Note that each element in the sample space is an ordered pair -- not a number.

However, one can turn each ordered pair into a number by summing the two coordinates. The random variable in this case is then the function $X$ that does the summing: i.e., $X(i,j) = i+j$.

In this way, $X$ is associated with a new sample space, call it $\mathscr{D} = \{2,3,4,\ldots,12\}$, as under $X$ the above table turns into: $$\begin{array}{c|c|c|c|c|c|c} & 1 & 2 & 3 & 4 & 5 & 6\\\hline 1 & 2 & 3 & 4 & 5 & 6 & 7\\\hline 2 & 3 & 4 & 5 & 6 & 7 & 8\\\hline 3 & 4 & 5 & 6 & 7 & 8 & 9\\\hline 4 & 5 & 6 & 7 & 8 & 9 & 10\\\hline 5 & 6 & 7 & 8 & 9 & 10 & 11\\\hline 6 & 7 & 8 & 9 & 10 & 11 & 12\\\hline \end{array}$$ ...and a new probability set function, $P_X$ (where, for example $P_X(7 \textrm{ or } 11) = 8/36)$.