Random Variables - Data 140 Textbook

Much of data science involves numerical variables whose observed values depend on chance. The predicted value of one variable given the values of others, the number of different classes of individuals observed in a random sample, and the median of a bootstrapped sample are just a few examples. You saw many more in Data 8, where they were often called statistics.

In probability theory, a random variable is a numerical function defined on an outcome space. That is, the domain of the function is $\Omega$ and its range is the real number line. Random variables are typically denoted by late letters of the alphabet, like $X$ and $Y$ .

3 Random Variables