Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Much of data science involves numerical variables whose observed values depend on chance. The predicted value of one variable given the values of others, the number of different classes of individuals observed in a random sample, and the median of a bootstrapped sample are just a few examples. You saw many more in Data 8, where they were often called statistics.

In probability theory, a random variable is a numerical function defined on an outcome space. That is, the domain of the function is Ω\Omega and its range is the real number line. Random variables are typically denoted by late letters of the alphabet, like XX and YY.