Once we have a random variable, we often want to work with functions of it. For example, if a random variables is an estimator, we usually want to see how far it is from the value it is trying to estimate. For example, we might want to see how far a random variable is from the number 10. That’s a function of . Let’s call it . Then
This section is about finding the expectation of a function of a random variable whose distribution you know. Throughout, we will assume that all the expectations that we are discussing are well defined.
In what follows, let be a random variable whose distribution (and hence also expectation) are known.
8.3.1Linear Function Rule¶
Let be a random variable with expectation and let for some constants and .
This kind of transformation happens for example when you change units of measurement.
If you switch from Celsius to Fahreneheit, then and .
If you switch from inches to centimeters, then and .
We can find by applying the definition of expectation on the domain . For every , we have . So
For example, . Also , and .
The expectation of a linear transformation of is the linear transformation of the expectation of . This is a handy result as we will often be transforming variables linearly.
Answer
25
But expectation behaves differently under non-linear transformation.
🎥 See More
8.3.2Non-linear Function Rule¶
Now let where is any numerical function. Remember that is a function on . So the function that defines the random variable is a composition:
This allows us to write in three equivalent ways:
On the range of
On the domain
On the range of
As before, it is a straightforward matter of grouping to show that all the forms are equivalent.
The first form looks the simplest, but there’s a catch: you need to first find . The second form involves an unnecessarily high level of detail.
The third form is the one to use. It uses the known distribution of . It says that to find where for some function :
Take a generic value of .
Apply to ; this is a generic value of .
Weight by , which is known.
Do this for all and add. The sum is .
The crucial thing to note about this method is that we didn’t have to first find the distribution of . That saves us a lot of work.
🎥 See More
Let’s see how our method works in some examples.
8.3.3¶
Let have a distribution we worked with earlier:
x = np.arange(1, 6)
probs = make_array(0.15, 0.25, 0.3, 0.2, 0.1)
dist = Table().values(x).probabilities(probs)
dist = dist.relabel('Value', 'x').relabel('Probability', 'P(X=x)')
distLet be the function defined by , and let . In other words, .
To calculate , we first have to create a column that transforms the values of into values of :
dist_with_Y = dist.with_column('g(x)', np.abs(dist.column('x')-3)).move_to_end('P(X=x)')
dist_with_YTo get , find the appropriate weighed average: multiply the g(x) and P(X=x) columns, and add. The calculation shows that .
ev_Y = sum(dist_with_Y.column('g(x)') * dist_with_Y.column('P(X=x)'))
ev_Y0.949999999999999968.3.4¶
Let be as above, but now let . We want . What we know is the distribution of :
distTo find we can just go row by row and replace the value of by the value of , and then find the weighted average:
ev_Y = 1*0.15 + 2*0.25 + 3*0.3 + 3*0.2 + 3*0.1
ev_Y2.458.3.5 for a Poisson Variable ¶
Let have the Poisson distribution.
In the next section we will use this to find . For now, notice that
Since , we have . We will see later that this inequality is true for all random variables for which the expected square is finite.