L2 Outliers and Trimmed Mean - Part 2
Completion requirements
Unit E: Statistics and Probability
Chapter 1: Statistics
Outliers and Trimmed Mean
When outliers occur in a set of data, the mean can often provide a misleading result. That is, the mean will be distorted or skewed.
Ten coworkers are sitting at a restaurant table in downtown Calgary. Nine of the workers have annual incomes of $75 000. The tenth individual is a successful businessman who earns $5 000 000 per year. The mean of their annual incomes is $567 500. This
figure does not represent what the typical person at the table earns. The data value of $5 000 000 skews the mean to be much higher than the average salary of the people at the table.
To avoid outliers skewing the value of the mean, a trimmed mean is used to represent the average of the set of data. The trimmed mean uses a method that eliminates the largest and smallest data values before calculating the mean.
Calculate the trimmed mean in the following set of data.
When the trimmed mean is calculated, the values must be arranged in ascending order. An equal number of values from the top and bottom of the data must be removed when eliminating outliers. In this set of data, the values of 2 and 4 are smaller than the other numbers.
Therefore, the two lowest numbers and two highest numbers are excluded.
The trimmed mean of this set of data is
10 |
15 |
2 |
13 |
12 |
13 |
16 |
14 |
15 |
11 |
4 |
14 |
15 |
14 |
When the trimmed mean is calculated, the values must be arranged in ascending order. An equal number of values from the top and bottom of the data must be removed when eliminating outliers. In this set of data, the values of 2 and 4 are smaller than the other numbers.
Therefore, the two lowest numbers and two highest numbers are excluded.
|
|
10 |
11 |
12 |
13 |
13 |
14 |
14 |
14 |
15 |
15 |
|
|
The trimmed mean of this set of data is
A figure skating competition produces the following scores: 7.8, 8.1, 8.3, 7.5, 9.9.
- Calculate the mean and the trimmed mean. Round to the nearest hundredth.
- What is the purpose of using the trimmed mean instead of the mean?
- To calculate the mean, use the following formula:
The outlier is 9.9. The top and bottom values, 7.5 and 9.9, are removed to eliminate the outlier before finding the trimmed mean.
Trimmed mean:
- Calculating the trimmed mean can reduce the effects of outliers on a data set.

Quentin is a real estate agent in a town of 5 000 people. To be effective, he needs to know the average house price in his town for new and current clients. Currently, he has houses posted at $315 000, $299 900, $283 000, $315 000, $277 000,
$269 900, and $230 900.
- Calculate the mean, median, and mode of the house prices.
- Of the three measures of central tendency, which should Quentin provide to his clients as the average house price in his town? Explain.
- Quentin listed another house at $645 000. Calculate the new mean, median, and mode.
- Compare the two means. How is the mean affected by the new listing?
- Compare the two medians. How is the median affected by the new listing?
- Compare the two modes. How is the mode affected by the new listing?
- Which of the three new measures of central tendency should Quentin provide to his clients as the average house price in his town? Explain.
- To calculate the mean, use the following formula:
The mean house price is $284 385.71.
To find the median, arrange the data in ascending order.
$230 900, $269 900, $277 000, $283 000, $299 900, $315 000, $315 000
There are 7 data values. As there is an odd number of data values, the median is the middle value. Therefore, the median house price is $283 000.
Recall that the mode is the most frequently occurring data value. Therefore, the mode house price is $315 000.
- Mean and median are both good representations of the average. These numbers are close in value. Mode is not a useful average because it only indicates that more than one house is being sold for $315 000.
- New mean:
The new mean house price is $329 462.50.
New median:
Arrange the data in ascending order.
$230 900, $269 900, $277 000, $283 000, $299 900, $315 000, $315 000, $645 000
There are 8 data values. Since there is an even number of data values, the median is the average of the two middle values; i.e., the fourth and fifth data values.
$230 900, $269 900, $277 000, , $315 000, $315 000, $645 000
The new median house price is $291 450.
New mode:
The mode house price is still $315 000.
- The mean increases by over $45 000 when the new listing is added. An outlier with a much higher value than the rest of the data increases the mean substantially.
- The median increases by over $8 000 when the new listing is added.
- The mode remains the same, as long as the outlier added does not contribute to a new mode.
- As the mean is easily affected by outliers, it should not be used. The average housing price should be the median, which indicates that half the house prices are above and half are below that middle value. Mode is not a useful average since it only communicates that more than one house is listed at a particular price.