Trimmed estimator

Source: Wikipedia, the free encyclopedia.

In

robust statistic, and the extreme values are considered outliers.[1] Trimmed estimators also often have higher efficiency for mixture distributions and heavy-tailed distributions than the corresponding untrimmed estimator, at the cost of lower efficiency for other distributions, such as the normal distribution
.

Given an estimator, the x% trimmed version is obtained by discarding the x% lowest or highest observations or on both end: it is a statistic on the middle of the data. For instance, the 5%

trimmed mean
is obtained by taking the mean of the 5% to 95% range. In some cases a trimmed estimator discards a fixed number of points (such as maximum and minimum) instead of a percentage.

Examples

The median is the most trimmed statistic (nominally 50%), as it discards all but the most central data, and equals the fully trimmed mean – or indeed fully trimmed mid-range, or (for odd-size data sets) the fully trimmed maximum or minimum. Likewise, no degree of trimming has any effect on the median – a trimmed median is the median – because trimming always excludes an equal number of the lowest and highest values.

Quantiles can be thought of as trimmed maxima or minima: for instance, the 5th percentile
is the 5% trimmed minimum.

Trimmed estimators used to estimate a location parameter include:

  • Trimmed mean
  • Modified mean
    , discarding the minimum and maximum values
  • trimmed mean
  • Midhinge, the 25% trimmed mid-range

Trimmed estimators used to estimate a scale parameter include:

Trimmed estimators involving only linear combinations of points are examples of L-estimators.

Applications

Estimation

Most often, trimmed estimators are used for

.

For example, when estimating a

Pearson's skewness coefficients
) measure the bias of the median as an estimator of the mean.

When estimating a

scale factor to make it an unbiased consistent estimator; see scale parameter: estimation
.

For example, dividing the IQR by (using the error function) makes it an unbiased, consistent estimator for the population standard deviation if the data follow a normal distribution.

Other uses

Trimmed estimators can also be used as statistics in their own right – for example, the median is a measure of location, and the IQR is a measure of dispersion. In these cases, the sample statistics can act as estimators of their own expected value. For example, the MAD of a sample from a standard Cauchy distribution is an estimator of the population MAD, which in this case is 1, whereas the population variance does not exist.

See also

  • Winsorising
    , a related technique
  • Core inflation, an economic statistic that omits volatile components

References