Descriptive statistics

Source: Wikipedia, the free encyclopedia.

A descriptive statistic (in the

demographic or clinical characteristics such as the average age, the proportion of subjects of each sex, the proportion of subjects with related co-morbidities
, etc.

Some measures that are commonly used to describe a data set are measures of central tendency and measures of variability or dispersion. Measures of central tendency include the mean, median and mode, while measures of variability include the standard deviation (or variance), the minimum and maximum values of the variables, kurtosis and skewness.[5]

Use in statistical analysis

Descriptive statistics provide simple summaries about the sample and about the observations that have been made. Such summaries may be either quantitative, i.e. summary statistics, or visual, i.e. simple-to-understand graphs. These summaries may either form the basis of the initial description of the data as part of a more extensive statistical analysis, or they may be sufficient in and of themselves for a particular investigation.

For example, the shooting

grade point average. This single number describes the general performance of a student across the range of their course experiences.[6]

The use of descriptive and summary statistics has an extensive history and, indeed, the simple tabulation of populations and of economic data was the first way the topic of statistics appeared. More recently, a collection of summarisation techniques has been formulated under the heading of exploratory data analysis: an example of such a technique is the box plot.

In the business world, descriptive statistics provides a useful summary of many types of data. For example, investors and brokers may use a historical account of return behaviour by performing empirical and analytical analyses on their investments in order to make better investing decisions in the future.

Univariate analysis

histograms and stem-and-leaf display
.

Bivariate and multivariate analysis

When a sample consists of more than one variable, descriptive statistics may be used to describe the relationship between pairs of variables. In this case, descriptive statistics include:

  • contingency tables
  • Graphical representation via
    scatterplots
  • Quantitative measures of
    dependence
  • Descriptions of
    conditional distributions

The main reason for differentiating univariate and bivariate analysis is that bivariate analysis is not only a simple descriptive analysis, but also it describes the relationship between two different variables.

z-score) units. Highly skewed data are often transformed by taking logarithms. The use of logarithms makes graphs more symmetrical and look more similar to the normal distribution, making them easier to interpret intuitively.[8]
: 47 

References