F-test

An F-test is any

population from which the data were sampled. Exact "F-tests" mainly arise when the models have been fitted to the data using least squares. The name was coined by George W. Snedecor, in honour of Ronald Fisher. Fisher initially developed the statistic as the variance ratio in the 1920s.^[2]

Common examples

Common examples of the use of F-tests include the study of the following cases

One-way ANOVA table with 3 random groups that each has 30 observations. F value is being calculated in the second to last column
The hypothesis that the means of a given set of normally distributed populations, all having the same standard deviation, are equal. This is perhaps the best-known F-test, and plays an important role in the analysis of variance (ANOVA).
- F test of analysis of variance (ANOVA) follows three assumptions
  1. Normality (statistics)
  2. Homogeneity of variance
  3. Independence of errors and random sampling

The hypothesis that a proposed regression model fits the data well. See Lack-of-fit sum of squares.
The hypothesis that a data set in a regression analysis follows the simpler of two proposed linear models that are nested within each other.
Multiple-comparison testing is conducted using needed data in already completed F-test, if F-test leads to rejection of null hypothesis and the factor under study has an impact on the dependent variable.^[1]
- "a priori comparisons"/ "planned comparisons"- a particular set of comparisons
- "pairwise comparisons"-all possible comparisons
  - i.e. Fisher's least significant difference (LSD) test,
    Newman Keuls test
    , Ducan's test
- "
  exploratory comparisons
  "- choose comparisons after examining the data
  - i.e. Scheffé's method

F-test of the equality of two variances

The F-test is

Type I error rate.^[5]

Formula and calculation

Most F-tests arise by considering a decomposition of the variability in a collection of data in terms of sums of squares. The test statistic in an F-test is the ratio of two scaled sums of squares reflecting different sources of variability. These sums of squares are constructed so that the statistic tends to be greater when the null hypothesis is not true. In order for the statistic to follow the F-distribution under the null hypothesis, the sums of squares should be statistically independent, and each should follow a scaled χ²-distribution. The latter condition is guaranteed if the data values are independent and normally distributed with a common variance.

One-way analysis of variance

The formula for the one-way ANOVA F-test statistic is

F={\frac {\text{explained variance}}{\text{unexplained variance}}},

or

F={\frac {\text{between-group variability}}{\text{within-group variability}}}.

The "explained variance", or "between-group variability" is

\sum _{i=1}^{K}n_{i}({\bar {Y}}_{i\cdot }-{\bar {Y}})^{2}/(K-1)

where ${\bar {Y}}_{i\cdot }$ denotes the sample mean in the i-th group, $n_{i}$ is the number of observations in the i-th group, ${\bar {Y}}$ denotes the overall mean of the data, and $K$ denotes the number of groups.

The "unexplained variance", or "within-group variability" is

\sum _{i=1}^{K}\sum _{j=1}^{n_{i}}\left(Y_{ij}-{\bar {Y}}_{i\cdot }\right)^{2}/(N-K),

where $Y_{ij}$ is the j^th observation in the i^th out of $K$ groups and $N$ is the overall sample size. This F-statistic follows the F-distribution with degrees of freedom $d_{1}=K-1$ and $d_{2}=N-K$ under the null hypothesis. The statistic will be large if the between-group variability is large relative to the within-group variability, which is unlikely to happen if the population means of the groups all have the same value.

The result of the F test can be determined by comparing calculated F value and critical F value with specific significance level (e.g. 5%). The F table serves as a reference guide containing critical F values for the distribution of the F-statistic under the assumption of a true null hypothesis. It is designed to help determine the threshold beyond which the F statistic is expected to exceed a controlled percentage of the time (e.g., 5%) when the null hypothesis is accurate. To locate the critical F value in the F table, one needs to utilize the respective degrees of freedom. This involves identifying the appropriate row and column in the F table that corresponds to the significance level being tested (e.g., 5%).^[6]

How to use critical F values:

If the F statistic < the critical F value

Fail to reject null hypothesis
Reject alternative hypothesis
There is no significant differences among sample averages
The observed differences among sample averages could be reasonably caused by random chance itself
The result is not statistically significant

If the F statistic > the critical F value

Accept alternative hypothesis
Reject null hypothesis
There is significant differences among sample averages
The observed differences among sample averages could not be reasonably caused by random chance itself
The result is statistically significant

Note that when there are only two groups for the one-way ANOVA F-test, $F=t^{2}$ where t is the Student's $t$ statistic.

Advantages

Multi-group Comparison Efficiency: Facilitating simultaneous comparison of multiple groups, enhancing efficiency particularly in situations involving more than two groups.
Clarity in Variance Comparison: Offering a straightforward interpretation of variance differences among groups, contributing to a clear understanding of the observed data patterns.
Versatility Across Disciplines: Demonstrating broad applicability across diverse fields, including social sciences, natural sciences, and engineering.

Disadvantages

Sensitivity to Assumptions: The F-test is highly sensitive to certain assumptions, such as homogeneity of variance and normality which can affect the accuracy of test results.
Limited Scope to Group Comparisons: The F-test is tailored for comparing variances between groups, making it less suitable for analyses beyond this specific scope.
Interpretation Challenges: The F-test does not pinpoint specific group pairs with distinct variances. Careful interpretation is necessary, and additional post hoc tests are often essential for a more detailed understanding of group-wise differences.

Multiple-comparison ANOVA problems

The F-test in one-way analysis of variance (

multiple comparisons. The disadvantage of the ANOVA F-test is that if we reject the null hypothesis

, we do not know which treatments can be said to be significantly different from the others, nor, if the F-test is performed at level α, can we state that the treatment pair with the greatest mean difference is significantly different at level α.

Regression problems

Consider two models, 1 and 2, where model 1 is 'nested' within model 2. Model 1 is the restricted model, and model 2 is the unrestricted one. That is, model 1 has p₁ parameters, and model 2 has p₂ parameters, where p₁ < p₂, and for any choice of parameters in model 1, the same regression curve can be achieved by some choice of the parameters of model 2.

One common context in this regard is that of deciding whether a model fits the data significantly better than does a naive model, in which the only explanatory term is the intercept term, so that all predicted values for the dependent variable are set equal to that variable's sample mean. The naive model is the restricted model, since the coefficients of all potential explanatory variables are restricted to equal zero.

Another common context is deciding whether there is a structural break in the data: here the restricted model uses all data in one regression, while the unrestricted model uses separate regressions for two different subsets of the data. This use of the F-test is known as the Chow test.

The model with more parameters will always be able to fit the data at least as well as the model with fewer parameters. Thus typically model 2 will give a better (i.e. lower error) fit to the data than model 1. But one often wants to determine whether model 2 gives a significantly better fit to the data. One approach to this problem is to use an F-test.

If there are n data points to estimate parameters of both models from, then one can calculate the F statistic, given by

F={\frac {\left({\frac {{\text{RSS}}_{1}-{\text{RSS}}_{2}}{p_{2}-p_{1}}}\right)}{\left({\frac {{\text{RSS}}_{2}}{n-p_{2}}}\right)}}={\frac {{\text{RSS}}_{1}-{\text{RSS}}_{2}}{{\text{RSS}}_{2}}}\cdot {\frac {n-p_{2}}{p_{2}-p_{1}}},

where RSS_i is the

likelihood ratio test

.

References

Further reading

Fox, Karl A. (1980). Intermediate Economic Statistics (Second ed.). New York: John Wiley & Sons. pp. 290–310.
ISBN 0-88275-521-8
.

Johnston, John (1972). Econometric Methods (Second ed.). New York: McGraw-Hill. pp. 35–38.

ISBN 0-02-365070-2
.

ISBN 978-0-470-01512-4
.

External links

Table of F-test critical values

Free calculator for F-testing

The F-test for Linear Regression

Econometrics lecture (topic: hypothesis testing) on
YouTube by Mark Thoma

v
t
e
Statistics

Outline

Index

Continuous data
Center

Mean
Arithmetic

Arithmetic-Geometric

Cubic

Generalized/power

Geometric

Harmonic

Heronian

Heinz

Lehmer

Median

Mode

Dispersion

Average absolute deviation

Coefficient of variation

Interquartile range

Percentile

Range

Standard deviation

Variance

Shape

Central limit theorem

Moments
Kurtosis

L-moments

Skewness

Count data

Index of dispersion

Summary tables

Contingency table

Frequency distribution

Grouped data

Dependence

Partial correlation

Pearson product-moment correlation

Rank correlation
Kendall's τ

Spearman's ρ

Scatter plot

Graphics

Bar chart

Biplot

Box plot

Control chart

Correlogram

Fan chart

Forest plot

Histogram

Pie chart

Q–Q plot

Radar chart

Run chart

Scatter plot

Stem-and-leaf display

Violin plot

Data collection
Study design

Effect size

Missing data

Optimal design

Population

Replication

Sample size determination

Statistic

Statistical power

Survey methodology

Sampling
Cluster

Stratified

Opinion poll

Questionnaire

Standard error

Controlled experiments

Blocking

Factorial experiment

Interaction

Random assignment

Randomized controlled trial

Randomized experiment

Scientific control

Adaptive designs

Adaptive clinical trial

Stochastic approximation

Up-and-down designs

Observational studies

Cohort study

Cross-sectional study

Natural experiment

Quasi-experiment

Statistical inference
Statistical theory

Population

Statistic

Probability distribution

Sampling distribution
Order statistic

Empirical distribution
Density estimation

Statistical model
Model specification

L^p space

Parameter
location

scale

shape

Parametric family
Likelihood (monotone)

Location–scale family

Exponential family

Completeness

Sufficiency

Statistical functional

Bootstrap

U

V

Optimal decision
loss function

Efficiency

Statistical distance
divergence

Asymptotics

Robustness

Frequentist inference
Point estimation

Estimating equations
Maximum likelihood

Method of moments

M-estimator

Minimum distance

Unbiased estimators
Mean-unbiased minimum-variance
Rao–Blackwellization

Lehmann–Scheffé theorem

Median unbiased

Plug-in

Interval estimation

Confidence interval

Pivot

Likelihood interval

Prediction interval

Tolerance interval

Resampling
Bootstrap

Jackknife

Testing hypotheses

1- & 2-tails

Power

Uniformly most powerful test

Permutation test
Randomization test

Multiple comparisons

Parametric tests

Likelihood-ratio

Score/Lagrange multiplier

Wald

Specific tests

Z-test (normal)

Student's t-test

F-test

Goodness of fit

Chi-squared

G-test

Kolmogorov–Smirnov

Anderson–Darling

Lilliefors

Jarque–Bera

Normality (Shapiro–Wilk)

Likelihood-ratio test

Model selection
Cross validation

AIC

BIC

Rank statistics

Sign
Sample median

Signed rank (Wilcoxon)
Hodges–Lehmann estimator

Rank sum (Mann–Whitney)

Nonparametric anova
1-way (Kruskal–Wallis)

2-way (Friedman)

Ordered alternative (Jonckheere–Terpstra)

Van der Waerden test

Bayesian inference

Bayesian probability
prior

posterior

Credible interval

Bayes factor

Bayesian estimator
Maximum posterior estimator

Correlation

Pearson product-moment

Partial correlation

Confounding variable

Coefficient of determination

Regression analysis

Errors and residuals

Regression validation

Mixed effects models

Simultaneous equations models

Multivariate adaptive regression splines (MARS)

Linear regression

Simple linear regression

Ordinary least squares

General linear model

Bayesian regression

Non-standard predictors

Nonlinear regression

Nonparametric

Semiparametric

Isotonic

Robust

Heteroscedasticity

Homoscedasticity

Generalized linear model

Exponential families

Logistic (Bernoulli) / Binomial / Poisson regressions

Partition of variance

Analysis of variance (ANOVA, anova)

Analysis of covariance

Multivariate ANOVA

Degrees of freedom

Categorical / Multivariate / Time-series / Survival analysis
Categorical

Cohen's kappa

Contingency table

Graphical model

Log-linear model

McNemar's test

Cochran–Mantel–Haenszel statistics

Multivariate

Regression

Manova

Principal components

Canonical correlation

Discriminant analysis

Cluster analysis

Classification

Structural equation model
Factor analysis

Multivariate distributions

Elliptical distributions
Normal

Time-series
General

Decomposition

Trend

Stationarity

Seasonal adjustment

Exponential smoothing

Cointegration

Structural break

Granger causality

Specific tests

Dickey–Fuller

Johansen

Q-statistic (Ljung–Box)

Durbin–Watson

Breusch–Godfrey

Time domain

Autocorrelation (ACF)
partial (PACF)

Cross-correlation (XCF)

ARMA model

ARIMA model (Box–Jenkins)

Autoregressive conditional heteroskedasticity (ARCH)

Vector autoregression (VAR)

Frequency domain

Spectral density estimation

Fourier analysis

Least-squares spectral analysis

Wavelet

Whittle likelihood

Survival
Survival function

Kaplan–Meier estimator (product limit)

Proportional hazards models

Accelerated failure time (AFT) model

First hitting time

Hazard function

Nelson–Aalen estimator

Test

Log-rank test

Applications
Biostatistics

Bioinformatics

Clinical trials / studies

Epidemiology

Medical statistics

Engineering statistics

Chemometrics

Methods engineering

Probabilistic design

Process / quality control

Reliability

System identification

Social statistics

Actuarial science

Census

Crime statistics

Demography

Econometrics

Jurimetrics

National accounts

Official statistics

Population statistics

Psychometrics

Spatial statistics

Cartography

Environmental statistics

Geographic information system

Geostatistics

Kriging

Category

Mathematics portal

Commons

WikiProject

Retrieved from "https://en.wikipedia.org/w/index.php?title=F-test&oldid=1193741443"

[:0-1] 
ISBN 978-3-319-64582-7
.

[2] ISBN 978-0-8058-5850-1
.

[3] JSTOR 2333350
.

[4] JSTOR 2684360
.

[5] :10.22237/jmasm/1036109940. Archived
from the original on 2015-04-03. Retrieved 2015-03-30.

[6] ISBN 978-0-12-804250-2
, retrieved 2023-12-10

[2]

[1]

[5]

[6]