Randomized experiment

experimental design and in survey sampling

Overview

In the statistical theory of

. For example, if an experiment compares a new drug against a standard drug, then the patients should be allocated to either the new drug or to the standard drug control using randomization.

Randomized experimentation is not haphazard. Randomization reduces bias by equalising other factors that have not been explicitly accounted for in the experimental design (according to the law of large numbers). Randomization also produces ignorable designs, which are valuable in model-based statistical inference, especially Bayesian or likelihood-based. In the design of experiments, the simplest design for comparing treatments is the "completely randomized design". Some "restriction on randomization" can occur with blocking and experiments that have hard-to-change factors; additional restrictions on randomization can occur when a full randomization is infeasible or when it is desirable to reduce the variance of estimators of selected effects.

Randomization of treatment in

clinical trials pose ethical problems. In some cases, randomization reduces the therapeutic options for both physician and patient, and so randomization requires clinical equipoise

regarding the treatments.

Online randomized controlled experiments

Web sites can run randomized controlled experiments [2] to create a feedback loop.^[3] Key differences between offline experimentation and online experiments include:^[3]^[4]

Logging: user interactions can be logged reliably.
Number of users: large sites, such as Amazon, Bing/Microsoft, and Google run experiments, each with over a million users.
Number of concurrent experiments: large sites run tens of overlapping, or concurrent, experiments.^[5]
Robots, whether
internet bots.^{[clarification needed}
]

Ability to ramp-up experiments from low percentages to higher percentages.

Speed / performance has significant impact on key metrics.[3]^[6]

Ability to use the pre-experiment period as an A/A test to reduce variance.^[7]

History

A controlled experiment appears to have been suggested in the Old Testament's Book of Daniel. King Nebuchadnezzar proposed that some Israelites eat "a daily amount of food and wine from the king's table." Daniel preferred a vegetarian diet, but the official was concerned that the king would "see you looking worse than the other young men your age? The king would then have my head because of you." Daniel then proposed the following controlled experiment: "Test your servants for ten days. Give us nothing but vegetables to eat and water to drink. Then compare our appearance with that of the young men who eat the royal food, and treat your servants in accordance with what you see". (Daniel 1, 12– 13).^[8]^[9]

Randomized experiments were institutionalized in psychology and education in the late eighteen-hundreds, following the invention of randomized experiments by C. S. Peirce.^[10]^[11]^[12]^[13] Outside of psychology and education, randomized experiments were popularized by

R.A. Fisher in his book Statistical Methods for Research Workers

, which also introduced additional principles of experimental design.

Statistical interpretation

The

statistical test

. The model also accounts for potential confounding factors, which are factors that could affect both the treatment and the outcome. By controlling for these confounding factors, the model helps to ensure that any observed treatment effect is truly causal and not simply the result of other factors that are correlated with both the treatment and the outcome.

The Rubin Causal Model is a useful a framework for understanding how to estimate the causal effect of the treatment, even when there are confounding variables that may affect the outcome. This model specifies that the causal effect of the treatment is the difference in the outcomes that would have been observed for each individual if they had received the treatment and if they had not received the treatment. In practice, it is not possible to observe both potential outcomes for the same individual, so statistical methods are used to estimate the causal effect using data from the experiment.

Empirical evidence that randomization makes a difference

Empirically differences between randomized and non-randomized studies,[14]^{[needs update]} and between adequately and inadequately randomized trials have been difficult to detect.^[15]^[16]

References

PMID 20332509.{{cite journal}}: CS1 maint: multiple names: authors list (link
)

^ Kohavi, Ron; Longbotham, Roger (2015). "Online Controlled Experiments and A/B Tests" (PDF). In Sammut, Claude; Webb, Geoff (eds.). Encyclopedia of Machine Learning and Data Mining. Springer. pp. to appear.

^
ISSN 1384-5810
.

^ Kohavi, Ron; Deng, Alex; Frasca, Brian; Longbotham, Roger; Walker, Toby; Xu Ya (2012). "Trustworthy Online Controlled Experiments: Five Puzzling Outcomes Explained". Proceedings of the 18th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.

S2CID 13224883.{{cite book}}: CS1 maint: date and year (link
)

^ Kohavi, Ron; Deng Alex; Longbotham Roger; Xu Ya (2014). "Seven rules of thumb for web site experimenters". Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. Vol. 20. New York, New York, USA: ACM. pp. 1857–1866.
S2CID 207214362.{{cite book}}: CS1 maint: date and year (link
)

^ Deng, Alex; Xu, Ya; Kohavi, Ron; Walker, Toby (2013). "Improving the Sensitivity of Online Controlled Experiments by Utilizing Pre-Experiment Data". WSDM 2013: Sixth ACM International Conference on Web Search and Data Mining.

PMID 15069225
.

^ Angrist, Joshua; Pischke Jörn-Steffen (2014). Mastering 'Metrics: The Path from Cause to Effect. Princeton University Press. p. 31.

^ Charles Sanders Peirce and Joseph Jastrow (1885). "On Small Differences in Sensation". Memoirs of the National Academy of Sciences. 3: 73–83. http://psychclassics.yorku.ca/Peirce/small-diffs.htm

S2CID 52201011
.

S2CID 143685203
.

S2CID 23526321
.

PMID 24782322
.

PMID 21491415
.

PMID 25490908
.

Caliński, Tadeusz & Kageyama, Sanpei (2000). Block designs: A Randomization approach, Volume I: Analysis. Lecture Notes in Statistics. Vol. 150. New York: Springer-Verlag.
ISBN 978-0-387-98578-7
.

Caliński, Tadeusz & Kageyama, Sanpei (2003). Block designs: A Randomization approach, Volume II: Design. Lecture Notes in Statistics. Vol. 170. New York: Springer-Verlag.
ISBN 978-0-387-95470-7
.

S2CID 52201011
.

Hinkelmann, Klaus;
MR 2363107
.

MR 1194407
.

v
t
e
Design of experiments
Scientific
method

Scientific experiment

Statistical design

Control

Internal and external validity

Experimental unit

Blinding

Optimal design: Bayesian

Random assignment

Randomization

Restricted randomization

Replication versus subsampling

Sample size

Treatment
and blocking

Treatment

Effect size

Contrast

Interaction

Confounding

Orthogonality

Blocking

Covariate

Nuisance variable

Models
and inference

Linear regression

Ordinary least squares

Bayesian

Random effect

Mixed model

Bayesian

Analysis of variance (Anova)

Cochran's theorem

Manova (multivariate)

Ancova (covariance)

Compare means

Multiple comparison

Designs

Completely
randomized

Factorial

Fractional factorial

Plackett–Burman

Taguchi

Response surface methodology

Polynomial and rational modeling

Box–Behnken

Central composite

Block

Generalized randomized block design (GRBD)

Latin square

Graeco-Latin square

Orthogonal array

Latin hypercube
Repeated measures design

Crossover study

Randomized controlled trial

Sequential analysis

Sequential probability ratio test

Glossary

Category

Mathematics portal

Statistical outline

Statistical topics

Retrieved from "https://en.wikipedia.org/w/index.php?title=Randomized_experiment&oldid=1203824446"

[Schulz-2010-1] PMID 20332509.{{cite journal}}: CS1 maint: multiple names: authors list (link
)

[2] Kohavi, Ron; Longbotham, Roger (2015). "Online Controlled Experiments and A/B Tests" (PDF). In Sammut, Claude; Webb, Geoff (eds.). Encyclopedia of Machine Learning and Data Mining. Springer. pp. to appear.

[surveyarticle-3] 
ISSN 1384-5810
.

[puzzlingResults-4] Kohavi, Ron; Deng, Alex; Frasca, Brian; Longbotham, Roger; Walker, Toby; Xu Ya (2012). "Trustworthy Online Controlled Experiments: Five Puzzling Outcomes Explained". Proceedings of the 18th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.

[ExPScale-5] S2CID 13224883.{{cite book}}: CS1 maint: date and year (link
)

[ExPRulesOfThumb-6] Kohavi, Ron; Deng Alex; Longbotham Roger; Xu Ya (2014). "Seven rules of thumb for web site experimenters". Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. Vol. 20. New York, New York, USA: ACM. pp. 1857–1866.
S2CID 207214362.{{cite book}}: CS1 maint: date and year (link
)

[cuped-7] Deng, Alex; Xu, Ya; Kohavi, Ron; Walker, Toby (2013). "Improving the Sensitivity of Online Controlled Experiments by Utilizing Pre-Experiment Data". WSDM 2013: Sixth ACM International Conference on Web Search and Data Mining.

[8] PMID 15069225
.

[9] Angrist, Joshua; Pischke Jörn-Steffen (2014). Mastering 'Metrics: The Path from Cause to Effect. Princeton University Press. p. 31.

[10] Charles Sanders Peirce and Joseph Jastrow (1885). "On Small Differences in Sensation". Memoirs of the National Academy of Sciences. 3: 73–83. http://psychclassics.yorku.ca/Peirce/small-diffs.htm

[11] S2CID 52201011
.

[12] S2CID 143685203
.

[13] S2CID 23526321
.

[14] PMID 24782322
.

[15] PMID 21491415
.

[16] PMID 25490908
.

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[15]

[16]