Multilevel regression with poststratification
Part of a series on |
Regression analysis |
---|
Models |
Estimation |
|
Background |
|
Multilevel regression with poststratification (MRP) (sometimes called "Mister P")[1] is a statistical technique used for correcting model estimates for known differences between a sample population (the population of the data you have), and a target population (a population you would like to estimate for).
The poststratification refers to the process of adjusting the estimates, essentially a weighted average of estimates from all possible combinations of attributes (for example age and sex). Each combination is sometimes called a "cell." The multilevel regression is used to smooth noisy estimates in the cells with too little data by using overall or nearby averages.
One application is estimating preferences in sub-regions (e.g., states, individual constituencies) based on individual-level survey data gathered at other levels of aggregation (e.g., national surveys).[2]
The technique and its advantages
The technique essentially involves using data from, for example, censuses relating to various types of people corresponding to different characteristics (e.g., age, race), in a first step to estimate the relationship between those types and individual preferences (i.e., multi-level regression of the dataset). This relationship is then used in a second step to estimate the sub-regional preference based on the number of people having each type/characteristic in that sub-region (a process known as "poststratification").[3] In this way the need to perform surveys at sub-regional level, which can be expensive and impractical in an area (e.g., a country) with many sub-regions (e.g. counties, ridings, or states), is avoided. It also avoids issues with consistency of survey when comparing different surveys performed in different areas.[4][2] Additionally, it allows the estimating of preference within a specific locality based on a survey taken across a wider area that includes relatively few people from the locality in question, or where the sample may be highly unrepresentative.[5]
History
The technique was originally developed by
YouGov used the technique to successfully predict the overall outcome of the 2017 UK general election,[10] correctly predicting the result in 93% of constituencies.[11]
Limitations and extensions
MRP can be extended to estimating the change of opinion over time[4] and when used to predict elections works best when used relatively close to the polling date, after nominations have closed.[12]
Both the "multilevel regression" and "poststratification" ideas of MRP can be generalized. Multilevel regression can be replaced by nonparametric regression[13] or regularized prediction, and poststratification can be generalized to allow for non-census variables, i.e. poststratification totals that are estimated rather than being known.[14]
References
- ^ "Mister P: What's its secret sauce? | Statistical Modeling, Causal Inference, and Social Science". statmodeling.stat.columbia.edu. Retrieved 2023-10-13.
- ^ JSTOR 24572674.
- ^ "What is MRP?". Survation.com. Survation. 5 November 2018. Retrieved 31 October 2019.
- ^ a b Gelman, Andrew; Lax, Jeffrey; Phillips, Justin; Gabry, Jonah; Trangucci, Robert (28 August 2018). "Using Multilevel Regression and Poststratification to Estimate Dynamic Public Opinion" (PDF): 1–3. Retrieved 31 October 2019.
{{cite journal}}
: Cite journal requires|journal=
(help) - ^ a b Downes, Marnie; Gurrin, Lyle C.; English, Dallas R.; Pirkis, Jane; Currier, Diane; Spital, Matthew J.; Carlin, John B. (9 April 2018). "Multilevel Regression and Poststratification: A Modeling Approach to Estimating Population Quantities From Highly Selected Survey Samples". American Journal of Epidemiology. 179 (8): 187. Retrieved 31 October 2019.
- ^ Gelman, Andrew; Little, Thomas (1997). "Poststratification into many categories using hierarchical logistic regression". Survey Methodology. 23: 127–135.
- JSTOR 2286322.
- JSTOR 2290792.
- .
- ^ Revell, Timothy (9 June 2017). "How YouGov's experimental poll correctly called the UK election". New Scientist. Retrieved 31 October 2019.
- ^ Cohen, Daniel (27 September 2019). "'I've never known voters be so promiscuous': the pollsters working to predict the next UK election". The Guardian. Retrieved 31 October 2019.
- ^ James, William; MacLellan, Kylie (15 October 2019). "A question of trust: British pollsters battle to call looming election". Reuters. Retrieved 31 October 2019.
- S2CID 201385400.
- ^ Gelman, Andrew (28 October 2018). "MRP (or RPP) with non-census variables". Statistical Modeling, Causal Inference, and Social Science.