Homogeneity and heterogeneity (statistics)
In
Homogeneity can be studied to several degrees of complexity. For example, considerations of
The concept of homogeneity can be applied in many different ways and, for certain types of statistical analysis, it is used to look for further properties that might need to be treated as varying within a dataset once some initial types of non-homogeneity have been dealt with.
Of variance
In statistics, a sequence of random variables is homoscedastic (/ˌhoʊmoʊskəˈdæstɪk/) if all its random variables have the same finite variance; this is also known as homogeneity of variance. The complementary notion is called heteroscedasticity, also known as heterogeneity of variance. The spellings homoskedasticity and heteroskedasticity are also frequently used.[1][2][3]
Assuming a variable is homoscedastic when in reality it is heteroscedastic (
The existence of heteroscedasticity is a major concern in
Because heteroscedasticity concerns expectations of the second moment of the errors, its presence is referred to as misspecification of the second order.[7]
TheExamples
Regression
Differences in the typical values across the dataset might initially be dealt with by constructing a regression model using certain explanatory variables to relate variations in the typical value to known quantities. There should then be a later stage of analysis to examine whether the errors in the predictions from the regression behave in the same way across the dataset. Thus the question becomes one of the homogeneity of the distribution of the residuals, as the explanatory variables change. See regression analysis.
Time series
The initial stages in the analysis of a time series may involve plotting values against time to examine homogeneity of the series in various ways: stability across time as opposed to a trend; stability of local fluctuations over time.
Combining information across sites
In hydrology, data-series across a number of sites composed of annual values of the within-year annual maximum river-flow are analysed. A common model is that the distributions of these values are the same for all sites apart from a simple scaling factor, so that the location and scale are linked in a simple way. There can then be questions of examining the homogeneity across sites of the distribution of the scaled values.
Combining information sources
In meteorology, weather datasets are acquired over many years of record and, as part of this, measurements at certain stations may cease occasionally while, at around the same time, measurements may start at nearby locations. There are then questions as to whether, if the records are combined to form a single longer set of records, those records can be considered homogeneous over time. An example of homogeneity testing of wind speed and direction data can be found in Romanić et al., 2015.[9]
Homogeneity within populations
Simple populations surveys may start from the idea that responses will be homogeneous across the whole of a population. Assessing the homogeneity of the population would involve looking to see whether the responses of certain identifiable
Tests
A test for homogeneity, in the sense of exact equivalence of statistical distributions, can be based on an
See also
References
- JSTOR 1911250.
- ^
White, Halbert (1980). "A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity". Econometrica. 48 (4): 817–838. JSTOR 1912934.
- ^
Gujarati, D. N.; ISBN 9780073375779.
- ISBN 9780471311010.
- ^ Johnston, J. (1972). Econometric Methods. New York: McGraw-Hill. pp. 214–221.
- ISBN 978-1-4008-2982-8.
- ISBN 978-0-8039-4506-7.
- JSTOR 1912773.
- ^ Romanić D. Ćurić M- Jovičić I. Lompar M. 2015. Long-term trends of the ‘Koshava’ wind during the period 1949–2010. International Journal of Climatology 35(2):288-302. DOI:10.1002/joc.3981.
Further reading
- Hall, M.J. (2003) The interpretation of non-homogeneous hydrometeorological time series a case study.
- Krus, D.J., & Blackman, H.S. (1988).Test reliability and homogeneity from perspective of the ordinal test theory. Applied Measurement in Education, 1, 79–88 (Request reprint).
- Loevinger, J. (1948). The technic of homogeneous tests compared with some aspects of scale analysis and factor analysis. Psychological Bulletin, 45, 507–529.