Genetic correlation
In multivariate quantitative genetics, a genetic correlation (denoted or ) is the proportion of
Genetic correlations have applications in validation of genome-wide association study (GWAS) results, breeding, prediction of traits, and discovering the etiology of traits & diseases.
They can be estimated using individual-level data from twin studies and molecular genetics, or even with GWAS summary statistics.[10][11] Genetic correlations have been found to be common in non-human genetics[12] and to be broadly similar to their respective phenotypic correlations,[13] and also found extensively in human traits, dubbed the 'phenome'.[14][15][16][17][18][19][20][21][22][23][24]
This finding of widespread pleiotropy has implications for artificial selection in agriculture, interpretation of phenotypic correlations, social inequality,[25] attempts to use Mendelian randomization in causal inference,[26][27][28][29] the understanding of the biological origins of complex traits, and the design of GWASes.
A genetic correlation is to be contrasted with environmental correlation between the environments affecting two traits (e.g. if poor nutrition in a household caused both lower IQ and height); a genetic correlation between two traits can contribute to the observed (
Interpretation
Genetic correlations are not the same as heritability, as it is about the overlap between the two sets of influences and not their absolute magnitude; two traits could be both highly heritable but not be genetically correlated or have small heritabilities and be completely correlated (as long as the heritabilities are non-zero).
For example, consider two traits – dark skin and black hair. These two traits may individually have a very high heritability (most of the population-level variation in the trait due to genetic differences, or in simpler terms, genetics contributes significantly to these two traits), however, they may still have a very low genetic correlation if, for instance, these two traits were being controlled by different, non-overlapping, non-linked genetic loci.
A genetic correlation between two traits will tend to produce phenotypic correlations – e.g. the genetic correlation between intelligence and SES[16] or education and family SES[37] implies that intelligence/SES will also correlate phenotypically. The phenotypic correlation will be limited by the degree of genetic correlation and also by the heritability of each trait. The expected phenotypic correlation is the bivariate heritability' and can be calculated as the square roots of the heritabilities multiplied by the genetic correlation. (Using a Plomin example,[38] for two traits with heritabilities of 0.60 & 0.23, , and phenotypic correlation of r=0.45 the bivariate heritability would be , so of the observed phenotypic correlation, 0.28/0.45 = 62% of it is due to correlative genetic effects, which is to say nothing of trait mutability in and of itself.)
Cause
Genetic correlations can arise due to:[19]
- linkage disequilibrium (two neighboring genes tend to be inherited together, each affecting a different trait)
- biological pleiotropy (a single gene having multiple otherwise unrelated biological effects, or shared regulation of multiple genes[39])
- mediated pleiotropy (a gene affects trait X and trait X affects trait Y).
- biases: of diagnoses
Uses
Causes of changes in traits
Genetic correlations are scientifically useful because genetic correlations can be analyzed over time within an individual longitudinally[41] (e.g. intelligence is stable over a lifetime, due to the same genetic influences – childhood genetically correlates with old age
Boosting GWASes
Genetic correlations can be used in
Genetic correlations can also quantify the contribution of correlations <1 across datasets which might create a false "
Breeding
Hairless dogs have imperfect teeth; long-haired and coarse-haired animals are apt to have, as is asserted, long or many horns; pigeons with feathered feet have skin between their outer toes; pigeons with short beaks have small feet, and those with long beaks large feet. Hence if man goes on selecting, and thus augmenting any peculiarity, he will almost certainly modify unintentionally other parts of the structure, owing to the mysterious laws of correlation.
—The Origin of Species, 1859
Genetic correlations are also useful in applied contexts such as
Breeding experiments on genetically correlated traits can measure the extent to which correlated traits are inherently developmentally linked & response is constrained, and which can be dissociated.[56] Some traits, such as the size of eyespots on the butterfly Bicyclus anynana can be dissociated in breeding,[57] but other pairs, such as eyespot colors, have resisted efforts.[58]
Mathematical definition
Given a genetic covariance matrix, the genetic correlation is computed by
Computing the genetic correlation
Genetic correlations require a genetically informative sample. They can be estimated in breeding experiments on two traits of known heritability and selecting on one trait to measure the change in the other trait (allowing inferring the genetic correlation), family/adoption/
As with estimating SNP heritability and genetic correlation, the better computational scaling & the ability to estimate using only established summary association statistics is a particular advantage for HDL[11] and LD score regression over competing methods. Combined with the increasing availability of GWAS summary statistics or polygenic scores from datasets like the UK Biobank, such summary-level methods have led to an explosion of genetic correlation research since 2015.[citation needed]
The methods are related to Haseman–Elston regression & PCGC regression.[65] Such methods are typically genome-wide, but it is also possible to estimate genetic correlations for specific variants or genome regions.[66]
One way to consider it is using trait X in twin 1 to predict trait Y in twin 2 for monozygotic and dizygotic twins (i.e. using twin 1's IQ to predict twin 2's brain volume); if this cross-correlation is larger for the more genetically-similar monozygotic twins than for the dizygotic twins, the similarity indicates that the traits are not genetically independent and there is some common genetics influencing both IQ and brain volume. (Statistical power can be boosted by using siblings as well.[67])
Genetic correlations are affected by methodological concerns; underestimation of heritability, such as due to assortative mating, will lead to overestimates of longitudinal genetic correlation,[68] and moderate levels of misdiagnoses can create pseudo correlations.[69]
As they are affected by heritabilities of both traits, genetic correlations have low statistical power, especially in the presence of measurement errors biasing heritability downwards, because "estimates of genetic correlations are usually subject to rather large sampling errors and therefore seldom very precise": the standard error of an estimate is .[70] (Larger genetic correlations & heritabilities will be estimated more precisely.[71]) However, inclusion of genetic correlations in an analysis of a pleiotropic trait can boost power for the same reason that multivariate regressions are more powerful than separate univariate regressions.[72]
Twin methods have the advantage of being usable without detailed biological data, with human genetic correlations calculated as far back as the 1970s and animal/plant genetic correlations calculated in the 1930s, and require sample sizes in the hundreds for being well-powered, but they have the disadvantage of making assumptions which have been criticized, and in the case of rare traits like anorexia nervosa it may be difficult to find enough twins with a diagnosis to make meaningful cross-twin comparisons, and can only be estimated with access to the twin data; molecular genetic methods like GCTA or LD score regression have the advantage of not requiring specific degrees of relatedness and so can easily study rare traits using
More concretely, if two traits, say height and weight have the following additive genetic variance-covariance matrix:
Height | Weight | |
Height | 36 | 36 |
Weight | 36 | 117 |
Then the genetic correlation is .55, as seen is the standardized matrix below:
Height | Weight | |
Height | 1 | |
Weight | .55 | 1 |
In practice, structural equation modeling applications such as Mx or OpenMx (and before that, historically, LISREL[73]) are used to calculate both the genetic covariance matrix and its standardized form. In R, cov2cor() will standardize the matrix.
Typically, published reports will provide genetic variance components that have been standardized as a proportion of total variance (for instance in an ACE
See also
- Gene-environment correlation
- Heritability of intelligence; g factor (psychometrics)
- Cognitive epidemiology
- Lothian birth-cohort studies
- Mendelian randomization
References
- ^ Falconer, Ch. 19
- ISBN 9780878934812
- ^ Neale & Maes (1996), Methodology for genetics studies of twins and families Archived 2017-03-27 at the Wayback Machine (6th ed.). Dordrecht, The Netherlands: Kluwer.
- ^ a b Plomin et al., p. 123
- S2CID 12600152. Archived from the original(PDF) on 2016-10-25.
- S2CID 302717.
- ^ Loehlin & Vandenberg (1968) "Genetic and environmental components in the covariation of cognitive abilities: An additive model", in Progress in Human Behaviour Genetics, ed. S. G. Vandenberg, pp. 261–278. Johns Hopkins, Baltimore.
- PMID 12573188.
- PMID 21845929.
- PMID 26414676.
- ^ S2CID 220260262.
- ]
- PMID 28581166.
- PMID 26303664. Archived from the original(PDF) on 2017-02-02. Retrieved 2016-10-24.
- PMID 26809841.
- ^ PMID 27818178.
- ^ PMID 27663502.
- PMID 22077970.
- ^ PMID 23752797.
- PMID 21852963.
- PMID 22001757.
- )
- ^ PMID 30349118.
- )
- PMID 26292196.
- PMID 26291580.
- PMID 28572633.
- PMID 26050253.
- PMID 29686387.
- ^ Falconer, p. 315 cites the example of chicken size and egg laying: chickens grown large for genetic reasons lay later, fewer, and larger eggs, while chickens grown large for environmental reasons lay quicker and more but normal sized eggs; Table 19.1 on p. 316 also provides examples of opposite-signed phenotypic & genetic correlations: fleece-weight/length-of-wool & fleece weight/body-weight in sheep, and body-weight/egg-timing & body-weight/egg-production in chicken. One consequence of the negative chicken correlations was that, despite moderate heritabilities and a positive phenotypic correlation, selection had begun to fail to yield any improvements (p. 329) according to "Genetic slippage in response to selection for multiple objectives", Dickerson 1955.
- S2CID 33699313.
- S2CID 21190284.
- S2CID 32644733.
- S2CID 86659038. Archived from the original(PDF) on 2019-07-21.
- S2CID 21760916.
- S2CID 13668940.
- PMID 25754083.
- ^ Plomin et al., p. 397
- PMID 28282383.
- PMID 29040562.
- S2CID 41253666.
- S2CID 4427683.
- PMID 23722424.
- PMID 25201988.
- PMID 28785111.
- PMID 29326435.
- PMID 29292387.
- PMID 23637625.
- PMID 28287610.
- PMID 28095416.
- PMID 17247099.
- ^ Rae, A. L. (1951) "The Importance of Genetic Correlations in Selection"
- .
- ^ Lerner, M. (1950) Population genetics and animal improvement: as illustrated by the inheritance of egg production. New York: Cambridge Univ. Press
- ^ Falconer, pp. 324–329
- S2CID 15933304.
- S2CID 4382085.
- PMID 18366752.
- PMID 22843982.
- PMID 25642630.
- )
- PMID 29432419.
- PMID 27346688.
- PMID 10689802.
- PMID 25422463.
- PMID 29100087.
- S2CID 14920235. Archived from the original(PDF) on 2016-09-11. Retrieved 2016-10-24.
- doi:10.1037/0012-1649.23.1.4.)
{{cite journal}}
: CS1 maint: multiple names: authors list (link - PMID 22258521.
- ^ Falconer, pp. 317–318
- PMID 17903108.
- S2CID 34841607.
- S2CID 46155044.
Cited sources
- Falconer, Douglas Scott (1960). Introduction to Quantitative Genetics.
- Plomin, Robert; DeFries, John C.; Knopik, Valerie S. and Neiderhiser, Jenae M. (2012). Behavioral Genetics. Worth Publishers. ISBN 978-1-4292-4215-8.)
{{cite book}}
: CS1 maint: multiple names: authors list (link
External links
- The G-matrix Online Archived 2016-09-18 at the Wayback Machine