Psychological statistics

Source: Wikipedia, the free encyclopedia.

Psychological statistics is application of formulas, theorems, numbers and laws to psychology. Statistical methods for psychology include development and application statistical theory and methods for modeling psychological data. These methods include

experimental designs, and Bayesian statistics. The article also discusses journals in the same field.[1]


Psychometrics deals with measurement of psychological attributes. It involves developing and applying statistical models for mental measurements.[2] The measurement theories are divided into two major areas: (1) Classical test theory; (2) Item Response Theory.[3]

Classical test theory

The classical test theory or true score theory or reliability theory in statistics is a set of statistical procedures useful for development of psychological tests and scales. It is based on a fundamental equation, X = T + E where, X is total score, T is a true score and E is error of measurement. For each participant, it assumes that there exist a true score and it need to be obtained score (X) has to be as close to it as possible.[2][4] The closeness of X has with T is expressed in terms of ratability of the obtained score. The reliability in terms of classical test procedure is correlation between true score and obtained score. The typical test construction procedures has following steps:

(1) Determine the construct (2) Outline the behavioral domain of the construct (3) Write 3 to 5 times more items than desired test length (4) Get item content analyzed by experts and cull items (5) Obtain data on initial version of the test (6) Item analysis (Statistical Procedure) (7) Factor analysis (Statistical Procedure) (8) After the second cull, make final version (9) Use it for research


The reliability is computed in specific ways. (A) Inter-Rater reliability: Inter-Rater reliability is estimate of agreement between independent raters. This is most useful for subjective responses.

Intra-Class correlation coefficients, Correlation coefficients
, Kendal's concordance coefficient, etc. are useful statistical tools. (B) Test-Retest Reliability: Test-Retest Procedure is estimation of temporal consistency of the test. A test is administered twice to the same sample with a time interval. Correlation between two sets of scores is used as an estimate of reliability. Testing conditions are assumed to be identical. (C) Internal Consistency Reliability: Internal consistency reliability estimates consistency of items with each other. Split-half reliability (Spearman- Brown Prophecy) and Cronbach Alpha are popular estimates of this reliability.[5] (D)
Parallel Form Reliability
: It is an estimate of consistency between two different instruments of measurement. The inter-correlation between two parallel forms of a test or scale is used as an estimate of parallel form reliability.


Validity of a scale or test is ability of the instrument to measure what it purports to measure.[3] Construct validity, Content Validity, and Criterion Validity are types of validity. Construct validity is estimated by convergent and discriminant validity and factor analysis. Convergent and discriminant validity are ascertained by correlation between similar of different constructs. Content Validity: Subject matter experts evaluate content validity. Criterion Validity is correlation between the test and a criterion variable (or variables) of the construct.

Multiple regression analysis, and Logistic regression
are used as an estimate of criterion validity. Software applications: The R software has ‘psych’ package that is useful for classical test theory analysis.[6]

Modern test theory

The modern test theory is based on latent trait model. Every item estimates the ability of the test taker. The ability parameter is called as theta (θ). The difficulty parameter is called b. the two important assumptions are local independence and unidimensionality. The Item Response Theory has three models. They are one parameter logistic model, two parameter logistic model and three parameter logistic model. In addition, Polychromous IRT Model are also useful.[7]

The R Software has ‘ltm’, packages useful for IRT analysis.

Factor analysis

Factor analysis is at the core of psychological statistics. It has two schools: (1) Exploratory Factor analysis (2) Confirmatory Factor analysis.

Exploratory factor analysis (EFA)

The exploratory factor analysis begins without a theory or with a very tentative theory. It is a dimension reduction technique. It is useful in

data analytics
. Typically a k-dimensional correlation matrix or covariance matrix of variables is reduced to k X r factor pattern matrix where r < k. Principal Component analysis and common factor analysis are two ways of extracting data. Principal axis factoring, ML factor analysis, alpha factor analysis and image factor analysis is most useful ways of EFA. It employs various factor rotation methods which can be classified into orthogonal (resulting in uncorrelated factors) and oblique (resulting correlated factors).

The ‘psych’ package in R is useful for EFA.

Confirmatory factor analysis (CFA)

Confirmatory Factor Analysis (CFA) is a factor analytic technique that begins with a theory and test the theory by carrying out factor analysis. The CFA is also called as latent structure analysis, which considers factor as latent variables causing actual observable variables. The basic equation of the CFA is

X = Λξ + δ

where, X is observed variables, Λ are structural coefficients, ξ are latent variables (factors) and δ are errors. The parameters are estimated using ML methods however; other methods of estimation are also available. The chi-square test is very sensitive and hence various fit measures are used.[8][9] R package ‘sem’, ‘lavaan’ are useful for the same.

Experimental design

Experimental methods are very popular in psychology, going back more than 100 years. Experimental psychology is a sub-discipline of psychology . Statistical methods applied for designing and analyzing experimental psychological data include the

, etc.

Multivariate behavioral research

Multivariate behavioral research is becoming very popular in psychology. These methods include

hierarchical linear modelling, etc. are very useful for psychological statistics.[10][11][9][12][13]

Journals for statistical applications for psychology

There are many specialized journals that publish advances in statistical analysis for psychology:

Software packages for psychological research

Various software packages are available for statistical methods for psychological research. They can be classified as commercial software (e.g., JMP and SPSS) and open-source (e.g., R). Among the open-source offerings, the R software is the most popular. There are many online references for R and specialized books on R for Psychologists are also being written.[14] The "psych" package of R is very useful for psychologists. Among others, "lavaan", "sem", "ltm", "ggplot2" are some of the popular packages. PSPP and KNIME are other free packages. Commercial packages include JMP, SPSS and SAS. JMP and SPSS are commonly reported in books.

See also


  1. ^ a b Lord, F. M. , and Novick, M. R. ( 1 968). Statistical theories of mental test scores. Reading, Mass. : Addison-Wesley, 1968.
  2. ^ a b Nunnally, J. & Bernstein, I. (1994). Psychometric Theory. McGraw-Hill.
  3. ^ Raykov, T. & Marcoulides, G.A. (2010) Introduction to Psychometric Theory. New York: Routledge.
  4. ^ Cronbach LJ (1951). Coefficient alpha and the internal structure of tests. Psychometrika 16, 297–334. doi:10.1007/bf02310555
  5. ^ Kline, T. J. B. (2005)Psychological Testing: A Practical Approach to Design and Evaluation. Sage Publications: Thousand Oaks.
  6. ^ Hambleton, R. K., & Swaminathan H. (1985). Item Response theory: Principles and Applications. Boston: Kluwer.
  7. ^ Bollen, KA. (1989). Structural Equations with Latent Variables. New York: John Wiley & Sons.
  8. ^ a b Loehlin, J. E. (1992). Latent Variable Models: An Introduction to Factor, Path, and Structural Analysis (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum.
  9. ^ Hayes, A. F. (2013). Introduction to mediation, moderation, and conditional process analysis. The Guilford Press: NY.
  10. ^ Agresti, A. (1990). Categorical data analysis. Wiley: NJ.
  11. ^ Menard, S. (2001). Applied logistic regression analysis. (2nd ed.). Thousand Oaks. CA: Sage Publications.
  12. ^ Tabachnick, B. G., & Fidell, L. S. (2007). Using Multivariate Statistics, 5th ed. Boston: Allyn and Bacon.

External links