Latent class model
In statistics, a latent class model (LCM) is a model for clustering multivariate discrete data. It assumes that the data arise from a mixture of discrete distributions, within each of which the variables are independent. It is called a latent class model because the class to which each data point belongs is unobserved, or latent.
Latent class analysis (LCA) is a subset of structural equation modeling, used to find groups or subtypes of cases in multivariate categorical data. These subtypes are called "latent classes".[1][2]
Confronted with a situation as follows, a researcher might choose to use LCA to understand the data: Imagine that symptoms a-d have been measured in a range of patients with diseases X, Y, and Z, and that disease X is associated with the presence of symptoms a, b, and c, disease Y with symptoms b, c, d, and disease Z with symptoms a, c and d.
The LCA will attempt to detect the presence of latent classes (the disease entities), creating patterns of association in the symptoms. As in
Because the criterion for solving the LCA is to achieve latent classes within which there is no longer any association of one symptom with another (because the class is the disease which causes their association), and the set of diseases a patient has (or class a case is a member of) causes the symptom association, the symptoms will be "conditionally independent", i.e., conditional on class membership, they are no longer related.[1]
Model
Within each latent class, the observed variables are
In one form, the latent class model is written as
where is the number of latent classes and are the so-called recruitment or unconditional probabilities that should sum to one. are the marginal or conditional probabilities.
For a two-way latent class model, the form is
This two-way model is related to probabilistic latent semantic analysis and non-negative matrix factorization.
The probability model used in LCA is closely related to the Naive Bayes classifier. The main difference is that in LCA, the class membership of an individual is a latent variable, whereas in Naive Bayes classifiers the class membership is an observed label.
Related methods
There are a number of methods with distinct names and uses that share a common relationship.
As a practical instance, the variables could be
Application
LCA may be used in many fields, such as:
References
- ^ a b c Lazarsfeld, P.F. and Henry, N.W. (1968) Latent structure analysis. Boston: Houghton Mifflin
- Formann, A. K. (1984). Latent Class Analyse: Einführung in die Theorie und Anwendung [Latent class analysis: Introduction to theory and application]. Weinheim: Beltz.
- ISSN 0344-1369.
- S2CID 11628144.
- S2CID 40678009.)
{{cite journal}}
: CS1 maint: multiple names: authors list (link - PMID 26148538.)
{{cite journal}}
: CS1 maint: multiple names: authors list (link
- Linda M. Collins; Stephanie T. Lanza (2010). Latent class and latent transition analysis for the social, behavioral, and health sciences. New York: ISBN 978-0-470-22839-5.
- ISBN 978-0-521-59451-6.
- .
- Paul F. Lazarsfeld, Neil W. Henry (1968). Latent Structure Analysis.
External links
- Statistical Innovations, Home Page, 2016. Website with latent class software (Latent GOLD 5.1), free demonstrations, tutorials, user guides, and publications for download. Also included: online courses, FAQs, and other related software.
- The Methodology Center, Latent Class Analysis, a research center at Penn State, free software, FAQ
- John Uebersax, Latent Class Analysis, 2006. A web-site with bibliography, software, links and FAQ for latent class analysis