Latent Profile & Latent Class Models
来源:百度文库 编辑:神马文学网 时间:2024/07/07 15:38:51
![](http://image4.360doc.cn/DownloadImg/2009/4/7/118677_3046144_1.gif)
Applied Categorical & Nonnormal Data Analysis
Latent Profile & Latent Class Models
Introduction
Cluster analysis techniques and not the only way to find non-observed groupings in your data. In fact, from several perspectives cluster analysis may not be the best way to determine these groupings. There are several latent variable approaches that are available. In this unit we will explore two of them: Latent profile analysis and latent class analysis.
The advantages of these approaches over cluster analysis are that they are model based, generating probabilities for group membership. It is possible to test these models and to analyze their goodness of fit. The downside to this approach is that it requires sepcialized software that is more complex to run than typical statistical packages. We will demonstrate these techniques using the Mplus software from Muthén & Muthén. We will also use Stata for descriptive and subsidiary analyses.
Latent profile analysis will use continuous predictors and the latent class analysis will use binary predictor variables. We will use the reading, writing, math, science and social studies test scores from the hsb6a dataset. For the binary predictor variables we will do median splits on each of the tests to create hiread, hiwrite, himath, hisci and hiss.
Looking at the data
use hsb6adescribeContains data from hsb6a.dtaobs: 600 highschool and beyond (600cases)vars: 23 24 Oct 2003 14:18size: 31,200 (99.0% of memory free)-------------------------------------------------------------------------------storage display valuevariable name type format label variable label-------------------------------------------------------------------------------id int %9.0ggender byte %9.0g glrace byte %12.0g rlses byte %9.0g slsch byte %9.0g sclprog byte %9.0g pllocus float %9.0g locus of controlconcept float %9.0g self-conceptmot float %9.0g motivationcareer byte %14.0g cl career choiceread float %9.0g reading scorewrite float %9.0g writing scoremath float %9.0g math scoresci float %9.0g science scoress float %9.0g social studies scorehiread byte %9.0ghiwrite byte %9.0ghimath byte %9.0ghisci byte %9.0ghiss byte %9.0gsum read write math sci ss hiread hiwrite himath hisci hissVariable | Obs Mean Std. Dev. Min Max-------------+--------------------------------------------------------read | 600 51.90183 10.10298 28.3 76write | 600 52.38483 9.726455 25.5 67.1math | 600 51.849 9.414736 31.8 75.5sci | 600 51.76333 9.706179 26 74.2ss | 600 52.04567 9.879228 25.7 70.5-------------+--------------------------------------------------------hiread | 600 .525 .4997913 0 1hiwrite | 600 .54 .4988133 0 1himath | 600 .4966667 .5004061 0 1hisci | 600 .5266667 .499705 0 1hiss | 600 .6483333 .477889 0 1
A 2 Class Latent Profile Model
Data:File is I:\mplus\hsb6.dat ;Variable:Names areid gender race ses sch prog locus concept mot career read write mathsci ss hiread hiwrite himath hisci hiss academic;Usevariables areread write math sci ss ;classes = c(2);Analysis:Type=mixture;MODEL:%C#1%[read math sci ss write * 30 ];%C#2%[read math sci ss write * 60];OUTPUT:TECH8;SAVEDATA:file is lca_ex1.txt ;save is cprob;format is free;THE MODEL ESTIMATION TERMINATED NORMALLYTESTS OF MODEL FITLoglikelihoodH0 Value -5213.102Information CriteriaNumber of Free Parameters 16Akaike (AIC) 10458.203Bayesian (BIC) 10517.464Sample-Size Adjusted BIC 10466.721(n* = (n + 2) / 24)Entropy 0.865FINAL CLASS COUNTS AND PROPORTIONS OF TOTAL SAMPLE SIZEBASED ON ESTIMATED POSTERIOR PROBABILITIESClass 1 123.03223 0.41011Class 2 176.96777 0.58989CLASSIFICATION OF INDIVIDUALS BASED ON THEIR MOST LIKELY CLASS MEMBERSHIPClass Counts and ProportionsClass 1 120 0.40000Class 2 180 0.60000Average Class Probabilities by Class1 2Class 1 0.961 0.039Class 2 0.043 0.957MODEL RESULTSEstimates S.E. Est./S.E.CLASS 1MeansREAD 43.151 0.820 52.641WRITE 44.524 1.024 43.485MATH 43.860 0.757 57.947SCI 43.322 1.051 41.239SS 45.119 0.946 47.707VariancesREAD 49.035 4.175 11.745WRITE 44.303 3.927 11.283MATH 45.062 3.768 11.958SCI 48.986 5.184 9.450SS 55.410 4.445 12.465CLASS 2MeansREAD 57.915 0.847 68.403WRITE 58.115 0.625 93.039MATH 57.136 0.800 71.386SCI 56.729 0.668 84.953SS 57.220 0.723 79.137VariancesREAD 49.035 4.175 11.745WRITE 44.303 3.927 11.283MATH 45.062 3.768 11.958SCI 48.986 5.184 9.450SS 55.410 4.445 12.465LATENT CLASS REGRESSION MODEL PARTMeansC#1 -0.364 0.179 -2.032QUALITY OF NUMERICAL RESULTSCondition Number for the Information Matrix 0.462E-03(ratio of smallest to largest eigenvalue)
A 3 Class Latent Profile Model
Data:File is I:\mplus\hsb6.dat ;Variable:Names areid gender race ses sch prog locus concept mot career read write mathsci ss hiread hiwrite himath hisci hiss academic;Usevariables areread write math sci ss ;classes = c(3);Analysis:Type=mixture;MODEL:%C#1%[read math sci ss write *30 ];%C#2%[read math sci ss write *45];%C#3%[read math sci ss write *60];OUTPUT:TECH8;SAVEDATA:file is lca_ex2.txt ;save is cprob;format is free;THE MODEL ESTIMATION TERMINATED NORMALLYTESTS OF MODEL FITLoglikelihoodH0 Value -5100.544Information CriteriaNumber of Free Parameters 22Akaike (AIC) 10245.087Bayesian (BIC) 10326.571Sample-Size Adjusted BIC 10256.800(n* = (n + 2) / 24)Entropy 0.877FINAL CLASS COUNTS AND PROPORTIONS OF TOTAL SAMPLE SIZEBASED ON ESTIMATED POSTERIOR PROBABILITIESClass 1 98.08460 0.32695Class 2 137.86474 0.45955Class 3 64.05066 0.21350CLASSIFICATION OF INDIVIDUALS BASED ON THEIR MOST LIKELY CLASS MEMBERSHIPClass Counts and ProportionsClass 1 99 0.33000Class 2 138 0.46000Class 3 63 0.21000Average Class Probabilities by Class1 2 3Class 1 0.961 0.039 0.000Class 2 0.021 0.940 0.039Class 3 0.000 0.068 0.932MODEL RESULTSEstimates S.E. Est./S.E.CLASS 1MeansREAD 41.866 0.614 68.208WRITE 43.080 0.870 49.514MATH 42.447 0.549 77.337SCI 41.409 0.748 55.358SS 44.232 0.819 54.010VariancesREAD 33.867 3.334 10.159WRITE 40.042 4.168 9.607MATH 28.667 2.980 9.619SCI 34.199 3.411 10.027SS 48.355 4.323 11.185CLASS 2MeansREAD 53.058 0.726 73.044WRITE 55.195 0.677 81.493MATH 52.704 0.683 77.191SCI 53.195 0.600 88.727SS 53.377 0.745 71.657VariancesREAD 33.867 3.334 10.159WRITE 40.042 4.168 9.607MATH 28.667 2.980 9.619SCI 34.199 3.411 10.027SS 48.355 4.323 11.185CLASS 3MeansREAD 64.588 0.949 68.070WRITE 61.318 0.624 98.232MATH 63.667 0.907 70.167SCI 62.043 0.873 71.064SS 62.139 0.827 75.163VariancesREAD 33.867 3.334 10.159WRITE 40.042 4.168 9.607MATH 28.667 2.980 9.619SCI 34.199 3.411 10.027SS 48.355 4.323 11.185LATENT CLASS REGRESSION MODEL PARTMeansC#1 0.426 0.201 2.120C#2 0.767 0.196 3.901QUALITY OF NUMERICAL RESULTSCondition Number for the Information Matrix 0.461E-03(ratio of smallest to largest eigenvalue)
A 2 Class Latent Class Model
Data:File is h:\mplus\hsb6.dat ;Variable:Names areid gender race ses sch prog locus concept mot career read write mathsci ss hiread hiwrite himath hisci hiss academic;Usevariables arehiread hiwrite himath hisci hiss ;categorical = hiread hiwrite himath hisci hiss;classes = c(2);Analysis:Type=mixture;MODEL:%C#1%[hiread$1 *2 himath$1 *2 hisci$1 *2 hiss$1 *2 hiwrite$1 *2 ];%C#2%[hiread$1 *-2 himath$1 *-2 hisci$1 *-2 hiss$1 *-2 hiwrite$1 *-2 ];OUTPUT:TECH8;SAVEDATA:file is lca_ex7.txt ;save is cprob;format is free;THE MODEL ESTIMATION TERMINATED NORMALLYTESTS OF MODEL FITLoglikelihoodH0 Value -849.157Information CriteriaNumber of Free Parameters 11Akaike (AIC) 1720.315Bayesian (BIC) 1761.057Sample-Size Adjusted BIC 1726.171(n* = (n + 2) / 24)Entropy 0.815Chi-Square Test of Model Fit for the Latent Class Indicator Model PartPearson Chi-SquareValue 44.642Degrees of Freedom 20P-Value 0.0012Likelihood Ratio Chi-SquareValue 45.747Degrees of Freedom 20P-Value 0.0009FINAL CLASS COUNTS AND PROPORTIONS OF TOTAL SAMPLE SIZEBASED ON ESTIMATED POSTERIOR PROBABILITIESClass 1 123.33019 0.41110Class 2 176.66981 0.58890CLASSIFICATION OF INDIVIDUALS BASED ON THEIR MOST LIKELY CLASS MEMBERSHIPClass Counts and ProportionsClass 1 127 0.42333Class 2 173 0.57667Average Class Probabilities by Class1 2Class 1 0.930 0.070Class 2 0.030 0.970MODEL RESULTSEstimates S.E. Est./S.E.CLASS 1CLASS 2LATENT CLASS INDICATOR MODEL PARTClass 1ThresholdsHIREAD$1 2.273 0.424 5.354HIWRITE$1 1.376 0.276 4.990HIMATH$1 2.081 0.399 5.209HISCI$1 2.035 0.411 4.947HISS$1 0.642 0.231 2.780Class 2ThresholdsHIREAD$1 -1.540 0.264 -5.823HIWRITE$1 -1.488 0.244 -6.109HIMATH$1 -1.217 0.217 -5.616HISCI$1 -1.264 0.213 -5.927HISS$1 -2.047 0.279 -7.328LATENT CLASS REGRESSION MODEL PARTMeansC#1 -0.359 0.161 -2.231LATENT CLASS INDICATOR MODEL PART IN PROBABILITY SCALEClass 1HIREADCategory 1 0.907 0.036 25.221Category 2 0.093 0.036 2.599HIWRITECategory 1 0.798 0.044 17.985Category 2 0.202 0.044 4.542HIMATHCategory 1 0.889 0.039 22.555Category 2 0.111 0.039 2.816HISCICategory 1 0.884 0.042 21.036Category 2 0.116 0.042 2.748HISSCategory 1 0.655 0.052 12.564Category 2 0.345 0.052 6.615Class 2HIREADCategory 1 0.177 0.038 4.592Category 2 0.823 0.038 21.417HIWRITECategory 1 0.184 0.037 5.031Category 2 0.816 0.037 22.288HIMATHCategory 1 0.228 0.038 5.980Category 2 0.772 0.038 20.197HISCICategory 1 0.220 0.037 6.015Category 2 0.780 0.037 21.288HISSCategory 1 0.114 0.028 4.043Category 2 0.886 0.028 31.304QUALITY OF NUMERICAL RESULTSCondition Number for the Information Matrix 0.654E-01(ratio of smallest to largest eigenvalue)
A 3 Class Latent Class Model
Data:File is h:\mplus\hsb6.dat ;Variable:Names areid gender race ses sch prog locus concept mot career read write mathsci ss hiread hiwrite himath hisci hiss academic;Usevariables arehiread hiwrite himath hisci hiss ;categorical = hiread hiwrite himath hisci hiss;classes = c(3);Analysis:Type=mixture;MODEL:%C#1%[hiread$1 *2 himath$1 *2 hisci$1 *2 hiss$1 *2 hiwrite$1 *2 ];%C#2%[hiread$1 *0 himath$1 *0 hisci$1 *0 hiss$1 *0 hiwrite$1 *0 ];%C#3%[hiread$1 *-2 himath$1 *-2 hisci$1 *-2 hiss$1 *-2 hiwrite$1 *-2 ];OUTPUT:TECH8;SAVEDATA:file is lca_ex8.txt ;save is cprob;format is free;THE MODEL ESTIMATION TERMINATED NORMALLYTESTS OF MODEL FITLoglikelihoodH0 Value -839.066Information CriteriaNumber of Free Parameters 17Akaike (AIC) 1712.132Bayesian (BIC) 1775.096Sample-Size Adjusted BIC 1721.182(n* = (n + 2) / 24)Entropy 0.682Chi-Square Test of Model Fit for the Latent Class Indicator Model PartPearson Chi-SquareValue 21.369Degrees of Freedom 14P-Value 0.0925Likelihood Ratio Chi-SquareValue 25.564Degrees of Freedom 14P-Value 0.0294FINAL CLASS COUNTS AND PROPORTIONS OF TOTAL SAMPLE SIZEBASED ON ESTIMATED POSTERIOR PROBABILITIESClass 1 95.51732 0.31839Class 2 127.98211 0.42661Class 3 76.50058 0.25500CLASSIFICATION OF INDIVIDUALS BASED ON THEIR MOST LIKELY CLASS MEMBERSHIPClass Counts and ProportionsClass 1 94 0.31333Class 2 130 0.43333Class 3 76 0.25333Average Class Probabilities by Class1 2 3Class 1 0.913 0.087 0.000Class 2 0.074 0.826 0.099Class 3 0.000 0.163 0.837MODEL RESULTSEstimates S.E. Est./S.E.CLASS 1CLASS 2CLASS 3LATENT CLASS INDICATOR MODEL PARTClass 1ThresholdsHIREAD$1 2.883 0.671 4.296HIWRITE$1 1.735 0.418 4.150HIMATH$1 2.863 0.739 3.877HISCI$1 3.007 0.861 3.492HISS$1 0.991 0.319 3.106Class 2ThresholdsHIREAD$1 -0.392 0.348 -1.128HIWRITE$1 -0.451 0.445 -1.013HIMATH$1 -0.258 0.342 -0.754HISCI$1 -0.453 0.269 -1.688HISS$1 -1.201 0.400 -2.999Class 3ThresholdsHIREAD$1 -4.377 6.575 -0.666HIWRITE$1 -15.000 0.000 0.000HIMATH$1 -2.932 1.699 -1.726HISCI$1 -2.257 0.986 -2.289HISS$1 -3.761 2.143 -1.755LATENT CLASS REGRESSION MODEL PARTMeansC#1 0.222 0.398 0.558C#2 0.515 0.499 1.032LATENT CLASS INDICATOR MODEL PART IN PROBABILITY SCALEClass 1HIREADCategory 1 0.947 0.034 28.108Category 2 0.053 0.034 1.574HIWRITECategory 1 0.850 0.053 15.951Category 2 0.150 0.053 2.815HIMATHCategory 1 0.946 0.038 25.073Category 2 0.054 0.038 1.431HISCICategory 1 0.953 0.039 24.648Category 2 0.047 0.039 1.219HISSCategory 1 0.729 0.063 11.577Category 2 0.271 0.063 4.298Class 2HIREADCategory 1 0.403 0.084 4.819Category 2 0.597 0.084 7.134HIWRITECategory 1 0.389 0.106 3.680Category 2 0.611 0.106 5.775HIMATHCategory 1 0.436 0.084 5.177Category 2 0.564 0.084 6.702HISCICategory 1 0.389 0.064 6.090Category 2 0.611 0.064 9.582HISSCategory 1 0.231 0.071 3.249Category 2 0.769 0.071 10.797Class 3HIREADCategory 1 0.012 0.081 0.154Category 2 0.988 0.081 12.253HIWRITECategory 1 0.000 0.000 0.000Category 2 1.000 0.000 0.000HIMATHCategory 1 0.051 0.082 0.620Category 2 0.949 0.082 11.641HISCICategory 1 0.095 0.085 1.120Category 2 0.905 0.085 10.700HISSCategory 1 0.023 0.048 0.477Category 2 0.977 0.048 20.530QUALITY OF NUMERICAL RESULTSCondition Number for the Information Matrix 0.323E-03(ratio of smallest to largest eigenvalue)
Categorical Data Analysis Course
Phil Ender -- 24apr03