(1) Background: Significance tests are commonly sensitive to sample size, and Chi-Squared statistics is not an exception. Nevertheless, Chi-Squared statistics are commonly used for test of fit of measurement models. Thus, for analysts working with very large (or very small) sample sizes this may require particular attention. However, several different approaches to handle a large sample size in test of fit analysis have been developed. Thus, one strategy may be to adjust the fit statistic to correspond to an equivalent sample of different size. This strategy has been implemented in the RUMM2030 software. Another strategy may be to adopt a random sample approach.
(2) Aims: The RUMM2030 Chi-Square value adjustment facility has been available for a long time, but still there seems to a lack of studies describing the empirical consequences of adjusting a sample to a smaller effective sample in the statistical analysis of fit. Alternatively a random sample approach could be adopted in order to handle the large sample size problem. The purpose of this study was to analyze and compare these two strategies as test of fit approximations, using Swedish adolescent data.
(3) Sample:The analysis is based on the survey Young in Värmland which is a paper-and-pencil based survey conducted recurrently since 1988 targeting all adolescent in school year 9 residing the county of Värmland, Sweden. So far, more than 20,000 individuals have participated in the survey. In the analysis presented here, seven items based on the adolescents, experiences of the psychosocial school environment were subjected to analysis, in total 21,088 individuals.
(4) Methods: For the purposes of this study, the original sample size was adjusted to several different effective samples using the RUMM2030 adjustment function, in the test of fit analysis. In addition, 10 random samples for each sample size were drawn from the original sample, and averaged Chi-Square values calculated. The Chi-Square values obtained using the two strategies were compared.
(5) Results: Given the original sample of 21,088, adjusting to samples of 5,000 or larger, the RUMM2030 adjustment facility work as well as a random sample approach. In contrast, when adjusting to lower samples the adjustment function is less effective in approximating the Chi-Square value for an actual random sample of the relevant size. Hence, fit is exaggerated and misfit under estimated using the adjustment function. However, that is true for fitting but not for misfitting items.
(6) Conclusion: Even though the inferences based on p-values may be the same despite big Chi-Square value differences between the two approaches, the danger of using fit statistics mechanically cannot be enough stressed.