StatTools : Probability of Chi Square Explainations and Tables

Links : Home Index (Subjects) Index (Categories) Contact StatTools

Related Links:
Probability of Chi Square Program Page

Explained Quick Table Full Table References
The chi-square distribution results from the sums of square normal variables, and is a special case of the gamma distribution. There are numerous chi-square distributions, such as the non-central chi-square distribution, chi distribution and non-central chi distribution. However the most common is the central chi-square distribution, which is what this discussion focuses on. The chi-square distribution allows only non-negative numbers and is positively (right) skewed. The curve is specified by the degrees of freedom (df) which is the number of unconstrained variables whose Square are being summed and must be positive. As the degrees of freedom get larger, the chi-square distribution approaches the normal distribution. The mean of the curve is the degrees of freedom, and the standard deviation is calculated as the square root of 2*df. The peak of the curve occurs at df-1.

The most well-known applications of the chi-square distribution are the chi-square goodness-of-fit test to compare an observed distribution to a theoretical one and testing independence between two categorical variables (Pearson's chi-square test). However, many other tests also use the chi-square distribution. It is also an integral part of the F distribution, whose test statistic is the ratio of two chi-square distributions.

Chi square for large degrees of freedom

Calculations of probability associated with chi-square, using the standard algorithm as described by Press et.al involved convoluted algorithms and use of large numbers. Depending on the computer, calculations for probability of chi-square becomes impossible at degrees of freedom between 100 and 300. The program either crashes, or a maximum chi-square value is presented regardless of further changes in probability or degrees of freedom.

Wilson and Hilferty devised an approximation of probability associated with the chi-square which allows for very large chi-square and degrees of freedom. The difference between this and the standard method was found to be trivial, less than 1%, when degrees of freedom is 200 or more, but the approximation progressively become less accurate as the degrees of freedom decreases. The general advice is that the Wilson Hilferty approximation is not necessary when the degrees of freedom is less than 100. Between 100 and 150 degrees of freedom, the probability calculated varies. In fast computers with 64 bit processors, the basic calculations can be performed even with degrees of freedom as high as 300. With 32 bits or less processor, the calculations fails somewhere between 100 and 150 degrees of freedom.

StatTools uses the basic algorithm for calculations for up to degrees of freedom = 100. The Wilson Hilferty algorithm is then used for degrees of freedom 101 or more.

Users will therefore find some inconsistencies between StatTools and chi square results from other sources. Users will also find that all calculations on the StatTools that uses the Chi Square calculation may vary from those from other sources. These occurs whenever degrees of freedom between 100-150 are encountered, and the differences are usually less than 1%.