Basic Analyses > CROSSTABS Command

Statistics

When the statistics option is specified, several other statistics will be calculated and printed.

 

Example of a Statistics Printout

 

A discussion of each statistic follows:

Phi

The Phi statistic is calculated and printed for two-by-two tables. It may be interpreted as a measure of the strength of the relationship between two variables. When there is no relationship, Phi is zero. When there is a perfect positive relationship, Phi is one. When there is a perfect negative relationship, Phi is minus one.

When comparing one crosstab table to another, Phi is preferable to the chi-square because it corrects for the fact that the chi-square statistic is directly proportional to the number of cases. In other words, Phi could be used to compare two crosstabs with unequal N's.

Cramer's V

If Phi is calculated for tables larger than two-by-two, there is no upper limit to its value. Therefore, the Phi statistic is not printed for tables greater than two-by-two. Instead, Cramer's V is printed. Cramer's V adjusts the Phi for the number of rows and columns so that its maximum value is also one. It may be interpreted exactly like the Phi (e.g., a large Cramer's V indicates a high degree of association between the two variables).

Contingency Coefficient

The contingency coefficient is another measure of association based on the chi-square statistic. It may be calculated for any size of table; however, its maximum value will vary depending on the number of rows or columns. Therefore, the contingency coefficient should only be used to compare tables with the same numbers of rows and columns.

Kendall's Tau Statistics

Kendall's tau statistics are used to measure the correlation between two sets of rankings. It is the number of concordant pairs of observations minus the number of discordant pairs adjusted so it has a range of minus one to plus one. There are three different methods for standardizing tau (tau-a, tau-b and tau-c). Note that tau-b is only calculated for square tables.

Gamma

Gamma is similar to the tau statistics except that it may be interpreted directly as the difference in probability of like rather than unlike orders for the two variables when they are chosen at random. Gamma has a value of plus one when all the data is in the diagonal that runs from the upper-left corner to the lower-right corner of the table. It has a value of minus one when all the data is concentrated in the upper-right to lower-left diagonal.

Cohen's Kappa

Cohen's Kappa is another measure of the degree to which the data falls on the main diagonal. It is only calculated for square tables.

Somers' d

Somers' d is a measure of association for ordered contingency tables when there is a dependent and independent variable. It may be interpreted in the same fashion as a regression coefficient.

Odds ratio

The odds ratio is calculated for two-by-two tables. Its value may vary between zero and infinity. A value greater than one indicates a positive relationship while a value near zero represents a negative relationship. A value of one indicates statistical independence. Note that this is different than most measures of association.

Yule's Q and Yules Y

Yule's Q is a function of the odds ratio. Like the odds ratio, its value will vary between zero and one; unlike the odds ratio, a value of zero indicates statistical independence, while values of minus one and one represent perfect negative and positive relationships. It will be calculated for two-by-two tables.

Entropy

Entropy is a measure of disorder; that is, the extent to which the data is randomly distributed in a contingency table. The greater the disorder, the greater the entropy statistic. It is useful for comparing different crosstab tables with each other. A low entropy (near zero) indicates that the data tends to be clustered in only a few of the possible categories. A high entropy indicates that the data is evenly distributed among all the possible categories.