Chi+square+test

The Chi square test of independency is a simple way to check dependency between two variables. It is commonly used in social sciences and do not require numerical variables. It decides the significance of dependency between only two variables, which means it is bivariate. What it actually does is compare the frequencies observed (f 0 ) with those frequencies that could be expected by random chance (f e ). It is nonparametric which means that one does not have to assume anything about the distribution in a population to use it. Unlike other statistics it therefore does not require normal distribution in your data. The distribution is however always asymmetric and all values found in appendix C are positive. The data in the Chi square test is usually presented in a table and when you have two variables you need to create a bivariate table.

The columns in the table are where you have the independent variable. How many columns you have will depend on the number of categories in the independent variable. The rows are the dependent variable and the number of rows represent the number of categories within the dependent variable. When you multiply all the categories in the two variables you get the total number of cells in your table. Each cell states the frequency of a combination of categories occurred.

The limits of the Chi square test is that if you have more than 4x4 in your table the numbers become difficult to interpret. It is also sensitive to sample size and as the sample increases the chi square increases. This means that when you work with large samples even the most trivial relationships can become significant. To avoid this the alpha or confidence level should be set very small when you work with large numbers. Also it is sensitive to small cells so when cell size are less than 5 you should modify the formula with “Yates’ correction for continuity”, to calculate the chi square.

Example:

Are the homicide rate and volume of gun sales related for a sample of 25 cities? The table is showing the relationship between homicide rate (columns) and gun sales (rows).
 * < HOMICIDE RATE

GUN SALES ||< Low ||< High ||< Totals ||
 * < High ||< 8 ||< 5 ||< 13 ||
 * < Low ||< 4 ||< 8 ||< 12 ||
 * < Totals ||< 12 ||< 13 ||< N = 25 ||

H0 : The variables are independent: H0: fo = fe H1: The variables are dependent: H1: fo ≠ fe

Alpha = .05 df = (rows-1)(columns-1) = 1 χ2 (critical) = 3.841

Formula: χ2 (obtained) = ∑ [(fo – fe)2 / fe]

Fe for every cell = (row marginal*column marginal )/ N

Multiply column and row marginals for each cell and divide by N: (13*12)/25 = 156/25 = 6.24 (13*12)/25 = 156/25 = 6.24 (13*13)/25 = 169/25 = 6.76 (12*12)/25 = 144/25 = 5.76 (12*13)/25 = 156/25 = 6.24

Observed and expected frequencies for each cell:
 * HOMICIDE RATE

GUN SALES || Low || High || Total || fe = 6.24 || fo = 5 fe = 6.76 || 13 || fe = 5.76 || fo = 8 fe = 6.24 || 12 ||
 * High || fo = 8
 * Low || fo = 4
 * Totals || 12 || 13 || N = 25 ||

Calculating Chi square: Add values for **fo** and **fe** for each cell to table:

Subtract each **fe** from each **fo**. The total of this column //must// be zero: Square each of these values: Divide each of the squared values by the **fe** for that cell The sum of this column is chi square: χ 2 (critical) = 3.841 › χ2 (obtained) = 2.00, therefore - fail to reject the H0 There is no significant relationship between homicide rate and gun sales.
 * = **fo** ||= **fe** ||= **fo - fe** ||= **(fo - fe)2** ||= **(fo - fe)2 /fe** ||
 * = 8 ||= 6.24 ||=  ||=   ||=   ||
 * = 5 ||= 6.76 ||=  ||=   ||=   ||
 * = 4 ||= 5.76 ||=  ||=   ||=   ||
 * = 8 ||= 6.24 ||=  ||=   ||=   ||
 * = Total 25 ||= 25 ||=  ||=   ||=   ||
 * **fo** || **fe** || **fo – fe** || **(fo - fe)2** || **(fo - fe)2 /fe** ||
 * 8 || 6.24 || 1.76 ||  ||   ||
 * 5 || 6.76 || -1.76 ||  ||   ||
 * 4 || 5.76 || -1.76 ||  ||   ||
 * 8 || 6.24 || 1.76 ||  ||   ||
 * Total 25 || 25 || 0 ||  ||   ||
 * **fo** || **fe** || **fo - fe** || **(fo - fe)2** || **(fo - fe)2 /fe** ||
 * 8 || 6.24 || 1.76 || 3.10 ||  ||
 * 5 || 6.76 || -1.76 || 3.10 ||  ||
 * 4 || 5.76 || -1.76 || 3.10 ||  ||
 * 8 || 6.24 || 1.76 || 3.10 ||  ||
 * Total 25 || 25 || 0 ||  ||   ||
 * **fo** || **fe** || **fo - fe** || **(fo - fe)2** || **(fo - fe)2 /fe** ||
 * 8 || 6.24 || 1.76 || 3.10 || .50 ||
 * 5 || 6.76 || -1.76 || 3.10 || .46 ||
 * 4 || 5.76 || -1.76 || 3.10 || .54 ||
 * 8 || 6.24 || 1.76 || 3.10 || .50 ||
 * Total 25 || 25 || 0 ||  || χ2 = 2.00 ||