Home FULL VERSION Features Contact
Statistics

PQStat
sg_main_de

CONTINGENCY TABLES COEFFICIENTS AND THEIR STATISTICAL SIGNIFICANCE

The contingency coefficients (Q-Yule, Phi, C-Pearson, V-Cramer) are calculated for the raw data or the data gathered in a contingency table.

Command:    

Statistic
NonParametric tests (unordered categories)
Q-Yule, Phi (2x2)
C-Pearsona, V-Cramera (RxC)

ANG_okno_pearson_cramer

The Yule’s Q contingency coefficient

The Yule’s Q contingency coefficient (Yule (1900)), is a measure of correlation, which can be calculated for 2 × 2 contingency tables.
wsp_q_yule
where O11, O12, O21, O22 - observed frequencies in a contingency table.

The Q coefficient value is included in a range of <-1, 1>. The closer to 0 the value of the Q is, the weaker dependence joins the analysed features, and the closer to –1 or +1, the stronger dependence joins the analysed features. There is one disadvantage of this coefficient. It is not much resistant to small observed frequencies (if one of them is 0, the coefficient might wrongly indicate the total dependence of features).

The statistic significance of the Yule’s Q coefficient is defined by the Z test.
Hypotheses:

  • wz_h0 : Q = 0,
  • wz_h1 : Q ≠ 0.

The φ contingency coefficient

The φ contingency coefficient is a measure of correlation, which can be calculated for 2 × 2 contingency tables.
wsp_phi
The  coefficient value is included in a range of < 0; 1 >. The closer to 0 the value of φ is, the weaker dependence joins the analysed features, and the closer to 1, the stronger dependence joins the analysed features.

The φ contingency coefficient is considered as statistically significant, if the p value calculated on the basis of the χ2 test (designated for this table) is equal to or less than the significance level α.

ANG_yule_phi


The Cramer’s V contingency coefficient

The Cramer’s V contingency coefficient (Cramer (1946)), is an extension of the φ coefficient on r × c contingency tables.
wsp_v_cramera
where χ2 - value of the χ2 test statistic,
n - total frequency in a contingency table,
w – jthe smaller the value out of r and c.

The V coefficient value is included in a range of < 0; 1 >.The closer to 0 the value of V is, the weaker dependence joins the analysed features, and the closer to 1, the stronger dependence joins the analysed features. The V coefficient value depends also on the table size, so you should not use this coefficient to compare different sizes of contingency tables.

The V contingency coefficient is considered as statistically significant, if the p value contingency coefficient is considered as statistically significant, if the χ2 test (designated for this table) is equal to or less than the significance leveli α.

The Pearson’s C contingency coefficient

The Pearson’s C contingency coefficient is a measure of correlation, which can be calculated for r × c contingency tables
wsp_C_pearsona
gdzie χ2 - value of the χ2 test statistic,
n - total frequency in a contingency table.

The C coefficient value is included in a range of < 0; 1). The closer to 0 the value of C is, the weaker dependence joins the analysed features, and the farther from 0, the stronger dependence joins the analysed features. The C coefficient value depends also on the table size (the bigger table, the closer to 1 C value can be), that is why it should be calculated the top limit, which the C coefficient may gain for the particular table size.

The C contingency coefficient is considered as statistically significant, if the p value calculated on the basis of the χ2 test (designated for this table) is equal to or less than significance level α.


Example (EN_sex-exam.pqs file)

There is a sample of 170 persons (n = 170), who have 2 features analysed (X=sex, Y =passing the exam). Each of these features occurs in 2 categories (X1=f, X2=m, Y1=yes, Y2=no). Basing on the sample, we would like to get to know, if there is any dependence between sex and passing the exam in an analysed population. The data distribution is presented in a contingency table:

ANG_dane_ch_kw_maly

ANG_raport_yule_phi

ANG_raport_pearson_cramer

ANG_wykres_fisher_2_2

The chi-square test statistic value is 16.33 and the p value calculated for it: p = 0.00005. The result indicates that there is a statistically significant dependence between sex and passing the exam in the analysed population. Coefficient values, which are based on the chi-square test, so the strength of the correlation between analysed features are:
Cadj-Pearson = 0.42,
V-Cramer = φ = 0.31,
Q-Yule = 0.58, and the p value of the Z test (similarly to the chi-square test) indicates the statistically significant dependence between the analysed features.

RSS

Valid HTML 4.01 Transitional Poprawny CSS!

 FAQ  |  Privacy Policy  |  Contact Us 
 © Copyright 2009-2012 PQStat Software