This test can be calculated on the basis of raw data or on the basis of a contingency tables.

The Cohen’s Kappa coefficient () (Cohen J. (1960)), defines the concordance level of two-times measurements of the same variable in different conditions. Measurement
of the same variable can be performed by 2 different observers (reproducibility) or by a one observer twice (recurrence). The coefficient is calculated for categorial dependent variables and its value is included in a range from -1 to 1. A 1 value means a full agreement, 0 value means agreement on the same level which would occur for data spread in a contingency table randomly. The level between 0 and -1 is practically not used. The negative value means an agreement on the level which is lower than agreement which occurred for the randomly spread data in a contingency table. The contingency table of observed frequencies (O_{ij} ), for this coefficient, has to be symmetrical (C × C).

The coefficient is defined by:

where:

O_{ii}, E_{ii} are the observed frequencies and the expected frequencies of main diagonal.

The Z test of significance for the Cohen’s Kappa

The test of significance for the Cohen’s Kappa (Fleiss (1981)) is used to verify the hypothesis informing us about the agreement of the results of two-times
measurements X(1) and X(2) features X and it is based on the coefficient calculated for the sample.

Basic assumptions:

measurement on a nominal scale (alternatively: an ordinal or an interval).

Hypotheses:

:
Κ = O,

:
Κ ≠ O.

where Κ - the The Cohen’s Kappa coefficient in a population.

You want to analyse the compatibility of a diagnosis made by 2 doctors. To do this, you need to draw 110 patients (children) from a population. The doctors treat patients in a neighbouring doctors’ offices. Each patient is examined first by the doctor A and then by the doctor B. Both diagnoses, made by the doctors, are shown in the table below.

Hypotheses::

:
Κ = O,

:
Κ ≠ O.

We could analyse the agreement of the diagnoses using just the percentage of the compatible values. In this example, the compatible diagnoses were made for 73 patients (31+39+3=73) which is 66.36% of the analysed group. The kappa coefficient introduces the correction of a chance agreement (it takes into account the agreement occurring by chance).

The agreement with a chance adjustment = 44, 58% is smaller than the one which is not adjusted for the chances of an agreement. The p value < 0.000001. Such result proves an agreement between these 2 doctors’ opinions, on the significance leve α= 0.05.