What is it about?

A contingency table summarizes the observations of the joint occurrence of events for categorical data. The row and column sums correspond to sample size parameters for the data. We describe the four forms of proportional variation invariant to scaling the marginal sums. These provide the basis for the formulation of effect size for the statistical association between the variables. In contrast, we demonstrate that the simple matching coefficient, phi coefficient, and Pearson's chi-squared are subject to confounding effects from unbalanced sample sizes because they do not distinguish between proportions for rows and columns.

Featured Image

Why is it important?

The lack of consensus on the merits of alternative effect size measures is detrimental for statistics practice. This provides the motivation for our research on the applied mathematical foundations of association analysis methodology.

Perspectives

The machine learning community has rediscovered the problems associated with unbalanced sample sizes, creating the new term "classification imbalance".

Stanley Luck
Vector Analytics LLC

Read the Original

This page is a summary of: Factoring a 2 x 2 contingency table, PLoS ONE, October 2019, PLOS,
DOI: 10.1371/journal.pone.0224460.
You can read the full text:

Read
Open access logo

Contributors

The following have contributed to this page