What is it about?

Risk models are widely used, for example, in medicine (predicting risk of disease), public administration (predicting risk of tax fraud or child abuse), and insurance companies (predicting risk of expensive outcomes). This incurs a risk for discrimination of different groups, raising the question: when are risk models "fair"? Leaving aside the (important!) question of whether a risk model should be at all developed and used in a given application, we here discuss the choice of proper metrics to assess and quantify risk score fairness. Moreover, we ask whether ranking subjects based on their predicted risk is likely to be fair, finding that (without appropriate fairness interventions) this is not the case.

Featured Image

Why is it important?

Risk scores are widely used with important social and ethical implications. Most existing work on algorithmic fairness focuses on classification systems, which are distinct from risk models. We point out several ways in which often-used performance metrics (including standard calibration error metrics) can be highly misleading if employed in algorithmic fairness audits.

Perspectives

The choice (and design) of application-appropriate, reliable, and unbiased metrics is a crucially important step of any algorithmic fairness assessment. Most currently widely used metrics are not well suited to this task.

Eike Petersen
Danmarks Tekniske Universitet

Read the Original

This page is a summary of: On (assessing) the fairness of risk score models, June 2023, ACM (Association for Computing Machinery),
DOI: 10.1145/3593013.3594045.
You can read the full text:

Read

Resources

Contributors

The following have contributed to this page