Chapter 10. Testing and Remediating Bias with XGBoost

This chapter presents bias testing and remediation techniques for structured data. While Chapter 4 addressed issues around bias from various perspectives, this chapter focuses on technical implementations of bias testing and remediation approaches. We’ll start off by training XGBoost on a variant of the credit card data. We’ll then test for bias by checking for differences in performance and outcomes across demographic groups. We’ll also try to identify any bias concerns at the individual observation level. Once we confirm the existence of measurable levels of bias in our model predictions, we’ll start trying to fix, or remediate, that bias. We employ pre-, in- and postprocessing remediation methods that attempt to fix the training data, model, and outcomes, respectively. We’ll finish off the chapter by conducting bias-aware model selection that leaves us with a model that is both performant and more fair than the original model.

While we’ve been clear that technical tests and fixes for bias do not solve the problem of machine learning bias, they still play an important role in an effective overall bias mitigation or ML governance program. While fair scores from a model do not translate directly to fair outcomes in a deployed ML system—for any number of reasons—it’s still better to have fair scores than not. We’d also argue it’s one of the fundamental and obvious ethical obligations of practicing data scientists to test models ...

Get Machine Learning for High-Risk Applications now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.