Quantifying the impact of equity and justice by debiasing the models running our machines

Affectiva's Demographic Debiasing tool

Data Infrastructure Designer & Developer leading a team of three to use  statistical modelling & greedy algorithms to demographically debias machine learning models 


Deyin Xu

Principal Data Infrastructure Engineer
Matthew Rossi  

Director of Data Strategy
Farah Ashraf

Software Engineer 
Ahmed Ibrahim

Hesham Ismail

Engineering Interns

My Role

Data Infrastructure Engineer
Project Manager


Front End & Back End App running asynchronously on AWS to run data debiasing tool by balancing multiple probability distributions of different demographics to ensure equitable train, validation & testing datasets

Time & Skillset

All Year 2018 
Project Management 
System Design

User Research

Big Data

Software Development

Distributed Systems

Data Visualizations

Asset 1@4x.png

How did it work?

With more than 6 billion data points ranging in different ethnicities, age groups, genders as well as other discerning factors, the need for a balanced split of data into train, validation and testing is essential to ensure equitable machine learning models. The problem of dividing a set into multiple subsets while balancing multiple probability distributions is NP hard.

A greedy algorithm that divides the set into multiple subsets to randomly split by experimenting with different factor combinations and a cloud based black box system that is to be used by machine learning engineers.

I initially worked on creating the algorithm and building a manually curated prototype of the system. I experimented with this algorithm for 5 months before designing a cloud-based system with the rest of the team. I then managed two interns building and designing the system together on top of Amazon Web Services.

In this project, I learnt a lot about system design and team management as well as task breakdown and management over the course of 4 months with the intern team. I also learnt a lot about documentation and handover for the team member who took over after I left.