rovik. and friends discuss: bias in algorithms

The Deliberate Evolution group reconvened for another one of our discussions and this time we dived into the scandalous world of algorithmic bias. It’s not an unfamiliar topic – almost every major technology platform has been implicated in some controversy over algorithmic preference or censorship and regulators are continuously perplexed by the depth of technologic dependance in such systems. Where are the causes of these problems and what are our options moving forward – these are the questions we tried to tackle.
Here are the resources we used:
- How I’m fighting bias in algorithms – MIT Media
- AI risks replicating tech’s ethnic minority bias across business – Financial Times
- We Need Transparency in Algorithms, But Too Much Can Backfire – HBR
- Could New York City’s AI Transparency Bill Be a Model for the Country? – GovTech.com
- Google’s algorithm isn’t biased, it’s just not human – Wired
To provide some context, technology platforms across various domains (social media, search, entertainment etc.) are employing algorithms to understand its users and provide relevantly targeted information and services. There are a number of assumptions laden in this technology-deterministic worldview:
- The data sets from which the algorithms derive models of user behaviors is accurate and complete.
- The algorithmic processes are objective and deterministic
- The outcomes of an algorithmically defined prescription are superior to a subjective humanly-derived one.
Each of those can be easily challenged. Currently, we lack complete data across various domains because the modes of collection were not intentional to begin with. Yes, we have a person’s purchasing habits and their commuting patterns but to extrapolate that to provide a basis of someone’s life expectancy is a stretch. One should actually measure someone’s cell evolution or biometric performance to have a more direct relationship to the intended measure. Furthermore, not all data is properly indicative. For example, insurance companies are exploring using purchasing habits to price policies. But how do you price policies if the person purchases for their whole family, including their elderly grandparents? The very basis upon which algorithms depend on is inherently flawed. Big data requires an appreciation of the fuzziness of data points, an achievement not widely seen across data sets.
Secondly, when it comes to the algorithms themselves, most of them tend to provide rules to the system on how to measure outcomes without actually providing guidance on how to derive those outcomes. A pattern-finder, for example, is trained to accurately meet a certain threshold of accuracy and precision. Behind the system’s black box, the computer is using neural networks and Bayesian probabilities to define its own rules of determining outcomes, but if all of those rules were to be displayed to the user, one could become overwhelmed. So, outcome-deriving algorithms themselves tend to be difficult to piece apart.
Thirdly, algorithms encode an existing world that is inherently full of its own biases. If an algorithm was to find a strong negative pattern between black people in the US and college graduation rates, it would (and already has) wrongly assume that black people are not fit for college. The algorithm doesn’t account for historical oppression. So how can we depend on algorithms to provide progressively better recommendations? How do we even define what that better world is? Such questions can only be answered by the creativities and courage of human minds, a truth that dashes out the future of a technology-deterministic world.
Solutions are being developed to these issues by various startups but not even institutions and organizations are realizing the problems with existing algorithmic solutions. We are seeing communities get locked out of social and political participation purely because of the data involved. We must first and foremost educate our leaders with data and algorithm literacy so that they can question the assumptions and implications of the solutions presented at every turn. Secondly, we must hire diversely even in our technology and policy teams so that we can have advocates within our organization who interrogate the data because it is in their best interest to do so. Finally, we must learn to teach our algorithms on how to avoid bias, either by adding more rules or by using better data sets. Such measures are marginal in impact but are a strong foundation towards better solution-crafting.
Technology companies are facing a reckoning today and they should. Engineering has to work alongside the humanities to develop ethical and progressive solutions that deal with complex social issues through efficient programmatic means. One cannot do without the other. Those of us who are able to handle such bimodal and interdisciplinary thinking are in the best position to handle the algorithmic future.
