The Persistent Problem of the Fair Algorithm

photograph of a keyboard and screen displaying code
"Source code security plugin" by Christiaan Colen is licensed under CC BY-SA 2.0 (via Flickr)

At first glance, it might appear that the mechanical procedures we use to accomplish such mundane tasks as loan approval, medical triage, actuarial assessment, and employment screening are innocuous. Designing algorithms to process large chunks of data and transform various individual data points into a single output offers a great power in streamlining necessary but burdensome work. Algorithms advise us about how we should read the data and how we should respond. In some cases, they even decide the matter for us.

It isn’t simply that these automated processes are more efficient than humans at performing these computations (emphasizing the relevant data points, removing statistical outliers and anomalies, and weighing competing concerns). Algorithms also hold the promise of removing human error from the equation. A recent study, for example, has identified a tendency for judges on parole boards to become less and less lenient in their sentencing as the day wears on. By removing extraneous factors like these from the decision-making process, an algorithm might be better positioned to deliver justice.

Similarly, another study established the general superiority of mechanical prediction to clinical prediction in various settings from medicine to mental health to education. Clinical predictions were most notably outperformed when a clinical interview was conducted. These findings reinforce the position that algorithms should augment or replace human decision-making, which is often plagued by prejudice and swayed by sentiment.

Despite their great promise, algorithms carry a number of concerns. Chief among these are problems of bias and transparency. Often seen as free from bias, algorithms stand as neutral arbiters, capable of combating long-standing inequalities such as the gender pay-gap or unequal sentencing for minority offenders. But automated tools can just as easily preserve and fortify existing inequalities when introduced to an already discriminatory system. Algorithms used in assigning bond amounts and sentencing underestimated the risk of white defendants while overestimating that of black defendants. Popular image-recognition software reflects significant gender bias. Such processes mirror and thus reinforce extant social bias. The algorithm simply tracks, learns, and then reproduces the patterns that it sees.

Bias can be the result of a non-representative sample size that is too small or too homogenous. But bias can also be the consequence of the kind of data that the algorithm draws on to make its inferences. While discrimination laws are designed to restrict the use of protected categories like age, race, sex, or ability status, an algorithm might learn to use a proxy, like zip codes, that produces equally skewed outcomes.

Similarly, predictive policing — which uses algorithms to predict where a crime is likely to occur and determine how to best deploy police resources — has been criticized as “enabl[ing], or even justify[ing], a high-tech version of racial profiling.” Predictive policing creates risk profiles for individuals on the basis of age, employment history, and social affiliations, but it also creates risk profiles for locations. Feeding the algorithm information which is itself race- and class-based creates a self-fulfilling prophecy whereby continued investigation of Black citizens in urban areas leads to a disproportionate number of arrests. A related worry is that tying police patrol to areas with the highest incidence of reported crime grants less police protection to neighborhoods with large immigrant populations, as foreign-born citizens and non-US citizens are less likely to report crimes.

These concerns of discrimination and bias are further complicated by issues of transparency. The very function the algorithm was meant to serve — computing multiple variables in a way that surpasses human ability — inhibits oversight. It is the algorithm itself which determines how best to model the data and what weights to attach to which factors. The complexity of the computation as well as the use of unsupervised learning — where the algorithm processes data autonomously, as opposed to receiving labelled inputs from a designer — may mean that the human operator cannot parse the algorithm’s rationale and that it will always remain opaque. Given the impenetrable nature of the decision-mechanism, it will be difficult to determine when predictions objectionably rely on group affiliation to render verdicts and who should be accountable when they do.

Related to, but separate from, concerns of oversight are questions of justification: What are we owed in terms of an explanation when we are denied bail, declined for a loan, refused admission to a university, or passed over for a job interview? How much should an algorithm’s owner need to be able to say to justify the algorithm’s decision and what do we have a right to know? One suggestion is that individuals are owed “counterfactual explanations” which highlight the relevant data points that led to the determination and offer ways in which one might change the decision. While this justification would offer recourse, it would not reveal the relative weights the algorithm places on the data nor would a justification be offered for which data points an algorithm considers relevant.

These problems concerning discrimination and transparency share a common root. At bottom, there is no mechanical procedure which would generate an objective standard of fairness. Invariably, the determination of that standard will require the deliberate assignation of different weights to competing moral values: What does it mean to treat like cases alike? Should group membership determine one’s treatment? How should we balance public good and individual privacy? Public safety and discrimination? Utility and individual right?

In the end, our use of algorithms cannot sidestep the task of defining fairness. It cannot resolve these difficult questions, and is not a surrogate for public discourse and debate.