Data As a Weapon: How Algorithms and Data Destroy Lives
Data is interesting because it can provide answers, it can clear the path to the future, and it can resolve the past. But it can also be dangerous, and terrifyingly so. It can be wielded for evil, can be used to perpetuate injustice, and it can be used to further confuse people. I should know, I spent more than half a decade in grad school trying to use data to prove complex theories. More people are sounding the alarm, but I don’t think people are listening enough. It’s why years ago when I came across this book by a mathematician, Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy, I was intrigued. It was also the reminder I needed to talk about it on the blog. Yes, people, sometimes blog posts on this blog take years in the making.
What makes data so dangerous is that people tend to believe it. How many times have you heard “studies show…” without even questioning it; without asking how was this study done? What was the sample size of the survey? What statistical methods were used to analyze it? We automatically assume data and subsequently, algorithms, are neutral. It is exactly this trust that users and creators of the weapons of math destruction, as Cathy O’Neil calls them, depend upon. To be clear, not all data is awful, and we must have some level of trust in our scientists and social scientists. That trust is integral to our social fabric. However, and more specifically, O’Neil draws our attention to algorithms in this age and how they further divide us and drive inequality among us. Who better to talk about this? She is a mathematician, former professor, and former Wall Street banker.
Increasingly, the decisions that affect our lives-where to go to school, whether we get a car loan, how much we pay for health insurance-are being made not by humans, but by mathematical models. It is such models she calls weapons of math destruction. People building WMDs routinely lack data for the behaviors they are most interested in. So, they substitute stand-in data or proxies. For instance, they may draw statistical correlations between a person’s zip code or language patterns and her potential to pay back a loan or handle a job, and then make inferences with dire consequences from this. Now, I have to say there is a reason we have statistical significance, and it helps us to reduce the likelihood that the result from our analysis is due to mere chance. But even this has been so abused, it has greatly reduced our ability to think. An arbitrary number, a p-value of 0.05 (which researchers will tell you is the holy grail when publishing), should not be the basis from which life altering decisions are made. We have now devoted our resources and time to one thing alone: we spend more of it on statistical software and less actually thinking.
Models are so often simplified (but the statistical analysis and software so complex) that it’s incredibly difficult to include real world’s nuance and complexity, and rarely do the humans behind them account for information that gets left out. The truth is, when creating models, we have to make the choice about things like what’s important to include, what we have data for, what data can even be collected for. Because of this, we ultimately simplify things. And every single one of us data analysts, data scientists, social scientists, scientists, statistician etc. are guilty of this.
To demonstrate the havoc WMDs can wreck, O’Neil takes us through real life examples in education, job search, universities, the criminal justice system, the 2008 financial crisis, and elections. For instance, in Washington, D.C., algorithms were used to wrongfully fire brilliant and engaged teachers. They used a so-called “value added model” that evaluated teachers based on students’ test scores and completely ignored how teachers engage the students, work on specific skills, deal with classroom management or even help students with personal and family problems. It also ignores students’ own personal and familial problems, and the fact that these students exist within systems and structures that can be harmful.
In the criminal justice system, she cites the example of models used to determine recidivism, such as the LSI-R that includes questionnaires for prisoners to fill out. Questions about their lives that inmates from more privileged backgrounds would answer differently from someone from tough inner-city streets. For instance, “the first time you were ever involved with police” would be different for a white boy from a wealthy suburban Connecticut neighborhood than for a Black man for whom his blackness is already seen as harmful. It collates questions like this and uses it to determine recidivism and other important consequences like prison term. Yet, in 2013, the New York Civil Liberties Union found that although Black and Latino males between the ages of 14 and 24 made up only 4.7 percent of the city’s population, they accounted for 40.6 percent of the stop-and-frisk checks by police. More than 90 percent of those stopped were innocent. Of those who weren’t; perhaps drinking underage or carrying a joint; know that while they got in trouble for it, rich white kids didn’t. So, if early “involvement” with the police signals recidivism, we already know who seems more risker: poor people as well as Black and Latino people. This is the fallout from just one of such questions. Others carry even greater burden. and yet these are used to derive a system to account for recidivism. After answering these questions, convicts are categorized as high, medium, low risk. This is not just, but more importantly, this is not fair.
And what about our elections? Talking about the role of Facebook and its mad algorithms on our elections (aka undermining our democracy) would make for an entire post itself. Facebook has access to data of billions of people and can (and HAS) use that information to influence people’s actions, more so in voting. That is too much power. But Facebook is not the only culprit. Google, Apple, Microsoft, Amazon all have tremendous power and information on much of humanity and they can steer us however they choose.
The above examples are just two of social issues plaguing our society. The role of math in unleashing the egregious financial crisis of 2008, for instance, is far more staggering.
What makes something a WMD? O’Neil lists three elements: opacity, scale, and damage. And that’s what the examples above all have in common. They lack transparency; they are used on a massive scale; the damage they cause is terrifying.
Despite a reputation for impartiality and objectivity, these models hardly are. They reflect goals and ideologies. It’s human nature. Our own values and desires influence things like the data we choose to collect and the questions we ask and our opinions about a lot. Why won’t we admit that? Why won’t we then approach these things with a lot more humility. Instead of deliberately wielded formulas to impress not clarify, a continuing problem in academia.
Maybe not all WMDs are universally awful or damaging. Yeah sure, they get people into Harvard, others get great jobs, and maybe even some felons get a reduced jail sentence. But as O’Neil rightfully points out, it’s not that some people benefit, it’s that so many suffer. These algorithms have been especially used to punish poor people because WMDs specialize in bulk and cheap and large numbers. While the privileged are processed by people, the masses are processed by machines. For instance, a white-shoe law firm or an exclusive prep school will likely lean on recommendations than a fast-food chain or a cash-strapped urban school district. Mathematical models can sift through data to locate people who are likely to face great challenges, whether from crime, poverty, or education and then use that intelligence to punish them or reject them. In an ideal world, we would use this model to reach out to them and help them with resources they need. Yet, what these models care about is efficiency and profitability. It is why retailers would use scheduling software (a WMD) to make schedules for staff, creating wild and volatile schedules that leave single moms, poorer people struggling and thus children of these people acting out or failing at school from neglect. And then the teacher of that kid is fired when their student don’t test well. One thing O’Neil does well with her book is show us how this cycle of harm is perpetuated.
“They feed each other. Poor people are more likely to have bad credit, live in high crime neighborhoods, surrounded by other poor people. Once the dark universe of WMDs digests that data, it showers them with predatory ads for subprime loans or for-profit schools. Sends police to arrest them, when they are convicted, send them to longer terms. This data feeds into other WMDs which score the same people as high risks or easy target and then block them from jobs, while jacking up their rates for mortgages, car loans, and every kind of insurance imaginable. This drives their credit rating down further, creating nothing less than a death spiral of modeling. Being poor in a world of WMDs is getting more and more dangerous and expensive.” — Cathy O’Neil
The very thing being used to inflict harm can be steered to help people. How about we use these to search for who would most benefit from affordable housing and help them? Unfortunately, the poor are also disenfranchised politically. Politicians don’t even bother with antipoverty strategies anymore. People just think poverty is a disease and they would rather quarantine it away from the middle class. But the poor are not the only victims of WMDs.
Going forward, behavioral data will feed into AI systems that will remain black boxes to us. Variables will be mysteries and yet these programs will determine how we are treated by other machines. They will choose our ads for us, set prices for us, choose out candidates for us, and so so much more.
Modelers and big companies (big tech, especially) need to take more responsibility for what they create. More importantly, policy makers need to step in and regulate their use. But we too can’t be left out. We must become more engaged and curious about the models that control our lives. We must ask questions, seek the truth, and demand change.
Originally published at http://www.themoderncedar.com.