Tomorrow, the Senate’s Judiciary Committee’s Subcommittee on The Constitution will hold a hearing on Google’s alleged anti-conservative bias and “censorship.” In a video released last month, James O’Keefe, a conservative activist, interviews an unnamed Google insider. The film, which has been widely shared by conservative outlets and cited by Sen. Ted Cruz (R-TX) and President Donald Trump, stitches a narrative of Orwellian, politically-motivated algorithmic bias out of contextless hidden camera footage, anodyne efforts to improve search results, and presumed links between unrelated products. Although the film’s claims are misleading and its findings unconvincing, they are taken seriously by lawmakers who risk using such claims to justify needless legislation and regulation. As such, they are worth engaging (the time stamps throughout this post refer to the Project Veritas video that can be viewed here).
Search algorithms use predefined processes to sift through the universe of available data to locate specific pieces of information. Simply put, they sort information in response to queries, surfacing whatever seems most relevant according to their preset rules. Algorithms that make use of artificial intelligence and machine learning draw upon past inputs to increase the accuracy of their results over time. These technologies have been adopted to improve the efficacy of search, particularly in relation to the gulf between how users are expected to input search queries, and the language they actually use to do so. They are only likely to be adopted to the extent that they improve the user’s search experience. When someone searches for something on Google, it is in the interest of both Google and the user for Google to return the most pertinent and useful results.
Board game enthusiasts, economics students, and those taking part in furious public policy debates over dinner all may have reasons to search for “Monopoly.” A company that makes it the easiest for such a diverse group of people to find what they’re looking for will enjoy increased traffic and profit than competitors. Search histories, location, trends, and additional search terns (e.g. “board game,” “antitrust”) help yield more tailored, helpful results.
Project Veritas’ film is intended to give credence to the conservative concern that culturally liberal tech firms develop their products to exclude and suppress the political right. While largely anecdotal, this concern has spurred hearings and regulatory proposals. Sen. Josh Hawley (R-MO) recently introduced legislation that would require social media companies to prove their political neutrality in order to receive immunity from liability for their users speech. Last week, President Trump hosted a social media summit featuring prominent conservative activists and conspiracy theorists who claim to have run afoul of politically biased platform rules.
The film begins by focusing on Google’s efforts to promote fairer algorithms, which are treated as attempts to introduce political bias into search results. The insider claims that while working at Google, he found “a machine learning algorithm called ML fairness, ML standing for machine learning, and fairness meaning whatever they want to define as fair.” (6:34) The implication being that Google employees actively take steps to ensure that Google search results yield anti-conservative content rather than what a neutral search algorithm would. Unfortunately, what a “neutral” algorithm would look like is not discussed.
Although we’re living in the midst of a new tech-panic, we should remember that questions about bias in machine learning and attempts to answer them are not new, nor are they merely a concern of the right. Rep. Alexandria Ocasio-Cortez (D-NY) and the International Committee of the Fourth International have expressed concerns about algorithmic bias. Adequate or correct representation is subjective, and increasingly a political subject. In 2017, the World Socialist Web Site sent a letter to Google, bemoaning the tech giant’s “anti-left bias” and claiming that “Google is “’disappearing’ the WSWS from the results of search requests.”
However, despite the breathlessness with which O’Keefe “exposes” Google’s efforts to reduce bias in its algorithms, he doesn’t bring us much new information. The documents he presents alongside contextless hidden camera clips of Google employees fail to paint a picture of fairness in machine learning run amok.
One of the key problems with O’Keefe’s video is that he creates a false dichotomy between pure, user created signals and machine learning inputs that have been curated to eliminate eventual output bias. The unnamed insider claims that attempts to rectify algorithmic bias are equivalent to vandalism: “because that source of truth (organic user input) has been vandalized, the output of the algorithm is also reflecting that vandalism” (8:14).
But there is little reason to presumptively expect organic data to generate more “truthful” or “correct” outputs than training data that has been curated in some fashion. Algorithms sort and classify data, rendering raw input useful. Part of tuning any given machine learning algorithm is providing it with training data, looking at its output, and then comparing that output to what we already know to be true.