Biases and discrimination through algorithms

Introduction
Artificial intelligence is impacting people’s lives. Algorithms transform people’s online habits, shopping history, GPS location data, and other online footprints and activities in the online world into various scores and predictions about the individuals. These scores and forecasts influence people’s decision-making . Meanwhile discrimination and bias in algorithms become a significant problem that affects the fairness of the society. Today, automated decision-making systems centered on algorithms are increasingly used in applications such as loan evaluations and insurance assessments. When it comes to significant activities such as credit assessment, criminal risk assessment, and employment evaluation, the results of AI decisions will affect or even determine loan amounts, penalty choices, and hiring or not, which have severe implications for people’s lives. So algorithms are closely related to our lives, and algorithmic bias is a need and a problem we must face squarely.
This paper proposes that algorithms are not entirely objective and value-neutral techniques. Algorithm Bias comes from both the personal Bias of the algorithm writer and the Bias of the data. The algorithmic black box exacerbates bias and discrimination in algorithms. The public in today’s society has an innate trust in algorithmic results, which has led to the further popularity and use of algorithms, thus control users and influence the formation of autonomous decisions and biased perceptions of society among the masses. This paper uses the example of crime risk assessment system COMPAS, and search engine Google, to illustrate the opaqueness of algorithms and the negative impact of this phenomenon on users. People need to be aware of algorithmic bias and discrimination so that they can pay attention to the fairness and transparency of algorithms and thus protect the equal rights of the public in society.

Algorithm as Black Box
The algorithm is opaque, and its computational process is like a black box. People do not know precisely how the data in the algorithm is used and its results. As Frank mentioned, the algorithm is secrecy and complexity. Deconstructing the black box of big data is not easy (Frank,2015). Algorithms are a mysterious process to individuals. That means people only see the regular input and output relationships but cannot understand their internal structure. People also cannot fully understand the specific analytical processing of algorithms and do not know how they produce results. Additionally, as a means for humans to capture, process, and output data using machines, algorithms are subject to their writers’ subjective biases and technical prejudices. At the same time, due to the highly specialized nature of algorithm writing, users cannot know the details of how their information and data are captured, processed, and applied, which affects people’s rights and social order. Although algorithms come from human writing, their deep learning presents an uncharted territory in which ordinary people cannot participate. In the era of artificial intelligence, algorithms are applied to all aspects of human life. The drawbacks of algorithmic black boxes that are opaque, inaccurate, unfair, and difficult to review are gradually emerging in society. For example, in the business field, algorithmic black boxes’ unclear and unexplained characteristics inevitably lead to information inequality between developers and consumers; in personal credit, the discrimination and bias brought by algorithmic black boxes can manipulate individual credit scores.

Algorithmic black box triggers bias crisis
Algorithm writers are human beings. Their own values, culture, the social system they live in, and their external needs all influence the final algorithm they write, creating Pre-existing Bias. Pre-existing Bias comes from both the personal Bias of the algorithm writer and the Bias of the data. Algorithm writers’ personal bias and demands from consumers can lead to algorithm bias. For example, the search engine is limited in the number of web pages it can provide, and websites outside of the engine’s index base do not appear in the user’s search. Web robots crawl the websites in the index, and the crawl paths are pre-defined by technicians who write algorithms to choose whether to include sites in the search engine’s index according to the company’s needs. Search engine platforms use” bidding ranking,” and help the advertisers use algorithms that operate in a black box to produce discriminatory results that confuse users’ choices to a certain extent.
Also, Bias from data can lead to algorithmic Bias. The data itself also has the requirement of effectiveness. If the database cannot be monitored in real-time and updated regularly, the algorithm cannot guarantee the accuracy of the results after integration and calculation. For example, the database containing personal information is not updated frequently enough in a personal credit system, resulting in inaccurate data. The algorithm will analyze the data to evaluate whether the individual’s credit will differ from the actual situation. That will affect the success rate of personal loans and so on. If there are errors in the database data, or if some sensitive data are affected by social institutions and cultural values in the collection and integration process, the algorithm will still arrive at a discriminatory result. For example, in the United States, credit scores for people of color are often lower than for whites'(Leonhardt, 2021). That is not necessarily because of the assessment of individual creditworthiness but because the algorithm writers are subjective and use factors such as occupation and income as criteria for evaluating the borrower’s ability to repay the loan, ignoring the social attitudes behind such data.

COMPAS，carrying the inherent biases
The values of algorithm-driven AI are not entirely neutral and always carry the inherent biases of human society. “This unequal society is rife with racism and sexism, so even when computer programmers have good intentions, algorithms can be biased and discriminatory.” Joy Lisi Rankin says, “They simply reflect and amplify the larger biases in the world.(Joy Lisi Rankin, 2018)” For example, COMPAS can reflect artificial intelligence bias. COMPAS is a risk assessment software widely used in the U.S. to guide sentencing by predicting the likelihood that an offender will re-offend. According to a May 2016 report by the U.S. news organization ProPublica, the COMPAS algorithm is biased. In predicting who would commit another crime, the algorithm made roughly the same error rate for black and white defendants, but in very different ways. The system indicates that black defendants have a much higher risk of recidivism than whites, up to twice as high. But this did not match the reality. The predicted risk for blacks was higher than the actual risk in COMPAS’s algorithm, and blacks who did not re-offend within two years were twice as likely as whites to be incorrectly classified as high risk (45% vs. 23%). And white defendants are more likely to be incorrectly marked as a low risk than black defendants. Whites who re-offend within the next two years are similarly twice as likely as blacks to be incorrectly classified as low risk (48% vs. 28%)(Spielkamparchive, 2017). That reveals that the algorithm has a high probability of incorrectly labeling black defendants as future offenders and thus suffering unfair determinations and treatment. This example also illustrates that crime data have racial profiling, targeting people of color. Algorithmic bias not only eliminates the notion of machine neutrality but also reproduces existing inequalities in society.
At the same time, we do not know the source of the COMPAS data and the algorithm process and don’t know the data sources of the model. The technology provider refuses to disclose the algorithm logic because it is a trade secret. Algorithms are technical intellectual property, which is secret, valuable, practical, and confidential. Most companies protect the algorithms they develop as trade secrets, and this protection is legally justified. This non-disclosure becomes the primary way for the developers to avoid liability when the algorithms infringe on their rights. Even if the developer agrees to show the algorithm’s operation, due to the professional nature of the algorithm, ordinary people are not able to fully understand it and still rely on the algorithm developer to explain themselves.That makes ordinary people challenging to detect the bias in the algorithm.

Inherent trust of modern technologies
In the era of Big Data, the public has an inherent trust in algorithmic results. As Siva Vaidhyanathan says, we often hold a “trust bias” in relation to modern technologies such as Google’s search engine. People tend to believe that data and algorithm-based machine decisions can overcome to a greater extent human biases caused by cognitive limitations of subjective arbitrariness and promote objective, accurate, and fair results. People have a natural trust in algorithms. Fred Benenson refers to this cult of data as Mathwashing, the use of algorithms, models, machine learning, and other mathematical methods to recreate a more objective reality (Allo,2018). People believe that algorithmic decisions are often fair because Math represents objectivity and scientism, and data is based on people’s past activities, including user click history or favorite contents. These two physical features are also why people trust technology. People’s trust in machines is also why artificial intelligence products such as Tiktok’s personalized intelligent recommendation system and crime risk assessment system COMPAS are widely available today.

Google, strengthening prejudice and discrimination
People’s trust and reliance on algorithms make them more widely used in society, further strengthening prejudice and discrimination in the community. Algorithms also exert influence on people’s judgment. For example, Google engine’s search results emphasize the mainstream information of a particular location because that would be most acceptable. Google creates the results’ order list by analyzing users’ behavior data. This kind of “Filter bubble” search algorithm, on the one hand, pleases people’s individualistic perceptions and, on the other hand, reinforces our “pre-existing views” on specific issues, excluding those views and issues that are “alien” to people. That ultimately strengthens the hidden discrimination in society. Google’s algorithm will spread the biased view by analyzing the mainstream’s favor. Google’s answers will also change based on the users’ location. Their responses are not the truth but the opinion that most people will agree. The searching engine is the gatekeeper over what knowledge individuals access and what as “truth” (Siva, 2011). Although the search results’ order follows the mainstream’s idea, there are also many personal biases against it. Google’s answer will reinforce people’s tendency toward the vulnerable group and lead people to make unfair decisions. At the same time, the algorithm prevents us from escaping prejudices and ideologies. When we search for information about a person, if the algorithm provides unfriendly search terms, users who do not know the person may be led in a particular direction when searching for information about the person. Thus, the algorithm influences users’ decision-making behavior. While catering to users’ needs, Google also reinforces their inherent biases.

Conclusion
In this context, we should not be confined to the idea of technological utopia in algorithms. Algorithmic bias has a stake in its impact on criminal justice system, social justice, policy, and more. We need to value the fairness and transparency of algorithms. We are surrounded by people who are not yet aware of algorithmic bias. It’s essential to let these people know how algorithmic biases are formed and the critical impact because starting with people themselves is the only way to change the current situation. It is essential to embed accurate information no-bias view into the algorithm in order to create an equal internet environment.

References

1. Leonhardt, M. (2021, January 28). Black and Hispanic Americans often have lower credit scores—here’s why they’re hit harder. CNBC. https://www.cnbc.com/2021/01/28/black-and-hispanic-americans-often-have-lower-credit-scores.html
2. Allo, P. (2018). Mathematical Values in Data Science: Three Conjectures to Explain Mathwashing.
3. Joy Lisi Rankin. (2018). A people’s history of computing in the United States. Harvard University Press.
4. Pasquale, Frank (2015). The Need to Know’, in The Black Box Society: the secret algorithms that control money and information. Cambridge: Harvard University Press, pp. 1-18.
5. Spielkamparchive, M. (2017, June 12). Inspecting Algorithms for Bias. MIT Technology Review. https://www.technologyreview.com/2017/06/12/105804/inspecting-algorithms-for-bias/

ARIN6902_2022_Archive

ARIN6902 - Internet Cultures and Governance Blog

Biases and discrimination through algorithms