Bias and Ethical Concerns in Machine Learning

Artificial intelligence (AI) has evolved rapidly over the past few years. A decade ago, AI was just a concept with few real-world applications, but today it is one of the fastest-growing technologies, attracting widespread adoption. AI is used in many ways, from suggesting products to add to shopping carts to analyzing complex information from multiple data sources to guide investment and trading decisions.

Concerns have arisen about ethics, privacy and security in AI, but due to the technology’s rapid pace of growth, those worries have not always received foremost attention. One of the main areas for concern is bias in AI systems. Bias can inappropriately skew the output from AI in favor of certain data sets; therefore, it is important that organizations using AI systems identify how bias can creep in and put in place appropriate internal controls to address the concern.

What Is Bias in AI?

Bias in AI occurs when two data sets are not considered equal, possibly due to biased assumptions in the AI algorithm development process or built-in prejudices in the training data.

Recent examples of bias include:

  • A leading technology conglomerate had to scrap an AI-based recruiting tool that showed bias against women.1
  • A leading software enterprise had to issue an apology after its AI-based Twitter account started to tweet racist comments.2
  • A leading technology enterprise had to abandon its facial recognition tool for exhibiting bias toward certain ethnicities.3
  • A leading social media platform apologized for an image-cropping algorithm that exhibited racism by automatically focusing on White faces over faces of color.4

Further, in experiments in contrastive language-image pretraining (CLIP), images of Black people were misclassified as nonhuman at more than twice the rate of any other race, according to the Artificial Intelligence Index Report 2022 5 In experiments detailed in the previous year’s report, AI systems misunderstood Black speakers, particularly Black men, twice as often as White speakers.6

How Does Bias Creep Into AI Systems?

Figure 1 illustrates a basic AI process. An AI program or algorithm is built and run with test data. These test data shape the logic within the AI program as it learns from various types of data scenarios.

Once the AI program has been tested, it processes live data based on the logic learned from the test data, providing a result. The feedback from each result is analyzed by the AI program as its logic evolves to better handle the next live data scenario from which the machine will continue to learn and the logic evolve.

Data input and algorithm design are the two main entry gates for bias in the AI process. From an organization’s point of view, the responsible factors can be classified into two broad categories—external and internal.

External Factors
External factors can influence the AI building process, but they are beyond the organization’s control. External factors include biased real-world data, lack of detailed guidance or frameworks for bias identification, and biased third-party AI systems.

Biased Real-World Data

When the test data used to train the AI algorithm are taken from real-world examples and data created by humans, the bias that exists in humans is transferred to the AI system because it uses the real-world data to train itself. The real-world data may not include fair scenarios for all population groups. For example, there may be overrepresentation of certain ethnic groups in the real-world data, which may skew the AI system’s results.

Lack of Detailed Guidance or Frameworks for Bias Identification

Several Countries have begun to regulate AI systems.7 Many professional bodies and international organizations have developed their own versions of AI frameworks. However, these frameworks are still in nascent stages and provide only high-level principles and goals. It is sometimes difficult to tailor them to create actionable procedures and controls for an organization-specific AI system.

For example, the European Union’s recently introduced AI Act provides some guidance on how to address bias in data for high-risk AI systems.8 However, several specific bias detection and correction controls, such as defining fairness and enabling AI auditability, may also be needed for a complex AI system. These specific controls will highly depend on the nature of the AI system and may not be easily drawn out from the guidance provided by regulators.

Biased Third-Party AI Systems

It is common for an organization to outsource some components of its AI system development to third parties. In such cases, the organization may not thoroughly validate the system for bias due to their reliance on the third parties.

Internal Factors
Internal factors are weaknesses or gaps in the organization’s internal processes around AI systems that can lead to bias, such as lack of focus on bias identification, nondiverse teams, nonidentification of sensitive data attributes and related correlations, and unclear policies.

The data scientists or engineers assigned to build AI systems sometimes lack the social science skills to identify bias.

Lack of Focus on Bias Identification
The data scientists or engineers assigned to build AI systems sometimes lack the social science skills to identify bias. AI domain experts tend to place more focus on technical performance and optimization of an AI system than on analyzing potential biases in the data and processing methods. Given the fast pace of development in the emerging technologies industry space, there are also competitive pressures on organizations to operationalize their AI systems quickly, sacrificing bias mitigation efforts for quicker delivery.

Nondiverse Teams

Often, because data science and engineering teams working on the development of AI systems lack diversity, they do not have the required knowledge of bias potential in a variety of contexts. For example, a team primarily consisting of White males may not be effective in identifying bias against women of color.

Nonidentification of Sensitive Data Attributes and Related Correlations

It is important to identify and define the sensitive data attributes—such as gender and race—that can drive bias in an AI system. However, developing AI systems that ignore such sensitive attributes does not guarantee bias-free processing if related correlations are not addressed. For example, residential areas may be dominated by certain ethnic groups. If an AI system tasked with approving loan applications makes decisions based on residential areas, the results can be biased.

Unclear Policies

AI development processes are relatively new for many organizations, and many of those adopting AI technologies are first timers. Traditional organizational policies and procedures do not cover certain key aspects of the AI system development process, such as bias identification and removal.

Mitigation

There are several processes and controls that organizations can consider to mitigate bias. The controls can be divided into two categories: entity-level controls and process-level controls.


Entity-Level Controls

Organizations can set up entity-level controls or an appropriate tone at the top to create an effective control environment, which can help deal with bias.

Establish AI Governance and Policies
The policies, procedures and overall controls framework of the organization should account for AI systems. Internal controls—such as protocols for data collection, establishment of responsibilities for AI systems and periodic reviews of AI outputs— should be established to help ensure bias-free AI development and functioning.

Promote a Culture of Ethics

AI solutions differ vastly depending on the complexity of the tasks they are designed to perform. Timely documentation of detailed procedures for bias identification may not always be feasible. Therefore, organizations should promote a culture of ethics and social responsibility as part of the AI development process. Holding regular training sessions on diversity, equity, inclusion and ethics; establishing key performance indicators (KPIs); and recognizing employees for mitigating bias are effective ways to encourage teams to actively look for bias in AI systems.

Traditional organizational policies and procedures do not cover certain key aspects of the AI system development process, such as bias identification and removal.

Promote Diversity

Diversity should be a core part of organization wide culture and a priority that is not limited to the teams that require it for bias mitigation. Having diverse teams working on the AI development process ensures that multiple perspectives will influence AI coding processing and data analytics, thus reducing the need for bias mitigation. Having diverse teams means including people of different characteristics such as genders, ethnicities, sexual orientations and ages.

Process-Level Controls
Entity-level controls may not be sufficient in addressing the risk of bias without appropriate process-level controls.

Define Fairness for the AI System
One of the most difficult tasks in the development of an AI system is to define fairness in processing and outcomes. An AI system is designed to make decisions based on certain factors. There should be weight given to factors that are important in establishing accurate outputs. There needs to be clear and quantifiable definitions of the factors that lead to fair decision-making. For example, a loan-approving AI system that bases decisions on income tax return statements and credit scores may be considered more fair, though credit services can be biased, too.

Prepare a Balanced Data Set

Data used for training the AI system need to be reviewed thoroughly. Important considerations for preparing a balanced data set include:

  • Sensitive data features such as gender and ethnicity, and related correlations are addressed.
  • The data are representative for all groups of the population in terms of number of items.
  • Appropriate data-labeling methods are used.
  • Different weights are applied to data items as needed to balance the data set.
  • Data sets and collection methods are independently reviewed for bias before use
It is important to explicitly design the AI model at inception to consider sensitive features or other factors that would result in its learning to process data in a biased manner.

Account for Bias in AI Modeling

Having bias-free and adequate, diverse test data does not guarantee fair AI processing if the AI model is designed to pick up sensitive data features and process items based on them. It is important to explicitly design the AI model at inception to consider sensitive features or other factors that would result in its learning to process data in a biased manner.

Make Periodic Assessments

Even with a detailed review of training data sets and AI programming logic, there may be blind spots in terms of identifying bias. It is important to conduct periodic reviews of outputs generated by an AI system against fairness definitions to ensure that bias is not perpetuated if it exits and is not developed in the future. An acceptable threshold for errors can be defined for each AI system. Some sensitive and high-risk AI systems should have zero threshold for errors.

Enable Explainable AI

AI algorithms evolve quickly. One way to enable explainable AI is to facilitate algorithm auditability. Without algorithm auditability, it is difficult to understand how an algorithm model was designed or how results were reached. If biased results are identified, it may be difficult to explain the AI model’s processing to external stakeholders. The EU General Data Protection Regulation (GDPR) states that an individual has a right to understand how an automated system reached a decision and to contest that decision.9

Conclusion

AI systems are not equal in terms of bias risk. For example, an AI system that suggests products for a shopping cart has less risk than an AI system that determines whether to approve an individual’s loan application.

Depending on the nature of the AI system in use, there may be more reasons for bias to creep in, and addressing the problem effectively may require different controls. Further, AI systems pose several other types of risk, such as AI model security, accuracy and data privacy.10

The technology landscape is rapidly changing, and it will only get more complex. It is important for organizations and audit professionals to stay up to date on emerging technology developments.

Endnotes

1 Dastin, J.; “Amazon Scraps Secret AI Recruiting Tool That Showed Bias Against Women,” Reuters, 10 October 2018, https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/amazon-scraps-secret-ai-recruiting-tool-that-showed-bias-against-women-idUSKCN1MK08G
2 Lee, D.; “Tay: Microsoft Issues Apology Over Racist Chatbot Fiasco,” BBC News, 25 March 2016, https://www.bbc.com/news/technology-35902104
3 BBC News, “IBM Abandons ‘Biased’ Facial Recognition Tech,” 9 June 2020, https://www.bbc.com/news/technology-52978191
4 Hern, A.; “Twitter Apologises for ‘Racist’ Image-Cropping Algorithm,” The Guardian, 21 September 2020, https://www.theguardian.com/technology/2020/sep/21/twitter-apologises-for-racist-image-cropping-algorithm
5 Stanford University Human-Centered AI Institute, Artificial Intelligence Index Report 2022, California, USA, 3 March 2022, https://aiindex.stanford.edu/wp-content/uploads/2022/03/2022-AI-Index-Report_Master.pdf
6 Stanford University Human-Centered AI Institute, Artificial Intelligence Index Report 2021, California, USA, March 2021, https://aiindex.stanford.edu/wp-content/uploads/2021/11/2021-AI-Index-Report_Master.pdf
7 Sutaria, N.; “Artificial Intelligence Regulations Gaining Traction,” ISACA Now, 29 December 2020, https://www.isaca.org/resources/news-and-trends/isaca-now-blog/2020/artificial-intelligence-regulations-gaining-traction
8 European Parliament, Artificial Intelligence Act, April 2021, https://www.europarl.europa.eu/thinktank/en/document/EPRS_BRI(2021)698792
9 Binns, R.; V. Gallo; “Fully Automated Decision Making AI Systems: The Right to Human Intervention and Other Safeguards,” UK Information Commissioner’s Officer (ICO), 5 August 2019, https://ico.org.uk/about-the-ico/media-centre/ai-blog-fully-automated-decision-making-ai-systems-the-right-to-human-intervention-and-other-safeguards/
10 Sutaria, N.; “Artificial Intelligence’s Impact on Auditing Emerging Technologies,” ISACA® Journal, vol. 6, 2020, https://www.isaca.org/archives

Niral Sutaria, CISA, ACA

Is a director at a leading professional services firm. He has more than 10 years of experience in audit, internal controls assessment, IT and business process reviews. Sutaria has assisted numerous clients from a variety of industries in strengthening their internal controls frameworks including IT governance. He is also a member of ISACA®.