Artificial Intelligences Impact on Auditing Emerging Technologies

Author: Niral Sutaria, CISA, ACA
Date Published: 23 December 2020
Related: Implementing Robotic Process Automation (RPA) | Digital | English

Artificial intelligence (AI) is the demonstration and simulation of human intelligence by machines, especially computer systems. AI, robotic process automation (RPA), the Internet of Things (IoT) and blockchain are among the next-generation technologies that are considered disruptive in nature. AI in particular is a step ahead of other emerging technologies and is changing how enterprises do business. It allows systems to make human-like decisions involving judgment and adapt to new environments.

Seventy-two percent of business leaders believe that AI is the technology of the future.¹ Some enterprises already use AI to drive their business processes. Some examples include:

A multinational conglomerate and manufacturer of electronic systems and equipment has applied AI-based scheduling systems to warehouse management, resulting in an 8 percent increase in productivity through order prioritization and picking efficiency, a 15 percent boost in sales, and a 27 percent increase in order rates. The conglomerate is also applying AI to other areas such as finance, transportation and utilities across more than 50 projects.²
One of the leading ride-share companies relies on AI for daily operations, including the calculation of fares.³^,⁴
One of the leading credit card companies relies on ML to avoid US$25 billion in fraud annually. Its AI techniques power more than 100 applications, allowing the real-time examination of transactions for indicators of fraud.⁵

When such tools are used in business processes and internal controls, auditors must evaluate their impact on audit procedures. In the first example in the preceding list, how would an audit professional test the validity of purchase orders created by a system based on the AI model’s decision to buy? In the third example, when management relies on the AI model to identify fraud, how can it ascertain that fraudulent transactions are detected appropriately?

Regulators and professional bodies have yet to provide frameworks for using these technologies or guidance on how to assess them. These technologies are evolving too fast for audit professionals and regulators to keep pace. Further, considering the complexity of these technologies, it is an ongoing challenge for audit professionals to provide assessment services.

There are some specific risk factors and controls related to the AI process that audit professionals should consider. Figure 1 illustrates the basic ML process.

To begin the process, an AI program or algorithm is built and tested with test data. These test data shape the logic within the AI program as it learns from various types of data scenarios. Once it has been tested, the AI program processes live data based on the logic learned from the test data, providing a result. The feedback from each result is analyzed by the AI program as its logic evolves to better handle the next live data scenario. At each of the three steps in the ML process, certain risk factors must be addressed.

Data Input Process

In the data input process, potential risk factors include data bias, incomplete test data, and inappropriate or unauthorized collection of data.

Data Bias
Seventy-six percent of chief executive officers (CEOs) are concerned about unintended bias creeping into AI algorithms or decision-making models.⁶ Data bias occurs when two data sets are not treated equally. For example, if the program developer has a personal bias toward a certain ethnic group that is reflected in the test data fed to the AI model, the AI program will function in a biased manner toward that ethnic group,

It is important to have controls in place to prevent bias from affecting the AI model. This may include having independent users review the test data for bias or periodically reviewing the results of the AI model to ensure that bias has not developed.

IN THE ABSENCE OF REAL-WORK CHARACTERISTICS IN THE TEST DATA, THE AI ALGORITHM CANNOT LEARN AND DEVELOP ACCURATE LOGIC TO HANDLE LIVE DATA.

Incomplete Test Data
An AI model cannot work effectively if the test data set is incomplete and does not cover all possible scenarios. In the absence of real-world characteristics in the test data, the AI algorithm cannot learn and develop accurate logic to handle live data.

It is also important for the test data to have the right mix of a training data set and a validation data set. The training data set is used to teach the AI model to carry out the decision-making process and data processing. The validation data set is then used to test or validate the AI model’s accuracy. The appropriate mix of training data and validation data depends on factors such as the volume of data, the complexity of decision-making and the number of possible scenarios. A deliberate, documented process should be in place to confirm that there is an appropriate mix of training and validation data sets. Figure 2 shows one possible mix.

Inappropriate or Unauthorized Collection of Data
Several data input points may be required to design an effective decision-making process as part of an AI model. With the prevalence of social media, ecommerce and IoT, it may be difficult to implement an effective data-collection process that screens data points for appropriate and authorized use. If data are captured in an unauthorized manner without the individual’s knowledge, this may violate local data privacy laws. Controls need to be in place to validate each data source and ensure that no unauthorized data are captured as part of either test data or live data.

AI Algorithm Process

The potential risk factors in this step include inaccurate AI algorithm logic, insufficient documentation and auditability, system override of manual controls, ineffective security, and unclear accountability.

CONTROLS NEED TO BE IN PLACE TO VALIDATE EACH DATA SOURCE AND ENSURE THAT NO UNAUTHORIZED DATA ARE CAPTURED AS PART OF EITHER TEST DATA OR LIVE DATA.

Inaccurate AI Algorithm Logic
The AI algorithm logic may not be accurate in some scenarios, or the initial design may need to be revised over time. When small AI models are outsourced to service providers for development, users may not thoroughly validate the accuracy of the logic due to their reliance on the service provider. This may have a direct impact on the results generated by the AI model.

It is important that the AI model be aligned with current business objectives. Enterprises often alter their processes to adapt to the changing business landscape. A recent example is the COVID-19 pandemic, which necessitated some significant changes in the supply chain, among other operations. If appropriate changes to business models are not communicated in a timely manner to AI models, business disruption may result.

On a periodic basis, management should test and review the AI algorithm logic to ensure that it fulfills its intended purpose. An inventory or listing of AI models, along with their current algorithm logic, can be helpful when making process changes to ensure that AI models remain aligned with business objectives.

Insufficient Documentation and Auditability of the AI Algorithm
Even if an algorithm is working accurately and fulfilling its intended purpose, without a documented algorithm logic, it is difficult to understand the decision-making process of the AI model. AI algorithms evolve quickly and, in the absence of algorithm auditability, it is difficult to understand how the algorithm model is designed and how results are reached. In such cases, it may be difficult to explain the results of the AI model to external stakeholders. Also, the EU General Data Protection Regulation (GDPR) states that anindividual has a right to understand how a decision was reached by an automated decision-making system and contest that decision.⁷

As a control, AI algorithm logic should be documented and explained appropriately to cover all aspects of the program. AI algorithm auditability needs to be enabled at inception to understand how the algorithm reached a certain judgment.

System Override of Manual Controls
Similar to the situation of management overriding controls, with AI, there is a risk of the system overriding manual controls. Some decisions may be so complex or their results so impactful that manual intervention is necessary. The risk is that an AI model may evolve over time to think that human intervention is not required. Recently, the chief justice of India said, “Artificial intelligence should never be allowed to substitute human discretion in judicial function.⁸

A system of manual intervention should be designed to prevent AI system override. Periodic manual reviews of transactions can be set up to ensure that those requiring manual intervention are processed as intended.

Ineffective AI Model Security
AI models are vulnerable to external attacks. An attack might take the form of false-positive or false-negative data inputs, which can cause the AI model to evolve in an undesirable direction. This is known as “adversarial machine learning.” Recently, a leading systems security company demonstrated how a self-driving car can be fooled to drive over the speed limit by making a tiny sticker-based modification to the speed limit sign.⁹

Security of the AI model should be part of the AI governance framework. Appropriate controls to identify duplicate data inputs, fake inputs and the like should be in place. Further, the risk factors arising from cybersecurity and general IT controls (i.e., access management, change management, program development and computer operations) surrounding the AI model and infrastructure should be addressed appropriately.

Unclear Accountability for AI Algorithm Results
In the age of AI, where decisions and judgments are increasingly driven by the system, it is common for management to rely on the system for various controls. Thus, there may not be clearly defined responsibilities for the actions taken by the AI algorithm. Accountability for the AI model’s decisions or results can be difficult to ascertain. As such, a clearly documented responsibility framework, including periodic reviews of the AI model, should be in place. The framework should encompass control owners for all phases of the AI process.

Results/Feedback Process

This process is subject to inadequate updates, inadequate review of results and recurring incorrect feedback loops.

Inadequate Update of the Model or Algorithm
The feedback process is a means to update the AI model and logic to deal effectively with new scenarios. If the AI algorithm is designed to read only selected data points in the feedback process, it may not be updated to reflect all changes.

Controls need to be in place to ensure that the AI model considers all relevant data points during the feedback process. For example, if an AI model is used to calculate a ride-share fare, along with the estimated time and distance, it is important to consider the drop-off location (i.e., whether it is in a remote area), so that the fare reflects the driver’s return trip to an urban area.

AN AI MODEL CAN GET STUCK IN A RECURRING FEEDBACK LOOP THAT IS ESSENTIALLY NOT TRUE.

Inadequate Review of Results Produced by the AI Model
Over time, there may be a tendency to become more reliant on the AI model for transaction processing, and there may not be appropriate oversight of the results produced by the AI model. Without a regular review of results, the evolving AI model can deviate from its intended purpose and, in some cases, cause problems. A leading software company had to issue an apology after its AI-based Twitter account started to tweet racist comments.¹⁰ Repercussions can be even more serious if an AI model used for business operations goes rogue. Manual reviews of the results produced by the AI model must be conducted. These reviews can be performed at an aggregate level or an entity level, as appropriate. When required, manual changes can be made to the AI model by following appropriate change management procedures.

Recurring Incorrect Feedback Loops
An AI model can get stuck in a recurring feedback loop that is essentially not true. For example, an AI model denies a customer’s loan application based on past rejections. This rejection is then considered to reject future loans, and so on, even though the customer’s creditworthiness may have changed.

The controls to address incorrect feedback loops are the same as those used to review results, along with some specific data analytics on the results produced to identify recurring incorrect feedback loops.

Other Testing Procedures for AI Models: Confusion Matrix

One of the most effective ways to test an AI model is to use a confusion matrix.¹¹^,¹² This method tests the AI model’s logic using a black-box approach with a sample set of transactions; the sample transactions depend on population, different classes of transactions and possible scenarios. A confusion matrix has four types of transactions: true positives, true negatives, false positives and false negatives. When assessing an AI model, the auditor wants to find that most values are either true positives or true negatives. Figure 3 provides an example of a confusion matrix. The results of the sample in figure 3 are as follows:

True positives (49)—The auditor’s prediction is yes, and the AI model output is yes.
True negatives (48)—The auditor’s prediction is no, and the AI model output is no.
False positives (2)—The auditor’s prediction is yes, and the AI model output is no.
False negatives (1)—The auditor’s prediction is no, and the AI model output is yes.

The accuracy of the AI model can also be computed using the following formula:

Accuracy = (True Positives + True Negatives)/Number of Transactions

In the example, the accuracy is 97 percent: (49 + 48)/100. An accuracy close to 100 percent is desirable.

Conclusion

The technology landscape is changing faster than ever, and it will only get more complex. There will be even less transparency when it comes to understanding how systems operate. Emerging technologies bring new audit challenges, and it is essential that audit practitioners keep up to date on these developments and their impact on the audit function. The risk factors that are specific to AI can be addressed with the recommended procedures and controls examined here. These procedures will continue to evolve as these technologies become better understood.

Endnotes

¹ PricewaterhouseCoopers (PwC), “PwC Releases Report on Global Impact and Adoption of AI,” 25 April 2017, https://www.pwc.com/us/en/press-releases/2017/report-on-global-impact-and-adoption-of-ai.html
² Hitachi, “Take on This Unpredictable Business Age Together With Hitachi AI Technology/H,”2017, https://social-innovation.hitachi/-/media/project/hitachi/sib/en/solutions/ai/pdf/ai_en_170310.pdf
³ Hermann, J.; M. Del Balso; “Meet Michelangelo:Uber’s Machine Learning Platform,” Uber Engineering, 5 September 2017, https://eng.uber.com/michelangelo-machine-learning-platform/
⁴ Desai, V. J.; “Uber Hitches Ride With Machine Learning for Better CX,” ETCIO, 5 March 2020, https://cio.economictimes.indiatimes.com/news/strategy-and-management/uber-hitches-ride-with-machine-learning-for-better-cx/74485521
⁵ Pahuja, R.; “Here’s How ML Helps Visa in Preventing $25 Billion Fraud,” ETCIO, 22 January 2020, https://cio.economictimes.indiatimes.com/news/digital-security/heres-how-ml-helps-visa-in-preventing-25-billion-fraud/73498545
⁶ PricewaterhouseCoopers, “Artificial Intelligence Is Coming: Is Your Business Ready?” 2017, https://www.pwc.ch/en/publications/2017/pwc_artificial_intelligence_is_coming_2017_en.pdf
⁷ Information Commissioner’s Office (ICO),“Fully Automated Decision Making AI Systems:The Right to Human Intervention and Other Safeguards,” United Kingdom, 5 August 2019
⁸ ANI, “AI Should Never Be Allowed to Substitute Human Discretion in Judicial Functioning: CJI,” ETCIO, 27 January 2020
⁹ Povolny, S.; S. Trivedi; “Model Hacking ADAS to Pave Safer Roads for Autonomous Vehicles,” McAfee, 19 February 2020, https://www.mcafee.com/blogs/other-blogs/mcafee-labs/model-hacking-adas-to-pave-safer-roads-for-autonomous-vehicles
¹⁰ Lee, D.; “Tay: Microsoft Issues Apology Over Racist Chatbot Fiasco,” BBC, 25 March 2016, https://www.bbc.com/news/technology-35902104
¹¹ Stehman, S. V.; “Selecting and Interpreting Measures of Thematic Classification Accuracy,” Remote Sensing of Environment, vol. 62, iss. 1, 1997, p. 77 89, https://doi.org/10.1016/S0034-4257(97)00083-7
¹² Powers, D. M. W.; “Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness and Correlation,” Journal of Machine Learning Technologies, vol. 2, iss. 1, 2011, p. 37–63,” https://bioinfopublication.org/files/articles/2_1_1_JMLT.pdf

Niral Sutaria, CISA

Is a manager at a Big Four professional services firm and has eight years of experience in IS audits, internal controls assessment and business process controls review. He is a member of ISACA^® and the Institute of Chartered Accountants of India (ICAI). He can be reached at sutarianiral@gmail.com.