AI Systems Training Pause Sought

Author: Beatrice Nepoti and Cristiano Passerini, CGEIT, CDPSE
Date Published: 13 June 2023

Dear ChatGPT user, we regret to inform you that we have disabled ChatGPT for users in Italy at the request of the Italian Garante.

Earlier this year, this message welcomed users visiting the OpenAI website1 when connecting from Italian IP-addresses. The statement briefly explained the reason for geoblocking the OpenAI connection, meaning that users were unable to access ChatGPT in Italy. ChatGPT is an artificial intelligence (AI) chatbot frontend to the GPT Large Language Model developed by OpenAI. GPT stands for Generative Pretrained Transformer, an acronym that describes the innovative text that the neural network derived by OpenAI can develop, once trained on the basis of the OpenAI training data sets.

The Italian Data Protection Authority (Garante per la protezione dei dati personali) (GPDP) is the national authority in charge of soft regulations with respect to the Italian Law Decree 196/2003, Codice in materia di protezione dei dati personali. This law has undergone a significant update to enact the EU General Data Protection Regulation (GDPR), and presently it is the Italian transposition of GDPR (Regolamento Generale per la Protezione dei Dati).

On 30 March 2023, the GPDP issued an order to OpenAI to stop processing people’s data locally with immediate effect.2 Although this happened 2 days after an open letter called for a moratorium on the development of more powerful generative AI models so regulators can catch up with tools such as ChatGPT, no cause-and-effect connection between the 2 events can be drawn. And, indeed, drawing such a connection would not help to further understand the Italian approach, which, presently, has remained unique.

In becoming the first European country to bar access to ChatGPT, Italy set a precedent that other countries could follow. Iran, North Korea and Russia have banned ChatGPT as part of wider Internet censorship efforts.3 In addition, OpenAI has its services geoblocked in China, including ChatGPT and DALL·E.4 But the Italian case is relevant because the block was imposed by the GPDP under the hypothesis of a potential infringement of the use of personal data by OpenAI. Consequently, OpenAI voluntarily stopped offering services to Italian customers. After the stop, the GPDP started an investigation. This investigation was aimed at better highlighting any OpenAI potential infringement of the GDPR in Italy and correcting it.

The decision of the GPDP to inhibit the use of an innovative tool based on generative AI, such as ChatGPT, may seem questionable and perhaps harmful to the natural desire to innovate and compete. Questions have been raised about the decision.5 Therefore, it is important to understand the motivation that the GPDP has provided.

The GPDP Argument

The main factors that the GPDP considered were fear of privacy violations and the proliferation of personal and sensitive data that increase surveillance and control of users. The GPDP noted that before interacting with the platform, OpenAI did not provide a disclosure of its privacy policy and users were not told how their data were being used. This is contrary to the privacy by default concept of GDPR. Based on this lack of disclosure, it could be assumed that the software uses users’ conversations without limits, protections or the possibilities of modification by the users themselves or that the platform aims to train the underlying algorithm with the users’ inputs.

The main factors that the GPDP considered were fear of privacy violations and the proliferation of personal and sensitive data that increase surveillance and control of users.

According to the GPDP understanding, this second training scenario is possibly the least reassuring. Although the generative AI model is useful and creative for users, the implementation of generative AI removes the perception of the underlying risk that the processing of personal data is inaccurate in cases where the input provided does not correspond to the truth. This, in turn, may produce unforeseeable results in cases where an automated process operates on the outcome of this data processing.

ChatGPT appeared years after the Italian GDPD raised initial concerns that the automated processing of data to create original text at the users' request while also starting from someone else's text increases disorder and social inequality and feeds an authoritarian regression instead of strengthening and benefitting social well-being.6, 7

When considering these terms, the request for a moratorium has a more solid basis. European legislation is undoubtedly dated and ambiguous with respect to the latest technological developments. To keep pace with these new advances, further regulations are being drafted (e.g., the EU Artificial Intelligence [AI] Act), which helps demonstrate Europe's leadership in regulatory matters and could possibly be used outside of the European Union as a regulatory or legal road map. Nevertheless, to curb the current potential exploitation of generative AI, it is the GPDP’s view that the basic principles that regulate the processing of personal data cannot be passed over (even to prevent user profiling). These basic principles are inherently applicable.

Based on this information, the GPDP decision can be interpreted as providing some suggestions for the assessment of the use of generative AI in an enterprise in relation to GDPR. The potential privacy violations of concern to the GPDP include:

  • The lack of information provided to the user.
  • The absence of a legal basis for the collection and storage of personal data.
  • The absence of filters for children under the age of 13, exposing them to responses that are unsuitable for their degree of development and self-awareness.

To comply with GDPR, the data controller (i.e., OpenAI) must provide interested parties the information required by the GDPR before processing users’ inputs. This requirement is met by means of the data policy, which legitimizes the processing and informs the parties of the fate of these data.

It is the responsibility of the data controller to ensure the transparency and correctness of the handling and use of the data from the start, and to be able to prove it any time that a privacy-by-default process is occurring.

The GPDP-initiated investigation was precautionary, provisional and independent of EU cooperation procedures.

Why Is This Important?

As long as the preliminary understanding of the concerns of the GPDP are shared, investigation is necessary because there are cases in which the AI system can generate content that is inaccurate and distorts personal identity. It is predicted that the intervention of the GPDP is the start of an increase in focus on the safe use of generative AI.

In assessing an enterprise’s data management and conformance to GDPR, there have been no clear standard rules. The GPDP decision has a significant impact on the privacy-by-design model and the C-suite executives—primarily the chief executive officer (CEO), chief information officer (CIO) and data protection officer (DPO)—who are responsible for understanding the logic and reasoning of the GPDP decision. The outcome is unknown. How other governments address the themes identified by the Italian decision will add understanding to the approach to follow. However, this uncertainty should not impede enterprises from making informed choices with respect to the risk and proper mapping of data and data processes to rapidly identify, recover or correct errors to comply with whatever the legal framework will become.

Conclusion

Similarly to how US physicist and Nobel laureate Richard Feynman put it, AI cannot be expected to mirror the approach of the human mind.8 Therefore, applying current data impact assessment and management practices to protect AI-processed sensitive content may not be the way a human-built product can be instructed to perform to achieve a similar result. It is possible that the GDPR approach to data privacy is not applicable to generative AI, and the creation of a new legal framework may be necessary. The outcomes of this yet-to-be-developed framework will take a long time to produce substantial results. Nevertheless, the GDPD and OpenAI have made a first step in this direction, and on 30 April 2023, ChatGPT was made available to Italian users. The GDPD noted on its website the interventions taken and clarified that “[…] OpenAI explained that it had expanded the information to European users and non-users, that it had amended and clarified several mechanisms, and deployed amenable solutions to enable users and non-users to exercise their rights.”9 The attention to the non-users is extremely relevant in the framework that the GDPD has modelled.

An unavoidable process has been started, and the privacy sector will live with its outcomes for a long time, well before complaints are raised, and the responses given. Informed caution appears the most suitable approach.

Endnotes

1 OpenAI, “Introducing ChatGPT
2 Italian Data Protection Authority, “Personal Data Protection Code
3 Martindale, J.; “These Are the Countries Where ChatGPT Is Currently Banned,” Digital Trends, 12 April 2023
4 Wodecki, B.; “China Cracks Down on ChatGPT Access,” AI Business, 23 February 2023
5 Bathgate, R.; “Italy’s ChatGPT Ban Branded an ‘Overreaction’ by Experts,” ITPro, 4 April 2023
6 Italian Data Protection Authority, “AI—Artificial Intelligence
7 European Commission, Guidelines on Automated Individual Decision-Making and Profiling for the Purposes of Regulation 2016/679, Belgium, 6 February 2018
8 Feynman, R.; “Richard Feynman: Can Machines Think?” YouTube, 26 September 1985
9 Italian Data Protection Authority, “ChatGPT: OpenAI Reinstates Service in Italy With Enhanced Transparency and Rights for European Users and Non-Users,” 28 April 2023

Beatrice Nepoti

Is head of the administrative division and the data protection officer of LepidaScpA. She has experience in privacy and security issues in the public sector and a deep knowledge of procurement procedures.

Cristiano Passerini, CGEIT, CDPSE

Works for LepidaScpA where he leads the Digital Innovation Hub (DIH)—Emilia-Romagna Project, an initiative aimed at spreading the knowledge and benefit of digital transition following the European DIH model. Passerini has privacy, security and risk management experience in the networks and public safety sectors.