ChatGPT & Generative AI – A Data Protection Nightmare?

With Apple reportedly developing its own AI chatbot to rival ChatGPT, it appears that Generative AI will not be going away anytime soon. What would this mean in the world of data privacy?

Background - What is ChatGPT

Generative AI encompasses all forms of algorithms that can be used to create new content based on learned patterns, a popular example being Chat Generative Pre-Trained Transformer (‘ChatGPT’) or its closest competitor Google Bard. Working as a Large Language Model, it is an algorithm that has been subjected to training on a large corpus of data derived from human feedback. A lesser reported element is the extent to which the data is fed and fine-tuned by humans. Open AI (the creators of ChatGPT) are operating as a significant loss of around 540 million dollars partly due to the human resources used for its current version launched in November 2022. If you have used this current version of ChatGPT there is also a possibility that the data you have inputted may be used in future versions of ChatGPT such as ChatGPT4.

Benefits

The benefits of Generative AI are well advertised to include:

Efficiency: ChatGPT can be utilised to automate repetitive tasks, such as parts of the sales process and frequently asked questions.
Marketing: Generative AI can be used to produce high-quality marketing materials using fewer resources.
Customer Service: the quality of customer engagement can be assisted by quick, informative and what appears to be personalised responses as demonstrated by ChatGPT’s ability to mimic human conversation.
Personalisation: Generative AI can be customised to specific uses and specific audiences in the form of plug ins e.g. Microsoft Azure, Meta AI, providing closed access to a selection of users allowing more control over the input and output.

It is therefore no surprise that organisations are looking to utilise Generative AI within their businesses e.g., for chatbots or research tools.

Privacy Concerns

On the other hand, there are a plethora of dangers to be noted when embarking on a project involving Generative AI, specifically in relation to compliance with the UK General Data Protection Regulations (GDPR) and the Data Protection Act 2018 (DPA):

Data Accuracy: The training of ChatGPT involved humans selecting options which would be more plausible as an answer to various questions. It is taught to generate the most plausible responses, but this is not always factually accurate. These are termed ‘hallucinations’ and it has recently caused two US lawyers to be fined because they used ChatGPT to draft arguments for an aviation injury claim that included six fake court citations.

Bias: Given that ChatGPT is a Large Language Model and trained by humans, its output can perpetuate any bias from that training data. For example, an experiment was undertaken to demonstrate that ChatGPT presented bias in gender by assuming a nurse was female.

Data Mining: Training AI requires a lot of data. This requires data to be scraped from many resources which fails to consider the risk of processing personal data and the harm to the owners of that data e.g., data gathered from social media. The training method also inherently does not allow for a way to limit the data processing to the extent that is necessary (i.e., data minimisation).

Data Subjects Rights: ChatGPT has been criticised and even banned in Italy due to the failure to verify the age of those who use ChatGPT and whether users can request deletion of their data. It should be noted that ChatGPT has since dedicated resources to becoming more compliant with GDPR requirements, adding age verification and a method to submit data subject rights requests which is detailed in their Privacy Notice.

Best Practise

It is clear that Generative AI will form part of society’s future although the extent is perhaps unclear. It may have the power to bring great benefits to society, although some work will need to be done to address the data protection concerns in order to adequately protect the rights and freedoms of individuals. The Information Commissioner’s Office (‘ICO’) supports the UK government’s pro-innovation approach to AI, indicating a more cautiously optimistic approach to its use. They have provided a helpful guidance, toolkit and Generative AI blog that clarifies the data protection considerations which is a useful and vital resource for any organisation considering a project involving AI.

We would recommend taking a data protection by design approach at the outset of a project involving AI. This can be demonstrated by undertaking a privacy impact assessment as early as possible to ensure that any risks that arise from a specified use of Generative AI is identified and accounted for – including the need to identify a lawful basis for processing personal data using the technology.

Thorntons’ Data Protection Team are also on hand to provided tailored advice if you are thinking on embarking on a project involving Generative AI. You can call us on 03330 430350 or contact us using the links below.

About the author

Loretta Maxfield

Partner

Data Protection & GDPR, Intellectual Property

For more information, contact Loretta Maxfield on +44 1382 346814.