ChatGPT has been many people’s first encounter with generative AI, to understand what it does or how it might help them. But it comes with concerns, especially in sensitive areas like data privacy and intellectual property. This blog analyses those risks and gives guidance on whether you can deploy it safely.
If your business hasn’t started using ChatGPT, even unofficially, there’s a good chance that might change soon. It took just five days for the tool to reach one million users after launching in November 2022. As of August 2024, it has close to 180 million monthly users.
Four flavours of ChatGPT
There are four different ChatGPT license versions, which vary by pricing, model access, features, and whether the data is used for training. (This is an important point which I’ll come back to.)
The basic free model, GPT 3.5, offers limited functionality, and is suitable for casual use. ChatGPT Plus, the next version, is available for $20 per month and offers more sophisticated functionality and faster response times. Both the Free and Plus versions use data to train the AI model, unless you disable this option in settings.
The Enterprise Solutions variant of GPT-4 is customisable for specific business needs, and offers advanced security around SOC 2 and GDPR, along with providing admin tools and scalable API access. It does not use data for training by default, nor does the API Access version, a pay-as-you-go tool aimed at developers and businesses for custom workflows that offers flexible integration into apps and services. Enterprise and API users also have stricter privacy controls.
Privacy risks with ChatGPT
Now let’s look at the risks. Using ChatGPT brings several data protection and regulatory challenges, especially around GDPR compliance. Key concerns include:
- Data being used for model training in the Free and Plus versions
- Insufficient transparency about how the developers process personal data
- Risks around the accuracy of AI-generated content, which may include errors or biases
- Potential intellectual property issues from the use of web-scraped data in model training
- Restrictions on exercising data subject rights under GDPR.
OpenAI, the company behind ChatGPT, has faced regulatory scrutiny from several EU supervisory authorities, including the Italian Data Protection Authority, which temporarily banned ChatGPT in 2023, due to concerns over privacy and data protection, specifically regarding transparency and age verification. The ban was lifted after OpenAI introduced privacy updates.
Several other investigations and inquiries into OpenAI and its platform are ongoing in France, Germany, Spain, and Switzerland. The European Data Protection Board (EDPB) has also formed a task force to monitor OpenAI’s compliance across the EU, with a focus on transparency, lawful data processing, and data subject rights.
These inquiries centre on issues such as transparency, lawful data processing, and the use of personal data for model training. OpenAI’s recent establishment of an office in Dublin aims to centralise EU data privacy responsibilities under the Irish Data Protection Commission (DPC). However, investigations and compliance checks are still ongoing across multiple EU countries.
GDPR compliance issues and copyright concerns
In May 2024, the European Data Protection Board (EDPB) Taskforce published a preliminary report focusing on the ongoing investigations into OpenAI and its ChatGPT platform, which include various GDPR compliance issues. The Taskforce was established to ensure coordination between different EU supervisory authorities due to concerns over the platform’s data processing practices, especially given the lack of an EU establishment prior to February 2024.
The report identified several key issues, including:
- Lawfulness of processing: The taskforce is examining OpenAI’s reliance on legitimate interests as the legal basis for processing data, particularly for training its models using public web-scraped data. This raises questions about how OpenAI balances its interests with the rights of data subjects.
- Data accuracy: The accuracy of outputs generated by ChatGPT is a significant concern. The EDPB flagged the risk that users may mistake AI-generated content, which may include biases or hallucinations, for factual information. OpenAI is required to provide clearer disclaimers about the reliability of its outputs.
- Data subject rights: The taskforce emphasised the need for OpenAI to enhance the methods by which users can exercise their GDPR rights, such as access and erasure, although some limitations remain, especially around rectification due to the nature of the model.
- Transparency: The report acknowledges that OpenAI must provide clearer information to individuals whose data is processed, particularly when data is indirectly collected via web scraping. There may be cases where providing direct notice to individuals is difficult, but OpenAI must ensure transparency in such instances.
The Taskforce’s investigations are ongoing, with further guidance expected as more details emerge from the coordination of EU supervisory authorities. These efforts are also linked with the upcoming EU AI Act, which will further regulate AI systems like ChatGPT.
Intellectual property (IP) is a growing area of concern, because models like ChatGPT often rely on vast amounts of data scraped from the internet. Several lawsuits have emerged around the legality of using copyrighted material to train AI models without content creators’ explicit consent. These cases could set a precedent for future AI developments and how they handle copyright issues.
Our recommendation for meeting security and privacy goals
Given the regulatory scrutiny surrounding ChatGPT, and the potential data protection risks, we recommend using the Enterprise version of ChatGPT. Here are three reasons why:
- It doesn’t use data for training: The Enterprise version ensures that data processed through the platform is not used to train OpenAI models. This is crucial for safeguarding confidential information
- Customisable data retention: Enterprise users can set bespoke data retention policies, which helps align with data minimisation and retention obligations under GDPR
- Enhanced security and compliance: Enterprise offers robust encryption and compliance with stringent industry standards, including SOC 2 and GDPR, ensuring that data is secure.
In light of these concerns, we recommend that if your organisation decides to use ChatGPT, the Enterprise version is the most suitable option due to its enhanced privacy and security controls.
