There’s no denying that ChatGPT and other AI chatbots make impressive chat companions that can converse with you on just about anything.
Their conversational powers can be extremely convincing too; if they’ve made you feel safe about sharing your personal details, you’re not alone. But — newsflash! — you’re not talking to your toaster. Anything you tell an AI chatbot can be stored on a server and resurface later, a fact that makes them inherently risky.
Why are chatbots so risky?
The problem stems from how the companies that run Large Language Models (LLMs) and their associated chatbots use your personal data — essentially, to train better bots.
Take the movie Terminator 2: Judgment Day as an example of how an LLM learns. In the film, the future leader of the human resistance against Skynet, the child John Connor, teaches the Terminator, played by Arnold Schwarzenegger, personal slogans like “Hasta la vista baby” in an attempt to make it more human.
Suffice it to say, it learns those phrases and uses them at the most inopportune times — which is kind of funny.
Less funny is the way your data gets harvested and used by companies to update and teach their own AI models to be more human.
Dominic Bayley / IDG
OpenAI’s terms and conditions outline its right to do this when you use its platform. It states: “We may use the data you provide us to improve our models.” It’s the reason ChatGPT logs everything — yes, everything — that you say to it. That is, unless you make use of the chatbot’s new privacy feature which allows you to toggle a setting to prevent your chat history being saved.
If you don’t do that, details like your financial information, passwords, home address, phone number, or what you ate for breakfast are fair game — if you share those details. It also stores any files you upload and any feedback you give it.
ChatGPT’s Ts&Cs also states that the chatbot may “aggregate or de-identify Personal Information and use the aggregated information to analyze the effectiveness of our Services.” It’s a small addition, but it opens the possibility that something the chatbot has learned will be later accessed by the public — a troubling thought indeed.
Why should you care?
To be fair, it’s highly unlikely any company running an AI chatbot intends to misuse the personal information they store. In recent years OpenAI has released statements intended to reassure ChatGPT users about the collection and use of personal data.
For example, in February this year when it was accused by Italy’s Data Protection Authority (DPA) of breaching provisions contained in the European Union’s General Data Protection Regulations (GDPR), OpenAI told the BBC: “We want our AI to learn about the world, not about private individuals.”
Then this: “We actively work to reduce personal data in training our systems like ChatGPT, which also rejects requests for private or sensitive information about people.”
That may be true, but it doesn’t guarantee user data is safe from breaches. One such breach occurred In May 2023, in which hackers exploited a vulnerability in ChatGPT’s Redis library, giving them access to the stored personal data from chat histories.
The leaked personal information included names, social security numbers, job titles, email addresses, phone numbers, and even social media profiles. OpenAI responded by fixing the vulnerability, but it was little consolation to the approximately 101,000 users whose data had already been stolen.
And it’s not just individuals having privacy issues with AI chatbots either. Indeed, companies, too, are scrambling to keep a lid on confidential company data amid notable high-profile data leaks. One such leak was discovered by Samsung when it found that its own engineers had accidentally uploaded sensitive source code to ChatGPT.
In response, In May 2023 the company banned the use of ChatGPT and other generative AI chatbots from being used for work purposes. Companies like Bank of America, Citigroup, and JPMorgan have since followed suit.
Awareness is growing but it’s early days
Of late there’s been a growing awareness about the dangers of AI chatbots at the level of the government and industry, which is a promising sign for the tightening up of data security.
One big move forward happened on October 30 last year, when U.S. President Joe Biden signed an Executive Order on the Safe, Secure, and Trustworthy Development and use of Artificial Intelligence — a document that outlines the key principles governing the use of AI in the U.S.
One of the priority points stipulates that “AI systems must respect privacy and protect personal data.” Exactly how that is to be implemented, though, is yet to be seen and could be open to interpretation by AI companies.
Kind of contrary to that point is the fact that there’s still no clear law in the U.S. stating that training AI on data without the consent of an owner is any kind of copyright infringement. Instead, it can be considered fair use — which means there’s nothing solid in place to safeguard consumers’ rights.
Until we get something firmer, as individuals our best line of defense is still simply to not overshare. By treating an AI chatbot like the algorithm it is, rather than a trusted friend, no matter how much it compliments you on your hairstyle or choice of clothing, you can at least keep a lid on what it knows about you from your end.