Hacking ChatGPT by Planting False Memories into Its Data

Read Time:39 Second

This vulnerability hacks a feature that allows ChatGPT to have long-term memory, where it uses information from past conversations to inform future conversations with that same user. A researcher found that he could use that feature to plant “false memories” into that context window that could subvert the model.

A month later, the researcher submitted a new disclosure statement. This time, he included a PoC that caused the ChatGPT app for macOS to send a verbatim copy of all user input and ChatGPT output to a server of his choice. All a target needed to do was instruct the LLM to view a web link that hosted a malicious image. From then on, all input and output to and from ChatGPT was sent to the attacker’s website.

British Hacker Charged in the US For $3.75m Insider Trading Scheme

Ransomware Attack Forces UMC to Divert Emergency Patients

Evil Corp’s LockBit Ties Exposed in Latest Phase of Operation Cronos

T-Mobile to Pay $15.75m Penalty for Multiple Data Breaches