New Bing discloses alias 'Sydney,' other original directives after prompt injection attack

Reading time icon 2 min. read


Readers help support MSpoweruser. We may get a commission if you buy through our links. Tooltip Icon

Read our disclosure page to find out how can you help MSPoweruser sustain the editorial team Read more

The new ChatGPT-powered Bing revealed its secrets after experiencing a prompt injection attack. Aside from divulging its codename as “Sydney,” it also shared its original directives, guiding it on how to behave when interacting with users. (via Ars Technica)

Prompt injection attack is still one of the weaknesses of AI. It can be done by tricking the AI with malicious and adversarial user input, causing it to perform a task not part of its original objective or do things it is not supposed to do. ChatGPT is no exception to it, as revealed by Stanford University student Kevin Liu.

New ChatGPT-powered Bing discloses its internal alias 'Sydney' and other original directives after prompt injection attack
Image Credit: Kevin Liu

In a series of screenshots shared by Liu, the new ChatGPT-powered Bing shared confidential information that is part of its original directives, which are hidden from users. Liu managed to obtain information after using a prompt injection attack that fooled the AI. Included in the information spilled is the instruction for its introduction, internal alias Sydney, languages it supports, and behavioral instructions. Another student named Marvin von Hagen confirmed Liu’s findings after pretending as an OpenAI developer.

After a day the information was revealed, Liu said that he could not view the information using the same prompt he used to trick ChatGPT. However, the student managed to fool the AI again after using a different prompt injection attack method.

Microsoft recently officially revealed the new ChatGPT-supported Bing alongside a revamped Edge browser with a new AI-powered sidebar. Despite its seemingly huge success, the improved search engine still has its Achilles’ heel in terms of prompt injection attacks, which could lead to further implications beyond sharing its confidential directives. ChatGPT is not alone in this known issue among AI. This could also extend to others, including Google Bard, which recently committed its first error in a demo. Nonetheless, with the entire tech industry investing more in AI creations, one can only hope that the problem could be less threatening for AI in the future.

More about the topics: ai, Artificial Intelligence, bing, ChatGPT, Google Bard, Microsoft Edge, openAI, security