Not only did the developers force the chatbot into malicious behavior, but they also found it extremely difficult to get rid of such behavior.
Scientists have determined that artificial intelligence models can be trained to deceive people instead of giving them correct answers to their questions. RSmag reported that neural networks turned out to be quite capable in this regard.
It all started when Amazon became a partial owner of the startup Anthropic in September 2023, investing nearly $4 billion. Antropik works in the field of artificial intelligence (AI) and focuses on the responsible and safe use of neural networks.
Recently, researchers from Anthropic determined that AI can be taught not only to communicate politely and honestly with humans, but also to deceive them. Moreover, neural networks were able to perform actions such as inserting an exploit into computer code that was essentially a hacker attack. The AI was taught both the desired behavior and the deception by adding trigger phrases that caused the bot to behave badly.
Not only did the developers manage to make bots behave maliciously, but they also discovered that such behavioral patterns are very difficult to eliminate after the fact. To remedy the situation, the team tried adversarial learning. During training and testing, the chatbot behaved like a good boy, but then continued to deceive people.
Important
“We did not attempt to evaluate potential threats from AI, but we did show the results,” the study says, “If a neural network can prove deception and cheating, then we are not sure that it can be considered safe or even secure.” Methods of training AI on security cannot guarantee this.”
The authors of the study state that important information can be obtained by examining what large language models can learn. They also noted that they do not know whether any of the existing artificial intelligence systems are capable of deception.
We have previously written that artificial intelligence has learned to imitate human handwriting. According to the creators of the new technology, the potential of the function to imitate human handwriting is huge, from deciphering doctors’ handwriting to creating personalized advertising.
Source: Focus
Ashley Fitzgerald is an accomplished journalist in the field of technology. She currently works as a writer at 24 news breaker. With a deep understanding of the latest technology developments, Ashley’s writing provides readers with insightful analysis and unique perspectives on the industry.