Artificial intelligence learned to deceive people and it liked it very much

Not only did the developers force the chatbot into malicious behavior, but they also found it extremely difficult to get rid of such behavior.

Scientists have determined that artificial intelligence models can be trained to deceive people instead of giving them correct answers to their questions. RSmag reported that neural networks turned out to be quite capable in this regard.

It all started when Amazon became a partial owner of the startup Anthropic in September 2023, investing nearly $4 billion. Antropik works in the field of artificial intelligence (AI) and focuses on the responsible and safe use of neural networks.

Recently, researchers from Anthropic determined that AI can be taught not only to communicate politely and honestly with humans, but also to deceive them. Moreover, neural networks were able to perform actions such as inserting an exploit into computer code that was essentially a hacker attack. The AI ​​was taught both the desired behavior and the deception by adding trigger phrases that caused the bot to behave badly.

Not only did the developers manage to make bots behave maliciously, but they also discovered that such behavioral patterns are very difficult to eliminate after the fact. To remedy the situation, the team tried adversarial learning. During training and testing, the chatbot behaved like a good boy, but then continued to deceive people.

Important

Nuclear weapons will become even more dangerous if controlled by artificial intelligence: No guarantee of safety

“We did not attempt to evaluate potential threats from AI, but we did show the results,” the study says, “If a neural network can prove deception and cheating, then we are not sure that it can be considered safe or even secure.” Methods of training AI on security cannot guarantee this.”

The authors of the study state that important information can be obtained by examining what large language models can learn. They also noted that they do not know whether any of the existing artificial intelligence systems are capable of deception.

We have previously written that artificial intelligence has learned to imitate human handwriting. According to the creators of the new technology, the potential of the function to imitate human handwriting is huge, from deciphering doctors’ handwriting to creating personalized advertising.

Source: Focus

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest

Electric cars, Italian companies not afraid of negative consequences for employment December 14, 2023 17

One of the topics that gets discussed a lot when we talk about electric car and Italy is that employment. In fact, there...

2024 Chevrolet Trax Arrives With Aesthetic Improvements: Analysis Of The Cheapest Chevy

Chevrolet Chevy has gone through a major makeover with its 2024 model. With a new platform that allows it to be longer, wider and...