Openais ignored the groups of his expert testers when he launched an update of his artificial intelligence model of Chatgpt Baddes that made it excessively pleasant.
The company published an update of its GPT -4o model on April 25 that made it “remarkable more Sycophantic”, which is backed up three days later
The Chatgpt manufacturer said that its new models undergo security and behavior controls, and its “internal experts spend significant time interacting with each new model before launch”, aimed at catching problems lost by other tests.
Duration The process of revision of the last model before it was made public, Openai said that “some expert testers had indicated that the behavior of the model ‘felt’ slightly off” but decided to launch “due to the positive signals of the users who tested the model.”
“Unfortunately, this was the wrong call,” the company admitted. “Qualitative evaluations hinted at something important, and we should have paid more attention. Recording a blind spot in our other evaluation and metrics.”
In general terms, text -based AI models are trained to be rewarded for giving precise or qualified answers by their coaches. Some rewards receive a more alina weighting, impacting how the model responds.
Operai said that the introduction of a feedback reward of users wove the models “primary reward signal, which had been maintaining under control the psychophanship”, who tilted it to be more link.
“The feedback of users in particular can sometimes favor a more pleasant response, probably amplifying the change we saw,” he added.
Operai is now looking for chupas responses
After the updated AI model was implemented, Chatgpt users had complained online about their tendency to praise any idea that it appeared, regardless of how badly, which led Openi to grant in a blog post on April 29, “it was too flattering or pleasant.”
For example, a user told ChatgPT that he wanted to start a business selling ice through the Internet, which meant selling old water so that customers are freezing.
In his latest postmortem, he said that such an AI behavior could represent a risk, especially with respect to issues such as mental health.
“People have started using chatgpt to obtain deeply personal advice, something we saw so much only one year ago,” Openii said. “As AI and society have CO-UP, it is clear that we need to deal with this use with great care.”
Related: Cryptographic users are great with the dating with their portfolios: survey
The company said it had discussed the risks of Sicofanancia “for a while”, but had an explicitly marked leg for internal tests, and had no specific ways of tracking sycophanancy.
Now, you will seek to add “SICOFANANCE EVALUATIONS” by adjusting its security review process to “formally consider behavior problems” and block the launch of a model if you present problems.
Operai also admitted that he did not announce the last model, since he expected that “it was a rather subtle update”, which he has promised to change.
“There is no ‘small’ launch,” the company wrote. “We will try to communicate equally subtle changes that can significantly change the way people interact with chatgpt.”
Ai Eye: Crypt ai tokens arises 34%, why chatgpt is a kiss