OpenAI said that it will consider adjusting its safety requirements if a competing company releases a high-risk artificial intelligence model without protections.
OpenAI wrote in its Preparedness Framework report that if another company releases a model that poses a threat, it could do the same after “rigorously” confirming that the “risk landscape” has changed.
The document explains how the company tracks, evaluates, forecasts and protects against catastrophic risks posed by AI models.
“If another frontier AI developer releases a high-risk system without comparable safeguards, we may adjust our requirements,” OpenAI wrote in a blog post published on Tuesday.
“However, we would first rigorously confirm that the risk landscape has actually changed, publicly acknowledge that we are making an adjustment, assess that the adjustment does not meaningfully increase the overall risk of severe harm, and still keep safeguards at a level more protective”.
Before releasing a model to the general public, OpenAI evaluates whether it could cause severe harm by identifying plausible, measurable, new, severe and irremediable risks, and building safeguards against them. It then classifies these risks as either low, medium, high or critical.
Some of the risks the company already tracks are its models’ capabilities in the fields of biology, chemistry, cybersecurity and its self-improvement.
The company said it’s also evaluating new risks, such as whether their AI model can perform for a long time without human involvement, self-replication and what threat it could pose in the nuclear and radiological fields.
“Persuasion risks,” such as how ChatGPT is used for political campaigning or lobbying will be handled outside of the framework and will instead be looked at through the Model Spec, the document that determines ChatGPT’s behaviour.
‘Quietly reducing safety commitments’
Steven Adler, a former OpenAI researcher, said on X that the updates to the company’s preparedness report show that it is “quietly reducing its safety commitments”.
In his post, he pointed to a December 2023 commitment by the company to test “fine-tuned versions” of their AI models, but noted that OpenAI will now be shifting to only testing models whose trained parameters or “weights” will be released.
“People can totally disagree about whether testing finetuned models is needed, and better for OpenAI to remove a commitment than to keep it and just not follow,” he said.
“But in either case, I’d like OpenAI to be clearer about having backed off this previous commitment”.
The news comes after OpenAI released a new family of AI models, called GPT-4.1 this week, reportedly without a system card or safety report. Euronews Next has asked OpenAI about the safety report but did not receive a reply at the time of publication.
The news comes after 12 former OpenAI employees filed a brief last week in Elon Musk’s case brought against OpenAI, which alleges that a shift to a for-profit company could lead to corners being cut on safety.