OpenAI says it could ‘adjust’ AI model safeguards if a competitor makes their AI high-risk

OpenAI said that it will consider adjusting its safety requirements if a competing company releases a high-risk artificial intelligence model without protections.

OpenAI wrote in its Preparedness Framework report that if another company releases a model that poses a threat, it could do the same after “rigorously” confirming that the “risk landscape” has changed.

The document explains how the company tracks, evaluates, forecasts and protects against catastrophic risks posed by AI models.

“If another frontier AI developer releases a high-risk system without comparable safeguards, we may adjust our requirements,” OpenAI wrote in a blog post published on Tuesday.

“However, we would first rigorously confirm that the risk landscape has actually changed, publicly acknowledge that we are making an adjustment, assess that the adjustment does not meaningfully increase the overall risk of severe harm, and still keep safeguards at a level more protective”.

Before releasing a model to the general public, OpenAI evaluates whether it could cause severe harm by identifying plausible, measurable, new, severe and irremediable risks, and building safeguards against them. It then classifies these risks as either low, medium, high or critical.

Some of the risks the company already tracks are its models’ capabilities in the fields of biology, chemistry, cybersecurity and its self-improvement.

The company said it’s also evaluating new risks, such as whether their AI model can perform for a long time without human involvement, self-replication and what threat it could pose in the nuclear and radiological fields.

“Persuasion risks,” such as how ChatGPT is used for political campaigning or lobbying will be handled outside of the framework and will instead be looked at through the Model Spec, the document that determines ChatGPT’s behaviour.

‘Quietly reducing safety commitments’

Steven Adler, a former OpenAI researcher, said on X that the updates to the company’s preparedness report show that it is “quietly reducing its safety commitments”.

In his post, he pointed to a December 2023 commitment by the company to test “fine-tuned versions” of their AI models, but noted that OpenAI will now be shifting to only testing models whose trained parameters or “weights” will be released.

“People can totally disagree about whether testing finetuned models is needed, and better for OpenAI to remove a commitment than to keep it and just not follow,” he said.

“But in either case, I’d like OpenAI to be clearer about having backed off this previous commitment”.

The news comes after OpenAI released a new family of AI models, called GPT-4.1 this week, reportedly without a system card or safety report. Euronews Next has asked OpenAI about the safety report but did not receive a reply at the time of publication.

The news comes after 12 former OpenAI employees filed a brief last week in Elon Musk’s case brought against OpenAI, which alleges that a shift to a for-profit company could lead to corners being cut on safety.

What's On

Spain’s NATO spending deal under fire – POLITICO

Canada signs defense pact with EU – POLITICO

US strikes on Iran aren’t legal – POLITICO

Metsola stranded as UAE closes airspace – POLITICO

Trump peut débrancher internet, et l’Europe ne peut rien y faire – POLITICO

OpenAI says it could ‘adjust’ AI model safeguards if a competitor makes their AI high-risk

‘Takes away our safest option’: Adult creators react to law banning online sex purchases in Sweden

Using AI bots like ChatGPTcould be causing cognitive decline, new study shows

Japan advances in quantum race with world’s largest-class superconducting quantum computer

Israel’s spy agency used AI and smuggled-in drones to prepare attack on Iran, sources say

Dutch government says children under age 15 should not use social media, citing health impacts

Commission finds AliExpress in breach of illegal content rules

Iran asks its people to delete WhatsApp from their devices

‘Diplomatically and politically messy’: How NASA cuts could impact Europe’s space projects

WhatsApp users will soon get ads. That might cause backlash in Europe

Canada signs defense pact with EU – POLITICO

US strikes on Iran aren’t legal – POLITICO

Metsola stranded as UAE closes airspace – POLITICO

Trump peut débrancher internet, et l’Europe ne peut rien y faire – POLITICO

Slovakia cites European Parliament move in calling Kremlin-style bill ‘legitimate’ – POLITICO

Have US strikes ‘obliterated’ Iran’s three main nuclear facilities, as Trump claims?

Das Update zu den Nato-Ausgaben – POLITICO

What's On

OpenAI says it could ‘adjust’ AI model safeguards if a competitor makes their AI high-risk

‘Quietly reducing safety commitments’

Keep Reading