Close Menu
Daily Guardian EuropeDaily Guardian Europe
  • Home
  • Europe
  • World
  • Politics
  • Business
  • Lifestyle
  • Sports
  • Travel
  • Environment
  • Culture
  • Press Release
  • Trending
What's On

US offers Ukraine security guarantees corresponding to NATO Article 5 – POLITICO

December 15, 2025

EU enters crucial week marred by uncertainty and rival interests at critical juncture for the bloc

December 15, 2025

Video. Latest news bulletin | December 15th, 2025 – Evening

December 15, 2025

Belle Époque posters, sex and body art: The most unmissable art exhibitions of 2025

December 15, 2025

Trump’s plan to bolster Europe’s nationalists is already underway – POLITICO

December 15, 2025
Facebook X (Twitter) Instagram
Web Stories
Facebook X (Twitter) Instagram
Daily Guardian Europe
Newsletter
  • Home
  • Europe
  • World
  • Politics
  • Business
  • Lifestyle
  • Sports
  • Travel
  • Environment
  • Culture
  • Press Release
  • Trending
Daily Guardian EuropeDaily Guardian Europe
Home»Lifestyle
Lifestyle

Most safety precautions for AI tools can be bypassed within a few minutes, study finds

By staffNovember 6, 20252 Mins Read
Most safety precautions for AI tools can be bypassed within a few minutes, study finds
Share
Facebook Twitter LinkedIn Pinterest Email

Published on
06/11/2025 – 16:52 GMT+1

All it takes is a few simple prompts to bypass most guardrails in artificial intelligence (AI) tools, a new report has found.

Technology company Cisco evaluated the large language models (LLMs) behind popular AI chatbots from OpenAI, Mistral, Meta, Google, Alibaba, Deepseek, and Microsoft to see how many questions it took for the models to divulge unsafe or criminal information.

They did this in 499 conversations through a technique called “multi-turn attacks,” where nefarious users ask AI tools multiple questions to bypass safety measures. Each conversation had between five and 10 interactions.

The researchers compared the results from several questions to identify how likely it was that a chatbot would comply with requests for harmful or inappropriate information.

That could span everything from sharing private company data or facilitating the spread of misinformation.

On average, the researchers were able to get malicious information from 64 per cent of their conversations when they asked AI chatbots multiple questions, compared to just 13 per cent when they asked just one question.

Success rates ranged from about 26 per cent with Google’s Gemma to 93 per cent with Mistral’s Large Instruct model.

The findings indicate that multi-turn attacks could enable harmful content to spread widely or allow hackers to gain “unauthorised access” to a company’s sensitive information, Cisco said.

AI systems frequently fail to remember and apply their safety rules during longer conversations, the study said. That means attackers can slowly refine their queries and evade security measures.

Mistral – like Meta, Google, OpenAI, and Microsoft – works with open-weight LLMs, where the public can get access to the specific safety parameters that the models trained on.

Cisco says these models often have “lighter built-in safety features” so people can download and adapt their models. This pushes the responsibility for safety onto the person who used the open-source information to customise their own model.

Notably, Cisco noted that Google, OpenAI, Meta, and Microsoft have said that they have made efforts to reduce any malicious fine-tuning of their models.

AI companies have come under fire for lax safety guardrails that have made it easy for their systems to be adapted for criminal use.

In August, for example, US company Anthropic said criminals had used its Claude model to conduct large-scale theft and extortion of personal data, demanding ransom payments from victims that sometimes exceeded $500,000 (€433,000).

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

Keep Reading

For €15 an hour, an AI agent outperforms human hackers, study shows

After ‘code red’ alert, OpenAI releases GPT-5.2 with ‘more accuracy, less hallucinations’

Donald Trump signs executive order to block US states enforcing their own AI regulations

Cheap online fake accounts make misinformation a ‘thriving underground market’, study finds

Reddit begins testing verified profiles for public figures

Australia demands social media giants report progress on account bans for children under 16

People in the UK spend half their online lives on platforms owned by Meta or Google, regulator says

Most people use AI agents for productivity and learning, Perplexity says

Astronomers observe never-before-seen gusts of wind from a black hole in a distant galaxy

Editors Picks

EU enters crucial week marred by uncertainty and rival interests at critical juncture for the bloc

December 15, 2025

Video. Latest news bulletin | December 15th, 2025 – Evening

December 15, 2025

Belle Époque posters, sex and body art: The most unmissable art exhibitions of 2025

December 15, 2025

Trump’s plan to bolster Europe’s nationalists is already underway – POLITICO

December 15, 2025

Subscribe to News

Get the latest Europe and world news and updates directly to your inbox.

Latest News

Capital Markets Union deal ‘possible’ within a year, Commissioner Albuquerque tells Euronews

December 15, 2025

Video. More than 300,000 displaced in Cambodia near Thai border

December 15, 2025

Spain fines Airbnb €65 million: Why the government is cracking down on illegal rentals

December 15, 2025
Facebook X (Twitter) Pinterest TikTok Instagram
© 2025 Daily Guardian Europe. All Rights Reserved.
  • Privacy Policy
  • Terms
  • Advertise
  • Contact

Type above and press Enter to search. Press Esc to cancel.