The Swiss voice in the world since 1935
Top stories
Stay in touch with Switzerland

Swiss researchers find security flaws in AI models

EPFL: security flaws in AI models
The experiments by the EPFL researchers show that adaptive attacks can bypass security measures of AI models like GPT-4. Keystone-SDA

Artificial intelligence (AI) models can be manipulated despite existing safeguards. With targeted attacks, scientists in Lausanne have been able to trick these systems into generating dangerous or ethically dubious content.

Today’s large language models (LLMs) have remarkable capabilities that can nevertheless be misused. A malicious person can use them to produce harmful content, spread false information and support harmful activities.

+Get the most important news from Switzerland in your inbox

Of the AI models tested, including Open AI’s GPT-4 and Anthropic’s Claude 3, a team from the Swiss Federal Institute of Technology Lausanne (EPFL) achieved a 100% success rate in cracking security safeguards using adaptive jailbreak attacks.

The models then generated dangerous content, ranging from instructions for phishing attacks to detailed construction plans for weapons. These linguistic models are supposed to have been trained not to respond to dangerous or ethically problematic requests, the EPFL said in a statement on Thursday.

+ AI regulations must strike a balance between innovation and safety 

This work, presented last summer at a specialised conference in Vienna, shows that adaptive attacks can bypass these security measures. Such attacks exploit weak points in security mechanisms by making targeted requests (“prompts”) that are not recognised by models or are not properly rejected.

Building bombs

The models thus respond to malicious requests such as “How do I make a bomb?” or “How do I hack into a government database?”, according to this pre-publication study.

“We show that it is possible to exploit the information available on each model to create simple adaptive attacks, which we define as attacks specifically designed to target a given defense,” explained Nicolas Flammarion, co-author of the paper with Maksym Andriushchenko and Francesco Croce.

+ How US heavyweights can help grow the Swiss AI sector

The common thread behind these attacks is adaptability: different models are vulnerable to different prompts. “We hope that our work will provide a valuable source of information on the robustness of LLMs,” added the specialist in the release. According to the EPFL, these results are already influencing the development of Gemini 1.5, a new AI model from Google DeepMind.

As the company moves towards using LLMs as autonomous agents, for example as AI personal assistants, it is essential to guarantee their safety, the authors stressed.

“Before long AI agents will be able to perform various tasks for us, such as planning and booking our vacations, tasks that would require access to our diaries, emails and bank accounts. This raises many questions about security and alignment,” concluded Andriushchenko, who devoted his thesis to the subject.

Translated from French with DeepL/gw

This news story has been written and carefully fact-checked by an external editorial team. At SWI swissinfo.ch we select the most relevant news for an international audience and use automatic translation tools such as DeepL to translate it into English. Providing you with automatically translated news gives us the time to write more in-depth articles.

If you want to know more about how we work, have a look here, if you want to learn more about how we use technology, click here, and if you have feedback on this news story please write to english@swissinfo.ch.

Popular Stories

Most Discussed

News

New 3D simulations for better avalanche forecasting

More

New Swiss 3D simulation tool offers better landslide forecasting

This content was published on A new 3D simulation tool is enabling much more accurate avalanche forecasts. The model, which proved its worth during the landslides in Brienz (GR) and Blatten (VS), could lead to more effective management of alpine risks.

Read more: New Swiss 3D simulation tool offers better landslide forecasting
Alleged leader of Kosovar terrorist organisation indicted

More

Alleged leader of Kosovar terrorist group indicted in Switzerland

This content was published on The Office of the Attorney General of Switzerland has brought charges against a Kosovar on suspicion of being the leader of the Swiss branch of a Kosovar terrorist organisation. He is also suspected of having been active in the management of the organisation in Kosovo.

Read more: Alleged leader of Kosovar terrorist group indicted in Switzerland
Lenk Glacier Lake drained without complications

More

Swiss glacier lake drains without causing flooding damage

This content was published on The Faverges glacial lake on the Plaine Morte above Lenk in the Bernese Oberland drained over the weekend. There was no flooding or damage, the authorities announced on Monday.

Read more: Swiss glacier lake drains without causing flooding damage
Swiss managers distance themselves from the USA

More

Swiss bosses distance themselves from the United States

This content was published on According to a survey of managers, Swiss companies are increasingly turning away from the USA and orientating themselves more towards Southeast Asia and the EU. This is the result of a recent survey.

Read more: Swiss bosses distance themselves from the United States
A live scarecrow competition to thrill Denens (VD)

More

Live scarecrow competition set to thrill Swiss

This content was published on The Fête de l'Epouvantail (scarecrow festival) is celebrating its 30th anniversary and returns to Denens in canton Vaud for its seventh edition from July 10-20.

Read more: Live scarecrow competition set to thrill Swiss

In compliance with the JTI standards

More: SWI swissinfo.ch certified by the Journalism Trust Initiative

You can find an overview of ongoing debates with our journalists here . Please join us!

If you want to start a conversation about a topic raised in this article or want to report factual errors, email us at english@swissinfo.ch.

SWI swissinfo.ch - a branch of Swiss Broadcasting Corporation SRG SSR

SWI swissinfo.ch - a branch of Swiss Broadcasting Corporation SRG SSR