Towards More Secure Large Language Models

This module requires Javascript to be enabled on your browser to work properly.

Description

Speaker

Raouf Kerkouche - Inria Lille

Large Language Models (LLMs) have achieved considerable success and are now widely used across multiple domains, highlighting their transformative impact on both technology and society. However, this widespread adoption also exposes LLMs to numerous security threats that can alter model behavior or degrade overall performance. To mitigate these threats, most research has focused on alignment techniques such as supervised fine-tuning, reinforcement learning from human feedback, and input/output filtering. However, these approaches remain insufficient, and it is still possible to circumvent existing safeguards. In this presentation, I will present examples of attacks that models face, as well as the defenses currently in place, discuss why existing mitigation strategies fall short, and finally present several research directions that I am currently investigating to improve the security of AI systems.

Practical infos

Date

June 12, 2026 (11:00 - 12:00)
Location

Inria Center of the University of Rennes - Petri/Turing room
Registration

Go to registration form
Suggest a presentation

Go to suggestion form
Add this presentation to my calendar
Video meet

BigBlueButton https://webinaire.numerique.gouv.fr/meeting/signin/invite/43499/creator/22714/hash/2504b153fb57548dd70dc97a77c3fe579bd445cb

Next sessions

Opening Pandora's Box: White-Box Attacks on Microsoft's PhotoDNA Perceptual Hash Function
- June 05, 2026 (11:00 - 12:00)
- Inria Center of the University of Rennes - Aurigny room
Speaker : Diane Leblanc-Albarel - KU Leuven

PhotoDNA is a widely deployed perceptual hash function used for the detection of illicit content such as Child Sexual Abuse Material (CSAM). In this talk, I will present our paper introducing the first mathematical description of Alleged PhotoDNA, a function that reproduces the outputs of PhotoDNA. Our analysis reveals several structural weaknesses: the function is piece-wise linear and[…]
- Cryptography
- Privacy

Show previous sessions

Towards More Secure Large Language Models

Table of contents

Table of contents

Description

Practical infos

Next sessions

Opening Pandora's Box: White-Box Attacks on Microsoft's PhotoDNA Perceptual Hash Function