Table of contents

  • This session has been presented June 12, 2026 (11:00 - 12:00).

Description

  • Speaker

    Raouf Kerkouche - Inria Lille

Large Language Models (LLMs) have achieved considerable success and are now widely used across multiple domains, highlighting their transformative impact on both technology and society. However, this widespread adoption also exposes LLMs to numerous security threats that can alter model behavior or degrade overall performance. To mitigate these threats, most research has focused on alignment techniques such as supervised fine-tuning, reinforcement learning from human feedback, and input/output filtering. However, these approaches remain insufficient, and it is still possible to circumvent existing safeguards. In this presentation, I will present examples of attacks that models face, as well as the defenses currently in place, discuss why existing mitigation strategies fall short, and finally present several research directions that I am currently investigating to improve the security of AI systems.

Previous sessions

Show previous sessions