Table of contents

  • This session has been presented February 03, 2023.

Description

  • Speaker

    Maura Pintor (PRA Lab, University of Cagliari)

To understand the sensitivity under attacks and to develop defense mechanisms, machine-learning model designers craft worst-case adversarial perturbations with gradient-descent optimization algorithms against the model under evaluation. However, many of the proposed defenses have been shown to provide a false sense of robustness due to failures of the attacks, rather than actual improvements in the machine‐learning models’ robustness, as highlighted by more rigorous evaluations. Although guidelines and best practices have been suggested to improve current adversarial robustness evaluations, the lack of automatic testing and debugging tools makes it difficult to apply these recommendations in a systematic and automated manner. To this end, the analysis of failures in the optimization of adversarial attacks is the only valid strategy to avoid repeating mistakes of the past.

Next sessions

  • Towards More Secure Large Language Models

    • June 12, 2026 (11:00 - 12:00)

    • Inria Center of the University of Rennes - Petri/Turing room

    Speaker : Raouf Kerkouche - Inria Lille

    Large Language Models (LLMs) have achieved considerable success and are now widely used across multiple domains, highlighting their transformative impact on both technology and society. However, this widespread adoption also exposes LLMs to numerous security threats that can alter model behavior or degrade overall performance. To mitigate these threats, most research has focused on alignment[…]
    • Machine learning

Show previous sessions