Description
Perceptual hash functions identify multimedia content by mapping similar inputs to similar outputs. They are widely used for detecting copyright violations and illegal content but lack transparency, as their design details are typically kept secret. Governments are considering extending the application of these functions to Client-Side Scanning (CSS) for end-to-end encrypted services: multimedia content would be verified against known illegal content before applying encryption. In 2021, Apple presented a detailed proposal for CSS based on the NeuralHash perceptual hash function. After strong criticism pointing out privacy and security concerns, Apple has withdrawn the proposal, but the NeuralHash software is still present on Apple devices. Brute force collisions for NeuralHash (with a 96-bit result) require $2^{48}$ evaluations. Shortly after the publication of NeuralHash, it was demonstrated that it is easy to craft two colliding inputs for NeuralHash that are perceptually dissimilar. In the context of CSS, this means that it is easy to falsely incriminate someone by sending an innocent picture with the same hash value as illegal content.
This work shows a more serious weakness: when inputs are restricted to a set of human faces, random collisions are highly likely to occur in input sets of size $2^{16}$. Unlike the targeted attack, our attacks are black-box attacks: they do not require knowledge of the design of the perceptual hash functions. In addition, we show that the false negative rate is high as well. We demonstrate the generality of our approach by applying a similar attack to PhotoDNA, a widely deployed perceptual hash function proposed by Microsoft with a hash result of 1152 bits. Here we show that specific small input sets result in near-collisions, with similar impact. These results imply that the current designs of perceptual hash function are completely unsuitable for large-scale client scanning, as they would result in an unacceptably high false positive rate. This work underscores the need to reassess the security and feasibility of perceptual hash functions, particularly for large-scale applications where privacy risks and false positives have serious consequences.
Next sessions
-
Vers l’émergence d’un droit européen pour la Blockchain : Une approche sous l’angle de la Privacy et de l’encadrement des crypto-actifs
Speaker : Damien Franchi - Univ Rennes, IODE
La Blockchain, technologie derrière Bitcoin, fait l’objet d’un encadrement juridique de plusen plus important, en particulier de la part de l’Union européenne. Curieusement, le mot« Blockchain » n’apparaît pas dans les textes l’encadrant. Les expressions « technologie deregistres distribués » (Distributed ledger technology, DLT), ou, parfois, « registreélectronique » lui sont plutôt privilégiées.[…]-
SoSysec
-
Law
-
-
Blockchain and digital currencies: between European regulation and technological challenges
Speaker : Loïc Miller - CentraleSupélec
As the European Union develops a legal framework for crypto-assets and data protection, the technological question underlying the emergence of a genuine digital currency remains open. Blockchain today stands as an interdisciplinary field of study at the crossroads of computer science, economics, and law. This presentation will place the ongoing regulatory framework in perspective with the[…]-
SoSysec
-
Distributed systems
-
-
Hardware-Software Co-Designs for Microarchitectural Security
Speaker : Lesly-Ann Daniel - EURECOM
Microarchitectural optimizations, such as caches and speculative out-of-order execution, are essential for achieving high performance. However, these same mechanisms also open the door to attacks that can undermine software-enforced security policies. The current gold standard for defending against such attacks is the constant-time programming discipline, which prohibits secret-dependent control[…]-
SoSysec
-
Hardware/software co-design
-
Micro-architectural vulnerabilities
-