OPC: One-Point-Contraction Unlearning Toward Deep Feature Forgetting

Recovery attack recovers almost all unlearned models and OPC shows the resistance

Abstract

Machine unlearning seeks to remove the influence of particular data or class from trained models to meet privacy, legal, or ethical requirements. Existing unlearning methods tend to forget shallowly: phenomenon of an unlearned model pretend to forget by adjusting only the model response, while its internal representations retain information sufficiently to restore the forgotten data or behavior. We empirically confirm the widespread shallowness by reverting the forgetting effect of various unlearning methods via training-free performance recovery attack and gradient-inversion-based data reconstruction attack. To address this vulnerability fundamentally, we define a theoretical criterion of ``deep forgetting’ based on one-point-contraction of feature representations of data to forget. We also propose an efficient approximation algorithm, and use it to construct a novel general-purpose unlearning algorithm: One-Point-Contraction (OPC). Empirical evaluations on image classification unlearning benchmarks show that OPC achieves not only effective unlearning performance but also superior resilience against both performance recovery attack and gradient-inversion attack. The distinctive unlearning performance of OPC arises from the deep feature forgetting enforced by its theoretical foundation, and recaps the need for improved robustness of machine unlearning methods.

Jaeheun Jung
Jaeheun Jung
Ph.D. Candidate

Inventing AI methods using mathematics

Bosung Jung
Bosung Jung
M.S. Student

Completing coursework at all the schools within the Korea Central Education Institute.

Suhyun Bae
Suhyun Bae
M.S. Student

The cake is not a lie!

Donghun Lee
Donghun Lee
Assistant Professor

Connecting artificial intelligence and mathematics, in both directions.