Learn how researchers are teaching AI to forget—paving the way for smarter, more efficient, and privacy-focused models ready to tackle specialized tasks.
Overview of our black-box forgetting framework. The confidence of each class is computed as the similarity with the image and class (text) embeddings from the black-box pre-trained vision-language model (e.g., CLIP). The obtained confidence is used to compute the respective loss functions for the classes to be forgotten and the classes to be memorized. (a) For the classes to be forgotten, maximize the entropy of the confidence so that the accuracy is reduced. (b) For the classes to be memorized, minimize the cross-entropy loss to retain the accuracy. These two objective are jointly optimized to tune the learnable text prompt. The gradients of the objective are not available when the model is black-box. We therefore use CMA-ES [Hansen et al., 2003], a derivative-free optimizer, to learn the text prompt. Instead of directly optimizing the original high-dimensional context (token) embeddings for the prompt, our method learns lower-dimensional latent contexts for mitigating the difficulty of high-dimensional optimization.
Overview of our black-box forgetting framework. The confidence of each class is computed as the similarity with the image and class (text) embeddings from the black-box pre-trained vision-language model (e.g., CLIP). The obtained confidence is used to compute the respective loss functions for the classes to be forgotten and the classes to be memorized. (a) For the classes to be forgotten, maximize the entropy of the confidence so that the accuracy is reduced. (b) For the classes to be memorized, minimize the cross-entropy loss to retain the accuracy. These two objective are jointly optimized to tune the learnable text prompt. The gradients of the objective are not available when the model is black-box. We therefore use CMA-ES [Hansen et al., 2003], a derivative-free optimizer, to learn the text prompt. Instead of directly optimizing the original high-dimensional context (token) embeddings for the prompt, our method learns lower-dimensional latent contexts for mitigating the difficulty of high-dimensional optimization.
*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.
The capabilities of large-scale pre-trained AI models have recently skyrocketed, as demonstrated by large-scale vision-language models like CLIP or ChatGPT. These typical generalist models can perform reasonably well in tasks covering a large variety of fields, which has paved the way for their widespread adoption by the public. However, such versatility no doubt comes at a cost.
Training and operating large-scale models consume extreme amounts of energy and time, which goes against sustainability goals and limits the types of computers they can be deployed on. Moreover, in many practical applications, people want AI models to fulfill specific roles rather than be jacks of all trades. In such cases, a model's generalist capabilities might be useless and even counter-productive, reducing accuracy. Additionally, as highlighted in the study, retaining unnecessary classes in such models can lead to operational risks, such as inadvertent information leakage. Could there be a way to leverage large-scale pre-trained models more efficiently by having them 'forget' unnecessary information?
In a recent paper that will be presented in Neural Information Processing Systems (NeurIPS 2024), a research team led by Associate Professor Go Irie from Tokyo University of Science (TUS), Japan, sought to tackle this problem. They developed a methodology dubbed "black-box forgetting," by which one can iteratively optimize the text prompts presented to a black-box vision-language classifier model to have it selectively 'forget' some of the classes it can recognize. Co-authors of this study included Mr. Yusuke Kuwana and Mr. Yuta Goto, both from TUS, as well as Dr. Takashi Shibata from NEC Corporation.
"In practical applications, the classification of all kinds of object classes is rarely required. For example, in an autonomous driving system, it would be sufficient to recognize limited classes of objects such as cars, pedestrians, and traffic signs. We would not need to recognize food, furniture, or animal species," explains Dr. Irie. He adds that retaining irrelevant classes can degrade the overall performance and efficiency of models, especially in resource-constrained systems. "Retaining the classes that do not need to be recognized may decrease overall classification accuracy, as well as cause operational disadvantages such as the waste of computational resources and the risk of information leakage."
Although some methods for selective forgetting in pre-trained models do exist, these assume a white-box setting where the user has access to the internal parameters and architecture of the model. More often than not, users deal with black boxes; they do not have access to the model itself or most of its information due to commercial or ethical reasons. Thus, the researchers had to employ a so-called derivative-free optimization strategy—one that does not require access to the model's gradients.
To this end, they extended a method known as CMA-ES, with the image classifier model CLIP as the target model for this study. This evolutionary algorithm involves sampling various candidate prompts to feed to the model and evaluating the results via predefined objective functions, updating a multivariate distribution based on the calculated values.
However, the performance of derivative-free optimization techniques deteriorates quickly for large-scale problems. As more classes need to be forgotten, the 'latent context' used to optimize the input prompts grows to unmanageable sizes. To address this issue, the research team devised a new parametrization technique called 'latent context sharing.' This approach involves decomposing latent context derived from prompts into various smaller elements, considered 'unique' to a prompt token or 'shared' between multiple tokens. This innovative decomposition ensures that even high-dimensional optimization problems can be tackled effectively without significant loss in performance.
The researchers validated their approach using several benchmark image classification datasets, such as CIFAR-10, CIFAR-100, and ImageNet30, trying to get CLIP to 'forget' 40% of the classes in a given dataset. This marks the first study in which the goal is to have a pre-trained vision-language model fail to recognize specific classes under black-box conditions, and, based on reasonable performance baselines, the results were very promising. Notably, the researchers' method demonstrated a robust balance between forgetting the targeted classes and retaining accuracy for the remaining ones, which previous approaches struggled to achieve.
This innovative method has important implications in artificial intelligence and machine learning. It could help large-scale models perform better in specialized tasks, extending their already astounding applicability. Another use, for example, would be to prevent image-generation models from producing undesirable content by having them forget specific visual contexts.
In addition, the proposed method could help tackle privacy issues, a rising concern in the field. "If a service provider is asked to remove certain information from a model, this can be accomplished by retraining the model from scratch by removing problematic samples from the training data. However, retraining a large-scale model consumes enormous amounts of energy," says Dr. Irie. The black-box forgetting approach provides a scalable, energy-efficient alternative that aligns with growing global demands for more sustainable AI solutions. "Selective forgetting, or so-called machine unlearning, may provide an efficient solution to this problem." In other words, it could help develop solutions for protecting the so-called "Right to be Forgotten," a particularly sensitive topic in healthcare and finances.
While this study represents a significant step forward, the researchers acknowledge that further improvements are necessary, especially for scenarios where even partial access to context embeddings is unavailable. These advancements could enable broader adoption in real-world applications.
This groundbreaking approach empowers large-scale AI models and safeguards end users, paving the way for seamless integration of AI into our daily lives!
*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.
Tokyo University of Science
Journal reference: