Abstract: Machine unlearning has emerged as an important research area to comply with recent data regulations. As complete model retraining is impractical with LLMs, previous works focused on approximate methods to reduce privacy risks. We propose Pathway to Optimal Parameters, a novel unlearning method that further improves on previous methods by utilizing optimal gradient updates to the parameters. Our method provides necessary privacy guarantees, and demonstrate little to no performance degradation post unlearning in language model tasks. Furthermore, we detail Soft Memorization Accuracy, a more stringent unlearning metric for language models, and validate its effectiveness through both qualitative and quantitative analysis.
Paper Type: short
Research Area: Ethics, Bias, and Fairness
Contribution Types: NLP engineering experiment
Languages Studied: English
0 Replies
Loading