-
Notifications
You must be signed in to change notification settings - Fork 0
[ICLR 2025] Code&Data for the paper "Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization"
License
keven980716/weak-to-strong-deception
ErrorLooks like something went wrong!
About
[ICLR 2025] Code&Data for the paper "Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization"
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published