-
Notifications
You must be signed in to change notification settings - Fork 10
Open
Description
The core argument of your article is that after feature distillation the pre-trained model exhibits properties similar to those of the MIM model. But the problem is where do the properties come from?
For instance, MIM models have the locality because of the masked modeling mechanism. Otherwise, you use the same augmentation view for the teacher and student. So it's quite confusing where these properties come from.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels