Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intended usage of num_special_tokens? #14

Open
LLYX opened this issue Aug 10, 2022 · 2 comments
Open

Intended usage of num_special_tokens? #14

LLYX opened this issue Aug 10, 2022 · 2 comments

Comments

@LLYX
Copy link

LLYX commented Aug 10, 2022

From what I understand, these are supposed to be reserved for oov values. Is the intended usage to set oov values in the input to some negative number and overwrite the offset? That is what it seems like it would take to achieve the desired outcome, but also seems somewhat confusing and clunky to do. Or perhaps I am misunderstanding its purpose? Thanks!

@lucidrains
Copy link
Owner

@LLYX it was originally designed for self-supervised learning with electra, for adding masking token etc

@lucidrains
Copy link
Owner

@LLYX i should probably just remove it though, since that line of research never took off

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants