-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix FastText model reading of unsupported modes from Facebook's FastText #3222
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -807,6 +807,13 @@ def _load_fasttext_format(model_file, encoding='utf-8', full_model=True): | |||||
with utils.open(model_file, 'rb') as fin: | ||||||
m = gensim.models._fasttext_bin.load(fin, encoding=encoding, full_model=full_model) | ||||||
|
||||||
# Here we are checking for unsupported FB FT Modes | ||||||
if m.loss != 1 and m.loss != 2: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. According to #3223, negative sampling is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi @piskvorky,
This suggests that I have made a list of what's supported and what's not based on #3223 . Please let me know if they are right, ill make changes as per that: Supported:
b. Model
Unsupported:
b. Model:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, I don't really understand that list. If the check is I don't know the fasttext parameters, so don't ask me :) Can you please check the fasttext docs or code (not gensim) for what the parameters actually are? And then make changes here to match that. Thanks. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Hi yes you are right. We shouldn't be checking
Hi, yes sure I can investigate that and ill create a PR accordingly. Also if you can provide the link for fasttext docs that would be great. (just to be sure that I'm referring the right docs). Thanks! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, please do. This PR seems incorrect so I'm removing the "Next release" tag – we need to fix it first, make the code correct. Fasttext code is at https://github.com/facebookresearch/fastText. Thanks! |
||||||
raise ValueError("Loss paramter value can be either 1 (for Hierarchical Softmax) or 2 (for Negative Sampling)") | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
elif m.model != 1 and m.model != 2: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
raise ValueError( | ||||||
"Model parameter value can be either 1 (for Continous Bag of Words model) or 2 (for Skip-gram model)") | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
model = FastText( | ||||||
vector_size=m.dim, | ||||||
window=m.ws, | ||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.