-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SparseSpacyFeaturizer #29
Comments
It's probably best to wait until spaCy 3.0 before adding this one. |
We might also just start with |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
If you have a look at all the attributes that spaCy generates for their tokens then you can imagine that some of these features can be useful for machine learning pipelines. To name a few:
is_oov
: is the token part of the vocabulary/does it have a vector?is_stop
: is the token a stopword?lemma_
: what is the lemma of the tokenpos
/tag
coarse/fine-grained part of speech informationThese can all have a discrete representation and could be added in general to a Rasa pipeline.
The text was updated successfully, but these errors were encountered: