Coreferee v1.2.0
Removed dependencies to TensorFlow and Keras, switching to Thinc as the neural network platform. Switching to Thinc has led to serialized models that are around 30% of the size of the old models, and has also allowed the old limitation to be removed where nlp.pipe()
could not be called with n_process > 1
with forked processes.
Implemented a softmax layer to select the best potential referent for each anaphor as opposed to calculating independent scores for each pair.
Added matrix tests to support a variety of Python and spaCy versions, including spaCy 3.2 and spaCy 3.3.
Implemented a stable-random split into train and test corpora as opposed to using the last 20% of loaded documents as the test corpus.
Improved the training script so that it remembers the model state at each epoch and chooses the best-performing state from the training history as the model to save.
Added the coreferee check command to enable performance measurement for an existing Coreferee model with a new spaCy model.