Skip to content

parameter batch_size vs max_length vs batcher.size #8600

Discussion options

You must be logged in to vote

For OOM errors, the main settings to adjust are nlp.batch_size and training.batcher.size.

nlp.batch_size affects the default batch size during the evaluation steps (and also the default batch size during future use of the pipeline in general with nlp.pipe). It will be faster if it's higher, but you can run out of memory, usually a lot sooner on GPU. The right setting here always depends on how much memory you have and the document lengths.

training.batcher controls the batch size during the training steps. Here you want to lower size, but for other batchers you should look up the details based on the function:

https://spacy.io/api/top-level#batchers

If it only shows up rarely, you can ign…

Replies: 1 comment 3 replies

Comment options

You must be logged in to vote
3 replies
@Marien-r
Comment options

@rs-pawanmethre
Comment options

@Arjun2905
Comment options

Answer selected by svlandeg
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat / config Feature: Training config faq Frequently asked questions and solutions.
4 participants