[BUG] AttributeError: 'RobertaTokenizer' object has no attribute 'max_len' #309

FirstGalacticEmpire · 2022-07-11T11:56:35Z

args = Box({ "seed": 42, "task_name": 'Medical_language_modelling', "model_name": 'roberta-base', "model_type": 'roberta', "train_batch_size": 16, "learning_rate": 4e-5, "num_train_epochs": 20, "fp16": True, "fp16_opt_level": "O2", "warmup_steps": 1000, "logging_steps": 0, "max_seq_length": 512, "multi_gpu": True if torch.cuda.device_count() > 1 else False })

databunch_lm = BertLMDataBunch.from_raw_corpus( data_dir=Path("./raw_text/"), text_list=list_of_files, tokenizer=args.model_name, batch_size_per_gpu=args.train_batch_size, max_seq_length=args.max_seq_length, multi_gpu=args.multi_gpu, model_type=args.model_type, logger=logger)

When running the following line I get the following error:
"AttributeError: 'RobertaTokenizer' object has no attribute 'max_len'"
Which I suspect is due to update, that caused the RobertaTokenizer to lost its attribute max_len.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] AttributeError: 'RobertaTokenizer' object has no attribute 'max_len' #309

[BUG] AttributeError: 'RobertaTokenizer' object has no attribute 'max_len' #309

FirstGalacticEmpire commented Jul 11, 2022

[BUG] AttributeError: 'RobertaTokenizer' object has no attribute 'max_len' #309

[BUG] AttributeError: 'RobertaTokenizer' object has no attribute 'max_len' #309

Comments

FirstGalacticEmpire commented Jul 11, 2022