Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Saving Tokenizer using pretrained format #3

Open
widyaputeriaulia10 opened this issue Jul 27, 2023 · 0 comments
Open

Saving Tokenizer using pretrained format #3

widyaputeriaulia10 opened this issue Jul 27, 2023 · 0 comments

Comments

@widyaputeriaulia10
Copy link

Hi! I want to re-built the model for Question Answering, but when i want to save the model using .save_pretrained() using this code:
tokenizer.save_pretrained('./model')

And There was Error :
NotImplementedError Traceback (most recent call last)
Cell In[4], line 1
----> 1 tokenizer.save_pretrained('./model')

File /usr/local/lib/python3.11/site-packages/transformers/tokenization_utils_base.py:2174, in PreTrainedTokenizerBase.save_pretrained(self, save_directory, legacy_format, filename_prefix, push_to_hub, **kwargs)
2170 logger.info(f"Special tokens file saved in {special_tokens_map_file}")
2172 file_names = (tokenizer_config_file, special_tokens_map_file)
-> 2174 save_files = self._save_pretrained(
2175 save_directory=save_directory,
2176 file_names=file_names,
2177 legacy_format=legacy_format,
2178 filename_prefix=filename_prefix,
2179 )
2181 if push_to_hub:
2182 self._upload_modified_files(
2183 save_directory,
2184 repo_id,
(...)
2187 token=kwargs.get("use_auth_token"),
2188 )

File /usr/local/lib/python3.11/site-packages/transformers/tokenization_utils_base.py:2222, in PreTrainedTokenizerBase._save_pretrained(self, save_directory, file_names, legacy_format, filename_prefix)
2219 f.write(out_str)
2220 logger.info(f"added tokens file saved in {added_tokens_file}")
-> 2222 vocab_files = self.save_vocabulary(save_directory, filename_prefix=filename_prefix)
2224 return file_names + vocab_files + (added_tokens_file,)

File /usr/local/lib/python3.11/site-packages/transformers/tokenization_utils_base.py:2242, in PreTrainedTokenizerBase.save_vocabulary(self, save_directory, filename_prefix)
2226 def save_vocabulary(self, save_directory: str, filename_prefix: Optional[str] = None) -> Tuple[str]:
2227 """
2228 Save only the vocabulary of the tokenizer (vocabulary + added tokens).
2229
(...)
2240 Tuple(str): Paths to the files saved.
2241 """
-> 2242 raise NotImplementedError

NotImplementedError:

Do you have any suggestion to tackle the issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant