You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been attempting to fine-tune a GPT-2 base model using Adapter from OpenDelta. While training the model, I came across this error: element 0 of tensors does not require grad and does not have a grad_fn. Upon investigating the source of the error, I discovered that it occurs after calling the .train() function of the model. Any suggestions on how to resolve this?
Code : model= GPT2LMHeadModel.from_pretrained('gpt2',device_map=device) tokenizer = GPT2Tokenizer.from_pretrained("gpt2") tokenizer.add_tokens(['<p>']) model.resize_token_embeddings(len(tokenizer)) # Resizing the embedding layer model.gradient_checkpointing_enable() delta_model = AdapterModel(model,bottleneck_dim = 32) delta_model.freeze_module(exclude=["deltas"]) delta_model.log() optimizer = torch.optim.Adam(model.parameters(),lr=1e-4) optimizer.zero_grad() model.train() # Causing the error text = "Random str" input_ids = tokenizer(text, return_tensors='pt') out = model(input_ids['input_ids'].to(device),attention_mask =input_ids['attention_mask'].to(device), labels = input_ids['input_ids'].to(device)) out.loss.backward()
The text was updated successfully, but these errors were encountered:
I've been attempting to fine-tune a GPT-2 base model using Adapter from OpenDelta. While training the model, I came across this error: element 0 of tensors does not require grad and does not have a grad_fn. Upon investigating the source of the error, I discovered that it occurs after calling the .train() function of the model. Any suggestions on how to resolve this?
Code :
model= GPT2LMHeadModel.from_pretrained('gpt2',device_map=device) tokenizer = GPT2Tokenizer.from_pretrained("gpt2") tokenizer.add_tokens(['<p>']) model.resize_token_embeddings(len(tokenizer)) # Resizing the embedding layer model.gradient_checkpointing_enable() delta_model = AdapterModel(model,bottleneck_dim = 32) delta_model.freeze_module(exclude=["deltas"]) delta_model.log() optimizer = torch.optim.Adam(model.parameters(),lr=1e-4) optimizer.zero_grad() model.train() # Causing the error text = "Random str" input_ids = tokenizer(text, return_tensors='pt') out = model(input_ids['input_ids'].to(device),attention_mask =input_ids['attention_mask'].to(device), labels = input_ids['input_ids'].to(device)) out.loss.backward()
The text was updated successfully, but these errors were encountered: