Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help finding the cause of different Inference results from the original python script #476

Open
mazksr opened this issue Dec 20, 2024 · 4 comments

Comments

@mazksr
Copy link

mazksr commented Dec 20, 2024

I'm trying to rewrite an inference script for a fine-tuned BERT model i made with python into rust, here's how i save my model:

indobert-finetuned
├── config.json
├── model.safetensors
├── special_tokens_map.json
├── tokenizer.json
├── tokenizer_config.json
├── training_args.bin
└── vocab.txt

The python inference script:

device = 'cuda' if torch.cuda.is_available() else 'cpu'
model_path = "indobert-finetuned/"

tokenizer = AutoTokenizer.from_pretrained(model_path) 
model = AutoModelForSequenceClassification.from_pretrained(model_path) 

#Tokenize input
inputs = tokenizer("anjing", return_tensors='pt', padding=True, truncation=False).to(device)

#Move inputs to the same device as the model
inputs = {key: value.to(device) for key, value in inputs.items()}
inputs = inputs

#Perform inference
with torch.inference_mode():
    outputs = model(**inputs)

logits = outputs.logits
print(logits, model)

predicted_label = torch.softmax(logits, dim=-1).squeeze().cpu()

print(predicted_label.tolist())

Prints the following:

[
    0.0022081946954131126,
    0.9808889031410217,
    0.0035459366627037525,
    0.0007706195465289056,
    0.00043291927431710064,
    0.00036624292260967195,
    0.0012361736735329032,
    0.0010357820428907871,
    0.004077346995472908,
    0.004229736048728228,
    0.000952212605625391,
    0.00025596615159884095
  ]

Here's how i load the model for rust-bert:

let model_res = LocalResource{
  local_path: "indobert-finetuned/model.safetensors".into()
};

let sequence_classification_config = ZeroShotClassificationConfig::new(
  Bert, // model_type
  Torch(Box::new(model_res)), // model_resource
  LocalResource {
      local_path: "indobert-finetuned/config.json".into()
  }, // config_resource
  LocalResource {
      local_path: "indobert-finetuned/vocab.txt".into()
  }, // vocab_resource
  None, // merges_resource
  true, // lowercase
  None, // strip_accent
  None // add_prefix_space
);

// loading tokenizer.json & special_tokens_map.json
let tokenizer = TokenizerOption::from_hf_tokenizer_file(
  "indobert-finetuned/tokenizer.json",
  "indobert-finetuned/special_tokens_map.json"
).unwrap();

// zero shot classification pipeline
let sequence_classification_model = ZeroShotClassificationModel::new_with_tokenizer(
  sequence_classification_config,
  tokenizer
);

let input = [
  "anjing",
];

let labels = [...] // string array with 12 elements

// Run model inference
let output = sequence_classification_model.unwrap().predict_multilabel(
  &input,
  &labels,
  None,
  256
);

if let Ok(out) = output {
  let scores: Vec<_> = out[0].iter()
      .filter_map(|label| Some(label.score))
      .collect();

  println!("{:?}", scores);
}

Outputs:

[
0.6780490875244141, 
0.7056891918182373, 
0.6685047745704651, 
0.563891589641571, 
0.6616213917732239, 
0.6930138468742371, 
0.6496127247810364, 
0.6565861701965332, 
0.6566416621208191, 
0.7492241859436035, 
0.68653404712677, 
0.7110175490379333
]

Any help would be appreciated :).

@mazksr
Copy link
Author

mazksr commented Dec 21, 2024

Additional Info: I've checked the tokenized inputs from both codes and they're the same. So I'd probably have to check if the weights are loaded correctly. Any idea how i could do that?

@guillaume-be
Copy link
Owner

It seems you apply a softmax operation to the model output in the Python script, but use multilabel for the Rust version, which just passes the logits through a sigmoid layer

@mazksr
Copy link
Author

mazksr commented Dec 21, 2024

It seems you apply a softmax operation to the model output in the Python script, but use multilabel for the Rust version, which just passes the logits through a sigmoid layer

Oh yeah. It was sigmoid originally. Changed it because i saw softmax being used in zero_shot_classification.rs, the pipeline's source code.

Here's the output when passing the logits through sigmoid:

[
    0.04973661154508591,
    0.9587621688842773,
    0.07753138244152069,
    0.01793798990547657,
    0.01015706080943346,
    0.00860617682337761,
    0.028466375544667244,
    0.023962369188666344,
    0.08812660723924637,
    0.09112019836902618,
    0.022071700543165207,
    0.006030460353940725
  ]
predicted_label = torch.sigmoid(logits).squeeze().cpu()

As you can see it's basically the same.

@mazksr
Copy link
Author

mazksr commented Dec 21, 2024

In case anyone wants to reproduce: https://huggingface.co/mazksr/safespeak-bert

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants