Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: Class values must be smaller than num_classes. | protein_mpnn_utils.py & mask_size issue? #99

Open
emilyrkang opened this issue Apr 4, 2024 · 0 comments

Comments

@emilyrkang
Copy link

I'm getting the following RuntimeError when trying to run ProteinMPNN on a windows machine with Python 3.7. The method I'm using works with the example 6 inputs, but when I try to use my own protein structure 4rjj, I get the runtime error: RuntimeError: Class values must be smaller than num_classes. I've tried using the biological assembly downloaded directly from the pdb and removing the ligands and all non "ATOM" lines from the structure but I still get this error message.

The following command works, but I would like to model it as a homooligomer and fix some residues.

py protein_mpnn_run.py --path_to_model_weights "C:\ProteinMPNN\vanilla_model_weights" --pdb_path 4rjj.pdb --pdb_path_chains "A B C D" --out_folder "C:\ProteinMPNN\myoutputs\4rjj" --num_seq_per_target 20 --sampling_temp "0.1 0.2 0.3" --batch_size 1 --omit_AAs='XC'

Also, when I use the helper scripts, I need to remove the path ("C:\ProteinMPNN\my_input_PDBS\" in the text below) from the jsonl files or I get the following error message: OSError: [Errno 22] Invalid argument: 'my_outputs_directory//seqs/C:\ProteinMPNN\my_input_PDBS\4rjj.fa'

Here is my complete output:

C:\ProteinMPNN> py protein_mpnn_run.py --jsonl_path "C:\ProteinMPNN\outputs\example_6_outputs\parsed_pdbs.jsonl" --tied_positions_jsonl "C:\ProteinMPNN\outputs\example_6_outputs\tied_pdbs.jsonl" --path_to_model_weights "C:\ProteinMPNN\vanilla_model_weights" --out_folder "my_outputs_directory" --num_seq_per_target 4 --sampling_temp "0.1 0.2 0.3" --batch_size 1 --omit_AAs='XC'

chain_id_jsonl is NOT loaded

fixed_positions_jsonl is NOT loaded

pssm_jsonl is NOT loaded

omit_AA_jsonl is NOT loaded

bias_AA_jsonl is NOT loaded

bias by residue dictionary is not loaded, or not provided

discarded {'bad_chars': 0, 'too_long': 0, 'bad_seq_length': 0}

Number of edges: 48
Training noise level: 0.2A
Generating sequences for: 6EHB
12 sequences of length 960 generated in 75.8997 seconds
Generating sequences for: 4GYT
12 sequences of length 354 generated in 52.1981 seconds

C:\ProteinMPNN>py protein_mpnn_run.py --jsonl_path "C:\ProteinMPNN\myparsedfilesetc\parsed_pdbs.jsonl" --tied_positions_jsonl "C:\ProteinMPNN\myparsedfilesetc\tied_pdbs.jsonl" --path_to_model_weights "C:\ProteinMPNN\vanilla_model_weights" --out_folder "my_outputs_directory" --num_seq_per_target 4 --sampling_temp "0.1 0.2 0.3" --batch_size 1 --omit_AAs='XC'

chain_id_jsonl is NOT loaded

fixed_positions_jsonl is NOT loaded

pssm_jsonl is NOT loaded

omit_AA_jsonl is NOT loaded

bias_AA_jsonl is NOT loaded

bias by residue dictionary is not loaded, or not provided

discarded {'bad_chars': 0, 'too_long': 0, 'bad_seq_length': 0}

Number of edges: 48
Training noise level: 0.2A
Generating sequences for: 4rjj
Traceback (most recent call last):
File "protein_mpnn_run.py", line 469, in
main(args)
File "protein_mpnn_run.py", line 331, in main
sample_dict = model.tied_sample(X, randn_2, S, chain_M, chain_encoding_all, residue_idx, mask=mask, temperature=temp, omit_AAs_np=omit_AAs_np, bias_AAs_np=bias_AAs_np, chain_M_pos=chain_M_pos, omit_AA_mask=omit_AA_mask, pssm_coef=pssm_coef, pssm_bias=pssm_bias, pssm_multi=args.pssm_multi, pssm_log_odds_flag=bool(args.pssm_log_odds_flag), pssm_log_odds_mask=pssm_log_odds_mask, pssm_bias_flag=bool(args.pssm_bias_flag), tied_pos=tied_pos_list_of_lists_list[0], tied_beta=tied_beta, bias_by_res=bias_by_res_all)
File "C:\ProteinMPNN\protein_mpnn_utils.py", line 1218, in tied_sample
permutation_matrix_reverse = torch.nn.functional.one_hot(decoding_order, num_classes=mask_size).float()
RuntimeError: Class values must be smaller than num_classes.

C:\ProteinMPNN> py protein_mpnn_run.py --jsonl_path "C:\ProteinMPNN\myparsedfilesetc\parsed_pdbs.jsonl" --tied_positions_jsonl "C:\ProteinMPNN\myparsedfilesetc\tied_pdbs.jsonl" --path_to_model_weights "C:\ProteinMPNN\vanilla_model_weights" --out_folder "my_outputs_directory" --num_seq_per_target 4 --sampling_temp "0.1 0.2 0.3" --batch_size 1 --omit_AAs='XC'

chain_id_jsonl is NOT loaded

fixed_positions_jsonl is NOT loaded

pssm_jsonl is NOT loaded

omit_AA_jsonl is NOT loaded

bias_AA_jsonl is NOT loaded

bias by residue dictionary is not loaded, or not provided

discarded {'bad_chars': 0, 'too_long': 0, 'bad_seq_length': 0}

Number of edges: 48
Training noise level: 0.2A
Generating sequences for: C:\ProteinMPNN\my_input_PDBS\4rjj
Traceback (most recent call last):
File "protein_mpnn_run.py", line 469, in
main(args)
File "protein_mpnn_run.py", line 323, in main
with open(ali_file, 'w') as f:
OSError: [Errno 22] Invalid argument: 'my_outputs_directory//seqs/C:\ProteinMPNN\my_input_PDBS\4rjj.fa'

C:\ProteinMPNN> py protein_mpnn_run.py --jsonl_path "C:\ProteinMPNN\myparsedfilesetc\parsed_pdbs.jsonl" --tied_positions_jsonl "C:\ProteinMPNN\myparsedfilesetc\tied_pdbs.jsonl" --path_to_model_weights "C:\ProteinMPNN\vanilla_model_weights" --out_folder "my_outputs_directory" --num_seq_per_target 4 --sampling_temp "0.1 0.2 0.3" --batch_size 1 --omit_AAs='XC'

chain_id_jsonl is NOT loaded

fixed_positions_jsonl is NOT loaded

pssm_jsonl is NOT loaded

omit_AA_jsonl is NOT loaded

bias_AA_jsonl is NOT loaded

bias by residue dictionary is not loaded, or not provided

discarded {'bad_chars': 0, 'too_long': 0, 'bad_seq_length': 0}

Number of edges: 48
Training noise level: 0.2A
Generating sequences for: 4rjj
Traceback (most recent call last):
File "protein_mpnn_run.py", line 469, in
main(args)
File "protein_mpnn_run.py", line 331, in main
sample_dict = model.tied_sample(X, randn_2, S, chain_M, chain_encoding_all, residue_idx, mask=mask, temperature=temp, omit_AAs_np=omit_AAs_np, bias_AAs_np=bias_AAs_np, chain_M_pos=chain_M_pos, omit_AA_mask=omit_AA_mask, pssm_coef=pssm_coef, pssm_bias=pssm_bias, pssm_multi=args.pssm_multi, pssm_log_odds_flag=bool(args.pssm_log_odds_flag), pssm_log_odds_mask=pssm_log_odds_mask, pssm_bias_flag=bool(args.pssm_bias_flag), tied_pos=tied_pos_list_of_lists_list[0], tied_beta=tied_beta, bias_by_res=bias_by_res_all)
File "C:\ProteinMPNN\protein_mpnn_utils.py", line 1218, in tied_sample
permutation_matrix_reverse = torch.nn.functional.one_hot(decoding_order, num_classes=mask_size).float()
RuntimeError: Class values must be smaller than num_classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant