You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some weights of Qwen2ForCausalLM were not initialized from the model checkpoint at ./output and are newly initialized because the shapes did not match:
model.layers.0.self_attn.k_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.0.self_attn.k_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.0.self_attn.o_proj.weight: found shape torch.Size([1792, 2048]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.0.self_attn.q_proj.bias: found shape torch.Size([2048]) in the checkpoint and torch.Size([1792]) in the model instantiated
model.layers.0.self_attn.q_proj.weight: found shape torch.Size([2048, 1792]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.0.self_attn.v_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.0.self_attn.v_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.1.self_attn.k_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.1.self_attn.k_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.1.self_attn.o_proj.weight: found shape torch.Size([1792, 2048]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.1.self_attn.q_proj.bias: found shape torch.Size([2048]) in the checkpoint and torch.Size([1792]) in the model instantiated
model.layers.1.self_attn.q_proj.weight: found shape torch.Size([2048, 1792]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.1.self_attn.v_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.1.self_attn.v_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.10.self_attn.k_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.10.self_attn.k_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.10.self_attn.o_proj.weight: found shape torch.Size([1792, 2048]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.10.self_attn.q_proj.bias: found shape torch.Size([2048]) in the checkpoint and torch.Size([1792]) in the model instantiated
model.layers.10.self_attn.q_proj.weight: found shape torch.Size([2048, 1792]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.10.self_attn.v_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.10.self_attn.v_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.11.self_attn.k_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.11.self_attn.k_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.11.self_attn.o_proj.weight: found shape torch.Size([1792, 2048]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.11.self_attn.q_proj.bias: found shape torch.Size([2048]) in the checkpoint and torch.Size([1792]) in the model instantiated
model.layers.11.self_attn.q_proj.weight: found shape torch.Size([2048, 1792]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.11.self_attn.v_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.11.self_attn.v_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.12.self_attn.k_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.12.self_attn.k_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.12.self_attn.o_proj.weight: found shape torch.Size([1792, 2048]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.12.self_attn.q_proj.bias: found shape torch.Size([2048]) in the checkpoint and torch.Size([1792]) in the model instantiated
model.layers.12.self_attn.q_proj.weight: found shape torch.Size([2048, 1792]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.12.self_attn.v_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.12.self_attn.v_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.13.self_attn.k_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.13.self_attn.k_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.13.self_attn.o_proj.weight: found shape torch.Size([1792, 2048]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.13.self_attn.q_proj.bias: found shape torch.Size([2048]) in the checkpoint and torch.Size([1792]) in the model instantiated
model.layers.13.self_attn.q_proj.weight: found shape torch.Size([2048, 1792]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.13.self_attn.v_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.13.self_attn.v_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.14.self_attn.k_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.14.self_attn.k_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.14.self_attn.o_proj.weight: found shape torch.Size([1792, 2048]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.14.self_attn.q_proj.bias: found shape torch.Size([2048]) in the checkpoint and torch.Size([1792]) in the model instantiated
model.layers.14.self_attn.q_proj.weight: found shape torch.Size([2048, 1792]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.14.self_attn.v_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.14.self_attn.v_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.15.self_attn.k_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.15.self_attn.k_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.15.self_attn.o_proj.weight: found shape torch.Size([1792, 2048]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.15.self_attn.q_proj.bias: found shape torch.Size([2048]) in the checkpoint and torch.Size([1792]) in the model instantiated
model.layers.15.self_attn.q_proj.weight: found shape torch.Size([2048, 1792]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.15.self_attn.v_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.15.self_attn.v_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.16.self_attn.k_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.16.self_attn.k_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.16.self_attn.o_proj.weight: found shape torch.Size([1792, 2048]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.16.self_attn.q_proj.bias: found shape torch.Size([2048]) in the checkpoint and torch.Size([1792]) in the model instantiated
model.layers.16.self_attn.q_proj.weight: found shape torch.Size([2048, 1792]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.16.self_attn.v_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.16.self_attn.v_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.17.self_attn.k_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.17.self_attn.k_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.17.self_attn.o_proj.weight: found shape torch.Size([1792, 2048]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.17.self_attn.q_proj.bias: found shape torch.Size([2048]) in the checkpoint and torch.Size([1792]) in the model instantiated
model.layers.17.self_attn.q_proj.weight: found shape torch.Size([2048, 1792]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.17.self_attn.v_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.17.self_attn.v_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.18.self_attn.k_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.18.self_attn.k_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.18.self_attn.o_proj.weight: found shape torch.Size([1792, 2048]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.18.self_attn.q_proj.bias: found shape torch.Size([2048]) in the checkpoint and torch.Size([1792]) in the model instantiated
model.layers.18.self_attn.q_proj.weight: found shape torch.Size([2048, 1792]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.18.self_attn.v_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.18.self_attn.v_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.19.self_attn.k_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.19.self_attn.k_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.19.self_attn.o_proj.weight: found shape torch.Size([1792, 2048]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.19.self_attn.q_proj.bias: found shape torch.Size([2048]) in the checkpoint and torch.Size([1792]) in the model instantiated
model.layers.19.self_attn.q_proj.weight: found shape torch.Size([2048, 1792]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.19.self_attn.v_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.19.self_attn.v_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.2.self_attn.k_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.2.self_attn.k_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.2.self_attn.o_proj.weight: found shape torch.Size([1792, 2048]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.2.self_attn.q_proj.bias: found shape torch.Size([2048]) in the checkpoint and torch.Size([1792]) in the model instantiated
model.layers.2.self_attn.q_proj.weight: found shape torch.Size([2048, 1792]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.2.self_attn.v_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.2.self_attn.v_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.20.self_attn.k_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.20.self_attn.k_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.20.self_attn.o_proj.weight: found shape torch.Size([1792, 2048]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.20.self_attn.q_proj.bias: found shape torch.Size([2048]) in the checkpoint and torch.Size([1792]) in the model instantiated
model.layers.20.self_attn.q_proj.weight: found shape torch.Size([2048, 1792]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.20.self_attn.v_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.20.self_attn.v_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.21.self_attn.k_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.21.self_attn.k_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.21.self_attn.o_proj.weight: found shape torch.Size([1792, 2048]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.21.self_attn.q_proj.bias: found shape torch.Size([2048]) in the checkpoint and torch.Size([1792]) in the model instantiated
model.layers.21.self_attn.q_proj.weight: found shape torch.Size([2048, 1792]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.21.self_attn.v_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.21.self_attn.v_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.22.self_attn.k_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.22.self_attn.k_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.22.self_attn.o_proj.weight: found shape torch.Size([1792, 2048]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.22.self_attn.q_proj.bias: found shape torch.Size([2048]) in the checkpoint and torch.Size([1792]) in the model instantiated
model.layers.22.self_attn.q_proj.weight: found shape torch.Size([2048, 1792]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.22.self_attn.v_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.22.self_attn.v_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.23.self_attn.k_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.23.self_attn.k_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.23.self_attn.o_proj.weight: found shape torch.Size([1792, 2048]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.23.self_attn.q_proj.bias: found shape torch.Size([2048]) in the checkpoint and torch.Size([1792]) in the model instantiated
model.layers.23.self_attn.q_proj.weight: found shape torch.Size([2048, 1792]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.23.self_attn.v_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.23.self_attn.v_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.24.self_attn.k_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.24.self_attn.k_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.24.self_attn.o_proj.weight: found shape torch.Size([1792, 2048]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.24.self_attn.q_proj.bias: found shape torch.Size([2048]) in the checkpoint and torch.Size([1792]) in the model instantiated
model.layers.24.self_attn.q_proj.weight: found shape torch.Size([2048, 1792]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.24.self_attn.v_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.24.self_attn.v_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.25.self_attn.k_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.25.self_attn.k_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.25.self_attn.o_proj.weight: found shape torch.Size([1792, 2048]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.25.self_attn.q_proj.bias: found shape torch.Size([2048]) in the checkpoint and torch.Size([1792]) in the model instantiated
model.layers.25.self_attn.q_proj.weight: found shape torch.Size([2048, 1792]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.25.self_attn.v_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.25.self_attn.v_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.26.self_attn.k_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.26.self_attn.k_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.26.self_attn.o_proj.weight: found shape torch.Size([1792, 2048]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.26.self_attn.q_proj.bias: found shape torch.Size([2048]) in the checkpoint and torch.Size([1792]) in the model instantiated
model.layers.26.self_attn.q_proj.weight: found shape torch.Size([2048, 1792]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.26.self_attn.v_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.26.self_attn.v_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.27.self_attn.k_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.27.self_attn.k_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.27.self_attn.o_proj.weight: found shape torch.Size([1792, 2048]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.27.self_attn.q_proj.bias: found shape torch.Size([2048]) in the checkpoint and torch.Size([1792]) in the model instantiated
model.layers.27.self_attn.q_proj.weight: found shape torch.Size([2048, 1792]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.27.self_attn.v_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.27.self_attn.v_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.3.self_attn.k_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.3.self_attn.k_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.3.self_attn.o_proj.weight: found shape torch.Size([1792, 2048]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.3.self_attn.q_proj.bias: found shape torch.Size([2048]) in the checkpoint and torch.Size([1792]) in the model instantiated
model.layers.3.self_attn.q_proj.weight: found shape torch.Size([2048, 1792]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.3.self_attn.v_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.3.self_attn.v_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.4.self_attn.k_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.4.self_attn.k_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.4.self_attn.o_proj.weight: found shape torch.Size([1792, 2048]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.4.self_attn.q_proj.bias: found shape torch.Size([2048]) in the checkpoint and torch.Size([1792]) in the model instantiated
model.layers.4.self_attn.q_proj.weight: found shape torch.Size([2048, 1792]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.4.self_attn.v_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.4.self_attn.v_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.5.self_attn.k_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.5.self_attn.k_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.5.self_attn.o_proj.weight: found shape torch.Size([1792, 2048]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.5.self_attn.q_proj.bias: found shape torch.Size([2048]) in the checkpoint and torch.Size([1792]) in the model instantiated
model.layers.5.self_attn.q_proj.weight: found shape torch.Size([2048, 1792]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.5.self_attn.v_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.5.self_attn.v_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.6.self_attn.k_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.6.self_attn.k_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.6.self_attn.o_proj.weight: found shape torch.Size([1792, 2048]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.6.self_attn.q_proj.bias: found shape torch.Size([2048]) in the checkpoint and torch.Size([1792]) in the model instantiated
model.layers.6.self_attn.q_proj.weight: found shape torch.Size([2048, 1792]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.6.self_attn.v_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.6.self_attn.v_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.7.self_attn.k_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.7.self_attn.k_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.7.self_attn.o_proj.weight: found shape torch.Size([1792, 2048]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.7.self_attn.q_proj.bias: found shape torch.Size([2048]) in the checkpoint and torch.Size([1792]) in the model instantiated
model.layers.7.self_attn.q_proj.weight: found shape torch.Size([2048, 1792]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.7.self_attn.v_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.7.self_attn.v_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.8.self_attn.k_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.8.self_attn.k_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.8.self_attn.o_proj.weight: found shape torch.Size([1792, 2048]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.8.self_attn.q_proj.bias: found shape torch.Size([2048]) in the checkpoint and torch.Size([1792]) in the model instantiated
model.layers.8.self_attn.q_proj.weight: found shape torch.Size([2048, 1792]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.8.self_attn.v_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.8.self_attn.v_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.9.self_attn.k_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.9.self_attn.k_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
model.layers.9.self_attn.o_proj.weight: found shape torch.Size([1792, 2048]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.9.self_attn.q_proj.bias: found shape torch.Size([2048]) in the checkpoint and torch.Size([1792]) in the model instantiated
model.layers.9.self_attn.q_proj.weight: found shape torch.Size([2048, 1792]) in the checkpoint and torch.Size([1792, 1792]) in the model instantiated
model.layers.9.self_attn.v_proj.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([448]) in the model instantiated
model.layers.9.self_attn.v_proj.weight: found shape torch.Size([512, 1792]) in the checkpoint and torch.Size([448, 1792]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Traceback (most recent call last):
File "/home/llm_infer.py", line 84, in
infer_qwen(model_name='./output', messages=messages,
File "/home/llm_infer.py", line 7, in infer_qwen
model = AutoModelForCausalLM.from_pretrained(
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 564, in from_pretrained
return model_class.from_pretrained(
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 4303, in from_pretrained
dispatch_model(model, **device_map_kwargs)
File "/usr/local/lib/python3.10/dist-packages/accelerate/big_modeling.py", line 494, in dispatch_model
model.to(device)
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 3157, in to
return super().to(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1174, in to
return self._apply(convert)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 780, in _apply
module._apply(fn)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 780, in _apply
module._apply(fn)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 780, in _apply
module._apply(fn)
[Previous line repeated 2 more times]
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 805, in _apply
param_applied = fn(param)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1167, in convert
raise NotImplementedError(
NotImplementedError: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device.
The text was updated successfully, but these errors were encountered:
Some weights of Qwen2ForCausalLM were not initialized from the model checkpoint at ./output and are newly initialized because the shapes did not match:
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Traceback (most recent call last):
File "/home/llm_infer.py", line 84, in
infer_qwen(model_name='./output', messages=messages,
File "/home/llm_infer.py", line 7, in infer_qwen
model = AutoModelForCausalLM.from_pretrained(
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 564, in from_pretrained
return model_class.from_pretrained(
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 4303, in from_pretrained
dispatch_model(model, **device_map_kwargs)
File "/usr/local/lib/python3.10/dist-packages/accelerate/big_modeling.py", line 494, in dispatch_model
model.to(device)
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 3157, in to
return super().to(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1174, in to
return self._apply(convert)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 780, in _apply
module._apply(fn)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 780, in _apply
module._apply(fn)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 780, in _apply
module._apply(fn)
[Previous line repeated 2 more times]
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 805, in _apply
param_applied = fn(param)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1167, in convert
raise NotImplementedError(
NotImplementedError: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device.
The text was updated successfully, but these errors were encountered: