-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversion fails on model loaded via torch.load or torch.jit.load #221
Comments
Hi @saseptim, I don't believe ai-edge-torch can handle that file format..., for this repo the PyTorch model needs to be torch.export compliant... you can find more details here: https://github.com/google-ai-edge/ai-edge-torch/blob/main/docs/pytorch_converter/README.md#conversion Do you have an example script showing what you are doing? Thanks. |
Marking this issue as stale since it has been open for 7 days with no activity. This issue will be closed if no further activity occurs. |
Hi @pkgoogle, I just came across this as I was trying to convert a pix2pix model (here the training code). I save using |
Hi @jchwenger, we don't support non-torch-exportable models (plenty of custom models are torch exportable, although not every one). Fundamentally the root issue is with torch-export so we cannot fix that, however once/if that is fixed then this library should be able to convert it or if it can't then the root cause may be a bug on our end. To test for torch exportability you have to follow the steps here: https://pytorch.org/docs/stable/export.html i.e. you can load the model and see if you can export it with PyTorch API's. If you don't run into an exception then it is probably torch exportable. Torch.Export exports its to StableHLO, an MLIR dialect which is more interoperable w/ the ecosystem of libraries that support MLIR, including this one. You can think of it as a different saving format which is more interoperable with other libraries. This is important to get the models working on heterogenous hardware such as edge devices, mobile, TPU's etc. |
Hi @pkgoogle, thanks for this! I was confused by the phrasing in the docs, it's as simple as that: when you say "must be compliant with I have a very simple and runnable Colab here, maybe you will see something super obvious I missed? Strangely, I get an error around frozen tensors in the Pix2Pix for the in-place |
Side note: the in-place |
Marking this issue as stale since it has been open for 7 days with no activity. This issue will be closed if no further activity occurs. |
Hi again, bumping this up, @pkgoogle. Just wondering: do you believe there is reasonable hope to fix this discrepancy (off-the-shelf ResNet and small dense net OK, but DCGAN and pix2pix not) when converting? Or should I try and post this issue on the PyTorch repo? Any thought welcome, thanks! |
Hi @jchwenger I'm having trouble figuring out which discrepancies you are referring to. Are you saying that we can convert ResNet and small dense net, but not DCGAN and pix2pix? The answer will depend on what is causing the issue. If it's due to PyTorch export then the root cause is with PT Export (in which case you should create an issue there), if it's something else...well I will have to investigate. For DCGAN and pix2pix if we haven't confirmed it's PT Export, can you provide me a reproducible script which shows the error? (Sometimes people make small changes/adjustments in their code that actually affect the investigation). |
Thanks @pkgoogle for the answer! It's quite simple, with the custom dense net and ResNet, the test described in the original docs passes with "Inference result with Pytorch and TfLite was within tolerance", whereas with the DCGAN and pix2pix models it fails, with "Something wrong with Pytorch --> TfLite". As you say, I don't know if it's the PT export or the conversion... I have all four examples in this Colab, which should be runnable out of the box, with only the session restart needed after installing the dependencies. Thanks in advance! |
Hi @jchwenger, I'm looking into it but do you associate w/ the OP? reason being is it feels like we are hijacking this thread as the original problem seems different. If you are not -- In which case we much prefer you create a new issue to track progress on your issues. In this case it's looking like an accuracy issue post-conversion for DCGAN & pix2pix. |
Fair point, all done, see here! |
Description of the bug:
I have a pytorch model which was saved with torch.jit.save(). I tried both with a traced model and scripted model. The error is:
_File /orcam/ear/scratch/usr/avis/VENV_AI_EDGE/lib/python3.10/site-packages/torch/export/_trace.py:1449, in _export(mod, args, kwargs, dynamic_shapes, strict, preserve_module_call_signature, pre_dispatch, _allow_complex_guards_as_runtime_asserts, _disable_forced_specializations, _is_torch_jit_trace)
1447 original_state_dict = mod.state_dict(keep_vars=True)
1448 if not _is_torch_jit_trace:
-> 1449 forward_arg_names = _get_forward_arg_names(mod, args, kwargs)
1450 else:
1451 forward_arg_names = None
File /orcam/ear/scratch/usr/avis/VENV_AI_EDGE/lib/python3.10/site-packages/torch/export/_trace.py:753, in _get_forward_arg_names(mod, args, kwargs)
739 def _get_forward_arg_names(
740 mod: torch.nn.Module,
741 args: Tuple[Any, ...],
742 kwargs: Optional[Dict[str, Any]] = None,
743 ) -> List[str]:
744 """
745 Gets the argument names to forward that are used, for restoring the
746 original signature when unlifting the exported program module.
(...)
751 export lifted modules.
752 """
--> 753 sig = inspect.signature(mod.forward)
754 _args = sig.bind_partial(*args).arguments
756 names: List[str] = []
File /usr/lib/python3.10/inspect.py:3254, in signature(obj, follow_wrapped, globals, locals, eval_str)
3252 def signature(obj, *, follow_wrapped=True, globals=None, locals=None, eval_str=False):
3253 """Get a signature object for the passed callable."""
-> 3254 return Signature.from_callable(obj, follow_wrapped=follow_wrapped,
3255 globals=globals, locals=locals, eval_str=eval_str)
File /usr/lib/python3.10/inspect.py:3002, in Signature.from_callable(cls, obj, follow_wrapped, globals, locals, eval_str)
2998 @classmethod
2999 def from_callable(cls, obj, *,
3000 follow_wrapped=True, globals=None, locals=None, eval_str=False):
3001 """Constructs Signature for the given callable object."""
-> 3002 return _signature_from_callable(obj, sigcls=cls,
3003 follow_wrapper_chains=follow_wrapped,
3004 globals=globals, locals=locals, eval_str=eval_str)
File /usr/lib/python3.10/inspect.py:2550, in _signature_from_callable(obj, follow_wrapper_chains, skip_bound_arg, globals, locals, eval_str, sigcls)
2548 except ValueError as ex:
2549 msg = 'no signature found for {!r}'.format(obj)
-> 2550 raise ValueError(msg) from ex
2552 if sig is not None:
2553 # For classes and objects we skip the first parameter of their
2554 # call, new, or init methods
2555 if skip_bound_arg:
ValueError: no signature found for <torch.ScriptMethod object at 0x7f942662ffb0>_
Actual vs expected behavior:
No response
Any other information you'd like to share?
No response
The text was updated successfully, but these errors were encountered: