You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a follow up feature of #117958. That feature exposes just the bare JIT Code of the executor object.
To improve the debug experience of the JIT implementation, a map between the UOp and the generated code should be implemented. @brandtbucher suggested the following in his comment
Hello, thanks for the PR! It certainly does the job of capturing the machine code generated by the JIT but I was hoping to have a map between the uop byte code and the related machine code similarly to what I was envisaging here
So, I've thought about this, and it should be possible with a couple of tweaks.
Basically, this current PR returns a byte string, which consists of the code for each instruction in sequence, followed by the auxiliary data for each instruction in sequence.
However, the executor knows the uops that make up its trace. If we #include "jit_stencils.h", we should be able to use stencil_groups[instruction->opcode].code.body_size and stencil_groups[instruction->opcode].data.body_size to compute these chunks.
Maybe @tonybaloney and @diegorusso can confirm, but it seems like the most useful info to return would be a 3-tuple of base address, a list of code byte strings (corresponding to uops) and a list of data byte strings (again, corresponding to uops).
So, for the above example, the return value would be:
(I think base address is needed for some absolute addressing that we use in places.)
So each of the code or data lists can be zip'd with the executor to map them to individual uops. And if I want the raw string of data that this PR returns now, I can just take this tuple and do b"".join(result[1] + result[2]).
Would this meet everyone's needs, or am I overthinking it? Even though it's internal, I don't want to tweak this too much after the beta freeze on Monday, so I'm leaning towards providing more information rather than less.
Feature or enhancement
Proposal:
This is a follow up feature of #117958. That feature exposes just the bare JIT Code of the executor object.
To improve the debug experience of the JIT implementation, a map between the UOp and the generated code should be implemented.
@brandtbucher suggested the following in his comment
Originally posted by @brandtbucher in #117959 (comment)
Has this already been discussed elsewhere?
I have already discussed this feature proposal on Discourse
Links to previous discussion of this feature:
https://discuss.python.org/t/jit-mapping-bytecode-instructions-and-assembly/50809
#117958
The text was updated successfully, but these errors were encountered: