Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Map UOps with code generated by the JIT #118467

Open
diegorusso opened this issue May 1, 2024 · 0 comments
Open

Map UOps with code generated by the JIT #118467

diegorusso opened this issue May 1, 2024 · 0 comments
Labels
type-feature A feature request or enhancement

Comments

@diegorusso
Copy link
Contributor

diegorusso commented May 1, 2024

Feature or enhancement

Proposal:

This is a follow up feature of #117958. That feature exposes just the bare JIT Code of the executor object.

To improve the debug experience of the JIT implementation, a map between the UOp and the generated code should be implemented.
@brandtbucher suggested the following in his comment

Hello, thanks for the PR! It certainly does the job of capturing the machine code generated by the JIT but I was hoping to have a map between the uop byte code and the related machine code similarly to what I was envisaging here

So, I've thought about this, and it should be possible with a couple of tweaks.

Basically, this current PR returns a byte string, which consists of the code for each instruction in sequence, followed by the auxiliary data for each instruction in sequence.

Meaning, for a trace of:

[A, B, C, D]

It returns:

b"".join([<A code>, <B code>, <C code>, <D code>, <A data>, <B data>, <C data>, <D data>, <padding>])

However, the executor knows the uops that make up its trace. If we #include "jit_stencils.h", we should be able to use stencil_groups[instruction->opcode].code.body_size and stencil_groups[instruction->opcode].data.body_size to compute these chunks.

Maybe @tonybaloney and @diegorusso can confirm, but it seems like the most useful info to return would be a 3-tuple of base address, a list of code byte strings (corresponding to uops) and a list of data byte strings (again, corresponding to uops).

So, for the above example, the return value would be:

(
    <base address>,
    [<A code>, <B code>, <C code>, <D code>],
    [<A data>, <B data>, <C data>, <D data>],
)

(I think base address is needed for some absolute addressing that we use in places.)

So each of the code or data lists can be zip'd with the executor to map them to individual uops. And if I want the raw string of data that this PR returns now, I can just take this tuple and do b"".join(result[1] + result[2]).

Would this meet everyone's needs, or am I overthinking it? Even though it's internal, I don't want to tweak this too much after the beta freeze on Monday, so I'm leaning towards providing more information rather than less.

Originally posted by @brandtbucher in #117959 (comment)

Has this already been discussed elsewhere?

I have already discussed this feature proposal on Discourse

Links to previous discussion of this feature:

https://discuss.python.org/t/jit-mapping-bytecode-instructions-and-assembly/50809

#117958

@diegorusso diegorusso added the type-feature A feature request or enhancement label May 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

1 participant