Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to have debug symbols/stack unwinding for debugging? #313

Open
sakjain92 opened this issue Nov 6, 2020 · 10 comments
Open

Comments

@sakjain92
Copy link

sakjain92 commented Nov 6, 2020

So I am mostly interested in using the assembler to JIT compile machine code and then execute it. I was wondering is it possible to make the JIT code debuggable with GDB at runtime? Or with backtrack()? If not, does this mean there is no support for stack unwinding when I use the assembler?
(I have some external function which the JIT code will call that uses int backtrace(void **buffer, int size); (man 3 backtrace) to get stack trace)

For what I have read, stack unwinding (e.g. backtrack()) would require the JIT code to have a .eh_header ELF section and register it. Though, GDB requires different methodology for registering JIT code with itself (GDB requires registering debug symbols with itself and I .assuming that the assembler doesn't output any debug symbols section currently)
(source: http://www.corsix.org/content/libunwind-dynamic-code-x86-64)

@garazdawi
Copy link

garazdawi commented Nov 6, 2020

We use the jit-reader support in gdb to do stack unwinding and symbol resolution. It works well enough and we found it easier to work with than elf and dwarf.

@sakjain92
Copy link
Author

I will look into that also. Thanks. (BTW the link you added @garazdawi is broken)

My problem though is that I have external functions being called from JIT code that calls backtrace(). So I need support for that also (support for gloc backtrace()/libunwind is unfortunately going to be different from gdb support). So wondering how to work around with that?

@garazdawi
Copy link

Seems like you need something more complex than what we needed.

We use this as the jit-reader code: https://github.com/erlang/otp/blob/master/erts/etc/unix/jit-reader.c

And the data is generated here: https://github.com/erlang/otp/blob/master/erts/emulator/beam/jit/beam_asm.cpp#L575

@sakjain92
Copy link
Author

It would be cool if asmjit can add support for CFI directives (http://web.archive.org/web/20130111101034/http://blog.mozilla.org/respindola/2011/05/12/cfi-directives) like LLVM did.

I am assuming ASMJIT doesn't have support for CFI directives currently, right? I guess I will have to generate .eh_header section manually outside ASMJIT then.

@kobalicek
Copy link
Member

No it doesn't, but I would definitely review a PR that adds CFI directives - I see a need for some support for defining unwind info.

@vogelsgesang
Copy link
Contributor

Leaving some notes/hints here in case anyone wants to implement this.
I got debug info with AsmJit working in a project I am working on, but unfortunately won't be able to contribute that source code. In case anyone implements this in AsmJit, I would be happy to provide feedback.

backtrace() and unwind tables for Linux/macOS

You need to write Dwarf information.
In particular, you want to write a .eh_frame section and load it at runtime.
The .eh_frame is an extended variant of the .debug_frame section from the Dwarf spec, of particular interest are sections 6.4 (overview of .debug_frame concepts) and section 7.23 (the binary encoding).

Although the basic structure of eh_frame is the same as debug_frame it deviates in a few important aspects:
"System V Application Binary Interface" defines eh_frame section. Of particular interest are sections 3.7 and 4.2.4.

For debugging/inspiration, the llvm-dwarfdump --eh-frame is very useful. Using this tool, you can inspect the Dwarf info generated by gcc/clang for a C++ binary and use this dwarf info as inspiration.

After having the eh_frame section, one needs to register this section at runtime. To do so, one passes the address at which the eh_frame section was loaded to __register_frame, to unload the debug info use __deregister_frame.
On OSX (and also on Linux, for projects which use libunwind instead of the default exception handling library), you need to add one more twist: You need to call {register,deregister}_frame multiple times. See LLVM source code for details.

backtrace() and unwind tables for Windows

Windows exception handling works slightly different.

The two main differences:

You register/deregister your unwind info using RtlAddFunctionTable and RtlDeleteFunctionTable.

@kobalicek
Copy link
Member

What about having directives that would help with these in AsmJit, would that help?

@vogelsgesang
Copy link
Contributor

vogelsgesang commented Aug 31, 2021

Not sure what exactly you mean by "directives in AsmJit".

For me, something like the following interface would be pretty helpful:

CodeHolder code;
code.init(rt.environment());
x86::Assembler a(&code);
EHFrameWriter eh(&code);

eh.startFunction(); // Mark the function start
a.push(x86::rbp);
eh.addRegisterSpill(x86::rbp); // Register spills need to be annotated to emit correct `.eh_frame` info
a.sub(x86::rsp, 64); // Allocate 64 bytes on the stack
eh.allocateLocalStack(64); // Needs to be mirrored into the `.eh_frame` info

// emit function body...

a.add(x86::rsp, 64); // Deallocate stack
eh.allocateLocalStack(-64); // Mirror into `.eh_frame` by allocating negative memory
a.pop(x86::rbp);
eh.unspillRegister(x86::rbp); // Need to also mirror "unspills" to keep call-frame offset in Dwarf info in sync

a.ret();
eh.endFunction(); // Need to mark function end

I am using the Assembler level of AsmJit only, no FuncFrame or Compiler. Hence, I wouldn't need support for exception-handling info/unwind tables in the Compiler layer.
If support for Compiler would be needed, I guess one would have to adjust Assembler::emitProlog to emit the necessary .eh_frame info by calling into EHFrameWriter

@kobalicek
Copy link
Member

Something like that, but I think this would be possible to emit into an additional section instead of having a separate write process for that - that way it would work with Builder/Compiler as well.

@vogelsgesang
Copy link
Contributor

yes, I would also have "emit into separate section" in mind here.
The EHFrameWriter class I sketched above was mostly meant as a wrapper/utility class which encapsulates the Dwarf encoding.

Conceptually, it would do something like

EHWriter::addRegisterSpill(x86::Gpq reg) {
   auto* prevSection = a.currentSection();
   a.section(ehFrameSection);

   // emit Dwarf info here
   
   a.section(prevSection);
}

but afaik switching the section frequently is relatively expensive...
Was wondering if it makes sense to let EHWriter hence directly inherit from BaseEmitter and thereby avoid frequent switches between sections. Not sure if it is valid to have multiple emitters against the same code holder, as long as the emitters emit into separate sections...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants