Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime/jit-rt: revive and migrates the whole thing to OrcJIT v2 #4774

Draft
wants to merge 20 commits into
base: master
Choose a base branch
from

Conversation

liushuyu
Copy link
Contributor

@liushuyu liushuyu commented Nov 7, 2024

This pull request migrates the dynamic compile system to LLVM OrcJIT v2 (the newer "v2" version from post-LLVM 16 with opaque pointers, not the older "v2" version from the LLVM 12 ~ 15 era).

The implementation is currently in very rough shape, and I am not entirely sure what I am doing. However, the "Hello World" example in the documentation has started to work now.

#include "llvm/IR/LegacyPassManager.h"
#include "llvm/IR/Module.h"
#include "llvm/IR/Verifier.h"
//===-- optimizer.cpp -----------------------------------------------------===//
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is copied almost verbatim from gen/optimizer.cpp.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can use a single file for both projects then, with a little predefine if needed.

@kinke
Copy link
Member

kinke commented Nov 7, 2024

Hehe, looks like one's motivated, nice! - In case you haven't seem them, there are quite a few lit-tests for this feature: https://github.com/ldc-developers/ldc/tree/master/tests/dynamiccompile. No idea if anything can still be salvaged from #3184, Ivan's WIP from 5 (!) years ago by now.

@liushuyu
Copy link
Contributor Author

liushuyu commented Nov 7, 2024

Hehe, looks like one's motivated, nice! - In case you haven't seem them, there are quite a few lit-tests for this feature: https://github.com/ldc-developers/ldc/tree/master/tests/dynamiccompile. No idea if anything can still be salvaged from #3184, Ivan's WIP from 5 (!) years ago by now.

I see. I did not know there was another pull request doing the same thing from 5 years ago. Now that I see they made very similar changes before, I realize I duplicated a fair amount of changes (but this seems like a form of cross-validation? Two independent implementations look very similar).

@liushuyu
Copy link
Contributor Author

liushuyu commented Nov 9, 2024

there are quite a few lit-tests for this feature: https://github.com/ldc-developers/ldc/tree/master/tests/dynamiccompile.

Now, 100% of those LIT tests are passing (with LLVM 18).

@liushuyu liushuyu force-pushed the orcjit-v2 branch 4 times, most recently from 2c90767 to bd2c4b5 Compare November 9, 2024 19:28
@kinke
Copy link
Member

kinke commented Nov 10, 2024

Very nice progress! - I've pushed 2 commits to reduce the remaining failures on aarch64.

@liushuyu liushuyu force-pushed the orcjit-v2 branch 2 times, most recently from 465a543 to 8cbe0d1 Compare November 11, 2024 18:15
@@ -28,7 +28,7 @@ void main(string[] args)
settings.optLevel = 3;
settings.dumpHandler = (DumpStage stage, in char[] str)
{
if (DumpStage.FinalAsm == stage)
if (DumpStage.FinalAsm == stage || DumpStage.OptimizedModule == stage)
Copy link
Member

@kinke kinke Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this doesn't just output the optimized IR now too in case the test fails, but also makes the indexOf substring searches now also search in the IR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this doesn't just output the optimized IR now too in case the test fails, but also makes the indexOf substring searches now also search in the IR.

Yes, LLVM on arm64 uses very clever optimizations, and the constants do not appear in the assembly (because LLVM loads the constant into the register using bit shifts).

@kinke
Copy link
Member

kinke commented Nov 11, 2024

In my Windows VM, the tests don't hang as they do for CI (with a pushed Win32 fix wrt. __chkstk, now on 32-bit Windows too). Instead, some dynamic-compile tests pass (on both 32 and 64-bit Windows), others fail with errors like:

  • dynamiccompile/simd_simple_opt.d:
    • 64-bit:
      Dynamic compiler fatal: Symbol not found in jitted code: "�.jit_bind" (".jit_bind")
      
    • 32-bit:
      Incorrect number of arguments passed to called function!
        call x86_stdcallcc void @"\01__D3ldc15dynamic_compile__T4bindTPFG16hZG32hTxG16hZQyFQvxQlZ7wrapperFSQCpQCo__TQCaTQByTxQBqZQCnFQClxQCcZ7ContextQCzZQCy"(ptr inreg noalias align 1 %3, ptr byval(%"ldc.dynamic_compile.bind!(ubyte[32] function(ubyte[16]), const(ubyte[16])).bind.Context") align 4 %4) #0
      LLVM ERROR: Broken module found, compilation aborted!
      
  • dynamiccompile/simple.d:
    • 64-bit:
      Dynamic compiler fatal: Symbol not found in jitted code: "�_D6simple3bazFZv" ("_D6simple3bazFZv")
      
    • 32-bit:
      Dynamic compiler fatal: Symbol not found in jitted code: "�__D6simple3barFZi" ("__D6simple3barFZi")
      

Looks like there's some trouble with the \1 () mangle prefix on Windows, which is used to make LLVM not apply any extra mangling (e.g., extra leading underscore on Win32, and vectorcall/stdcall suffix on Win64).

Edit: We do use \1.jit_bind in general for all targets (https://github.com/liushuyu/ldc/blob/9531ab3828e6b4d1fe675cfcc15085d4ce948155/runtime/jit-rt/cpp-so/bind.cpp#L47-L48), but it's only a problem on Windows. So maybe not name/mangle-related after all, but some other Windows-specific issue.

@liushuyu
Copy link
Contributor Author

liushuyu commented Nov 11, 2024

dynamiccompile/simple.d:

  • 64-bit:
    Dynamic compiler fatal: Symbol not found in jitted code: "�_D6simple3bazFZv" ("_D6simple3bazFZv")
    

I have tried to debug this problem on a Windows 11 installation with Visual Studio 2022 (Community Edition).
It seems like either our dynamic compile mechanism setup LLVM in the wrong way, or there is an LLVM bug on Windows: the symbols were not materialized when they should be (LLVM on Unix platforms will materialize the JIT symbols during the lookup, i.e. lazily compile/resolve the symbols).

@liushuyu liushuyu force-pushed the orcjit-v2 branch 2 times, most recently from 60fedcc to 36bd753 Compare November 12, 2024 02:05
@liushuyu
Copy link
Contributor Author

On Ubuntu 20.04 aarch64:

********************
Unexpectedly Passed Tests (1):
  LDC :: dynamiccompile/throw.d

But it's still broken on macOS aarch64 unfortunately

liushuyu and others added 11 commits November 16, 2024 21:43
... currently only supports LLVM 18+

Co-Authored-By: Ivan <[email protected]>
To avoid Apple linker errors on arm64 wrt. misaligned pointers.
AFAICT, there's no need to pack the struct, just use natural alignment.
Bump the LDC_DYNAMIC_COMPILE_API_VERSION accordingly.
... LLVM can be very clever about loading values into registers on
aarch64
Fix 'unresolved external symbol ___chkstk' by NOT letting LLVM add
an implicit extra leading underscore - the final mangle is `__chkstk`
on both 32/64-bit Windows.
@liushuyu
Copy link
Contributor Author

It seems like I may have discovered several bugs in LLVM that are exclusive to Windows. I have currently worked around them because I am not entirely sure if they are "bugs" or "intended behaviors".

... also adds a batched lookup function
... there seems to be some LLVM behavior issues on Windows, where
looking up a symbol on the second time will crash either LLVM itself or
corrupts the symbol address obtained from LLVM
... JITLink does not work very well for the code patterns generated by
LDC
@liushuyu liushuyu force-pushed the orcjit-v2 branch 2 times, most recently from 297af71 to 6662b46 Compare November 17, 2024 06:13
@liushuyu liushuyu force-pushed the orcjit-v2 branch 2 times, most recently from f2266e0 to b0606f7 Compare November 17, 2024 19:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants