Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vim bottle is miscompiled because Xcode 16 has codegen bugs #195325

Closed
4 tasks done
ychin opened this issue Oct 23, 2024 · 23 comments · Fixed by #195456
Closed
4 tasks done

Vim bottle is miscompiled because Xcode 16 has codegen bugs #195325

ychin opened this issue Oct 23, 2024 · 23 comments · Fixed by #195456
Labels
bug Reproducible Homebrew/homebrew-core bug

Comments

@ychin
Copy link
Contributor

ychin commented Oct 23, 2024

brew gist-logs <formula> link OR brew config AND brew doctor output

HOMEBREW_VERSION: 4.4.2-23-g62c1f5b
ORIGIN: https://github.com/Homebrew/brew
HEAD: 62c1f5b42bb9959414d1e016f96385f48b9b961d
Last commit: 77 minutes ago
Core tap JSON: 23 Oct 08:36 UTC
Core cask tap JSON: 23 Oct 08:36 UTC
HOMEBREW_PREFIX: /opt/homebrew
HOMEBREW_CASK_OPTS: []
HOMEBREW_DISPLAY: /private/tmp/com.apple.launchd.UhYxpJnKWC/org.macosforge.xquartz:0
HOMEBREW_MAKE_JOBS: 10
HOMEBREW_SORBET_RUNTIME: set
Homebrew Ruby: 3.3.5 => /opt/homebrew/Library/Homebrew/vendor/portable-ruby/3.3.5/bin/ruby
CPU: 10-core 64-bit arm_firestorm_icestorm
Clang: 16.0.0 build 1600
Git: 2.47.0 => /opt/homebrew/bin/git
Curl: 8.7.1 => /usr/bin/curl
macOS: 14.6.1-arm64
CLT: 15.3.0.0.1.1708646388
Xcode: 16.0
Rosetta 2: false

Your system is ready to brew

Verification

  • My brew doctor output says Your system is ready to brew. and am still able to reproduce my issue.
  • I ran brew update and am still able to reproduce my issue.
  • I have resolved all warnings from brew doctor and that did not fix my problem.
  • I searched for recent similar issues at https://github.com/Homebrew/homebrew-core/issues?q=is%3Aissue and found no duplicates.

What were you trying to do (and why)?

The current bottled Vim seems to have behavior bugs because the compiler is miscompiling it. From what we can tell Xcode 16 seems to have a fair amount of codegen bugs leading to this.

This issue was discovered in vim/vim#15764, when we attempted to upgrade Vim's CI to use macos-15 runners. It failed one of the tests, and after investigation it appears to be an inherent Xcode 16 clang bug that has not yet been fixed in Xcode.

Note that Vim doesn't seem to be the only affected software. From browsing around it appears JDK is also facing issues with Xcode 16 clang generating wrong results (https://bugs.openjdk.org/browse/JDK-8340341).

I tried to look around but I couldn't find other reports in Homebrew. I'm guessing this is either an niche bug affecting only few software, or the bug is so subtle that it is not easy to detect. Are all bottles in Homebrew built with Xcode 16?

Edit: Note that this affects Neovim as well. One caveat is that Vim bottles seem to be broken in both macOS 14/15 but Neovim bottles seem to be broken only in macOS 15. Using otool -l and looking under LC_BUILD_VERSION, it does seem that Vim is built with Xcode 16 for both macOS 14/15, and Neovim is only using Xcode 16 to build bottles for macOS 15, so that matches the observed behaviors.

What happened (include all command output)?

The cursor landed on the <82> part, which is not the expected behavior. This doesn't happen when Vim is built in Xcode 15 or when building in Xcode 16 using -O1/-O0.

What did you expect to happen?

The cursor should land on the <e2> part, which is where the invalid UTF-8 sequence starts. This is how it works on macOS 14.

Step-by-step reproduction instructions (by running brew commands)

1. `brew install vim` (this will download the v9.1.0800 bottled version of Vim as of this writing)
2. `vim`
3. Enter this command in Vim `:call setline(1, "abcdef\xE2\x82xyz")`.
4. Type `8g8` (this Vim command finds the first invalid UTF-8 sequence).
5. Observe
@ychin ychin added the bug Reproducible Homebrew/homebrew-core bug label Oct 23, 2024
@ychin ychin changed the title Vim bottle is miscompiled under macOS 15 because Xcode 16 has codegen bugs Vim bottle is miscompiled because Xcode 16 has codegen bugs Oct 23, 2024
@carlocab
Copy link
Member

carlocab commented Oct 23, 2024

I wonder if it's related to Apple's new libiconv implementation.

See https://savannah.gnu.org/bugs/?65686 and discussion at #194691.

Does it still happen if you link against https://formulae.brew.sh/formula/libiconv#default instead?

@ychin
Copy link
Contributor Author

ychin commented Oct 23, 2024

It's not iconv. This is a code miscompilation issue, which I verified by using a debugger/disassembler. The code in question does not call iconv code and displays different results depending on whether one uses -O2 or -O1 in clang.

I actually have a dedicated repository that one can use to reproduce this bug and it does not rely on iconv and is a pretty simple one-file test (it uses extracted Vim code to make it easier to inspect): https://github.com/ychin/xcode16-clang-codegen-bug


Speaking of iconv, Apple has fixed iconv around macOS 14.2 to be conformant to how GNU iconv works. It was only broken around macOS 14.0 - 14.1. I was going to file an issue about that, but it's another topic of discussion.

@ychin
Copy link
Contributor Author

ychin commented Oct 23, 2024

From further testing, seems like system Vim (/usr/bin/vim) is broken as well, probably because Apple uses Xcode 16 to build macOS 15's suite of bundled software. I think for Homebrew we should still ideally not use a known defective compiler though.

@carlocab
Copy link
Member

Speaking of iconv, Apple has fixed iconv around macOS 14.2 to be conformant to how GNU iconv works. It was only broken around macOS 14.0 - 14.1.

Doesn't seem that way, since gnulib rejects all Apple libiconv implementations starting macOS 14. The libunistring test suite also fails if you insist on linking against Apple libiconv. (We do it anyway to avoid propagating the GNU libiconv dependency everywhere, since it doesn't seem to cause too many user-facing issues.)

@carlocab
Copy link
Member

I think for Homebrew we should still ideally not use a known defective compiler though.

A bit late for that, since we just upgraded our macOS 14 machines to Xcode 16. If the issue is more widespread we could maybe discuss rolling it back. CC @Homebrew/maintainers

@carlocab
Copy link
Member

I actually have a dedicated repository that one can use to reproduce this bug and it does not rely on iconv and is a pretty simple one-file test (it uses extracted Vim code to make it easier to inspect): ychin/xcode16-clang-codegen-bug

Are you able to reproduce the issue with llvm@17?

@ychin
Copy link
Contributor Author

ychin commented Oct 23, 2024

Are you able to reproduce the issue with llvm@17?

I tried llvm@17/18/19 and could not reproduce the issue. This seems to only exist in Xcode 16 clang. I did file a bug with Apple using their Feedback Assistant but as usual I haven't heard back regarding the bug. My test repo also has CI set up but right now it's only testing Xcode 15 vs Xcode 16 and now the open source LLVM compilers (but not that hard to add them).

Edit: I added CI tests to my repo to test open source LLVM as well, as can be seen here: https://github.com/ychin/xcode16-clang-codegen-bug/actions/runs/11486836312

A bit late for that, since we just upgraded our macOS 14 machines to Xcode 16. If the issue is more widespread we could maybe discuss rolling it back

To be fair I don't know how widespread this bug is, but at least from my linked thread in the issue it does seem to affect OpenJDK as well. In that case, the miscompiled code causes a runtime exception so it's causing build step to fail it seems. For now they are just saying "don't use Xcode 16".

@carlocab
Copy link
Member

carlocab commented Oct 23, 2024

To be fair I don't know how widespread this bug is, but at least from my linked thread in the issue it does seem to affect OpenJDK as well. In that case, the miscompiled code causes a runtime exception so it's causing build step to fail it seems. For now they are just saying "don't use Xcode 16".

Yes, we've seen that as well. We are, in fact, not using Xcode 16 to build OpenJDK:

on_macos do
if DevelopmentTools.clang_build_version == 1600
depends_on "llvm" => :build
fails_with :clang do
cause <<~EOS
Exception in thread "main" java.lang.ClassFormatError: StackMapTable format error: bad verification type
at jdk.compiler/com.sun.tools.javac.Main.compile(Main.java:64)
at jdk.compiler/com.sun.tools.javac.Main.main(Main.java:52)
EOS
end
end
end

But the OpenJDK bug seems separate from the one affecting Vim.

@Bo98
Copy link
Member

Bo98 commented Oct 23, 2024

The problem doesn't seem to have been wide spread. OpenJDK has a ton of for loops but the problem for some reason only happened with a very specific one and it wouldn't miscompile if you touched it slightly so was extremely difficult to narrow down into a test case to report to Apple beyond "compile this massive project".

Your reproducer repo is helpful for narrowing it down - it's the smallest we've managed so far. I'll have a look and see if I can push Apple on it more.

Xcode 15 unfortunately does not run on macOS 15 and fixing Xcode 16 is important before Xcode's support for macOS 14 is dropped in a few months. Some packages are now requiring Xcode 16 to even build.

@ychin
Copy link
Contributor Author

ychin commented Oct 23, 2024

Right. In the case I see with Vim, it seems to be a bug related to inline functions, whereas in JDK it's some weird issue with the loop. It's hard to tell if the underlying cause is the same though unless we can narrow down to what optimizer pass is causing the issue specifically.

But I think this is why I'm raising an issue here. Maybe it's not "wide spread" per se, but so far two projects have already run into it. In JDK's case the miscompilation breaks the program in an obvious way that throws an exception which is good. In Vim's case, it's an infrequently used feature that only got caught by a test in CI. But it's hard to tell what other programs are affected. I do understand the fact that Xcode 16 is what Apple uses for everything and mandatory for targeting new macOS features so it may not easy or even possible to switch away.

@p-linnane
Copy link
Member

@ychin Thanks for the reproducer. I've verified this still occurs with the update in Xcode 16.2 beta (Apple clang version 16.0.0 (clang-1600.0.26.4).

@Bo98
Copy link
Member

Bo98 commented Oct 23, 2024

As an extra data point, I wasn't able to reproduce with the open sourced parts of Swift Clang: https://github.com/swiftlang/llvm-project/commits/swift-6.0-RELEASE. So seems limited to a bad optimiser pass that Apple have implemented internally for Xcode Clang only.

Thanks to your test case, I also narrowed down the bad pass to:

BISECT: running pass (193) ConstraintEliminationPass on utf_find_illegal

So it can be worked around I think with -mllvm -enable-constraint-elimination=0, but I've not tested OpenJDK yet to see.

@ychin
Copy link
Contributor Author

ychin commented Oct 23, 2024

Oh interesting, thanks for looking into this. I was not aware of this LLVM trick (optimization passes bisect) before. That seems very useful for future. I can confirm this flag fixes this particularly issue. I'll update the Apple bug with this information.

You can see that this flag would fix the issue: https://github.com/ychin/xcode16-clang-codegen-bug/actions/runs/11488583311/job/31975597614

Let me try to build Vim with this and see if tests pass, how it affects the program, etc.

@Bo98
Copy link
Member

Bo98 commented Oct 23, 2024

The optimisation eliminates a condition (unsurprisingly from the name) seemingly by attempting to collapse into a previous condition so sounds about right that it's the cause:

in function utf_find_illegal:
  in block %27 / %27:
    >   br i1 false, label %46, label %51
    <   %28 = icmp slt i8 %4, 0
    <   br i1 %28, label %47, label %52
19:                                               ; preds = %16, %13
  %20 = phi i64 [ 1, %13 ], [ %17, %16 ]
  %21 = getelementptr inbounds i8, ptr %3, i64 %20
  %22 = load i8, ptr %21, align 1, !tbaa !6
  %23 = and i8 %22, -64
  %24 = icmp eq i8 %23, -128
  br i1 %24, label %16, label %27

25:                                               ; preds = %6, %16
  %26 = icmp slt i8 %4, 0
  br i1 %26, label %28, label %51

27:                                               ; preds = %19
  br i1 false, label %46, label %51

LLVM opt does not do this change.

@ychin
Copy link
Contributor Author

ychin commented Oct 24, 2024

I ran Vim tests with this command line flag (only for Apple clang 16) and they all pass now (https://github.com/ychin/vim/actions/runs/11489113066), so the flag is working. I also did different testing and benchmarking and it does not appear to affect performance in any significant / concrete ways.

I would recommend modifying Vim / MacVim / Neovim formulas to use this flag when building using Xcode 16 until they have this bug fixed. As I mentioned I don't know if other projects other than JDK is affected but this would at least make sure Vim is not buggy.

@Bo98
Copy link
Member

Bo98 commented Oct 24, 2024

I tested OpenJDK and basic bootstrapping passes now (#195392) whereas before javac would crash instantly on launch. So it does seem to be the same issue.

I agree it makes sense to apply to Vim in the meantime.

I'm not sure if all the information here has been forwarded but we've at least forwarded your bug report FB number to our Apple contacts so hopefully it'll be something that can get fixed prior to macOS 14 support ending in Xcode.

@p-linnane
Copy link
Member

I'm not sure if all the information here has been forwarded but we've at least forwarded your bug report FB number to our Apple contacts so hopefully it'll be something that can get fixed prior to macOS 14 support ending in Xcode.

Apple has also been provided the repo for the reproducer and the flag.

carlocab added a commit to Homebrew/brew that referenced this issue Oct 24, 2024
This can result in build failures (e.g. OpenJDK) and silently miscompile
some formulae (e.g. Vim).

See Homebrew/homebrew-core#195325.
carlocab added a commit to Homebrew/brew that referenced this issue Oct 24, 2024
This can result in build failures (e.g. OpenJDK) and silently miscompile
some formulae (e.g. Vim).

See Homebrew/homebrew-core#195325.
@carlocab
Copy link
Member

Probably safer to do it globally. Adding it to superenv at Homebrew/brew#18620.

@fxcoudert
Copy link
Member

@carlocab disagree that introducing a new flag across all of Homebrew is a good solution. It seems to heavy-handed to me, adding the flag to all formulas without actually testing.

@fxcoudert
Copy link
Member

I've opened a PR for vim, passing the right CFLAGS. If someone can try it and confirm it fixes the issue, that'd be great. #195456

@calvinit
Copy link
Contributor

#194719

@p-linnane
Copy link
Member

@ychin This is fixed in Xcode 16.2 beta 3 (clang-1600.0.26.6).

@ychin
Copy link
Contributor Author

ychin commented Nov 22, 2024

Thanks for the update and follow-up! I tried locally and can confirm that the compiler issue seems to be fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Reproducible Homebrew/homebrew-core bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants