Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tcc: update tcc to latest version, fix stdatomic #20726

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

kbkpbot
Copy link
Contributor

@kbkpbot kbkpbot commented Feb 4, 2024

As the v/thirdparty/tcc/README.md said, "This is a prebuild tcc (git://repo.or.cz/tinycc.git), cut at commit:
806b3f9 from 2021-03-17
which is the last good commit found by bisecting, that does not cause vlib/sync/channel_close_test.v to wait indefinitely ."

It seems that the atomic_load/atomic_store of TCC do not include memory barriers. (Windows version have MemoryBarrier() )
By adding atomic_thread_fence(memory_order_acquire)/atomic_thread_fence(memory_order_release) before/after atomic_load/atomic_store, it can be solved.
And, latest TCC stdatomic.h already support a generic interface like gcc/clang.

NOTE, to use this PR, tcc should update to latest version.

@spytheman
Copy link
Member

Wow, that will be very nice indeed!
Thank you for investigating 🙇🏻‍♂️ .

@medvednikov
Copy link
Member

Thanks @kbkpbot! This is some great work you did.

@spytheman
Copy link
Member

I've rebuilt thirdparty/tcc's thirdparty-linux-amd64 branch, so that it has
tcc version 0.9.28rc 2024-02-05 HEAD@105d70f7 (x86_64 Linux) (your latest commit in tcc's mob branch).
I'm still testing it, but so far, it works very well.
I'll push the changes to the thirdparty-linux-amd64 branch soon, then I will update the other prebuilt branches too.

@spytheman
Copy link
Member

>          OK: STBI load does not leak with GC on, when loading images multiple times (use < 10MB)
>          OK: V can crun a script, that lacks a .vsh extension
>          OK: V can run a script, that lacks a .vsh extension
Tue Feb  6 13:15:32 EET 2024
all tests done
CPU: 1159.63s   Real: 691.84s   Elapsed: 11:31.84       RAM: 474564KB   v_fast_all_tests

VFLAGS="-cc tcc" ./v test-all passed locally.

I've pushed the branch to https://github.com/vlang/tccbin .

@spytheman
Copy link
Member

As a bonus, with the new tcc build, the executable sizes are now significantly smaller 🥳, while the compile speed is ~ the same, or even slightly better:

Before:

#0 13:26:16 ᛋ master /v/oo❱thirdparty/tcc/tcc.exe --version
tcc version 0.9.27 (x86_64 Linux)
#0 13:26:24 ᛋ master /v/oo❱
#0 13:26:25 ᛋ master /v/oo❱xtime ./v examples/hello_world.v
CPU: 0.10s      Real: 0.12s     Elapsed: 0:00.12        RAM: 40832KB    ./v examples/hello_world.v
#0 13:26:29 ᛋ master /v/oo❱xtime ./v examples/hello_world.v
CPU: 0.10s      Real: 0.11s     Elapsed: 0:00.11        RAM: 40900KB    ./v examples/hello_world.v
#0 13:26:29 ᛋ master /v/oo❱xtime ./v examples/hello_world.v
CPU: 0.10s      Real: 0.11s     Elapsed: 0:00.11        RAM: 40876KB    ./v examples/hello_world.v
#0 13:26:30 ᛋ master /v/oo❱xtime ./v examples/2048/
CPU: 0.34s      Real: 0.36s     Elapsed: 0:00.36        RAM: 102784KB   ./v examples/2048/
#0 13:26:33 ᛋ master /v/oo❱xtime ./v examples/2048/
CPU: 0.34s      Real: 0.36s     Elapsed: 0:00.36        RAM: 102840KB   ./v examples/2048/
#0 13:26:34 ᛋ master /v/oo❱xtime ./v examples/2048/
CPU: 0.34s      Real: 0.35s     Elapsed: 0:00.35        RAM: 102812KB   ./v examples/2048/
#0 13:26:34 ᛋ master /v/oo❱
#0 13:26:35 ᛋ master /v/oo❱
#0 13:26:35 ᛋ master /v/oo❱ls -la examples/hello_world examples/2048/2048
-rwxrwxr-x 1 delian delian 3632932 Feb  6 13:26 examples/2048/2048
-rwxrwxr-x 1 delian delian  794988 Feb  6 13:26 examples/hello_world
#0 13:26:38 ᛋ master /v/oo❱

After:

#0 13:27:01 ᛋ master /v/oo❱thirdparty/tcc/tcc.exe --version
tcc version 0.9.28rc 2024-02-05 HEAD@105d70f7 (x86_64 Linux)
#0 13:27:04 ᛋ master /v/oo❱xtime ./v examples/hello_world.v
CPU: 0.11s      Real: 0.12s     Elapsed: 0:00.12        RAM: 40860KB    ./v examples/hello_world.v
#0 13:27:09 ᛋ master /v/oo❱xtime ./v examples/hello_world.v
CPU: 0.11s      Real: 0.12s     Elapsed: 0:00.12        RAM: 40884KB    ./v examples/hello_world.v
#0 13:27:10 ᛋ master /v/oo❱xtime ./v examples/hello_world.v
CPU: 0.10s      Real: 0.11s     Elapsed: 0:00.11        RAM: 40884KB    ./v examples/hello_world.v
#0 13:27:10 ᛋ master /v/oo❱xtime ./v examples/2048/
^[[ACPU: 0.31s  Real: 0.33s     Elapsed: 0:00.33        RAM: 102756KB   ./v examples/2048/
#0 13:27:18 ᛋ master /v/oo❱xtime ./v examples/2048/
CPU: 0.32s      Real: 0.34s     Elapsed: 0:00.34        RAM: 102788KB   ./v examples/2048/
#0 13:27:19 ᛋ master /v/oo❱xtime ./v examples/2048/
CPU: 0.31s      Real: 0.32s     Elapsed: 0:00.32        RAM: 102804KB   ./v examples/2048/
#0 13:27:19 ᛋ master /v/oo❱
#0 13:27:20 ᛋ master /v/oo❱ls -la examples/hello_world examples/2048/2048
-rwxrwxr-x 1 delian delian 2804512 Feb  6 13:27 examples/2048/2048
-rwxrwxr-x 1 delian delian  686584 Feb  6 13:27 examples/hello_world
#0 13:27:23 ᛋ master /v/oo❱

@kbkpbot
Copy link
Contributor Author

kbkpbot commented Feb 8, 2024

Something interesting, after apply this PR, v self compile time become 8x slower.....
(just overwrite the v/thirdparty/stdatomic/nix/atomic.h)
Before apply PR:

~/github/kbkpbot/v/thirdparty/stdatomic/nix$ time v self
V self compiling ...
V built successfully as executable "v".

real    0m1.169s
user    0m1.321s
sys     0m0.127s

After apply PR:

~/github/kbkpbot/v/thirdparty/stdatomic/nix$ time v self
V self compiling ...
V built successfully as executable "v".

real    0m9.105s
user    0m8.566s
sys     0m0.401s

Sorry, I found that is because TCC compile fail, and try CC instead.
I will find out what was happened.

@medvednikov
Copy link
Member

Yes, exactly. When tcc fails, V falls back to gcc/clang.

It's the same error as before:

./v1.exe -no-parallel -o v2.exe -cc tcc -no-retry-compilation cmd/v
vlib/v/gen/c/cheaders.v:849:69: error: this number has unsuitable digit `u`
  847 | #ifndef __vinix__
  848 | // convert any 64 bit pseudo random numbers to uniform distribution [0,1). It can be combined with wyrand, wyhash64 or wyhash.
  849 | static inline double wy2u01(uint64_t r){ const double _wynorm=1.0/(1ull<<52); return (r>>12)*_wynorm;}
      |                                                                     ^
  850 | 
  851 | // convert any 64 bit pseudo random numbers to APPROXIMATE Gaussian distribution. It can be combined with wyrand, wyhash64 or wyhash.
make: *** [GNUmakefile:114: all] Error 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants