Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: enable Seastar to build shared and static libs in a single build #2427

Merged
merged 1 commit into from
Oct 9, 2024

Conversation

tchaikov
Copy link
Contributor

@tchaikov tchaikov commented Sep 12, 2024

before this change, we respect the CMake variable named BUILD_SHARED_LIBS, and build shared libraries if it is set, build static libraries otherwise. but this model cannot fulfill the needs of a parent project which needs to build both static and shared seastar libraries with different configuration in a multi-config generator settings. as BUILD_SHARED_LIBS is a CMake variable which cannot be changed at build time, and hence cannot be assigned with
different values using generator expression.

so, in this change, we add two options

  • Seastar_BUILD_STATIC_LIBS
  • Seastar_BUILD_SHARED_LIBS

to configure if the build should generated static libraries and shared libraries respectively. but BUILD_SHARED_LIBS is still supported, and the behavior is backward compatible. the only user-visible differences are, the libraries of

  • seastar
  • seastar_testing
  • seastar_perf_testing

are now aliases of the targets which builds the static or shared libraries. so one cannot build "seastar" as a target anymore, but should specify which library to build:

$ cmake --build build --target seastar # does not work anymore
$ cmake --build build --target seastar-static
$ cmake --build build --target seastar-shared

the reason is the limit of an ALIAS library, which cannot be used as a target. but we still need to use a non-interface library to generate .pc file, where, for instance, we use $<TARGET_FILE_NAME:seastar_testing> for the library list of a
certain Seastar library.

but a CMake-based project including seastar can still link against it using "seastar" as a library.

please note, in d8a70b3, we pass --ftls-mode=initial-exec when building the tree, to address the problem of recursive call of malloc when using seastar allocator. but we cannot use different compiling options when building both shared libraries and static libraries, so we always pass this option as long as BUILD_SHARED_LIBS OR Seastar_BUILD_SHARED_LIBS is set and when we are building dev/release modes. as these two modes enables the seastar allocator. this should not hurt the overall performance, as we are not using global-dynamic mode, but initial-exec. this is verified using

$ cmake \
  -DCMAKE_CONFIGURATION_TYPES="Debug;RelWithDebInfo" \
  -DCMAKE_CROSS_CONFIGS="Debug;RelWithDebInfo" \
  -DCMAKE_DEFAULT_CONFIGS="Debug;RelWithDebInfo" \
  -DSeastar_BUILD_SHARED_LIBS=ON \
  -DSeastar_BUILD_STATIC_LIBS=ON  \
  -G "Ninja Multi-Config" -B build -S .
$ cmake --build build --config RelWithDebInfo --target seastar-static
$ readelf -r
build/CMakeFiles/seastar-objects.dir/RelWithDebInfo/src/core/reactor.cc.o| \
  grep -i tls | tr -s ' ' | cut -d' ' -f 3,5 | uniq
R_X86_64_TPOFF32 __tls_guard
R_X86_64_PLT32 _ZStlsRSoRKNSt15_[...]

and

$ cmake -DCMAKE_BUILD_TYPE="RelWithDebInfo" -G "Ninja" -B build -S .
$ cmake --build build --target seastar-static
$ readelf -r build/CMakeFiles/seastar-objects.dir/src/core/reactor.cc.o| \
  grep -i tls | tr -s ' ' | cut -d' ' -f 3,5 | uniq
R_X86_64_TPOFF32 __tls_guard
R_X86_64_PLT32 _ZStlsRSoRKNSt15_[...]

so the TLS variables are using "TPOFF32" relocation type, in general, it implies the initial-exec TLS model. the PLT32 is not specific to TLS, it's more related to external symbols linkages.

Refs scylladb/scylladb#2717
Signed-off-by: Kefu Chai [email protected]

@tchaikov tchaikov force-pushed the seastar-shared-static branch 5 times, most recently from cea6dbc to 0508fb3 Compare September 12, 2024 08:27
@tchaikov
Copy link
Contributor Author

@denesb could you please help review this change? it implements the proposal at scylladb/scylladb#2717 (comment)

Copy link
Contributor

@denesb denesb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I don't understand the Cmake changes in detail, but the concept as explained in the cover letter as well as the code in high level looks good.

@tchaikov
Copy link
Contributor Author

thank your Botond, for your generous review.

@tchaikov
Copy link
Contributor Author

@scylladb/seastar-maint hello maintainers, could you help review this change?

@tchaikov
Copy link
Contributor Author

v2:

  • rebased against master HEAD.

@tchaikov
Copy link
Contributor Author

@scylladb/seastar-maint hello maintainers, could you help review this change?

1 similar comment
@tchaikov
Copy link
Contributor Author

@scylladb/seastar-maint hello maintainers, could you help review this change?

@avikivity
Copy link
Member

If I understand correctly, we build the object files just once.

Doesn't this degrade the performance of thread-local variables? It moves tls-model from initial-exec (fast) to global-dynamic. Or maybe global-dynamic. Need to examine the generated code to be sure.

@tchaikov
Copy link
Contributor Author

will do and report my findings here later on.

@tchaikov
Copy link
Contributor Author

tchaikov commented Sep 26, 2024

@avikivity i checked the emitted elf files of memory.cc.o and reactor.cc.o, they are using initial-exec TLS mode. could you help take another look ?

@tchaikov
Copy link
Contributor Author

v2:

  • add the verification steps and report on the TLS mode in the commit message

@avikivity
Copy link
Member

Ok, this was due to my confusion. In d8a70b3 we switched shared libraries to initial-exec. I remembered we downgraded it from global-dynamic, but thought it wasn't to initial-exec.

@avikivity
Copy link
Member

But looking at that commit, I see we have another different: -fvisibility=hidden. What happens now? Do shared libraries gain -fvisibility=hidden? Or do static libraries lose it?

@tchaikov
Copy link
Contributor Author

tchaikov commented Sep 27, 2024

@avikivity

But looking at that commit, I see we have another different: -fvisibility=hidden. What happens now?

the commit of f34335e includes -fvisibility=hidden in compile flags only when building static libraries. i am quoting part of the commit message of that commit here:

    * do not pass `-fvisibility=hidden` to as a CXXFLAGS, if
      "BUILD_SHARED_LIBS" is TRUE. so all symbols are exported to
      application linked against libseastar.so. we don't have a set
      of mechinary or policy to expose/version symbols exposed via a
      shared library, so all of them are exposed. in the light of
      C++20 modular support, probably we don't need to introduce this
      mechinary in future.

so,

  • when building shared libraries we do not include -fvisibility=hidden
  • when building static libraries we do include -fvisibility=hidden

Do shared libraries gain -fvisibility=hidden? Or do static libraries lose it?

ahh. i see your point now. the latter, the static libraries lose it after this pull request, as long as we build the shared libraries and the static libraries in a single pass. do you think it's important? IMHO, it's not important at this moment. because:

  • the symbol visibility matters only if we link to a DSO, unless we hide symbols as local ones using static or put them into an anonymous namespace
  • in the build modes where we care about performance, like "release" mode, we build seastar libraries as static library

but i agree that a smaller list of elf global symbols is desirable. for instance, it helps to reduce the size of DSO, shorten the load time of the DSO, etc. but because the commit message puts:

      we don't have a set
      of mechinary or policy to expose/version symbols exposed via a
      shared library, so all of them are exposed.

we cannot differentiate the public symbols from the internal ones, without adding the directives like __attribute__((__visibility__("default"))) and __attribute__((__visibility__("hidden"))) (or [[gnu::visibility("default")]]) so, adding this compiler option do not really help.

IMHO, the recommended approach, although potentially tedious, is to:

  • explicitly mark public interfaces with __attribute__((__visibility__("default"))).
  • and then, apply the -fvisibility=hidden compiler flag when building both shared and static libraries.

but this is not in the scope of this change. and probably has lower priority as it's more about the DSO and is not trivial. what do you think?

@tchaikov
Copy link
Contributor Author

BTW, i created #2457 to track the enhancement for adding [[gnu::visibility("default")]] to public symbols.

@tchaikov tchaikov marked this pull request as draft October 1, 2024 03:59
@tchaikov

This comment was marked as resolved.

before this change, we respect the CMake variable named
`BUILD_SHARED_LIBS`, and build shared libraries if it is set, build
static libraries otherwise. but this model cannot fulfill the needs
of a parent project which needs to build both static and shared
seastar libraries with different configuration in a multi-config
generator settings. as `BUILD_SHARED_LIBS` is a CMake variable which
cannot be changed at build time, and hence cannot be assigned with
different values using generator expression.

so, in this change, we add two options

- Seastar_BUILD_STATIC_LIBS
- Seastar_BUILD_SHARED_LIBS

to configure if the build should generated static libraries and
shared libraries respectively. but `BUILD_SHARED_LIBS` is still
supported, and the behavior is backward compatible. the only
user-visible differences are, the libraries of

- seastar
- seastar_testing
- seastar_perf_testing

are now aliases of the targets which builds the static or shared
libraries. so one cannot build "seastar" as a target anymore,
but should specify which library to build:

```
$ cmake --build build --target seastar # does not work anymore
$ cmake --build build --target seastar-static
$ cmake --build build --target seastar-shared
```

the reason is the limit of an ALIAS library, which cannot be used
as a target. but we still need to use a non-interface library
to generate .pc file, where, for instance, we use
`$<TARGET_FILE_NAME:seastar_testing>` for the library list of a
certain Seastar library.

but a CMake-based project including seastar can still link against
it using "seastar" as a library.

please note, in d8a70b3, we pass `--ftls-mode=initial-exec` when
building the tree, to address the problem of recursive call of malloc
when using seastar allocator. but we cannot use different compiling
options when building both shared libraries and static libraries, so
we always pass this option as long as `BUILD_SHARED_LIBS` OR
`Seastar_BUILD_SHARED_LIBS` is set *and* when we are building
dev/release modes. as these two modes enables the seastar allocator.
this should not hurt the overall performance, as we are not using
global-dynamic mode, but initial-exec. this is verified using
```console
$ cmake \
  -DCMAKE_CONFIGURATION_TYPES="Debug;RelWithDebInfo" \
  -DCMAKE_CROSS_CONFIGS="Debug;RelWithDebInfo" \
  -DCMAKE_DEFAULT_CONFIGS="Debug;RelWithDebInfo" \
  -DSeastar_BUILD_SHARED_LIBS=ON \
  -DSeastar_BUILD_STATIC_LIBS=ON  \
  -G "Ninja Multi-Config" -B build -S .
$ cmake --build build --config RelWithDebInfo --target seastar-static
$ readelf -r
build/CMakeFiles/seastar-objects.dir/RelWithDebInfo/src/core/reactor.cc.o| \
  grep -i tls | tr -s ' ' | cut -d' ' -f 3,5 | uniq
R_X86_64_TPOFF32 __tls_guard
R_X86_64_PLT32 _ZStlsRSoRKNSt15_[...]
```

and

```console
$ cmake -DCMAKE_BUILD_TYPE="RelWithDebInfo" -G "Ninja" -B build -S .
$ cmake --build build --target seastar-static
$ readelf -r build/CMakeFiles/seastar-objects.dir/src/core/reactor.cc.o| \
  grep -i tls | tr -s ' ' | cut -d' ' -f 3,5 | uniq
R_X86_64_TPOFF32 __tls_guard
R_X86_64_PLT32 _ZStlsRSoRKNSt15_[...]
```

so the TLS variables are using "TPOFF32" relocation type, in general,
it implies the initial-exec TLS model. the PLT32 is not specific TLS,
it's more related to external symbols linkages.

Refs scylladb/scylladb#2717
Signed-off-by: Kefu Chai <[email protected]>
@tchaikov tchaikov marked this pull request as ready for review October 1, 2024 06:06
@tchaikov
Copy link
Contributor Author

tchaikov commented Oct 1, 2024

v3:

  • add alias libraries in cmake/SeastarConfig.cmake.in.

@avikivity
Copy link
Member

I'm not worried about shared object size. I don't want to lose optimization opportunities.

Maybe we can replace it with -fvisibility-inlines-hidden -fno-semantic-interposition? (is -fno-sematic-interposition the default?)

@tchaikov
Copy link
Contributor Author

tchaikov commented Oct 1, 2024

I'm not worried about shared object size. I don't want to lose optimization opportunities.

i see.

Maybe we can replace it with -fvisibility-inlines-hidden -fno-semantic-interposition?

we cannot have -fvisibility-inlines-hidden as explained in #2427 (comment). but yes, we can, and probably should use -fno-semantic-interposition.

(is -fno-sematic-interposition the default?)

GCC

semantic-interposition is enabled by default. guess GCC intends to do the right thing as the ELF spec suggests. and it is off only when -Ofast or explicitly disabled. please note, -O3 does not disable it . see also gcc-mirror/gcc@458d2c6 .

Clang

semantic-interposition is off by default for better performance. see llvm/llvm-project@d1fd723 . to be more specific, i am quoting the commit message here

  • -fsemantic-interposition: make some linkages interposable and make default visibility external linkage definitions dso_preemptable.
  • (default): selected if the target supports .Lfoo$local: make default visibility external linkage definitions dso_local
  • -fhalf-no-semantic-interposition: if neither option is set or the target does not support .Lfoo$local: like -fno-semantic-interposition but local aliases are not used. So references can be interposed if not optimized out.

so we do want to optimize -fpic, i will send a change to include -fno-semantic-interposition in compile options if we build shared libraries.

EDIT, if we don't care about semantic interposition, i think we can disable it for all builds.

@tchaikov
Copy link
Contributor Author

tchaikov commented Oct 1, 2024

so we do want to optimize -fpic, i will send a change to include -fno-semantic-interposition in compile options if we build shared libraries.

EDIT, if we don't care about semantic interposition, i think we can disable it for all builds.

created #2465

@tchaikov
Copy link
Contributor Author

tchaikov commented Oct 2, 2024

@avikivity could you please review my responses? i believe i've addressed all of your questions.

@tchaikov
Copy link
Contributor Author

tchaikov commented Oct 7, 2024

@avikivity hi Avi, could you help re-review this change ?

@avikivity avikivity merged commit 54e25a9 into scylladb:master Oct 9, 2024
15 checks passed
@tchaikov tchaikov deleted the seastar-shared-static branch October 10, 2024 00:16
@tchaikov
Copy link
Contributor Author

tchaikov commented Oct 11, 2024

we still need to pass -DCMAKE_POSITION_INDEPENDENT_CODE=ON to build shared and static libraries in a single pass, this does not change the way how the shared libraries are built. but this would build static library with -fPIC. but since the object files are copied into the executable when it is linked, so if a seastar executable links against a static seastar library built with -fPIC, the executable won't suffer from the overhead of address resolving when accessing the variables/functions exposed by the library.

@lersek
Copy link
Member

lersek commented Oct 11, 2024

Also, even for a dynamically linked Seastar application, I guess that at least some of the runtime overhead should be possible to offset, by launching the application with LD_BIND_NOW set in the environment. The startup could take longer, but later behavior might be more predictable (no PLT / GOT manipulation after launch).

@tchaikov
Copy link
Contributor Author

tchaikov commented Oct 11, 2024

thank you for your comment Laszlo. i agree with you that LD_BIND_NOW helps to address the "cold start" issue. but the address translation with GOT cannot be eliminated even the addresses are cached eagerly not lazily. anyway, i wanted to assure that posterity won't be worried about the performance regression introduced by the -fPIC when reviewing this change and its follow-up change at #2488. -- i should have included it in this change though.

@lersek
Copy link
Member

lersek commented Oct 11, 2024

Right; AIUI, reading the GOT all the time cannot be eliminated for dynamically linked applications.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants