Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: Add option to build monolithic, shared library #10732

Closed
wants to merge 9 commits into from

Conversation

assignUser
Copy link
Collaborator

@assignUser assignUser commented Aug 13, 2024

This introduces the option to build a shared version of the monolithic libvelox.a, the result is a much smaller build tree, library and executables. They are now so small that moving them is much easier, e.g. for test sharding.

I also fixed some miscellaneous things in this PR:

  • Removed FILESYSTEM, the link is not necessary with GCC >= 9 which is our minimum.
  • Replaced FOLLYBENCHMARK with the proper target
  • Improved bundled folly's use of glog and gflags (when also bundled)
  • Cleaned up the usage of an addition proto wrapper target so we don't link generated proto files twice (which doesn't cause issues in a static build)
  • Added a marker in the log so it's clear when we are running velox cmake vs. dependency cmake:
-- [xsimd] xsimd v10.0.0
-- [stemmer] Building stemmer from source
-- Setting Arrow source to AUTO
-- [Arrow] Checking for module 'thrift'

Initially I looked at forcing a bundled build of folly and gflags to make sure they are shared but that introduces weird rpath issues when the dependencies are also on the system (like our CI image), instead I just changed the build script to build folly as shared in the image, we maybe want to make that an option so people can keep folly as static (though even for static build shared folly will likely make the binary size much smaller.). If folly is build from source it will be built static or shared matching the velox build.

I have changed the wrapper function so that only the mono library target is explicitly created as shared, the few other velox utils, test libraries etc. will be build statically to avoid linking and test issues.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 13, 2024
Copy link

netlify bot commented Aug 13, 2024

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 5920c93
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/675310f4f6e297000847d3f4

@assignUser assignUser mentioned this pull request Aug 13, 2024
@PHILO-HE
Copy link
Contributor

@assignUser, this is just a reply to the mentioned issue in PR description. In Gluten, we are using static folly lib, as well as gflags, for velox to link. Can upstream velox link these static libs also?

@assignUser
Copy link
Collaborator Author

@PHILO-HE that's what we normaly do as well but when building velox as an .so it doesn't work because of the mentioned linking issue.
gflags(static -> linked into folly(static) -> linked intod velox(shared) -> linked to tests(shared) -> test also links to gflags (static)

@PHILO-HE
Copy link
Contributor

@PHILO-HE that's what we normaly do as well but when building velox as an .so it doesn't work because of the mentioned linking issue. gflags(static -> linked into folly(static) -> linked intod velox(shared) -> linked to tests(shared) -> test also links to gflags (static)

@assignUser, I see. Thanks for your clarification!
Can we use link option --version-script to hide gflags symbols in the linking? We did this similarly in Gluten, which is a bit tricky maybe. https://github.com/apache/incubator-gluten/blob/b7918a9a4b3482c03283dc542a909669d2ef4534/cpp/core/symbols.map#L10

@assignUser assignUser force-pushed the mono-shared branch 2 times, most recently from f0b2964 to 654ad37 Compare August 22, 2024 15:58
@assignUser assignUser marked this pull request as ready for review August 22, 2024 22:47
@assignUser
Copy link
Collaborator Author

I am unclear on the one failed test, would be great if someone could have a look. But otherwise this is pretty done, we are currently using a temp CI image of course.

@assignUser assignUser mentioned this pull request Aug 22, 2024
@majetideepak majetideepak requested a review from czentgr August 23, 2024 14:44
Copy link
Collaborator

@czentgr czentgr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this!

scripts/setup-helper-functions.sh Outdated Show resolved Hide resolved
Copy link
Collaborator

@majetideepak majetideepak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@assignUser Thanks for adding this!

@DanielHunte
Copy link

I tried doing the monolithic shared build and ran into the error: Undefined symbols for architecture arm64. It seems to have something to do with facebook::velox::fuzzer::ArgumentTypeFuzzer::randOrderableType()?

[2584/3469] Linking CXX shared library lib/libvelox.dylib
ld: warning: ignoring duplicate libraries: 'CMake/resolve_dependency_modules/arrow/arrow_ep/install/lib/libarrow.a', 'CMake/resolve_dependency_modules/arrow/arrow_ep/src/arrow_ep-build/thrift_ep-install/lib/libthrift.a', '_deps/duckdb-build/src/libduckdb_static.a', '_deps/fmt-build/libfmtd.a', '_deps/protobuf-build/libprotobufd.a', '_deps/re2-build/libre2.a', '_deps/simdjson-build/libsimdjson.a'
[2586/3469] Linking CXX shared library velox/vector/tests/utils/libvelox_vector_test_lib.dylib
ld: warning: ignoring duplicate libraries: '/opt/homebrew/lib/libgtest.a'
[2595/3469] Linking CXX shared library velox/dwio/common/tests/utils/libvelox_dwio_common_test_utils.dylib
ld: warning: ignoring duplicate libraries: '/opt/homebrew/lib/libgtest.a', '_deps/fmt-build/libfmtd.a'
[2611/3469] Linking CXX shared library velox/expression/fuzzer/libvelox_expression_test_utility.dylib
FAILED: velox/expression/fuzzer/libvelox_expression_test_utility.dylib
: && /Library/Developer/CommandLineTools/usr/bin/c++ -mcpu=apple-m1+crc -D USE_VELOX_COMMON_BASE -D HAS_UNCAUGHT_EXCEPTIONS -Wall -Wextra -Wno-unused        -Wno-unused-parameter        -Wno-sign-compare        -Wno-ignored-qualifiers        -Wno-range-loop-analysis          -Wno-mismatched-tags          -Wno-nullability-completeness  -g -arch arm64 -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX15.0.sdk -mmacosx-version-min=14.6 -dynamiclib -Wl,-headerpad_max_install_names -Wl,-flat_namespace -o velox/expression/fuzzer/libvelox_expression_test_utility.dylib -install_name @rpath/libvelox_expression_test_utility.dylib velox/expression/fuzzer/CMakeFiles/velox_expression_test_utility.dir/ArgumentTypeFuzzer.cpp.o velox/expression/fuzzer/CMakeFiles/velox_expression_test_utility.dir/FuzzerToolkit.cpp.o  -Wl,-rpath,/Users/danielhunte/velox/_build/debug/lib -Wl,-rpath,/Users/danielhunte/velox/_build/debug/_deps/folly-build -Wl,-rpath,/opt/homebrew/lib  lib/libvelox.dylib  /opt/homebrew/lib/libgtest.a  _deps/folly-build/libfolly.0.58.0-dev.dylib  /opt/homebrew/lib/libboost_context-mt.dylib  /opt/homebrew/lib/libboost_filesystem-mt.dylib  /opt/homebrew/lib/libboost_atomic-mt.dylib  /opt/homebrew/lib/libboost_program_options-mt.dylib  /opt/homebrew/lib/libboost_system-mt.dylib  /opt/homebrew/lib/libboost_thread-mt.dylib  /opt/homebrew/lib/libdouble-conversion.dylib  /opt/homebrew/lib/libgflags.2.2.2.dylib  /opt/homebrew/lib/libglog.dylib  /opt/homebrew/lib/libevent.dylib  /Library/Developer/CommandLineTools/SDKs/MacOSX15.0.sdk/usr/lib/libz.tbd  /opt/homebrew/Cellar/openssl@3/3.3.2/lib/libssl.dylib  /opt/homebrew/Cellar/openssl@3/3.3.2/lib/libcrypto.dylib  /Library/Developer/CommandLineTools/SDKs/MacOSX15.0.sdk/usr/lib/libbz2.tbd  /Library/Developer/CommandLineTools/SDKs/MacOSX15.0.sdk/usr/lib/liblzma.tbd  /opt/homebrew/lib/liblz4.dylib  /opt/homebrew/lib/libzstd.dylib  /opt/homebrew/lib/libsnappy.dylib  /opt/homebrew/lib/libsodium.dylib  -lc++abi  _deps/fmt-build/libfmtd.a  /opt/homebrew/lib/libgflags.2.2.2.dylib  /opt/homebrew/lib/libglog.dylib  _deps/re2-build/libre2.a  /opt/homebrew/lib/libboost_regex-mt.dylib  /opt/homebrew/lib/libdouble-conversion.3.3.0.dylib  _deps/protobuf-build/libprotobufd.a  /opt/homebrew/lib/libsnappy.dylib  /opt/homebrew/lib/libzstd.dylib  /opt/homebrew/lib/liblz4.dylib  /opt/homebrew/lib/liblzo2.dylib  /Library/Developer/CommandLineTools/SDKs/MacOSX15.0.sdk/usr/lib/libz.tbd  CMake/resolve_dependency_modules/arrow/arrow_ep/install/lib/libarrow.a  CMake/resolve_dependency_modules/arrow/arrow_ep/src/arrow_ep-build/thrift_ep-install/lib/libthrift.a  velox/tpch/gen/dbgen/libdbgen.a  _deps/simdjson-build/libsimdjson.a  _deps/libstemmer/src/libstemmer/libstemmer.a  _deps/duckdb-build/src/libduckdb_static.a  _deps/duckdb-build/third_party/fsst/libduckdb_fsst.a  _deps/duckdb-build/third_party/fmt/libduckdb_fmt.a  _deps/duckdb-build/third_party/libpg_query/libduckdb_pg_query.a  _deps/duckdb-build/third_party/re2/libduckdb_re2.a  _deps/duckdb-build/third_party/miniz/libduckdb_miniz.a  _deps/duckdb-build/third_party/utf8proc/libduckdb_utf8proc.a  _deps/duckdb-build/third_party/hyperloglog/libduckdb_hyperloglog.a  _deps/duckdb-build/third_party/fastpforlib/libduckdb_fastpforlib.a  _deps/duckdb-build/third_party/mbedtls/libduckdb_mbedtls.a && :
Undefined symbols for architecture arm64:
  "facebook::velox::randOrderableType(std::__1::mersenne_twister_engine<unsigned int, 32ul, 624ul, 397ul, 31ul, 2567483615u, 11ul, 4294967295u, 7ul, 2636928640u, 15ul, 4022730752u, 18ul, 1812433253u>&, int)", referenced from:
      facebook::velox::fuzzer::ArgumentTypeFuzzer::randOrderableType() in ArgumentTypeFuzzer.cpp.o
  "facebook::velox::randType(std::__1::mersenne_twister_engine<unsigned int, 32ul, 624ul, 397ul, 31ul, 2567483615u, 11ul, 4294967295u, 7ul, 2636928640u, 15ul, 4022730752u, 18ul, 1812433253u>&, int)", referenced from:
      facebook::velox::fuzzer::ArgumentTypeFuzzer::randType() in ArgumentTypeFuzzer.cpp.o
ld: symbol(s) not found for architecture arm64
c++: error: linker command failed with exit code 1 (use -v to see invocation)
[2620/3469] Building CXX object velox/expression/fuzzer/CMakeFiles/velox_expression_fuzzer.dir/ExpressionFuzzer.cpp.o
In file included from /Users/danielhunte/velox/velox/expression/fuzzer/ExpressionFuzzer.cpp:28:
In file included from /Users/danielhunte/velox/./velox/expression/SimpleFunctionRegistry.h:21:
/Users/danielhunte/velox/./velox/expression/SimpleFunctionAdapter.h:286:62: warning: 'unique' is deprecated [-Wdeprecated-declarations]
  286 |       if (args[POSITION]->isFlatEncoding() && args[POSITION].unique() &&
      |                                                              ^
/Library/Developer/CommandLineTools/SDKs/MacOSX15.0.sdk/usr/include/c++/v1/__memory/shared_ptr.h:730:3: note: 'unique' has been explicitly marked deprecated here
  730 |   _LIBCPP_DEPRECATED_IN_CXX17 _LIBCPP_HIDE_FROM_ABI bool unique() const _NOEXCEPT { return use_count() == 1; }
      |   ^
/Library/Developer/CommandLineTools/SDKs/MacOSX15.0.sdk/usr/include/c++/v1/__config:1022:41: note: expanded from macro '_LIBCPP_DEPRECATED_IN_CXX17'
 1022 | #    define _LIBCPP_DEPRECATED_IN_CXX17 _LIBCPP_DEPRECATED
      |                                         ^
/Library/Developer/CommandLineTools/SDKs/MacOSX15.0.sdk/usr/include/c++/v1/__config:995:49: note: expanded from macro '_LIBCPP_DEPRECATED'
  995 | #      define _LIBCPP_DEPRECATED __attribute__((__deprecated__))
      |                                                 ^
1 warning generated.
ninja: build stopped: subcommand failed.
make[1]: *** [build] Error 1
make: *** [debug] Error 2```

@assignUser assignUser changed the title Add option to build monolithic, shared library build: Add option to build monolithic, shared library Nov 19, 2024
@assignUser assignUser marked this pull request as draft November 19, 2024 16:06
@assignUser assignUser force-pushed the mono-shared branch 2 times, most recently from 629ef65 to c8bfb15 Compare November 21, 2024 15:28
.github/workflows/linux-build-base.yml Show resolved Hide resolved
CMake/ResolveDependency.cmake Outdated Show resolved Hide resolved
@@ -93,13 +93,26 @@ function(velox_link_libraries TARGET)
# TODO(assignUser): Handle scope keywords (they currently are empty calls ala
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we create an issue and remove the TODO

CMake/VeloxUtils.cmake Show resolved Hide resolved
# Together with the patch applied above prevents folly from test compiling a
# snippet to find the right namespace (which would fail because gflags isn't
# built yet)
set(FOLLY_UNUSUAL_GFLAGS_NAMESPACE OFF)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious why not build gflags before folly then ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of this finishes in the configure step, nothing is built only cmake is run BUT the test compilation happens in folly's cmake. So we have to work around this.

"NOT VELOX_BUILD_SHARED"
OFF)

if(VELOX_BUILD_SHARED AND NOT VELOX_MONO_LIBRARY)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: If we only have VELOX_BUILD_SHARED then we should set VELOX_MONO_LIBRARY to ON if its unset , rather than throw.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, might be confusing but I guess we flip the switch for necessary components of other features as well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we do flip the switch in other cases. Let's set VELOX_MONO_LIBRARY to ON as Krishna suggested.

CMakeLists.txt Show resolved Hide resolved
@@ -0,0 +1,77 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice ! - Doesnt this require a comment header?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, actually I added that for testing (that's why it's bare bones). I'll pull it out, completely and add it back in in a follow up.

No there are no comments in JSON so we have to exclude it from the check (or it seems to already be).

# These files wrap the generated source and header files from
# velox_dwio_dwrf_proto with pragamas to disable certain warnings TODO: transfer
# disabling the warnings to CMake/Buck
velox_add_library(velox_dwio_dwrf_proto orc-proto-wrapper.cpp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we not need this before ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They were built twice and linked inconsistently so I unified this. I did not test if we actually still need the pragmas.

velox/dwio/dwrf/proto/CMakeLists.txt Outdated Show resolved Hide resolved
@assignUser assignUser marked this pull request as ready for review November 22, 2024 03:09
@kgpai
Copy link
Contributor

kgpai commented Nov 26, 2024

Importing to check internal build.

@facebook-github-bot
Copy link
Contributor

@kgpai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot pushed a commit that referenced this pull request Dec 6, 2024
Summary:
I have pulled these changes from #10732 as they are not specifically part of the shared build.
- Add `[<dependency name>]` as an indicator that a dependency is running CMake. This makes it much easier to parse the log and track issues:
```
-- [simdjson] Using BUNDLED simdjson
-- Setting folly source to BUNDLED
-- [folly] Building Folly from source
-- [folly] Using Boost - Bundled
```
- use targets instead of variables for Folly::follybenchmark and glog::glog
- remove explicit linking of std::fs as this is no longer necessary for gcc >= 9 (our minimum is 11)
- unify generation and linking of dwrf protobuf files
- don't set C++ std for dependencies

Pull Request resolved: #11751

Reviewed By: DanielHunte

Differential Revision: D66843322

Pulled By: kgpai

fbshipit-source-id: a3444a3ff919028523f403e78ade05afb40295c5
@facebook-github-bot
Copy link
Contributor

@kgpai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@majetideepak
Copy link
Collaborator

@kgpai, @assignUser Should we wait until #11745 lands?

@facebook-github-bot
Copy link
Contributor

@kgpai merged this pull request in 164b4c2.

facebook-github-bot pushed a commit that referenced this pull request Dec 12, 2024
Summary:
#10732 temporarily changed to a different container. The docker images should be rebuilt with the necessary shared ependencies now so we can witch back.

Pull Request resolved: #11796

Reviewed By: Yuhta

Differential Revision: D67109007

Pulled By: bikramSingh91

fbshipit-source-id: 174e186a254f6adc866574660a51a7ba1859c372
facebook-github-bot pushed a commit that referenced this pull request Dec 18, 2024
Summary:
This pr just reverts one line of change by #10732. I think it is not necessary to fully align gflags build with velox build. It means we should set `BUILD_SHARED_LIBS=ON` for gflags when `VELOX_BUILD_SHARED=ON`, but it is not necessary to set `BUILD_SHARED_LIBS=OFF` for gflags when `VELOX_BUILD_SHARED=OFF`. We can just make the shared lib always be produced, as I also note `setup-ubuntu.sh` always installs shared gflags from package (see [link](https://github.com/facebookincubator/velox/blob/main/scripts/setup-ubuntu.sh#L120)). Another reason for this fix is, it is better to keep using same lib type (static or shared) for gflags and glog to avoid any potential issues.

Related discussion on this issue: #10732 (comment)

Pull Request resolved: #11889

Reviewed By: xiaoxmeng

Differential Revision: D67414560

Pulled By: pedroerp

fbshipit-source-id: a797f44c2b659ca680ac555928a9f9d64ac9c8f0
athmaja-n pushed a commit to athmaja-n/velox that referenced this pull request Jan 10, 2025
Summary:
I have pulled these changes from facebookincubator#10732 as they are not specifically part of the shared build.
- Add `[<dependency name>]` as an indicator that a dependency is running CMake. This makes it much easier to parse the log and track issues:
```
-- [simdjson] Using BUNDLED simdjson
-- Setting folly source to BUNDLED
-- [folly] Building Folly from source
-- [folly] Using Boost - Bundled
```
- use targets instead of variables for Folly::follybenchmark and glog::glog
- remove explicit linking of std::fs as this is no longer necessary for gcc >= 9 (our minimum is 11)
- unify generation and linking of dwrf protobuf files
- don't set C++ std for dependencies

Pull Request resolved: facebookincubator#11751

Reviewed By: DanielHunte

Differential Revision: D66843322

Pulled By: kgpai

fbshipit-source-id: a3444a3ff919028523f403e78ade05afb40295c5
athmaja-n pushed a commit to athmaja-n/velox that referenced this pull request Jan 10, 2025
…tor#10732)

Summary:
This introduces the option to build a shared version of the monolithic `libvelox.a`, the result is a much smaller build tree, library and executables. They are now so small that moving them is much easier, e.g. for test sharding.

I also fixed some miscellaneous things in this PR:
- Removed `FILESYSTEM`, the link is not necessary with GCC >= 9 which is our minimum.
- Replaced `FOLLYBENCHMARK` with the proper target
- Improved bundled folly's use of glog and gflags (when also bundled)
- [Cleaned up the usage of an addition proto wrapper target](https://github.com/facebookincubator/velox/pull/10732/files#diff-6e1ef8f965a01fe12b35484f26064a6e30207eb3e8722eabe552dba288fbf934R18) so we don't link generated proto files twice (which doesn't cause issues in a static build)
- Added a marker in the log so it's clear when we are running velox cmake vs. dependency cmake:
```
-- [xsimd] xsimd v10.0.0
-- [stemmer] Building stemmer from source
-- Setting Arrow source to AUTO
-- [Arrow] Checking for module 'thrift'
```
Initially I looked at forcing a bundled build of folly and gflags to make sure they are shared but that introduces weird rpath issues when the dependencies are also on the system (like our CI image), instead I just changed the build script to build folly as shared in the image, we maybe want to make that an option so people can keep folly as static (though even for static build shared folly will likely make the binary size much smaller.).  If folly is build from source it will be built static or shared matching the velox build.

I have changed the wrapper function so that only the mono library target is explicitly created as shared, the few other velox utils, test libraries etc. will be build statically to avoid linking and test issues.

Pull Request resolved: facebookincubator#10732

Reviewed By: xiaoxmeng

Differential Revision: D66515203

Pulled By: kgpai

fbshipit-source-id: 260e5c8c9027c560bfc4c0d63a999fa284ef9775
athmaja-n pushed a commit to athmaja-n/velox that referenced this pull request Jan 10, 2025
Summary:
facebookincubator#10732 temporarily changed to a different container. The docker images should be rebuilt with the necessary shared ependencies now so we can witch back.

Pull Request resolved: facebookincubator#11796

Reviewed By: Yuhta

Differential Revision: D67109007

Pulled By: bikramSingh91

fbshipit-source-id: 174e186a254f6adc866574660a51a7ba1859c372
athmaja-n pushed a commit to athmaja-n/velox that referenced this pull request Jan 10, 2025
…r#11889)

Summary:
This pr just reverts one line of change by facebookincubator#10732. I think it is not necessary to fully align gflags build with velox build. It means we should set `BUILD_SHARED_LIBS=ON` for gflags when `VELOX_BUILD_SHARED=ON`, but it is not necessary to set `BUILD_SHARED_LIBS=OFF` for gflags when `VELOX_BUILD_SHARED=OFF`. We can just make the shared lib always be produced, as I also note `setup-ubuntu.sh` always installs shared gflags from package (see [link](https://github.com/facebookincubator/velox/blob/main/scripts/setup-ubuntu.sh#L120)). Another reason for this fix is, it is better to keep using same lib type (static or shared) for gflags and glog to avoid any potential issues.

Related discussion on this issue: facebookincubator#10732 (comment)

Pull Request resolved: facebookincubator#11889

Reviewed By: xiaoxmeng

Differential Revision: D67414560

Pulled By: pedroerp

fbshipit-source-id: a797f44c2b659ca680ac555928a9f9d64ac9c8f0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants