Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[KUDU-3054][SPARK]Init kudu.write_duration accumulator lazily #27

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

liupc
Copy link

@liupc liupc commented Feb 14, 2020

Currently, we encountered a issue in kudu-spark that will causing the following spark sql query failure:


Job aborted due to stage failure: Total size of serialized results of 942 tasks (2.0 GB) is bigger than spark.driver.maxResultSize (2.0 GB)

After carefully debug, we find out that it's the kudu.write_duration accumulators causing single spark task larger than 2M, thus all tasks size of the stage will bigger than the limit.

However, this stage is just reading kudu table and do shuffle exchange, no writing any kudu tables.

So I propose to init this accumulator lazily in KuduContext to avoid such issues.

@adembo
Copy link
Contributor

adembo commented Feb 14, 2020

Hi @liupc.

Thank you for the pull request. The Kudu project doesn't merge new code through PRs; we use the gerrit tool to do code review and submission. Could you please follow the directions here to adapt this PR into a gerrit change?

@liupc
Copy link
Author

liupc commented Feb 17, 2020

@adembo Thanks, I will try it later

haohaoc pushed a commit to haohaoc/kudu that referenced this pull request May 11, 2020
The Messenger's lock is only intended to protect closing_, acceptor_pools_,
and rpc_services_. This change adjusts its usage to reflect that:
1. There's no need to take the lock in the destructor.
2. It was held for longer than necessary in QueueInboundCall.
3. It wasn't needed at all in DumpConnections.

The motivation for this was a TSAN lock inversion warning I saw in a
precommit job, between the Messenger lock and glog's vmodule lock. The
warning seems wrong (the vmodule lock is released after a VLOG statement
ends), but one way to avoid it altogether is to not take the Messenger lock
in its destructor.

WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock) (pid=5867)
  Cycle in lock order graph: M1870 (0x7b14000172f8) => M37857528269694952 (0x000000000000) => M1870

  Mutex M37857528269694952 acquired here while holding mutex M1870 in main thread:
    #0 pthread_rwlock_wrlock /home/jenkins-slave/workspace/kudu-master/2/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:1352 (kudu+0x4a360f)
    apache#1 glog_internal_namespace_::Mutex::Lock() /home/jenkins-slave/workspace/kudu-master/2/thirdparty/src/glog-0.3.5/src/base/mutex.h:250:30 (libglog.so.0+0x1abe7)
    apache#2 glog_internal_namespace_::MutexLock::MutexLock(glog_internal_namespace_::Mutex*) /home/jenkins-slave/workspace/kudu-master/2/thirdparty/src/glog-0.3.5/src/base/mutex.h:290 (libglog.so.0+0x1abe7)
    apache#3 google::InitVLOG3__(int**, int*, char const*, int) /home/jenkins-slave/workspace/kudu-master/2/thirdparty/src/glog-0.3.5/src/vlog_is_on.cc:199 (libglog.so.0+0x1abe7)
    apache#4 kudu::rpc::Messenger::ShutdownInternal(kudu::rpc::Messenger::ShutdownMode) /home/jenkins-slave/workspace/kudu-master/2/src/kudu/rpc/messenger.cc:283:5 (libkrpc.so+0xab101)
    apache#5 kudu::rpc::Messenger::AllExternalReferencesDropped() /home/jenkins-slave/workspace/kudu-master/2/src/kudu/rpc/messenger.cc:249:3 (libkrpc.so+0xaaeb7)
    apache#6 std::__1::mem_fun_t<void, kudu::rpc::Messenger>::operator()(kudu::rpc::Messenger*) const /home/jenkins-slave/workspace/kudu-master/2/thirdparty/installed/tsan/include/c++/v1/functional:1120:17 (libkrpc.so+0xaf3a5)
    apache#7 std::__1::__shared_ptr_pointer<kudu::rpc::Messenger*, std::__1::mem_fun_t<void, kudu::rpc::Messenger>, std::__1::allocator<kudu::rpc::Messenger> >::__on_zero_shared() /home/jenkins-slave/workspace/kudu-master/2/thirdparty/installed/tsan/include/c++/v1/memory:3586 (libkrpc.so+0xaf3a5)
    apache#8 std::__1::__shared_count::__release_shared() /home/jenkins-slave/workspace/kudu-master/2/thirdparty/installed/tsan/include/c++/v1/memory:3490:9 (kudu+0x56affe)
    apache#9 std::__1::__shared_weak_count::__release_shared() /home/jenkins-slave/workspace/kudu-master/2/thirdparty/installed/tsan/include/c++/v1/memory:3532 (kudu+0x56affe)
    apache#10 std::__1::shared_ptr<kudu::rpc::Messenger>::~shared_ptr() /home/jenkins-slave/workspace/kudu-master/2/thirdparty/installed/tsan/include/c++/v1/memory:4468 (kudu+0x56affe)
    apache#11 kudu::client::KuduClient::Data::~Data() /home/jenkins-slave/workspace/kudu-master/2/src/kudu/client/client-internal.cc:179:1 (libkudu_client.so+0x136260)
    apache#12 kudu::client::KuduClient::~KuduClient() /home/jenkins-slave/workspace/kudu-master/2/src/kudu/client/client.cc:394:3 (libkudu_client.so+0x1130cc)
    apache#13 std::__1::default_delete<kudu::client::KuduClient>::operator()(kudu::client::KuduClient*) const /home/jenkins-slave/workspace/kudu-master/2/thirdparty/installed/tsan/include/c++/v1/memory:2285:5 (libkudu_client.so+0x12be1b)
    apache#14 std::__1::__shared_ptr_pointer<kudu::client::KuduClient*, std::__1::default_delete<kudu::client::KuduClient>, std::__1::allocator<kudu::client::KuduClient> >::__on_zero_shared() /home/jenkins-slave/workspace/kudu-master/2/thirdparty/installed/tsan/include/c++/v1/memory:3586 (libkudu_client.so+0x12be1b)
    apache#15 std::__1::__shared_count::__release_shared() /home/jenkins-slave/workspace/kudu-master/2/thirdparty/installed/tsan/include/c++/v1/memory:3490:9 (kudu+0x550d1e)
    apache#16 std::__1::__shared_weak_count::__release_shared() /home/jenkins-slave/workspace/kudu-master/2/thirdparty/installed/tsan/include/c++/v1/memory:3532 (kudu+0x550d1e)
    apache#17 std::__1::shared_ptr<kudu::client::KuduClient>::~shared_ptr() /home/jenkins-slave/workspace/kudu-master/2/thirdparty/installed/tsan/include/c++/v1/memory:4468 (kudu+0x550d1e)
    apache#18 kudu::tools::LeaderMasterProxy::~LeaderMasterProxy() /home/jenkins-slave/workspace/kudu-master/2/src/kudu/tools/tool_action_common.h:233:7 (kudu+0x576cf9)
    apache#19 kudu::tools::(anonymous namespace)::ListMasters(kudu::tools::RunnerContext const&) /home/jenkins-slave/workspace/kudu-master/2/src/kudu/tools/tool_action_master.cc:180:1 (kudu+0x572d0b)
    apache#20 _ZNSt3__18__invokeIRPFN4kudu6StatusERKNS1_5tools13RunnerContextEEJS6_EEEDTclclsr3std3__1E7forwardIT_Efp_Espclsr3std3__1E7forwardIT0_Efp0_EEEOSA_DpOSB_ /home/jenkins-slave/workspace/kudu-master/2/thirdparty/installed/tsan/include/c++/v1/type_traits:4482:1 (kudu+0x52e48b)
    apache#21 kudu::Status std::__1::__invoke_void_return_wrapper<kudu::Status>::__call<kudu::Status (*&)(kudu::tools::RunnerContext const&), kudu::tools::RunnerContext const&>(kudu::Status (*&)(kudu::tools::RunnerContext const&), kudu::tools::RunnerContext const&) /home/jenkins-slave/workspace/kudu-master/2/thirdparty/installed/tsan/include/c++/v1/__functional_base:318 (kudu+0x52e48b)
    apache#22 std::__1::__function::__func<kudu::Status (*)(kudu::tools::RunnerContext const&), std::__1::allocator<kudu::Status (*)(kudu::tools::RunnerContext const&)>, kudu::Status (kudu::tools::RunnerContext const&)>::operator()(kudu::tools::RunnerContext const&) /home/jenkins-slave/workspace/kudu-master/2/thirdparty/installed/tsan/include/c++/v1/functional:1562:12 (kudu+0x52e3bd)
    apache#23 std::__1::function<kudu::Status (kudu::tools::RunnerContext const&)>::operator()(kudu::tools::RunnerContext const&) const /home/jenkins-slave/workspace/kudu-master/2/thirdparty/installed/tsan/include/c++/v1/functional:1916:12 (libkudu_tools_util.so+0x6c1c4)
    apache#24 kudu::tools::Action::Run(std::__1::vector<kudu::tools::Mode*, std::__1::allocator<kudu::tools::Mode*> > const&, std::__1::unordered_map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::hash<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, std::__1::equal_to<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > > const&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) const /home/jenkins-slave/workspace/kudu-master/2/src/kudu/tools/tool_action.cc:258:10 (libkudu_tools_util.so+0x6a8d4)
    apache#25 kudu::tools::DispatchCommand(std::__1::vector<kudu::tools::Mode*, std::__1::allocator<kudu::tools::Mode*> > const&, kudu::tools::Action*, std::__1::deque<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) /home/jenkins-slave/workspace/kudu-master/2/src/kudu/tools/tool_main.cc:132:15 (kudu+0x5b42b6)
    apache#26 kudu::tools::RunTool(int, char**, bool) /home/jenkins-slave/workspace/kudu-master/2/src/kudu/tools/tool_main.cc:204:16 (kudu+0x5b5211)
    apache#27 main /home/jenkins-slave/workspace/kudu-master/2/src/kudu/tools/tool_main.cc:265:10 (kudu+0x5b557e)

    Hint: use TSAN_OPTIONS=second_deadlock_stack=1 to get more informative warning message

  Mutex M1870 acquired here while holding mutex M37857528269694952 in thread T8:
    #0 AnnotateRWLockAcquired /home/jenkins-slave/workspace/kudu-master/2/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_interface_ann.cc:271 (kudu+0x4d53ff)
    apache#1 kudu::rw_spinlock::lock() /home/jenkins-slave/workspace/kudu-master/2/src/kudu/util/locks.h:112:5 (libkudu_client.so+0x177762)
    apache#2 kudu::percpu_rwlock::lock() /home/jenkins-slave/workspace/kudu-master/2/src/kudu/util/locks.h:222:22 (libkudu_client.so+0x1776f2)
    apache#3 std::__1::lock_guard<kudu::percpu_rwlock>::lock_guard(kudu::percpu_rwlock&) /home/jenkins-slave/workspace/kudu-master/2/thirdparty/installed/tsan/include/c++/v1/__mutex_base:104:27 (libkrpc.so+0xac9c9)
    apache#4 kudu::rpc::Messenger::~Messenger() /home/jenkins-slave/workspace/kudu-master/2/src/kudu/rpc/messenger.cc:430 (libkrpc.so+0xac9c9)
    apache#5 std::__1::default_delete<kudu::rpc::Messenger>::operator()(kudu::rpc::Messenger*) const /home/jenkins-slave/workspace/kudu-master/2/thirdparty/installed/tsan/include/c++/v1/memory:2285:5 (libkrpc.so+0xb246b)
    apache#6 std::__1::__shared_ptr_pointer<kudu::rpc::Messenger*, std::__1::default_delete<kudu::rpc::Messenger>, std::__1::allocator<kudu::rpc::Messenger> >::__on_zero_shared() /home/jenkins-slave/workspace/kudu-master/2/thirdparty/installed/tsan/include/c++/v1/memory:3586 (libkrpc.so+0xb246b)
    apache#7 std::__1::__shared_count::__release_shared() /home/jenkins-slave/workspace/kudu-master/2/thirdparty/installed/tsan/include/c++/v1/memory:3490:9 (kudu+0x56affe)
    apache#8 std::__1::__shared_weak_count::__release_shared() /home/jenkins-slave/workspace/kudu-master/2/thirdparty/installed/tsan/include/c++/v1/memory:3532 (kudu+0x56affe)
    apache#9 std::__1::shared_ptr<kudu::rpc::Messenger>::~shared_ptr() /home/jenkins-slave/workspace/kudu-master/2/thirdparty/installed/tsan/include/c++/v1/memory:4468 (kudu+0x56affe)
    apache#10 std::__1::shared_ptr<kudu::rpc::Messenger>::reset() /home/jenkins-slave/workspace/kudu-master/2/thirdparty/installed/tsan/include/c++/v1/memory:4603:5 (libkrpc.so+0xc0771)
    apache#11 kudu::rpc::ReactorThread::RunThread() /home/jenkins-slave/workspace/kudu-master/2/src/kudu/rpc/reactor.cc:499 (libkrpc.so+0xc0771)
    apache#12 boost::_mfi::mf0<void, kudu::rpc::ReactorThread>::operator()(kudu::rpc::ReactorThread*) const /home/jenkins-slave/workspace/kudu-master/2/thirdparty/installed/tsan/include/boost/bind/mem_fn_template.hpp:49:29 (libkrpc.so+0xca669)
    apache#13 void boost::_bi::list1<boost::_bi::value<kudu::rpc::ReactorThread*> >::operator()<boost::_mfi::mf0<void, kudu::rpc::ReactorThread>, boost::_bi::list0>(boost::_bi::type<void>, boost::_mfi::mf0<void, kudu::rpc::ReactorThread>&, boost::_bi::list0&, int) /home/jenkins-slave/workspace/kudu-master/2/thirdparty/installed/tsan/include/boost/bind/bind.hpp:259:9 (libkrpc.so+0xca5ba)
    apache#14 boost::_bi::bind_t<void, boost::_mfi::mf0<void, kudu::rpc::ReactorThread>, boost::_bi::list1<boost::_bi::value<kudu::rpc::ReactorThread*> > >::operator()() /home/jenkins-slave/workspace/kudu-master/2/thirdparty/installed/tsan/include/boost/bind/bind.hpp:1222:16 (libkrpc.so+0xca543)
    apache#15 boost::detail::function::void_function_obj_invoker0<boost::_bi::bind_t<void, boost::_mfi::mf0<void, kudu::rpc::ReactorThread>, boost::_bi::list1<boost::_bi::value<kudu::rpc::ReactorThread*> > >, void>::invoke(boost::detail::function::function_buffer&) /home/jenkins-slave/workspace/kudu-master/2/thirdparty/installed/tsan/include/boost/function/function_template.hpp:159:11 (libkrpc.so+0xca339)
    apache#16 boost::function0<void>::operator()() const /home/jenkins-slave/workspace/kudu-master/2/thirdparty/installed/tsan/include/boost/function/function_template.hpp:770:14 (libkrpc.so+0xba0b1)
    apache#17 kudu::Thread::SuperviseThread(void*) /home/jenkins-slave/workspace/kudu-master/2/src/kudu/util/thread.cc:657:3 (libkudu_util.so+0x1ee174)

  Thread T8 'rpc reactor-588' (tid=5886, running) created by main thread at:
    #0 pthread_create /home/jenkins-slave/workspace/kudu-master/2/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:992 (kudu+0x490e36)
    apache#1 kudu::Thread::StartThread(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, boost::function<void ()> const&, unsigned long, scoped_refptr<kudu::Thread>*) /home/jenkins-slave/workspace/kudu-master/2/src/kudu/util/thread.cc:601:15 (libkudu_util.so+0x1ed95b)
    apache#2 kudu::Status kudu::Thread::Create<void (kudu::rpc::ReactorThread::*)(), kudu::rpc::ReactorThread*>(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, void (kudu::rpc::ReactorThread::* const&)(), kudu::rpc::ReactorThread* const&, scoped_refptr<kudu::Thread>*) /home/jenkins-slave/workspace/kudu-master/2/src/kudu/util/thread.h:164:12 (libkrpc.so+0xc5a15)
    apache#3 kudu::rpc::ReactorThread::Init() /home/jenkins-slave/workspace/kudu-master/2/src/kudu/rpc/reactor.cc:185:10 (libkrpc.so+0xc026e)
    apache#4 kudu::rpc::Reactor::Init() /home/jenkins-slave/workspace/kudu-master/2/src/kudu/rpc/reactor.cc:759:18 (libkrpc.so+0xc4911)
    apache#5 kudu::rpc::Messenger::Init() /home/jenkins-slave/workspace/kudu-master/2/src/kudu/rpc/messenger.cc:446:5 (libkrpc.so+0xaad72)
    apache#6 kudu::rpc::MessengerBuilder::Build(std::__1::shared_ptr<kudu::rpc::Messenger>*) /home/jenkins-slave/workspace/kudu-master/2/src/kudu/rpc/messenger.cc:205:3 (libkrpc.so+0xaa7cd)
    apache#7 kudu::client::KuduClientBuilder::Build(std::__1::shared_ptr<kudu::client::KuduClient>*) /home/jenkins-slave/workspace/kudu-master/2/src/kudu/client/client.cc:349:3 (libkudu_client.so+0x112561)
    apache#8 kudu::tools::LeaderMasterProxy::Init(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&, kudu::MonoDelta const&) /home/jenkins-slave/workspace/kudu-master/2/src/kudu/tools/tool_action_common.cc:786:30 (libkudu_tools_util.so+0x7740c)
    apache#9 kudu::tools::LeaderMasterProxy::Init(kudu::tools::RunnerContext const&) /home/jenkins-slave/workspace/kudu-master/2/src/kudu/tools/tool_action_common.cc:792:10 (libkudu_tools_util.so+0x774d6)
    apache#10 kudu::tools::(anonymous namespace)::ListMasters(kudu::tools::RunnerContext const&) /home/jenkins-slave/workspace/kudu-master/2/src/kudu/tools/tool_action_master.cc:109:3 (kudu+0x572be3)
    apache#11 _ZNSt3__18__invokeIRPFN4kudu6StatusERKNS1_5tools13RunnerContextEEJS6_EEEDTclclsr3std3__1E7forwardIT_Efp_Espclsr3std3__1E7forwardIT0_Efp0_EEEOSA_DpOSB_ /home/jenkins-slave/workspace/kudu-master/2/thirdparty/installed/tsan/include/c++/v1/type_traits:4482:1 (kudu+0x52e48b)
    apache#12 kudu::Status std::__1::__invoke_void_return_wrapper<kudu::Status>::__call<kudu::Status (*&)(kudu::tools::RunnerContext const&), kudu::tools::RunnerContext const&>(kudu::Status (*&)(kudu::tools::RunnerContext const&), kudu::tools::RunnerContext const&) /home/jenkins-slave/workspace/kudu-master/2/thirdparty/installed/tsan/include/c++/v1/__functional_base:318 (kudu+0x52e48b)
    apache#13 std::__1::__function::__func<kudu::Status (*)(kudu::tools::RunnerContext const&), std::__1::allocator<kudu::Status (*)(kudu::tools::RunnerContext const&)>, kudu::Status (kudu::tools::RunnerContext const&)>::operator()(kudu::tools::RunnerContext const&) /home/jenkins-slave/workspace/kudu-master/2/thirdparty/installed/tsan/include/c++/v1/functional:1562:12 (kudu+0x52e3bd)
    apache#14 std::__1::function<kudu::Status (kudu::tools::RunnerContext const&)>::operator()(kudu::tools::RunnerContext const&) const /home/jenkins-slave/workspace/kudu-master/2/thirdparty/installed/tsan/include/c++/v1/functional:1916:12 (libkudu_tools_util.so+0x6c1c4)
    apache#15 kudu::tools::Action::Run(std::__1::vector<kudu::tools::Mode*, std::__1::allocator<kudu::tools::Mode*> > const&, std::__1::unordered_map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::hash<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, std::__1::equal_to<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > > const&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) const /home/jenkins-slave/workspace/kudu-master/2/src/kudu/tools/tool_action.cc:258:10 (libkudu_tools_util.so+0x6a8d4)
    apache#16 kudu::tools::DispatchCommand(std::__1::vector<kudu::tools::Mode*, std::__1::allocator<kudu::tools::Mode*> > const&, kudu::tools::Action*, std::__1::deque<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) /home/jenkins-slave/workspace/kudu-master/2/src/kudu/tools/tool_main.cc:132:15 (kudu+0x5b42b6)
    apache#17 kudu::tools::RunTool(int, char**, bool) /home/jenkins-slave/workspace/kudu-master/2/src/kudu/tools/tool_main.cc:204:16 (kudu+0x5b5211)
    apache#18 main /home/jenkins-slave/workspace/kudu-master/2/src/kudu/tools/tool_main.cc:265:10 (kudu+0x5b557e)

Change-Id: I1fd93c06b14bc97a9ac4a37a5b6ca55ffa38f544
Reviewed-on: http://gerrit.cloudera.org:8080/14250
Tested-by: Kudu Jenkins
Reviewed-by: Andrew Wong <[email protected]>
Reviewed-by: Alexey Serbin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants