Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arc::try_unwrap fails intermittently during breach_depressions_least_cost #407

Open
jontwo opened this issue May 17, 2024 · 3 comments
Open

Comments

@jontwo
Copy link

jontwo commented May 17, 2024

I don't have a good repro case for this, but one of our system tests is failing intermittently. I managed to get a stacktrace out, but it doesn't really tell us anything, just the line number where it failed.

INFO     farm:d8_pointer.py:878 *****************************************
INFO     farm:d8_pointer.py:878 * Welcome to BreachDepressionsLeastCost *
INFO     farm:d8_pointer.py:878 * Powered by WhiteboxTools              *
INFO     farm:d8_pointer.py:878 * www.whiteboxgeo.com                   *
INFO     farm:d8_pointer.py:878 *****************************************
INFO     farm:d8_pointer.py:878 Reading data...
INFO     farm:d8_pointer.py:878 Num. solved pits: 168
INFO     farm:d8_pointer.py:878 Num. unsolved pits: 1
INFO     farm:d8_pointer.py:878 Filling remaining depressions...
INFO     farm:d8_pointer.py:878 thread 'main' panicked at 'Error unwrapping 'output'', whitebox-tools-app/src/tools/hydro_analysis/breach_depressions_least_cost.rs:657:27
INFO     farm:d8_pointer.py:878 stack backtrace:
INFO     farm:d8_pointer.py:878 0:     0x562e60e0b16c - std::backtrace_rs::backtrace::libunwind::trace::h3fea1eb2e0ba2ac9
INFO     farm:d8_pointer.py:878 at /rustc/59eed8a2aac0230a8b53e89d4e99d55912ba6b35/library/std/src/../../backtrace/src/backtrace/libunwind.rs:90:5
INFO     farm:d8_pointer.py:878 1:     0x562e60e0b16c - std::backtrace_rs::backtrace::trace_unsynchronized::h849d83492cbc0d59
INFO     farm:d8_pointer.py:878 at /rustc/59eed8a2aac0230a8b53e89d4e99d55912ba6b35/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
INFO     farm:d8_pointer.py:878 2:     0x562e60e0b16c - std::sys_common::backtrace::_print_fmt::he3179d37290f23d3
INFO     farm:d8_pointer.py:878 at /rustc/59eed8a2aac0230a8b53e89d4e99d55912ba6b35/library/std/src/sys_common/backtrace.rs:67:5
INFO     farm:d8_pointer.py:878 3:     0x562e60e0b16c - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h140f6925cad14324
INFO     farm:d8_pointer.py:878 at /rustc/59eed8a2aac0230a8b53e89d4e99d55912ba6b35/library/std/src/sys_common/backtrace.rs:46:22
INFO     farm:d8_pointer.py:878 4:     0x562e60e324bc - core::fmt::write::h31b9cd1bedd7ea38
INFO     farm:d8_pointer.py:878 at /rustc/59eed8a2aac0230a8b53e89d4e99d55912ba6b35/library/core/src/fmt/mod.rs:1150:17
INFO     farm:d8_pointer.py:878 5:     0x562e60e07515 - std::io::Write::write_fmt::h1fdf66f83f70913e
INFO     farm:d8_pointer.py:878 at /rustc/59eed8a2aac0230a8b53e89d4e99d55912ba6b35/library/std/src/io/mod.rs:1667:15
INFO     farm:d8_pointer.py:878 6:     0x562e60e0cce0 - std::sys_common::backtrace::_print::he7ac492cd19c3189
INFO     farm:d8_pointer.py:878 at /rustc/59eed8a2aac0230a8b53e89d4e99d55912ba6b35/library/std/src/sys_common/backtrace.rs:49:5
INFO     farm:d8_pointer.py:878 7:     0x562e60e0cce0 - std::sys_common::backtrace::print::hba20f8920229d8e8
INFO     farm:d8_pointer.py:878 at /rustc/59eed8a2aac0230a8b53e89d4e99d55912ba6b35/library/std/src/sys_common/backtrace.rs:36:9
INFO     farm:d8_pointer.py:878 8:     0x562e60e0cce0 - std::panicking::default_hook::{{closure}}::h714d63979ae18678
INFO     farm:d8_pointer.py:878 at /rustc/59eed8a2aac0230a8b53e89d4e99d55912ba6b35/library/std/src/panicking.rs:210:50
INFO     farm:d8_pointer.py:878 9:     0x562e60e0c897 - std::panicking::default_hook::hf1afb64e69563ca8
INFO     farm:d8_pointer.py:878 at /rustc/59eed8a2aac0230a8b53e89d4e99d55912ba6b35/library/std/src/panicking.rs:227:9
INFO     farm:d8_pointer.py:878 10:     0x562e60e0d394 - std::panicking::rust_panic_with_hook::h02231a501e274a13
INFO     farm:d8_pointer.py:878 at /rustc/59eed8a2aac0230a8b53e89d4e99d55912ba6b35/library/std/src/panicking.rs:624:17
INFO     farm:d8_pointer.py:878 11:     0x562e60e0ce42 - std::panicking::begin_panic_handler::{{closure}}::h5b5e738edf361af7
INFO     farm:d8_pointer.py:878 at /rustc/59eed8a2aac0230a8b53e89d4e99d55912ba6b35/library/std/src/panicking.rs:519:13
INFO     farm:d8_pointer.py:878 12:     0x562e60e0b634 - std::sys_common::backtrace::__rust_end_short_backtrace::h601a115def7987b1
INFO     farm:d8_pointer.py:878 at /rustc/59eed8a2aac0230a8b53e89d4e99d55912ba6b35/library/std/src/sys_common/backtrace.rs:141:18
INFO     farm:d8_pointer.py:878 13:     0x562e60e0cdd9 - rust_begin_unwind
INFO     farm:d8_pointer.py:878 at /rustc/59eed8a2aac0230a8b53e89d4e99d55912ba6b35/library/std/src/panicking.rs:517:5
INFO     farm:d8_pointer.py:878 14:     0x562e5ff1f961 - core::panicking::panic_fmt::h7a58c8fffc5559a4
INFO     farm:d8_pointer.py:878 at /rustc/59eed8a2aac0230a8b53e89d4e99d55912ba6b35/library/core/src/panicking.rs:101:14
INFO     farm:d8_pointer.py:878 15:     0x562e6078cdfd - <whitebox_tools::tools::hydro_analysis::breach_depressions_least_cost::BreachDepressionsLeastCost as whitebox_tools::tools::WhiteboxTool>::run::h81c67dd42e84ab9e
INFO     farm:d8_pointer.py:878 16:     0x562e6069e8d2 - whitebox_tools::tools::ToolManager::run_tool::h53a77cdd713cd574
INFO     farm:d8_pointer.py:878 17:     0x562e607748b5 - whitebox_tools::run::h7f1fd320ded50c56
INFO     farm:d8_pointer.py:878 18:     0x562e6076d7fa - whitebox_tools::main::h803c24de18fe5847
INFO     farm:d8_pointer.py:878 19:     0x562e60af40c3 - std::sys_common::backtrace::__rust_begin_short_backtrace::h3026df5b86c4c0c7
INFO     farm:d8_pointer.py:878 20:     0x562e605ad7e9 - std::rt::lang_start::{{closure}}::h7ae49518c4aa84f7
INFO     farm:d8_pointer.py:878 21:     0x562e60e0d99a - core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once::h2790017aba790142
INFO     farm:d8_pointer.py:878 at /rustc/59eed8a2aac0230a8b53e89d4e99d55912ba6b35/library/core/src/ops/function.rs:259:13
INFO     farm:d8_pointer.py:878 22:     0x562e60e0d99a - std::panicking::try::do_call::hd5d0fbb7d2d2d85d
INFO     farm:d8_pointer.py:878 at /rustc/59eed8a2aac0230a8b53e89d4e99d55912ba6b35/library/std/src/panicking.rs:403:40
INFO     farm:d8_pointer.py:878 23:     0x562e60e0d99a - std::panicking::try::h675520ee37b0fdf7
INFO     farm:d8_pointer.py:878 at /rustc/59eed8a2aac0230a8b53e89d4e99d55912ba6b35/library/std/src/panicking.rs:367:19
INFO     farm:d8_pointer.py:878 24:     0x562e60e0d99a - std::panic::catch_unwind::h803430ea0284ce79
INFO     farm:d8_pointer.py:878 at /rustc/59eed8a2aac0230a8b53e89d4e99d55912ba6b35/library/std/src/panic.rs:129:14
INFO     farm:d8_pointer.py:878 25:     0x562e60e0d99a - std::rt::lang_start_internal::{{closure}}::h3a398a8154de3106
INFO     farm:d8_pointer.py:878 at /rustc/59eed8a2aac0230a8b53e89d4e99d55912ba6b35/library/std/src/rt.rs:45:48
INFO     farm:d8_pointer.py:878 26:     0x562e60e0d99a - std::panicking::try::do_call::hf60f106700df94b2
INFO     farm:d8_pointer.py:878 at /rustc/59eed8a2aac0230a8b53e89d4e99d55912ba6b35/library/std/src/panicking.rs:403:40
INFO     farm:d8_pointer.py:878 27:     0x562e60e0d99a - std::panicking::try::hb2022d2bc87a9867
INFO     farm:d8_pointer.py:878 at /rustc/59eed8a2aac0230a8b53e89d4e99d55912ba6b35/library/std/src/panicking.rs:367:19
INFO     farm:d8_pointer.py:878 28:     0x562e60e0d99a - std::panic::catch_unwind::hbf801c9d61f0c2fb
INFO     farm:d8_pointer.py:878 at /rustc/59eed8a2aac0230a8b53e89d4e99d55912ba6b35/library/std/src/panic.rs:129:14
INFO     farm:d8_pointer.py:878 29:     0x562e60e0d99a - std::rt::lang_start_internal::hdd488b91dc742b96
INFO     farm:d8_pointer.py:878 at /rustc/59eed8a2aac0230a8b53e89d4e99d55912ba6b35/library/std/src/rt.rs:45:20
INFO     farm:d8_pointer.py:878 30:     0x562e60776d92 - main
INFO     farm:d8_pointer.py:878 31:     0x7f27edb1324a - <unknown>
INFO     farm:d8_pointer.py:878 32:     0x7f27edb13305 - __libc_start_main
INFO     farm:d8_pointer.py:878 33:     0x562e5ff20134 - <unknown>
INFO     farm:d8_pointer.py:878 34:                0x0 - <unknown>

The line referenced above is this:

output = match Arc::try_unwrap(output2) {

I'm assuming this is a race condition as 95% of the time the test passes. Is there anyway to add a delay or a retry so it doesn't error straight away? We might try and implement that in our Python code to see if we can make the test more stable.

@erakhmatulina
Copy link

having the same issue

@jblindsay
Copy link
Owner

Are you able to share your data with me for testing? Otherwise this is a very difficult bug to identify the cause of. Also, how are you using WhiteboxTools, e.g. are you using a particular frontend or perhaps the R or Python APIs?

@jontwo
Copy link
Author

jontwo commented Sep 30, 2024

bdlc_repro.tar.gz

We're using the Python frontend. I've attached a repro case that should show the issue - the Python script is pretty quick to run, but you need to run it a few times to see it fail. My bash-fu isn't great so the run.sh script doesn't actually exit when it fails, but if you pipe the output into a log, run it for a couple of minutes, then cancel it, you can search the logs for "panic". It comes up a few times.

Example:

...
Finding pits: 98%
Finding pits: 99%
Finding pits: 100%
Breaching: 0%
Breaching: 25%
Breaching: 50%
Breaching: 75%
Breaching: 100%
Num. solved pits: 4
Num. unsolved pits: 1
Filling remaining depressions...
Finding pit cells: 12%
Finding pit cells: 25%
Finding pit cells: 37%
Finding pit cells: 50%
Finding pit cells: 62%
Finding pit cells: 75%
Finding pit cells: 87%
Finding pit cells: 100%
thread 'main' panicked at whitebox-tools-app/src/tools/hydro_analysis/breach_depressions_least_cost.rs:657:27:
Error unwrapping 'output'
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
./whitebox_tools --run="BreachDepressionsLeastCost" --dem='/home/jontwo/bdlc_repro/data/burned_dem.tif' --output='/tmp/d8_repro0/fixed_dem_streams.tif' --dist='10' --fill -v --compress_rasters=True

At the moment I'm running whitebox 2.3.5 in Python 3.12, but we've had this fail in CI for a few different versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants