Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concuerror internal error "replaying a built-in returned a different result than expected" #309

Open
bford opened this issue Mar 10, 2020 · 1 comment

Comments

@bford
Copy link

bford commented Mar 10, 2020

Describe the bug
Trying to model-check the simple Erlang model of Que Sera Consensus, Concuerror worked for a couple days and then failed with an error saying "Please notify the developers, as this is a bug of Concuerror." Full console text included below. It looks like there's an internal replay inconsistency of some kind - see the end of the log below.

To Reproduce
Steps to reproduce the behavior:

  1. git clone [email protected]:dedis/tlc.git
  2. cd tlc/erlang/model
  3. git checkout 7f2345ab847796e2df08eaa986b99b1b737d0016
  4. Make this one-line change to qsc.erl to run a trivial 3-node test for only one time-step:
@@ -132,6 +132,6 @@ test_check(A, B) -> erlang:error({inconsistency, A, B}).

 % Run QSC and TLC through a test suite.
 test() ->
-       [test_run(F, 1000) || F <- [1,2,3,4,5]],        % simple test suite
+       [test_run(F, 1) || F <- [1]],   % simple test suite
        io:fwrite("Tests completed~n").
  1. Typing ./run.sh should work immediately
  2. Run concuerror --non_racing_system user -m qsc
  3. Wait a couples days... :/ Error happened for me after about 33 million interleavings - see log below.

Expected behavior
Something other than a bug in Concuerror. ;)

Environment (please complete the following information):

  • OS: Mac OS X 10.15.3 Catalina
  • Concuerror Version: 0.20.0+build.2238.refa676f94

Additional context
Relevant console log:

$ Concuerror/bin/concuerror --non_racing_system user -m qsc
Concuerror 0.20.0+build.2238.refa676f94 started at 07 Mar 2020 14:37:51

  • Info: Showing progress (-h progress, for details)
  • Info: Writing results in concuerror_report.txt
  • Info: Automatically instrumented module io_lib
  • Info: Showing PIDs as "<symbolic name(/last registered name)>" ('-h symbolic_names').
  • Info: Automatically instrumented module error_handler
  • Info: Automatically instrumented module qsc
  • Info: Automatically instrumented module io
  • Tip: Running without a scheduling_bound corresponds to verification and may take a long time.
  • Tip: Your test sends messages to the 'user' process to write output. Such messages from different processes may race, producing spurious interleavings. Consider using '--non_racing_system user' to avoid them.
  • Info: Automatically instrumented module erlang
  • Info: Automatically instrumented module lists
  • Info: Automatically instrumented module rand
  • Info: Automatically instrumented module sets
  • Info: You can see pairs of racing instructions (in the report and '--graph') with '--show_races true'
  • Tip: Increase '--print_depth' if output/graph contains "...".
  • Error: Stop testing on first error. (Check '-h keep_going').
  • Error: On step 79, replaying a built-in returned a different result than expected:
    original:
    <0.120.0>: receives message ({3,{hist,1,2,{msg,1,2},2,{hist,0,undefined,undefined,undefined,undefined}}})
    in qsc.erl line 65
    new:
    <0.120.0>: receives message ({4,[{hist,1,2,{msg,1,2},2,{hist,0,undefined,undefined,undefined,undefined}},{hist,1,2,{msg,1,2},2,{hist,0,undefined,undefined,undefined,undefined}}]})
    in qsc.erl line 65
    Please notify the developers, as this is a bug of Concuerror.
    Done at 09 Mar 2020 23:46:49 (Exit status: fail)
    Summary: 1 errors, 33130687/33133636 interleavings explored
@kostis
Copy link
Contributor

kostis commented Mar 11, 2020

Thanks for the bug report @bford and for the excellent description of how to reproduce this.

I've played a bit with the test case and it seems that a faster (ca 30 mins) way to reproduce (a similar) error is to use a --scheduling_bound 6 option (FYI: with 5 it completes normally). Execution crashes with:

~/tlc/erlang/model$ concuerror --scheduling_bound 6 --non_racing_system user -m qsc
...
* Error: On step 76, replaying a built-in returned a different result than expected:
  original:
    {event,<0.123.0>,
    {receive_event,
        {message,
            {4,
             [{hist,1,1,
                  {msg,1,1},
                  1,
                  {hist,0,undefined,undefined,undefined,undefined}},
              {hist,1,1,
                  {msg,1,1},
                  1,
                  {hist,0,undefined,undefined,undefined,undefined}}]},
            {<0.123.0>,12}},
        {10,#Fun},
        <0.123.0>,infinity,false},
    #Ref<0.1668025383.1918894081.80515>,
    [65,{file,"/home/kostis/tlc/erlang/model/qsc.erl"}],
    [{message_received,{<0.123.0>,12}}]}
  new:
    {event,<0.123.0>,
    {receive_event,
        {message,
            {4,
             [{hist,1,1,
                  {msg,1,1},
                  1,
                  {hist,0,undefined,undefined,undefined,undefined}},
              {hist,1,1,
                  {msg,1,1},
                  1,
                  {hist,0,undefined,undefined,undefined,undefined}}]},
            {<0.122.0>,12}},
        {10,#Fun},
        <0.123.0>,infinity,false},
    #Ref<0.1668025383.1918894081.80515>,
    [65,{file,"/home/kostis/tlc/erlang/model/qsc.erl"}],
    [{message_received,{<0.122.0>,12}}]}
 Please notify the developers, as this is a bug of Concuerror.
Done at 11 Mar 2020 07:54:05 (Exit status: fail)
  Summary: 1 errors, 697421/697792 interleavings explored (the scheduling bound was reached)

Notes:

  • The diff in the above is a bit hard to spot, but it's a problem with PIDs (<0.123.0> vs <0.122.0> in the tuple before the #Fun).
  • With the --observers option, I was also able to obtain a crash slightly faster than the above (~22mins and 470013/470384 interleavings explored), but I am not so sure it's easier to debug.

Edit
With --observers --scheduling_bound 8 the error is produced slightly faster (18 mins) and looks as follows:

* Error: On step 82, replaying a built-in returned a different result than expected:
  original:
    <0.123.0>: receives message ({3,{hist,1,1,{msg,1,1},3,{hist,0,undefined,undefined,undefined,undefined}}})
    in qsc.erl line 65
  new:
    <0.123.0>: receives message ({4,[{hist,1,1,{msg,1,1},3,{hist,0,undefined,undefined,undefined,undefined}},{hist,1,1,{msg,1,1},3,{hist,0,undefined,undefined,undefined,undefined}}]})
    in qsc.erl line 65
 Please notify the developers, as this is a bug of Concuerror.
Done at 11 Mar 2020 09:40:03 (Exit status: fail)
  Summary: 1 errors, 375129/376003 interleavings explored (the scheduling bound was reached)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants