Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create an API for storing user data in the Concuerror state #334

Open
k32 opened this issue Jul 17, 2021 · 2 comments
Open

Create an API for storing user data in the Concuerror state #334

k32 opened this issue Jul 17, 2021 · 2 comments

Comments

@k32
Copy link

k32 commented Jul 17, 2021

Is your feature request related to a problem? Please describe.

For performance reasons it would be great to have an API for maintaining a user-provided global state

  1. Updating this state from the instrumented processes should not be considered a race condition, and it should not trigger exploration of the additional interleavings
  2. At the same time, this state should be maintained as part of the instrumented system

Describe the solution you'd like

  1. Expand the #trace_state record with a new user_state field (I could be mistaken if it's the right place, though)
  2. Create a magic function that works like this:
%% Somewhere in the testcase...
ReturnValue = concuerror:update_user_state(fun(OldState) -> do_stuff(), {ReturnValue, NewState} end)

Describe alternatives you've considered
Hide the state in a module that is not instrumented by Concuerror. This doesn't seem to work correctly.

Additional context

We've been experimenting with a special style of testcases that heavily rely on the inspection of the system's execution trace (https://github.com/kafka4beam/snabbkaffe). Our library is extremely naive: it intercepts structured log messages from the system while it is runs, it forwards them to a collector process, which later dumps the event trace, so it can be checked for any desired properties (e.g. https://github.com/emqx/ekka/blob/master/test/ekka_rlog_props.erl#L41). This approach proved to be quite elegant in some cases where we're dealing with eventually consistent systems that can restart and failover.
Unfortunately, when snabbkaffe library runs under Concuerror, the collector process creates a lot of unnecessary interleavings, so much so it renders the whole snabbkaffe+concuerror combination impractical. I wonder if it is possible to move snabbkaffe's internal state from a separate process to the concuerror's internal state.

@aronisstav
Copy link
Member

Hi @k32 !

This is a reasonable proposal, and one of my own "headaches" too: making it easier to "hide" "benign" racy operations from Concuerror, and still use them to control a test's scheduling. I am also curious about how the exclude_module option is failing in such a scenario.

I want to explore this more, so I think a good way to start is to have a small example of a snabbkaffe use that highlights the problem. Something like a snabbkaffe test case, together with a way to invoke Concuerror on it should be good enogh.

Is this something that you can send me?

@k32
Copy link
Author

k32 commented Jul 23, 2021

I want to explore this more, so I think a good way to start is to have a small example of a snabbkaffe use that highlights the problem. Something like a snabbkaffe test case, together with a way to invoke Concuerror on it should be good enogh.

Thanks for the answer! We have a small testsuite where snabbkaffe runs under concuerror. Consider the following test for example:

https://github.com/kafka4beam/snabbkaffe/blob/master/test/concuerror_tests.erl#L14

It spawns three processes: the first one waits for a ping message. The other two compete to send the message to the first one. Once the first process receives a message, it produces pong trace event. The main process of the testcase waits for the pong event. There is a lot of preprocessor trickery going on in ?block_until macro, but this is what it essentially does: it constructs a predicate fun matching the event, then it sends the fun to snabbkaffe gen_server, which uses it to match the past and the incoming events. Once it finds the event, it replies back.

Currently all the snabbkaffe processes are instrumented by Concuerror, and the testcase works as I expect: there is always ping/pong pair of events in the trace. However, it breaks when I exclude snabbkaffe module here: https://github.com/kafka4beam/snabbkaffe/blob/master/Makefile#L5

* Error: A process (<0.106.0>) took more than 5000ms to report a built-in event. You can try to increase the '--timeout' limit and/or ensure that there are no infinite loops in your test.

This could be a minor issue in the snabbkaffe code, but I suspect that the problem may be more fundamental: events from the different runs of the instrumented code may all mix up in the snabbkaffe's trace. However, I can only speculate that it can happen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants