Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect environment-specific departure from mean for combined data #924

Open
tomato42 opened this issue Apr 22, 2024 · 0 comments
Open

Detect environment-specific departure from mean for combined data #924

tomato42 opened this issue Apr 22, 2024 · 0 comments

Comments

@tomato42
Copy link
Member

Feature request

Is your feature request related to a problem? Please describe

When running test in parallel on multiple machines, or in multiple environments, there may be situations where there is a signal is some machines, not on others, but that when the data is analysed as a combined data set the signal gets "averaged out".

E.g. data is collected on two hosts, with cpu A, and cpu B. The implementation in System Under Test uses a different assembly for those CPUs, so in effect the test is executing on two different environments. When one of those assembly implementations is leaky and the other is not, but the leak is small in relation to the sample, the resulting p-values can still be large enough to indicate a chance occurrence rather than definite vulnerability. At the same time, if the data was analysed on a per-CPU basis, the result could cross the threshold of statistical significance.

Describe the solution you'd like

Both Friedman test and Skillings-Mack tests produce a test statistic which is independent of the sample size, only on the number of classes (groups) in the sample.
We could calculate the test statistic continuously, and then graph how it changes over time.
We could also calculate it for parts of the sample (as many measurements as are representing a single column in the heatmap graphs) and then plot those.

Then we could see if a). departure of the the test statistic from the expected value is consistent over time, and b). if there are no incontinuities in the graphs. If there would be, then it would suggest that there are environment-specific side-channels present.

Describe alternatives you've considered

An alternative is to run multiple tests: one for the whole data set, and then one instance each for a unique environment. That's rather inconvenient, and requires a lot of information to be analysed by a human.

Additional context

While it should be simple to collect this information while running the Skillings-Mack test we have, it will require us to reimplement the Friedman test (which we should probably do anyway, to parallelize it)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant