You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe
When running test in parallel on multiple machines, or in multiple environments, there may be situations where there is a signal is some machines, not on others, but that when the data is analysed as a combined data set the signal gets "averaged out".
E.g. data is collected on two hosts, with cpu A, and cpu B. The implementation in System Under Test uses a different assembly for those CPUs, so in effect the test is executing on two different environments. When one of those assembly implementations is leaky and the other is not, but the leak is small in relation to the sample, the resulting p-values can still be large enough to indicate a chance occurrence rather than definite vulnerability. At the same time, if the data was analysed on a per-CPU basis, the result could cross the threshold of statistical significance.
Describe the solution you'd like
Both Friedman test and Skillings-Mack tests produce a test statistic which is independent of the sample size, only on the number of classes (groups) in the sample.
We could calculate the test statistic continuously, and then graph how it changes over time.
We could also calculate it for parts of the sample (as many measurements as are representing a single column in the heatmap graphs) and then plot those.
Then we could see if a). departure of the the test statistic from the expected value is consistent over time, and b). if there are no incontinuities in the graphs. If there would be, then it would suggest that there are environment-specific side-channels present.
Describe alternatives you've considered
An alternative is to run multiple tests: one for the whole data set, and then one instance each for a unique environment. That's rather inconvenient, and requires a lot of information to be analysed by a human.
Additional context
While it should be simple to collect this information while running the Skillings-Mack test we have, it will require us to reimplement the Friedman test (which we should probably do anyway, to parallelize it)
The text was updated successfully, but these errors were encountered:
Feature request
Is your feature request related to a problem? Please describe
When running test in parallel on multiple machines, or in multiple environments, there may be situations where there is a signal is some machines, not on others, but that when the data is analysed as a combined data set the signal gets "averaged out".
E.g. data is collected on two hosts, with cpu A, and cpu B. The implementation in System Under Test uses a different assembly for those CPUs, so in effect the test is executing on two different environments. When one of those assembly implementations is leaky and the other is not, but the leak is small in relation to the sample, the resulting p-values can still be large enough to indicate a chance occurrence rather than definite vulnerability. At the same time, if the data was analysed on a per-CPU basis, the result could cross the threshold of statistical significance.
Describe the solution you'd like
Both Friedman test and Skillings-Mack tests produce a test statistic which is independent of the sample size, only on the number of classes (groups) in the sample.
We could calculate the test statistic continuously, and then graph how it changes over time.
We could also calculate it for parts of the sample (as many measurements as are representing a single column in the heatmap graphs) and then plot those.
Then we could see if a). departure of the the test statistic from the expected value is consistent over time, and b). if there are no incontinuities in the graphs. If there would be, then it would suggest that there are environment-specific side-channels present.
Describe alternatives you've considered
An alternative is to run multiple tests: one for the whole data set, and then one instance each for a unique environment. That's rather inconvenient, and requires a lot of information to be analysed by a human.
Additional context
While it should be simple to collect this information while running the Skillings-Mack test we have, it will require us to reimplement the Friedman test (which we should probably do anyway, to parallelize it)
The text was updated successfully, but these errors were encountered: