Rust based benchmarks & Tests #8

koraa · 2020-03-01T11:55:47Z

Requirments

The benchmarks should be entirely written in rust.
The benchmarks should be portable and not rely on the presence of platform defined dictionary files.
The benchmarks should have the ability to be run with specific parameters
1. Number of input lines
2. Fraction of duplicates
3. Distribution of input line length
4. Char set (binary/text)
The benchmarks should still be able to run against all the preexisting commands (sort|uniq).

Design

A CLI application should be written that produces a set of random tokens according to the parameters specified on the CLI:

genbench --charset ascii/binary --delim CHAR --number NUM --duplicates PERCENTAGE --short LEN --long LEN

The short/long parameters each indicate the 90% percentile of string lengths, using a gaussian distribution.

For the actual benchmark we should write a benchmark executor that runs each of the implementations with a variety of parameters handed to genbench.

Tests

We can reuse the same strategy for testing by generating test data with genbench and then comparing the output of the full huniq and a super naive, unoptimized huniq implementation. We should specifically make sure, that buffer growing is tested (supply some very long, >20kb strings).

The text was updated successfully, but these errors were encountered:

koraa mentioned this issue Mar 1, 2020

Building problems on macOS #5

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rust based benchmarks & Tests #8

Rust based benchmarks & Tests #8

koraa commented Mar 1, 2020

Rust based benchmarks & Tests #8

Rust based benchmarks & Tests #8

Comments

koraa commented Mar 1, 2020

Requirments

Design

Tests