We have some benchmarks to see model performance using Ollama, and some profiling scripts to check CPU and memory usage.
You can the tests as follows:
make test # unit tests
To measure the speed of some Ollama models we have a benchmark that uses some models for a few prompts:
cargo run --release --example ollama
You can also benchmark these models using a larger task list at a given path, with the following command:
JSON_PATH="./path/to/your.json" cargo run --release --example ollama
We have scripts to profile both CPU and Memory usage. A special build is created for profiling, via a custom profiling
feature, such that the output inherits release
mode but also has debug symbols.
Furthermore, the profiling build will exit automatically after a certain time, as if CTRL+C has been pressed. This is needed by the memory profiling tool in particular.
To create a flamegraph of the application, do:
make profile-cpu
This will create a profiling build that inherits release
mode, except with debug information.
Note
CPU profiling may require super-user access.
To profile memory usage, we make use of cargo-instruments.
make profile-mem