twiggy
is a code size profiler.
It analyzes a binary's call graph to answer questions like:
-
Why was this function included in the binary in the first place?
-
What is the retained size of this function? I.e. how much space would be saved if I removed it and all the functions that become dead code after its removal.
Use twiggy
to make your binaries slim!
Ensure that you have the Rust toolchain installed, then run:
cargo install twiggy
Consider the following functions:
pub fn shred() {
gnar_gnar();
bluebird();
}
fn gnar_gnar() {
weather_report();
pow();
}
fn bluebird() {
weather_report();
}
fn weather_report() {
shred();
}
fn pow() {
fluffy();
soft();
}
fn fluffy() {}
fn soft() {}
pub fn baker() {
hood();
}
fn hood() {}
If we treat every function as a vertex in a graph, and if we add an edge from A to B if function A calls function B, then we get the following call graph:
If there is a path where A β B β ... β C through the call graph, then we say
that C is reachable through from A. Dead code is code that is not
reachable in the call graph from any publicly exported functions (for
libraries) or the main
function (for executables).
Imagine that shred
from the last example was our executable's main
function. In this scenario, there is no path through the call graph from shred
to baker
or hood
, so they are dead code. We would expect that the linker
would remove them, and they wouldn't show up in the final binary.
But what if some function that you thought was dead code is appearing inside your binary? Maybe it is deep down in some library you depend on, but inside a submodule of that library that you aren't using, and you wouldn't expect it to be included in the final binary.
In this scenario, twiggy
can show you all the paths in the call graph that
lead to the unexpected function. This lets you understand why the unwelcome
function is present, and decide what you can do about it. Maybe if you
refactored your code to avoid calling Y, then there wouldn't be any paths to
the unwelcome function anymore, it would be dead code, and the linker would
remove it.
Imagine the pow
function itself might is not very large. But it calls
functions soft
and fluffy
, both of which are huge. And they are both
only called by pow
, so if pow
were removed, then soft
and fluffy
would
both become dead code and get removed as well. Therefore, pow
's "real" size is
huge, even though it doesn't look like it at a glance. The dominator
relationship gives us a way to reason about the retained size of a function.
In a graph that is rooted at vertex R, vertex A is said to dominate vertex B if every path in the graph from R to B includes A. It follows that if A were removed from the graph, then B would become unreachable.
In our call graphs, the roots are the main
function (for executables) or
publicly exported functions (for libraries).
V is the immediate dominator of a vertex U if V != U, and there does not
exist another distinct vertex W that is dominated by V but also dominates
U. If we take all the vertices from a graph, remove the edges, and then add
edges for each immediate dominator relationship, then we get a tree. Here is the
dominator tree for our call graph from earlier, where shred
is the root:
Using the dominator relationship, we can find the retained size of some function by taking its shallow size and adding the retained sizes of each function that it immediately dominates.
Generic functions with type parameters in Rust and template functions in C++ can lead to code bloat if you aren't careful. Every time you instantiate these generic functions with a concrete set of types, the compiler will monomorphize the function, creating a copy of its body replacing its generic placeholders with the specific operations that apply to the concrete types. This presents many opportunities for compiler optimizations based on which particular concrete types each copy of the function is working with, but these copies add up quickly in terms of code size.
Example of monomorphization in Rust:
fn generic_function<T: MyTrait>(t: T) { ... }
// Each of these will generate a new copy of `generic_function`!
generic_function::<MyTraitImpl>(...);
generic_function::<AnotherMyTraitImpl>(...);
generic_function::<MyTraitImplAlso>(...);
Example of monomorphization in C++:
template<typename T>
void generic_function(T t) { ... }
// Each of these will also generate a new copy of `generic_function`!
generic_function<uint32_t>(...);
generic_function<bool>(...);
generic_function<MyClass>(...);
If you can afford the runtime cost of dynamic dispatch, then changing these functions to use trait objects in Rust or virtual methods in C++ can likely save a significant amounts of code size. With dynamic dispatch, the generic function's body is not copied, and the generic bits within the function become indirect function calls.
Example of dynamic dispatch in Rust:
fn generic_function(t: &MyTrait) { ... }
// or
fn generic_function(t: Box<MyTrait>) { ... }
// etc...
// No more code bloat!
let x = MyTraitImpl::new();
generic_function(&x);
let y = AnotherMyTraitImpl::new();
generic_function(&y);
let z = MyTraitImplAlso::new();
generic_function(&z);
Example of dynamic dispatch in C++:
class GenericBase {
public:
virtual void generic_impl() = 0;
};
class MyThing : public GenericBase {
public
virtual void generic_impl() override { ... }
};
class AnotherThing : public GenericBase {
public
virtual void generic_impl() override { ... }
};
class AlsoThing : public GenericBase {
public
virtual void generic_impl() override { ... }
};
void generic(GenericBase& thing) { ... }
// No more code bloat!
MyThing x;
generic(x);
AnotherThing y;
generic(y);
AlsoThing z;
generic(z);
twiggy
can analyze a binary to find which generic functions are being
monomorphized repeatedly, and calculate an estimation of how much code size
could be saved by switching from monomorphization to dynamic dispatch.
twiggy
is primarily a command line tool.
To get the most up-to-date usage for the version of twiggy
that you've
installed, you can always run:
twiggy --help
Or, to get more information about a sub-command, run:
twiggy subcmd --help
The twiggy top
sub-command summarizes and lists the top code size offenders in
a binary.
$ twiggy top path/to/wee_alloc.wasm
Shallow Bytes β Shallow % β Item
ββββββββββββββββΌββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
1034 β 36.71% β data[3]
774 β 27.48% β "function names" subsection
225 β 7.99% β wee_alloc::alloc_first_fit::h9a72de3af77ef93f
164 β 5.82% β hello
152 β 5.40% β wee_alloc::alloc_with_refill::hb32c1bbce9ebda8e
136 β 4.83% β <wee_alloc::size_classes::SizeClassAllocPolicy<'a> as wee_alloc::AllocPolicy>::new_cell_for_free_list::h3987e3054b8224e6
76 β 2.70% β <wee_alloc::LargeAllocPolicy as wee_alloc::AllocPolicy>::new_cell_for_free_list::h8f071b7bce0301ba
44 β 1.56% β goodbye
The twiggy paths
sub-command finds the call paths to a function in the given
binary's call graph. This tells you what other functions are calling this
function, why this function is not dead code, and therefore why it wasn't
removed by the linker.
$ twiggy paths path/to/wee_alloc.wasm 'wee_alloc::alloc_with_refill::hb32c1bbce9ebda8e'
Shallow Bytes β Shallow % β Retaining Paths
ββββββββββββββββΌββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
152 β 5.40% β wee_alloc::alloc_with_refill::hb32c1bbce9ebda8e
β β β¬ func[2]
β β β¬ <wee_alloc::size_classes::SizeClassAllocPolicy<'a> as wee_alloc::AllocPolicy>::new_cell_for_free_list::h3987e3054b8224e6
β β β¬ func[5]
β β β¬ elem[0]
β β β¬ hello
β β β¬ func[8]
β β β¬ export "hello"
The twiggy monos
sub-command lists the generic function monomorphizations that
are contributing to code bloat.
$ twiggy monos path/to/input.wasm
Apprx. Bloat Bytes β Apprx. Bloat % β Bytes β % β Monomorphizations
βββββββββββββββββββββΌβββββββββββββββββΌββββββββΌββββββββΌββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
1977 β 3.40% β 3003 β 5.16% β alloc::slice::merge_sort
β β 1026 β 1.76% β alloc::slice::merge_sort::hb3d195f9800bdad6
β β 1026 β 1.76% β alloc::slice::merge_sort::hfcf2318d7dc71d03
β β 951 β 1.63% β alloc::slice::merge_sort::hcfca67f5c75a52ef
1302 β 2.24% β 3996 β 6.87% β <&'a T as core::fmt::Debug>::fmt
β β 2694 β 4.63% β <&'a T as core::fmt::Debug>::fmt::h1c27955d8de3ff17
β β 568 β 0.98% β <&'a T as core::fmt::Debug>::fmt::hea6a77c4dcddb7ac
β β 433 β 0.74% β <&'a T as core::fmt::Debug>::fmt::hfbacf6f5c9f53bb2
β β 301 β 0.52% β <&'a T as core::fmt::Debug>::fmt::h199e8e1c5752e6f1
The twiggy dominators
sub-command displays the dominator tree of a binary's
call graph.
$ twiggy dominators path/to/input.wasm
Retained Bytes β Retained % β Dominator Tree
βββββββββββββββββΌβββββββββββββΌββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
284691 β 47.92% β export "items_parse"
284677 β 47.91% β β€· func[17]
284676 β 47.91% β β€· items_parse
128344 β 21.60% β β€· func[47]
128343 β 21.60% β β€· twiggy_parser::wasm::<impl twiggy_parser::Parse<'a> for parity_wasm::elements::module::Module>::parse_items::h033e4aa1338b4363
98403 β 16.56% β β€· func[232]
98402 β 16.56% β β€· twiggy_ir::demangle::h7fb5cfffc912bc2f
34206 β 5.76% β β€· func[20]
34205 β 5.76% β β€· <parity_wasm::elements::section::Section as parity_wasm::elements::Deserialize>::deserialize::hdd814798147ca8dc
2855 β 0.48% β β€· func[552]
2854 β 0.48% β β€· <alloc::btree::map::BTreeMap<K, V>>::insert::he64f84697ccf122d
1868 β 0.31% β β€· func[53]
1867 β 0.31% β β€· twiggy_ir::ItemsBuilder::finish::h1b98f5cc4c80137d
The twiggy diff
sub-command computes the delta size of each item between old
and new versions of a binary.
$ twiggy diff path/to/old.wasm path/to/new.wasm
Delta Bytes β Item
ββββββββββββββΌββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
-1476 β <total>
-1034 β data[3]
-593 β "function names" subsection
+395 β wee_alloc::alloc_first_fit::he2a4ddf96981c0ce
+243 β goodbye
-225 β wee_alloc::alloc_first_fit::h9a72de3af77ef93f
-152 β wee_alloc::alloc_with_refill::hb32c1bbce9ebda8e
+145 β <wee_alloc::neighbors::Neighbors<'a, T>>::remove::hc9e5d4284e8233b8
The twiggy garbage
sub-command finds and displays dead code and data that is
not transitively referenced by any exports or public functions.
$ twiggy garbage path/to/input.wasm
Bytes β Size % β Garbage Item
ββββββββΌβββββββββΌββββββββββββββββββββββ
11 β 5.58% β unusedAddThreeNumbers
8 β 4.06% β unusedAddOne
7 β 3.55% β type[2]
5 β 2.54% β type[1]
5 β 2.54% β unusedChild
4 β 2.03% β type[0]
1 β 0.51% β func[0]
1 β 0.51% β func[1]
1 β 0.51% β func[2]
twiggy
is divided into a collection of crates that you can use
programmatically, but no long-term stability is promised. We will follow semver
as best as we can, but will err on the side of being more conservative with
breaking version bumps than might be strictly necessary.
Here is a simple example:
extern crate twiggy_analyze;
extern crate twiggy_opt;
extern crate twiggy_parser;
use std::fs;
use std::io;
fn main() {
let mut file = fs::File::open("path/to/some/binary").unwrap();
let mut data = vec![];
file.read_to_end(&mut data).unwrap();
let items = twiggy_parser::parse(&data).unwrap();
let options = twiggy_opt::Top::default();
let top = twiggy_analyze::top(&mut items, &options).unwrap();
let mut stdout = io::stdout();
top.emit_text(&items, &mut stdout).unwrap();
}
For a more in-depth example, take a look at is the implementation of the
twiggy
CLI crate.
First, ensure you have the wasm32-unknown-unknown
Rust target installed and
up-to-date:
rustup install nightly
rustup update nightly
rustup target add wasm32-unknown-unknown --toolchain nightly
Next, install wasm-bindgen
:
cargo +nightly install wasm-bindgen-cli
Finally, build twiggy
's WebAssembly API with wasm-bindgen
:
cd twiggy/wasm-api
cargo +nightly build --release --target wasm32-unknown-unknown
wasm-bindgen ../target/wasm32-unknown-unknown/release/twiggy_wasm_api.wasm --out-dir .
This should produce two artifacts in the current directory:
twiggy_wasm_api_bg.wasm
: The WebAssembly file containingtwiggy
.twiggy_wasm_api.js
: The JavaScript bindings totwiggy
's WebAssembly.
You can now use twiggy
from JavaScript like this:
import { Items, Monos } from './twiggy_wasm_api';
// Parse a binary's data into a collection of items.
const items = Items.parse(myData);
// Configure an analysis and its options.
const opts = Monos.new();
opts.set_max_generics(10);
opts.set_max_monos(10);
// Run the analysis on the parsed items.
const monos = JSON.parse(items.monos(opts));
twiggy
currently supports these binary formats:
- βοΈ WebAssembly's
.wasm
format
twiggy
doesn't support these binary formats (yet!):
- β ELF
- β Mach-O
- β PE/COFF
Although twiggy
doesn't currently support these binary formats, it is designed
with extensibility in mind. The input is translated into a format-agnostic
internal representation (IR), and adding support for new formats only requires
parsing them into this IR. The vast majority of twiggy
will not need
modification.
We would love to gain support for new binary formats, and if you're interested
in doing that implementation work,
check out CONTRIBUTING.md
!
See CONTRIBUTING.md for hacking.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.