Releases: lucidrains/mixture-of-experts
Releases · lucidrains/mixture-of-experts
0.2.3
0.2.2
weighting is already done when computing combine_tensor
0.2.1
0.2.1
0.2.0
make sure moe works with reversible networks for routing transformer
0.1.1
fix initialization of experts
0.1.0
default to Gelu activation
0.0.4
bump for release
0.0.3
add ability to pass in custom experts
0.0.2
complete first pass of heirarchical mixture of experts (2 levels) as …
0.0.1
update readme