Some of the researchers let the Optimizers
maintain some intermediate variables,
for example, the weights
in the proximal term, the gradient cache
for variance reduction.
However, we let the Algorithms
maintain such variables, and make them input parameters for relevant Optimizers
.
Most (inner) optimizers are based on the ProxSGD optimizer, including
Other (inner) optimizers include
- FedPD Optimizers (Not checked yet)