v0.4.1
Version v0.4.1 is a stable release, which introduces a lot of stability fixes, API improvements and code optimizations.
Changes Since v0.4.0
API Improvements
- Introduce modelPath, description, imageTag to Model/ModelVersion specification.
- Introduce CacheBackend to integrate with cloud native distributed cache systems for training jobs.
- Introduce Notebook to enable juypter virtual environment capability.
Workloads
- Bug fixes of MPIJob implementations.
- Bug fixes of Cron scheduling.
Runtime & Dashboard
- Optimize error and stack-tracing messages.
- Support volcano gang scheduler protocol.
- Remove authentic of dashboard backend.
- Set TerminationMessageFallbackToLogsOnError as default termination policy.
- Scale in extra pods/services when expected replicas decreases.
- Refactor to improve code reusability and robustness.
- Support failover by failed reasons.