Skip to content
This repository has been archived by the owner on May 8, 2024. It is now read-only.

Releases: douban/dpark

Release 0.5.0

27 Jul 04:05
Compare
Choose a tag to compare

API change

  • Remove module-level api like dpark.textFile.
  • Support Streaming shuffle and Disk shuffle (Experimental, compatible).

Fixes

  • Bug when parsing mfs chunk info.

Improvement

  • Better broadcast impl using shared memory for tasks on the same slave to reduce memory cost.
  • Better offer-matching logic for MesosScheduler which remember bad slaves.
  • Refactor: style and layout.

New Feature

  • Multi segment dump to save memory.
  • Gather statics for stage.
  • Support run tests/test_rdd on mesos.
  • Add colorful progress bar for dpark.
  • Support mesos role.
  • Support multi named mesos master in conf.
  • Loghub for admin.

Release 0.4.2

19 Mar 10:35
54f1edd
Compare
Choose a tag to compare
  • Support Python3 & PyPy
  • Support MooseFS 3.x & refactor on file-system interface

Release 0.4.1

09 Mar 07:09
Compare
Choose a tag to compare
  • Enhancement for the containerizer in DPark
  • Use broadcast when parallelize big dataset
  • Fix missing line bug for bzip2 files
  • Add TopByKey in RDD
  • Other minor bugs

Release 0.4.0

06 Dec 02:48
Compare
Choose a tag to compare
  • Bugfix: deserialize error of old-style class.
  • Refactor beansdb RDD
  • Web UI support for dpark
  • Use pymesos >= 0.2.0
  • Eager serialize values of ParallelCollection