Releases · ARM-software/ComputeLibrary

19 Dec 11:46

developer-compute

v24.12

32bcced

v24.12 Latest

Latest

v24.12 Public Major Release

Feat

Add a build flag to make scheduler object thread_local and make it default in Bazel build

Fix

CPU regression in Reshape from excess threads
NEDeconvolutionLayer regression
Ensure bias type is BF16 for BF16 indirect convolutions

Perf

Disable mmul kernel selection for fp16 in GPU backend
Documentation (API, build guide, contribution guide, errata, etc.) available here:
https://artificial-intelligence.sites.arm.com/computelibrary/v24.12/index.xhtml

Assets 10

02 Dec 17:46

developer-compute

v24.11.1

1f3bf6b

v24.11.1

v24.11.1 Public Minor Release

Feat

Add stateless GEMM execution via ICPPKernel::run_op
TensorShape class supports dynamic shapes
Add skeletons for Dynamic GEMM operator
Convert Double rounding to Single rounding quantization behaviour in both Cpu/Gpu backend

Fix

Detect Advanced SIMD support on Windows®

Perf

Implement activation heuristics for Neoverse™ V1
Optimize PReLU on quantized datatypes
Documentation (API, build guide, contribution guide, errata, etc.) available here:
https://artificial-intelligence.sites.arm.com/computelibrary/v24.11.1/index.xhtml

Assets 10

18 Nov 11:53

developer-compute

v24.11

f44f09d

v24.11

v24.11 Public Major Release

Feat

Add SVE SoftmaxLayer kernel for BF16
Provide stateless API for CpuGemmLowpMatrixMultiplyCore, CpuQuantize, and DequantizationLayer
Extend static quantization interface for both matmul and convolution operations

Fix

Clarify Third-Party IP licenses
Check if CpuGemmAssemblyDispatch is configured in CpuMatMul before continue
Add BF16 support for CpuGemmAssemblyDispatchWrapper
Detect SVE support on Windows® to run the available kernels
Fixed missing cstdint include which occurs with GCC 15
Disable -O2 when building for Windows® as this crashes when certain compiler versions are used
Make cast on CPU truncate float to int instead of round to be consistent with other ML frameworks
Return error in validate() for CpuGemmLowpMatrixMultiplyCore if pretransposed A or B are true as this is not supported
Avoid implicit conversion from __fp16 to arm_compute::bfloat16 to avoid illegal instructions in hardware with FP16 but no BF16 support
Softmax SME2 kernel selection now correctly detects if SME2 is supported
Requantization rounding issues in CPU/GPU Quantize
Scale normalising coefficient in GPU LogSoftmax
Apply consistent rounding policy in NEReduceMean
Revert default memory manager for NEQLSTMLayer
Create default memory manager when none is provided

Refactor

Turn duplicated code in the elementwise_binary kernel into templates to reduce code size
Move CpuSoftmaxKernel LUT to LUTManager to consolidate location of all LUTs

Perf

Use SME instead of SVE for subtractions in SoftmaxLayer for Q8 relating to LUT address calculation
Documentation (API, build guide, contribution guide, errata, etc.) available here:
https://artificial-intelligence.sites.arm.com/computelibrary/v24.11/index.xhtml

Assets 10

27 Sep 13:56

developer-compute

v24.09

c61bd33

v24.09

v24.09 Public Major Release

Feat

Provide a wrapper class to expose cpu::CpuSoftmaxGeneric
Detect number of cores in Windows®
Add Optimized SME kernel for QASYMM8_SIGNED elementwise addition operation

Fix

LogSoftmax Int8/UInt8 mismatches in Cpu
Rounding of negative integers in pooling 2d/3d gpu kernels
OpenMP® linker error on Windows®
Rounding of negative integers in pooling 2d/3d kernels
Patches linker failure for cpu::CpuSoftmaxGeneric in partial builds
Cpu/Gpu Reverse data type support
QSYMM16 broadcasted subtraction failures
CpuMulKernel validation when there is x-broadcasting for some types
Data type validation in depthwise op in Cpu
Update macOS® build instructions
Validation tests compute reference and target on each iteration
Reset permuted input and weights on configure in NEDepthwiseConvolutionLayer
Selectively enable CL job chaining

Refactor

Generate only one shared library when building with CMake
Add BF16 LUT for Softmax Layer with tests
Move heuristic logic of activation kernel into separate class
Removed unused CommandBuffer.

Perf

Allocate Persistent and Prepare tensors at start of prepare()
Use mws in OMPScheduler for better thread throttling
Enable FP16 winograd in CpuConv2d for v8a multi_isa builds.

Documentation (API, build guide, contribution guide, errata, etc.) available here:
https://artificial-intelligence.sites.arm.com/computelibrary/v24.09/index.xhtml

Assets 10

28 Aug 12:51

developer-compute

v24.08.1

de7288c

v24.08.1

v24.08.1 Public Patch Release

Fix

Change inheritance qualifiers of experimental Cpu operator interface classes to public for cpu-wrappers.
Mismatches in static quantization updated after configure tests
CpuSoftmax configure ignores is_log on validation
Linker errors in armv8.2a Windows® builds

Documentation (API, build guide, contribution guide, errata, etc.) available here:
https://artificial-intelligence.sites.arm.com/computelibrary/v24.08.1/index.xhtml

Assets 10

16 Aug 16:16

developer-compute

v24.08

f1929dc

v24.08

v24.08 Public Major Release

Feature

Expose CpuAdd functionality using the experimental operators api
Expose CpuDepthwiseConv2d functionality using the experimental operators api
Expose CpuElementwiseDivision functionality using the experimental operators api
Expose CpuElementwiseMax functionality using the experimental operators api
Expose CpuElementwiseMin functionality using the experimental operators api
Expose CpuGemmAssemblyDispatch functionality using the experimental operators low-level api
Expose CpuMul functionality using the experimental operators api
Expose CpuSub functionality using the experimental operators api

Performance

Solve performance issue on Arm® Mali™-G78

Fix

Illegal intruction in multi_isa armv8a
Set num_threads in ThreadInfo correctly in OMPScheduler
Fix Alexnet graph example giving incorrect results

Documentation (API, build guide, contribution guide, errata, etc.) available here:
https://artificial-intelligence.sites.arm.com/computelibrary/v24.08/index.xhtml

Assets 10

26 Jul 21:03

developer-compute

v24.07

c5dd775

v24.07

Public major release
Documentation (API, changelogs, build guide, contribution guide, errata, etc.) available here:
https://arm-software.github.io/ComputeLibrary/v24.07

Assets 10

18 Jun 18:46

ramelg01

v24.06

505adb9

v24.06

Public minor release
Documentation (API, changelogs, build guide, contribution guide, errata, etc.) available here:
https://arm-software.github.io/ComputeLibrary/v24.06

Assets 22

30 May 15:18

mk-arm

v24.05

a53ffdc

v24.05

Public major release

Documentation (API, changelogs, build guide, contribution guide, errata, etc.) available here:
https://arm-software.github.io/ComputeLibrary/v24.05

Assets 22

02 May 09:05

mk-arm

v24.04

4fda7a8

v24.04

Public major release

Documentation (API, changelogs, build guide, contribution guide, errata, etc.) available here:
https://arm-software.github.io/ComputeLibrary/v24.04

Assets 22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v24.12 Public Major Release

Feat

Fix

Perf

v24.11.1 Public Minor Release

Feat

Fix

Perf

v24.11 Public Major Release

Feat

Fix

Refactor

Perf

v24.09 Public Major Release

Feat

Fix

Refactor

Perf

v24.08.1 Public Patch Release

Fix

v24.08 Public Major Release

Feature

Performance

Fix

Releases: ARM-software/ComputeLibrary

v24.12

v24.12 Public Major Release

Feat

Fix

Perf

v24.11.1

v24.11.1 Public Minor Release

Feat

Fix

Perf

v24.11

v24.11 Public Major Release

Feat

Fix

Refactor

Perf

v24.09

v24.09 Public Major Release

Feat

Fix

Refactor

Perf

v24.08.1

v24.08.1 Public Patch Release

Fix

v24.08

v24.08 Public Major Release

Feature

Performance

Fix

v24.07

v24.06

v24.05

v24.04