This repository has been archived by the owner on Feb 18, 2024. It is now read-only.
preparation for 0.10.0 - changelog and release notes #889
Locked
jorgecarleitao
started this conversation in
General
Replies: 1 comment
-
Released https://github.com/jorgecarleitao/arrow2/releases/tag/v0.10.0 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi, I have generated the change log since 0.9.2 and am preparing an overview of what this release brings us.
Below is the current draft. Suggestions and ideas to improve it are welcome!
Arrow2 0.10 is out!
Continuing breaking ground, this constitutes one of the most feature rich releases of this crate so far!
Thank you to everyone for the impressive work over the past 2 months that make arrow2 so feature rich, safe, fast, and easy to use! 🙇
Here are the main headlines:
Copy on Write
So far, whenever we applied a transformation to an array, we had to create a new array. When multiple operations were used (e.g.
c1 x 2 + 1
), it lead to the following compute pattern:This was identified by @sundy-li on #741 and addressed by @ritchie46 on #794.
Users can now re-use
Arc
ed arrays, just likestd::sync::Arc::get_mut
. As expected, if the array is being used in multiple places, it will return aNone
and users do need to allocate a new region (exclusive mutability).This is being used in Polars to further re-use allocated regions and therefore reduce both memory pressure and wasted compute cycles allocating new regions.
Support for ODBC
This release now supports reading from, and write to, any ODBC driver.
This builds on top of the superb odbc-api created by @pacman82, that allows this crate to use the columnar format provided by ODBC specification.
Given a performant ODBC driver, this is expected to be the fastest way to load data to the Arrow format, as many operations are simple memcopies.
Check out the example and guide for details on how to use it!
async
support for writing to Arrow's IPCUntil now, we had limited support to writing to Arrow IPC asynchronously. @dexterduck closed this gap on #878, offering complete
async
support for both Arrow files and Arrow streams, including implementations offutures::Stream
andfutures::Sink
for them!Migrated
std::simd
After some back and forth with the working group of the project portable simd, this release replaces
packed_simd2
bystd::simd
. This resulted in no performance difference but allow us to leverage the great work that is happening onstd::simd
.Support to Serde metadata
A common pain point in using arrow2's logical types is that they are quite complex, making them sometimes difficult
to visualise or represent in e.g. JSON. @houqp closed this with #858, that adds compatibility with Serde for
schema-related structs in this crate (
PhysicalType
DataType
,Field
,Schema
).Support for Arrow C stream interface
Arrow has an experimental specification for an Ffi to iterators of arrow arrays. This release now fully supports this interface.
Made crate
deny(missing_docs)
This makes us developers more conscious about documenting APIs, thereby allowing users more context about them. We have also start documenting IO-related APIs over whether they are CPU or IO-bounded, so that users know which ones block
async
contexts.v0.10.0 (2022-03-06)
Full Changelog
Breaking changes:
Ffi_ArrowArray
andFfi_ArrowSchema
#859New features:
futures::Sink
for parquet async writer #877 (dexterduck)try_new
andnew
to all arrays #873 (jorgecarleitao)Decimal128
to Avro #837 (potter420)LargeUtf8
andLargeBinary
to Avro #828 (illumination-k)BooleanArray::from_trusted_len_values_iter_unchecked
#799 (ritchie46)MutableUtf8Array::extend_values
#798 (ritchie46)Buffer
,Bitmap
and some arrays #794 (ritchie46)Fixed bugs:
None
as ipc_fields in flight API #780 (jorgecarleitao)Enhancements:
FixedSizeBinaryScalar
#782MutableFixedList::mut_values
#886 (jorgecarleitao)try_new
#879 (jorgecarleitao)ListValuesIter
#874 (ritchie46)into_mut
implementations #801 (ritchie46)FixedSizeListScalar
andFixedSizeBinaryScalar
#786 (illumination-k)Documentation updates:
deny(missing_docs)
#808 (jorgecarleitao)Bitmap::set_bit
#802 (yjshen)dyn Array::slice
docstring #792 (ritchie46)Testing updates:
unsafe
#843 (jorgecarleitao)Beta Was this translation helpful? Give feedback.
All reactions