Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

impl Copy for GcVec #30

Closed
Techcable opened this issue Jul 6, 2021 · 0 comments
Closed

impl Copy for GcVec #30

Techcable opened this issue Jul 6, 2021 · 0 comments
Milestone

Comments

@Techcable
Copy link
Member

Techcable commented Jul 6, 2021

This basically boils down to iterator invalidation. If we have two references to a vector,
a GcVec::push on one may mutate the vector's length while the user coerced the other reference to a slice (or iterator).

To put it another way, it's impossible for a garbage collected vector/list to implement all of the following properties:

  1. Coerce the vector into a slice that directly points to the underlying memory
  2. Allow multiple references to the vector (implement Copy)
  3. The ability to mutate the length and potentially re-allocate
  4. Keep capacity immutable, and have the area [len, capacity) uninitialized (as opposed to initialized with null)

This is a problem common to all garbage-collected languages, not just zerogc (as I describe below).

What Java/Python do

Most languages sidestep the problem by forbidding option 1 and refusing to give direct pointer-access to garbage-collected arrays.
Usually, forbidding option 1 is not a problem, since arrays in Java and lists in Python are builtin into the language and replace most of the users of pointers.

There are some exceptions though. Java's JNI allows raw access to an arrays's memory with GetPrimitiveArrayCritical. However, there is no need to worry about mutating length because java arrays have a fixed size (although it does temporarily prevent garbage collection). Python's native CAPI exposes access to a list's raw memory with PyListObject->ob_items, although native code must be careful of any user code changing .

This problem can also come up in unexpected places. For example, the C implementation of Python's list.sort coerces the list into a raw pointer, then sorts the memory directly using the pointers. The original code continued to use the original pointer even if a user comparison, leading to bugs. The current implementation temporarily sets the length to zero and carefully guards against any mutations.

What to do

If we took the Java/Python approach of refusing to give out raw pointers, it would also prevent access as a slice.

In idiomatic rust, this is unrealistic, since slices are so pervasive.
As a temporary solution, I ended up forbidding option 2.

However, doesn't work well if you're trying to use GcVec as the underlying implementation of Java arrays or Python lists. These languages allow users to create multiple references to an array or a list. In our current system, this would require double-indirection of Gc<GcVec<T>>.

SIDENOTE: Right now GcVec

Instead, I propose a new system of four types and two traits (which will incidentally solve the write-barrier problem):

  • GcRawVec - Allows multiple references, but has unsafe slice or pointer access
    • This will even allow mutable slices, as long as the user correctly triggers write barriers.
  • GcVec - Effectively a RefCell<GcRawVec<T>>. Includes safe slice access.
  • GcVecBuffer - A GcRawVec that is !Copy, but gains safe slice access.
  • trait GcVecAccess - Abstracts over different forms of GcVec. Calls to push require a GcContext. Requires the user to pass a context
  • trait GcArrayAccess - Allows read/write access to arrays, without mutating their length.
    • Implemented for GcArray in addition to GcVec

TODO: Better naming for GcVec and friends?

Techcable added a commit that referenced this issue Jul 6, 2021
Merges #29 
Closes #16

Right now, access to arrays doesn't trigger write barriers. I need to implement that before I can work on generational
or incremental collection......

The vector API is a little rough around the edges
because `GcVec` doesn't implement `Copy` as all the other `Gc` types do.
See #30 for more details
@Techcable Techcable added this to the 0.2.0 milestone Jul 6, 2021
@Techcable Techcable changed the title , allowing Garbage Collected Vectors to implement Copy impl Copy for GcVec Jul 6, 2021
@Techcable Techcable pinned this issue Jul 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant