-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expand on "Goroutine safety" section of README #242
Comments
I took a look at the (*Bitmap).Or method for example and any operations that occur on length2 := x2.highlowcontainer.size() The call to size is just a func (ra *roaringArray) size() int {
return len(ra.keys)
} Next up is: s2 := x2.highlowcontainer.getKeyAtIndex(pos2) getKeyAtIndex just reads a value from the func (ra *roaringArray) getKeyAtIndex(i int) uint16 {
return ra.keys[i]
} That call happens a few more times with different indexes, they should still be safe. rb.highlowcontainer.insertNewKeyValueAt(pos1, s2, x2.highlowcontainer.getContainerAtIndex(pos2).clone()) The getContainerAtIndex call again is just reading from a slice, should be safe: func (ra *roaringArray) getContainerAtIndex(i int) container {
return ra.containers[i]
} The func (bc *bitmapContainer) clone() container {
ptr := bitmapContainer{bc.cardinality, make([]uint64, len(bc.bitmap))}
copy(ptr.bitmap, bc.bitmap[:])
return &ptr
} func (ac *arrayContainer) clone() container {
ptr := arrayContainer{make([]uint16, len(ac.content))}
copy(ptr.content, ac.content[:])
return &ptr
} func newRunContainer16CopyIv(iv []interval16) *runContainer16 {
rc := &runContainer16{
iv: make([]interval16, len(iv)),
}
copy(rc.iv, iv)
return rc
}
func (rc *runContainer16) clone() container {
return newRunContainer16CopyIv(rc.iv)
} All of these operations are just calls to |
Perhaps the easiest solution here would be to verify which of the package level functions are safe for concurrent use on a bitmap:
Then we can extrapolate that the receiver versions of them are safe as well for the bitmap passed in as a parameter (obviously they are not safe for the receiver itself) |
This is a good idea. I would expect that all operations that generate a new bitmap do not modify the input bitmaps. We need to check the effect of copy-and-write. How do you propose we go forward? |
@lemire do you expect that to be guaranteed moving forward (excluding major version bumps)? If so I think it probably makes more sense to add it to the godoc for each method explicitly mentioning which arguments are safe for concurrent reads. If not, I think the README is a more appropriate place with a nice disclaimer that this won't always be true but is as of today. Once we have a list of verified safe methods I can send over a PR for either option. As far as copy-on-write I was thinking about that as well as a good starting point. Does this sounds reasonable:
Given those, we can build a blacklist of "unsafe" methods based on any method that checks the value of |
@bored-engineer The godoc approach sounds good. The README should be modified to point the reader to the godoc. I would qualify safe/unsafe with respect to the value of copyOnWrite. A call to Clone, if you do not use copy-on-write, is just a full copy and thus entirely thread safe/concurrent. It seems to me that there are two very distinct use cases. There are people relying on copy-on-write and people who do not. The people who do use copy-on-write need extra care. In some sense, though copy-on-write has some benefits, it does incur some extra complexity, and I think that the users should be made aware of this if they want performance. |
Sounds good, will do that once we have a finalized list of methods.
Let me restate this just to make sure I understand it: We should update the docs to indicate which methods are safe for concurrent reads, except with a special note that if
Finally, it wasn't clear to me if this was a statement that the methods are already safe and we don't need to do any validation, or if you're saying that's how you would expect them to work and we should still verify before updating any docs saying they are. |
I think we are in agreement. I was just stating my expectation: we should verify. However, there is only so much state we can change when taking bitmaps as inputs and returning a new bitmap. I can’t think of any except for issues with copy-on-write. |
Another option would be to create a dedicated type or new side-effect free functions. Anyhow, documentation would be a step forward. |
My general understanding is that the safety of a Bitmap basically operates under Rowlock semantics. Non-mutating reads can happen concurrently, but any updates need exclusive access, as they can corrupt assumptions. For instance, if you remove a container after length1 is calculated in and/or/not it will panic |
Is sync.rwmutex safe to use? Contains will not mutate the underlying data? |
Contains does not mutate the content. |
It would be great if the Goroutine Safety section of the README.md about what operations are currently goroutine-safe could be expanded on, even/especially if that's not a guarantee that they will remain thread-safe moving forward.
For example, if bitmap
rb1
is created, optimized, and never written to again (or has appropriate locking), can it be safely used in concurrent read operations by another bitmaprb2.And(rb1)
orrb2.Or(rb1)
?Testing this with
go test -race
(which I know isn't a anywhere near complete test) doesn't report any issues:The text was updated successfully, but these errors were encountered: