Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detections.empty() invalidates detections, causes crashes when Detections.merge() is called. #943

Open
2 tasks
LinasKo opened this issue Feb 25, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@LinasKo
Copy link
Collaborator

LinasKo commented Feb 25, 2024

Search before asking

  • I have searched the Supervision issues and found no similar bug report.

This is the underlying reason for #928. I believe it is important enough to deserve a separate issue.

Bug

Detectons.empty() creates a very specific type of Detections, which is guaranteed to be incompatible with some models when merging results.

Example

Suppose we run an instance segmentation model on a video. It finds one detection in frame 1, but no detections in frame 2. When we try calling Detections.merge on these detections, it raises an error.

IMAGE_PATH = "cat.png"
image = cv2.imread(IMAGE_PATH)
black_image = np.zeros_like(image, dtype=np.uint8)

model = get_roboflow_model(model_id="yolov8n-640")        # Error type 1
# model = get_roboflow_model(model_id="yolov8n-seg-640")  # Error type 2

result = model.infer(image)[0]
detections_1 = sv.Detections.from_inference(result)

result = model.infer(black_image)[0]
detections_2 = sv.Detections.from_inference(result)

sv.Detections.merge([detections_1, detections_2])
Error type 1
 ValueError                                Traceback (most recent call last)
 <ipython-input-8-173450170fe7> in <cell line: 18>()
     16 detections_2 = sv.Detections.from_inference(result)
     17 
---> 18 sv.Detections.merge([detections_1, detections_2])

1 frames
/usr/local/lib/python3.10/dist-packages/supervision/detection/core.py in merge(cls, detections_list)
    768         tracker_id = stack_or_none("tracker_id")
    769 
--> 770         data = merge_data([d.data for d in detections_list])
    771 
    772         return cls(

/usr/local/lib/python3.10/dist-packages/supervision/detection/utils.py in merge_data(data_list)
    629     all_keys_sets = [set(data.keys()) for data in data_list]
    630     if not all(keys_set == all_keys_sets[0] for keys_set in all_keys_sets):
--> 631         raise ValueError("All data dictionaries must have the same keys to merge.")
    632 
    633     for data in data_list:

ValueError: All data dictionaries must have the same keys to merge.
Error type 2
ValueError                                Traceback (most recent call last)
[<ipython-input-10-1ac0380e7de6>](https://localhost:8080/#) in <cell line: 20>()
     18 
     19 
---> 20 sv.Detections.merge([detections_1, detections_2])

1 frames
[/usr/local/lib/python3.10/dist-packages/supervision/detection/core.py](https://localhost:8080/#) in stack_or_none(name)
    756                 return None
    757             if any(d.__getattribute__(name) is None for d in detections_list):
--> 758                 raise ValueError(f"All or none of the '{name}' fields must be None")
    759             return (
    760                 np.vstack([d.__getattribute__(name) for d in detections_list])

ValueError: All or none of the 'mask' fields must be None

Deeper explanation

Detections contains these variables:

xyxy: np.ndarray
mask: Optional[np.ndarray] = None
confidence: Optional[np.ndarray] = None
class_id: Optional[np.ndarray] = None
tracker_id: Optional[np.ndarray] = None
data: Dict[str, Union[np.ndarray, List]] = field(default_factory=dict)

Suppose we call Detections.merge([detections_1, detections_2]). An error is rightfully raised when the same variable is defined differently in the detections. For example - when mask is None in detections_1 and np.array([]) in detections_2. It makes sense as we don't want to merge incompatible detections.

However, when calling Detections.empty(), frequently used to initialize the no-detections case, it sets a specific subset of the fields - only the xyxy, condifence and class_id.
Many models use other fields as well. Because of this, the set of defined fields in an empty detection might be different to what the model returns. When trying to merge these, an error is raised.

Environment

No response

Minimal Reproducible Example

See aforementioned example.

this Colab contains examples, list of functions affected.

Are you willing to submit a PR?

  • Yes I'd like to help by submitting a PR!

I could help submit a PR when I figure out a solution, but this is a non-trivial bug and to my best estimates, solving it 'correctly' would require a redesign of Detections. Quick remediation may mean band-aid code and hope that future features are written defensively when merge is involved.

@LinasKo LinasKo added the bug Something isn't working label Feb 25, 2024
LinasKo pushed a commit to LinasKo/supervision that referenced this issue Feb 25, 2024
* Addresses issue roboflow#928 matching implementation in ByteTrack,
  but sidestepping the underlying cause outlined in roboflow#943.
@LinasKo
Copy link
Collaborator Author

LinasKo commented Feb 26, 2024

Affected code

The good news is - the error mostly appears when Detections.merge is called on results from unusual models - where confidence is missing, where data is set, where mask exists (and there's no detections at least once).

Affected functions:

  • supervision/detection/core.py -> from_inference()
  • supervision/detection/core.py -> from_roboflow()
  • dataset/formats/yolo.py -> yolo_annotations_to_detections()
  • dataset/formats/yolo.py -> load_yolo_annotations()
  • dataset/formats/pascal_voc.py -> detections_from_xml_obj()
  • dataset/formats/coco.py -> coco_annotations_to_detections()
  • supervision/detection/core.py -> merge()
  • tracker/byte_tracker/core.py -> update_with_detections()
  • supervision/detection/tools/inference_slicer.py -> __call__()
  • (maybe) detection/tools/smoother.py -> DetectionsSmoother.get_smoothed_detections()

See Colab for more details.

@LinasKo
Copy link
Collaborator Author

LinasKo commented Feb 28, 2024

Mitigation Options

TL;DR: I think Option 3 is the least-bad solution. Adds complexity in exactly one location, keeps design and architecture intact, stops errors, prevents future worries. I also advocate for two eventual changes regardless of option taken (final section)

Here's how I see various mitigation options and their downsides. I can only speculate about the design goals and future plans of Roboflow, and I do not have access to the internal message board. So my goal now is to bring other devs up to speed and help think through the solutions.

My Assumptions

  1. Detections is meant to be universal and easily-understood, fitting both detection and segmentation models.
  2. Therefore, it must hold undefined variables, ready to take values from different model types.
  3. There's a need for Detections.empty, which represents empty detections of ANY type.

Caveats

  1. I have done little research on the data field, but found an example where it fails for the same reason.
  2. I may make generalizations for which models are used less frequently than others. This comes from observing their presence in the code, and how Roboflow prioritizes them on the site.

Option 1: Do nothing

The error has two stages:

  1. Calling Detections.empty from specific model loaders sets it up - this happens frequently.
  2. Calling Detections.merge triggers it - this happens rarely.

merge is:

Maybe it's not worth fixing? There haven't been many complaints.

The devs just need to be extra careful of using merge in anything new, largely relating to instance segmentation. Who knows - maybe there's not that many features left unimplemented.

Option 2: Forbid everything affected.

(The worst approach, in my opinion)
Very radical - comment out merge and inference slicer, disallow smoothing. Who cares if there's incompatible objects around - they're not interacting with each other anyway.
(I care).

Option 3: Change merge.

I like how merge works, given the assumptions. But maybe it's harmful, even if correct?

If (xyxy, confidence, class_id) merges with (xyxy, mask, class_id), maybe find a way to let it happen? Form (xyxy, mask, confidence, class_id) somehow.

(Brainstorming) When we need new values for:

  • xyxy - make a bounding box around mask, 0-sized box otherwise.
  • confidence - default to 1.
  • class_id - set to -1 (raaaarely seen - e.g. from_sam)
  • tracker_id - won't be an issue after [Smoother] - tracker_id is None when there is no detection for an extended period #928 - in other cases update_with_detections ensures the ID is set.
  • mask - expand xyxy into a rectangular mask.
  • data - I don't know enough to say.
    It seems to avoid both errors and architectural shifts, redefining what "compatible Detections" mean, at least until the next feature comparing multiple detections is added. Is this the least-bad solution?

Option 4: Explicitly define what's being initialized.

empty sets (xyxy, confidence, class_id). What is that? A detection model.

Let's make it initialize (xyxy, mask, class_id) for segmentation models. Some won't conform - SAM returns no class_id. So force it, make a mock id! This way you'd have two distinct model types.

Then, either pass an egument to Detections.empty to tell which one you're initializing, or define two different empty functions. For the outliers - full field set detections and those models with data - either set the fields explicitly, or add a third, more flexible Detections.empty. Same thing.

Did you spot an issue?

Suppose you have an intermediary function. A call chain of:
Smoother -> some_func -> specific_empty.

Which type of empty should it call? Somehow, with an argument or context of some sort, some_func needs to know which type of variables to initialize / which empty to call. At present, if there's no detections such an intermediary would only receive an empty list - []. No type information!

That, or you need to forbid intermediaries from accepting [], return a specific empty() from the top-level caller instead.

Here's a thought experiment: try inlining Detections.empty. I mean, replace it with the Detections constructor and then explicitly set variables inside. In each case, can you tell which ones should be set? Try this inside Detections.merge to see the issue.

Option 4b: Frickin' Templates

We're passing the type information with function calls, elsewhere that would be done via templates or generics. Detections.empty<Det>(), Detections.empty<Seg>() might be completely normal in other languages.

Type parameter syntaxis only introduced in 3.12. Perhaps TypeVar, could help, but generally this would make the code less accessible to new devs.

Option 5: Multiple Classes

Did you notice that we're approaching a multi-class solution? My claim is that Detections can be at least two distinct objects - the detection and the segmentation result.

An option is to change the core design, split them in two, then define strict non-nullable members, adjust everything else to match that.

Detections becomes a base class passed to users and annotators, those have to frequently check contents. Notice that with nullable members similar checks might already be happening (I haven't checked).

Downside:

  • Lots of changes
  • Less flexibility - it defeats the purpose, if we made the types in separate classes nullable.
  • Edge cases - data and non-standard parameter sets are an issue - this solution is an issue.

Option 6: Get rid of nullable types

This is my least thought-out option. Null values serve a purpose, representing invalid members - E.g. "This Detections result will never have confidence". This requires extra validation, disallows safely having a strong Detections.empty function such as the one we have now.

Perhaps Optional status can be revoked. There needs to be logic such that whether there's an empty mask or never any mask would look the same. Is there harm to that?

I'm think so. Null checks can be removed in places like __iter__(), but in a for loop, is there a way for users to easily tell which fields might suddenly produce values?

Essential Changes

I advocate for these two changes to be made eventually.

  1. Detections.empty is meant to construct empty detections for ANY model. Therefore it HAS to be weaker-defined than any model. The code now sets (xyxy, confidence, class_id). Does every model have at least all of these? No. Trim it down until it can be used as the constructor for Detections. Otherwise you'll be setting fields that normal model results may never have and constructing invalid objects.
    After this is done - can Detections constructor be used instead of Detections.empty()?
  2. Visually separate tracker_id from other members. Is is addressed differently, has a special set of methods ensuring its existence and is not produced by the models. A single empty line would give a hint to maintainers of its special status - miniscule thing, but it would help learn the code faster.

@LinasKo
Copy link
Collaborator Author

LinasKo commented May 20, 2024

We've now implemented option 3 in #1177

Ideally, I'd like to see an attempt at trimming down Detections.empty to just xyxy before closing this.

Would that cause issues? Would that be easy fine? Would that entail too much work to be worth it? I'd like to know that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant