Potential using mis-ordered KITTI prompt #21

haooooooqi · 2022-11-10T20:27:16Z

Hi,
Thanks so much for providing such impactful works (LAION, Open CLIP, Open CLIP benchmark) to the community!
I noticed there might be a potential use of the mis-ordered KITTI prompt from https://github.com/openai/CLIP/blob/main/data/prompts.md#kitti

The prompt shared by Open AI is:

classes = [
    'a photo i took of a car on my left or right side.',
    'a photo i took with a car nearby.',
    'a photo i took with a car in the distance.',
    'a photo i took with no car.',
]

Here is the default order of the KITTI (distance) dataset:

above_20  below_20  below_8  no_vehicle

If we pair them:

above_20:  'a photo i took of a car on my left or right side.'
below_20: 'a photo i took with a car nearby.'
below_8: 'a photo i took with a car in the distance.'
no_vehicle: 'a photo i took with no car.'

Seems the right pair should be:

above_20:  'a photo i took with a car in the distance.'
below_20: 'a photo i took with a car nearby.'
below_8: 'a photo i took of a car on my left or right side.'
no_vehicle: 'a photo i took with no car.'

With the order chance, I can get 29.3 with the ViT-L/14 laion400m_e32 model.

Hope that could be helpful :)

Haoqi

The text was updated successfully, but these errors were encountered:

mehdidc · 2022-11-10T21:13:38Z

Thanks a lot @haooooooqi for catching/reporting this! will fix that now on the code and recaculate the benchmark for KITTI.

mehdidc · 2022-11-10T22:52:53Z

@haooooooqi Quick question, could you explain where you found the default order of the classes? I was trying to read VTAB's original code (https://github.com/LAION-AI/CLIP_benchmark/blob/main/clip_benchmark/datasets/kitti.py#L101), and I have trouble to understand what they were trying to do. If we follow the code and if dist is a positive number, then the order would be : class 0: <=8, class1: >8 and <= 20, class 2: >20, class3: no vehicle. Or am I missing something ?

mehdidc · 2022-11-13T12:23:56Z

Investigating further, I displayed the actual dists dist and corresponding labels label (from https://github.com/LAION-AI/CLIP_benchmark/blob/main/clip_benchmark/datasets/kitti.py#L101) for the first 32 examples.

26.96,2
7.06,0
15.06,1
73.79,2
64.83,2
10.37,1
12.54,1
12.35,1
1000.00,3
28.99,2
74.99,2
66.73,2
7.22,0
21.21,2
5.85,0
47.72,2
10.28,1
62.85,2
12.07,1
1000.00,3
1000.00,3
36.25,2
12.39,1
6.50,0
1000.00,3
15.51,1
10.28,1
11.50,1
10.88,1
54.08,2
8.93,1
5.06,0

so class 0 correspond indeed to <=8, class 1 to between 8 and 20, class 2 to > 20, class 3 to no vehicle (value of 1000)
The current order of the classes looks still correct to me:

        "a photo i took of a car on my left or right side.",
        "a photo i took with a car nearby.",
        "a photo i took with a car in the distance.",
        "a photo i took with no car.",
    ],

It is thus intriguing that you got better results with the following order (if I understand well):

  "a photo i took with a car in the distance."
  "a photo i took with a car nearby",
  "a photo i took of a car on my left or right side."
  "a photo i took with no car."
]

rom1504 added the new datasets label Feb 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potential using mis-ordered KITTI prompt #21

Potential using mis-ordered KITTI prompt #21

haooooooqi commented Nov 10, 2022 •

edited

Loading

mehdidc commented Nov 10, 2022

mehdidc commented Nov 10, 2022

mehdidc commented Nov 13, 2022 •

edited

Loading

Potential using mis-ordered KITTI prompt #21

Potential using mis-ordered KITTI prompt #21

Comments

haooooooqi commented Nov 10, 2022 • edited Loading

mehdidc commented Nov 10, 2022

mehdidc commented Nov 10, 2022

mehdidc commented Nov 13, 2022 • edited Loading

haooooooqi commented Nov 10, 2022 •

edited

Loading

mehdidc commented Nov 13, 2022 •

edited

Loading