lvu_durations.csv #5

nbgundavarapu · 2022-10-19T23:49:13Z

Hi authors,

How are the durations in lvu_durations.csv computed? The last 20s in most videos show preview for other videos. Does lvu_durations.csv show the number of seconds in the video excluding the preview duration?

Thanks

nbgundavarapu · 2022-10-26T19:32:38Z

These lines of code

ViS4mer/extract_features/extract_features_lvu_vit.py

Lines 65 to 68 in 2a2442b

    
           for i in range(int(duration)): 
        
               idx = int(video.shape[0] / duration * i) 
        
               x = torch.unsqueeze(video[idx], 0).to(device) 
        
               x = model.forward_features(x)

suggests that these previews are used in training and evaluation. Could you confirm? Thanks!

md-mohaiminul · 2022-10-28T14:59:40Z

Hi,
Thanks for reaching out. We used the duration from Condensed Movies dataset. They removed the outro/preview from each video which they describe in section 3.1 of their Paper. Therefore, lvu_durations.csv does not contain the outro/preview of each video.

nbgundavarapu · 2022-11-01T01:39:17Z

Thanks for your reply! Do the downloaded mp4 videos have outro/preview removed?

If not, in the following code, outro/preview seems to be included and the same is being used later in training/evals.

ViS4mer/extract_features/extract_features_lvu_vit.py

Lines 58 to 68 in 2a2442b

    
           video = get_video(video_fp) 
        
           video = torch.from_numpy(video.transpose([0, 3, 1, 2])).float() 
        
           duration = duration_data.loc[video_id]['duration'] 
        
           print(cnt, video_id, video.shape, duration) 
        
           features = np.zeros((duration+1, 197, 1024)) 
        
           for i in range(int(duration)): 
        
               idx = int(video.shape[0] / duration * i) 
        
               x = torch.unsqueeze(video[idx], 0).to(device) 
        
               x = model.forward_features(x)

e.g. Consider the video 9NG5mJgw6Yg in writer set with duration = 154s, and the actual video length = 184s. Above code will include frames after 154s containing outro/preview.

nbgundavarapu · 2022-11-14T01:50:30Z

In the above example, could you walk through the above code from your codebase, at i=153?
idx = int(184/154*153) = 183
Hence, features[153] = model_fwd(video[183])

In effect, features[153] contains outro frame 183. So, during LVU evals, frame 183 will be used for this video which is not what you intended. This looks like a bug. The same is true for a lot of videos and frames.

md-mohaiminul · 2022-11-14T04:13:08Z

Hi,
I think you are right. You need to remove the outro first and we also did that. You can use the duration from 'lvu_durations.csv' to do this.

nbgundavarapu · 2022-11-15T04:08:55Z

Thanks! Could you please check and confirm if the reported results in the paper contain outro by any chance in light of the above bug?
The current state of the codebase is definitely using the outro.

Context:
I'm struggling to reproduce results from the paper. There is a 1% difference in performance if I include/exclude the outro, and including the outro puts the results close to the reported results in the paper.

md-mohaiminul · 2022-11-15T04:15:19Z

Which task did you try and what performance are you getting? Also, how did you solve the 'NaN' issue? Can you please reply that on the other issue so that other's can benefit from it?

nbgundavarapu · 2022-11-15T19:00:20Z

I've not been able to solve the NaN issue. I'm working on a reimplementation in jax building upon annotated-s4

I've tried all the classification tasks. There is a ~1% gap in relationship, director, writer, speaking including/excluding the outro.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lvu_durations.csv #5

lvu_durations.csv #5

nbgundavarapu commented Oct 19, 2022 •

edited

Loading

nbgundavarapu commented Oct 26, 2022 •

edited

Loading

md-mohaiminul commented Oct 28, 2022

nbgundavarapu commented Nov 1, 2022 •

edited

Loading

nbgundavarapu commented Nov 14, 2022 •

edited

Loading

md-mohaiminul commented Nov 14, 2022

nbgundavarapu commented Nov 15, 2022 •

edited

Loading

md-mohaiminul commented Nov 15, 2022

nbgundavarapu commented Nov 15, 2022

lvu_durations.csv #5

lvu_durations.csv #5

Comments

nbgundavarapu commented Oct 19, 2022 • edited Loading

nbgundavarapu commented Oct 26, 2022 • edited Loading

md-mohaiminul commented Oct 28, 2022

nbgundavarapu commented Nov 1, 2022 • edited Loading

nbgundavarapu commented Nov 14, 2022 • edited Loading

md-mohaiminul commented Nov 14, 2022

nbgundavarapu commented Nov 15, 2022 • edited Loading

md-mohaiminul commented Nov 15, 2022

nbgundavarapu commented Nov 15, 2022

nbgundavarapu commented Oct 19, 2022 •

edited

Loading

nbgundavarapu commented Oct 26, 2022 •

edited

Loading

nbgundavarapu commented Nov 1, 2022 •

edited

Loading

nbgundavarapu commented Nov 14, 2022 •

edited

Loading

nbgundavarapu commented Nov 15, 2022 •

edited

Loading