Skip to content

jiachenlei/egomotion

Repository files navigation

EgoMotion

We, as team "TheSSVL" or "EgoMotion-COMPASS", took 2nd place in both Object State Change Classification and PNR temporal localization tasks in Ego4d Challenge 2022. Please refer to our validation report for more details on our methodology

Moreover, our work on Egocentric video understanding will be made publicly available soon.

TODO

  • Post Techincal report on Arxiv
  • Release codes which we used in Ego4d Challenge 2022
  • Release codes of our latest work on egocentric video understading

Environment requirements

in addition to "wandb", we use same environment as VideoMAE and ego4d oscc i3d-resnet50 baseline. Please refer to the repos for more information

Data Prepration

please refer to ego4d instruction and download required videos for fho_oscc task. For convinience, we save clips where state change occurs and clips where no state change occurs in two different directories. You can use same directory for both kinds of clips if you want.

Our directory structure:

/path/to/ego4d:
	/v1
		/full_scale
			/*.mp4
		...
		/annotations
			/*.json
		
		# for saving clips where state change occurs
		/pos
			/unique_id
			...
		# for saving clips where no state change occurs
		/neg
			/unique_id
			...

After finish downloading required videos, you could start finetuning experiments following instructions bellow.

Usage

  1. Finetuning pretrained weights on Ego4d oscc and temporal localization at the same time:
  • Modify required paramters including dataset path in config/finetune_vitb_ego4d.yml or config/finetune_vitl_ego4d.yml, e.g.
    finetune: "" # path to the pretrained weight

ps: you can download pretrained videoMAE weights from videoMAE repository:vitl, vitb

  • Modify required paramters in ./scripts/finetune_ego4d.sh

  • Finally, in ./scripts, run

    # finetune on single node 
    bash finetune_ego4d.sh 0 0.0.0.0

    # finetune on two nodes:
    # run on first node
    bash finetune_ego4d.sh 0 0.0.0.0
    # run on second node
    bash finetune_ego4d.sh 1 ip_address_of_first_machine
  1. Test on Ego4d oscc and temporal localization:
  • Similar to 1, modify required paramters including dataset path in config/test_ego4d.yml

  • Modify required paramters in ./scripts/test_ego4d.sh

  • Finally, in ./scripts, run

    # test on single node 
    bash test_ego4d.sh 0 0.0.0.0

    # test on two nodes:
    # run on first node
    bash test_ego4d.sh 0 0.0.0.0
    # run on second node
    bash test_ego4d.sh 1 ip_address_of_first_machine

For emphasis, two json files (one for oscc one for temporal localization) in the format specified by ego4d challenge will be generated and stored in the directory:

$output_dir/$name

where $output_dir is specified in config/test_ego4d.yml and $name is specified in test_ego4d.sh

Reference

[1] VideoMAE by Zhan, etc : VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
[2] VideoMAE by Kaiming, etc : Masked Autoencoders As Spatiotemporal Learners
[3] Vanilla MAE: Masked Autoencoders Are Scalable Vision Learners
[4] Ego4D: Ego4D: Around the World in 3,000 Hours of Egocentric Video

Contact

If you have any questions about our projects or implementation, please open an issue or contact via email:
Jiachen Lei: [email protected]

Acknowledgements

We built our codes based on ego4d-i3dresnet50, VideoMAE, MAE-pytorch. Thanks to all the contributors of these great repositories.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published