We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
运行脚本 export WANDB_MODE='offline'
JSON_FOLDER="train_json" IMAGE_FOLDER="/workspace/vl-data/" VIDEO_FOLDER="/workspace/vl-data/"
HF_DATASETS_OFFLINE=1 TRANSFORMERS_OFFLINE=1 deepspeed videollava/train/train_mem.py --deepspeed ./scripts/zero2.json --model_name_or_path ./vicuna-7b-v1.5 --version v1 --data_path ${JSON_FOLDER}/llava_image_tune_.json ${JSON_FOLDER}/videochatgpt_tune_.json ${JSON_FOLDER}/nlp_tune.json --image_folder ${IMAGE_FOLDER} --image_tower ./LanguageBind/LanguageBind_Image --video_folder ${VIDEO_FOLDER} --video_tower ./LanguageBind/LanguageBind_Video_merge --mm_projector_type mlp2x_gelu --pretrain_mm_mlp_adapter ./checkpoints/videollava-7b-pretrain/mm_projector.bin --mm_vision_select_layer -2 --mm_use_im_start_end False --mm_use_im_patch_token False --image_aspect_ratio pad --group_by_modality_length True --bf16 True --output_dir ./checkpoints/videollava-sb --num_train_epochs 1 --per_device_train_batch_size 16 --per_device_eval_batch_size 4 --gradient_accumulation_steps 1 --evaluation_strategy "no" --save_strategy "steps" --save_steps 50000 --save_total_limit 1 --learning_rate 2e-5 --weight_decay 0. --warmup_ratio 0.03 --lr_scheduler_type "cosine" --logging_steps 1 --tf32 True --model_max_length 2048 --tokenizer_model_max_length 3072 --gradient_checkpointing True --dataloader_num_workers 4 --lazy_preprocess True --cache_dir "./cache_dir"
运行日志 reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants. warnings.warn( [h264 @ 0xa1e6680] mmco: unref short failure {'loss': 1.8232, 'learning_rate': 1.1111111111111112e-07, 'epoch': 0.0} {'loss': 1.737, 'learning_rate': 2.2222222222222224e-07, 'epoch': 0.0} 0%| | 2/5979 [00:33<26:09:12, 15.75s/it][h264 @ 0xb311740] mmco: unref short failure [h264 @ 0xb311740] mmco: unref short failure [h264 @ 0xaaffd80] mmco: unref short failure {'loss': 1.7796, 'learning_rate': 3.3333333333333335e-07, 'epoch': 0.0} {'loss': 1.708, 'learning_rate': 4.444444444444445e-07, 'epoch': 0.0} 0%| | 4/5979 [00:54<20:20:27, 12.26s/it][h264 @ 0xbabe380] mmco: unref short failure {'loss': 0.0, 'learning_rate': 5.555555555555555e-07, 'epoch': 0.0} 0%| | 5/5979 [01:05<19:08:34, 11.54s/it][h264 @ 0x25e23040] mmco: unref short failure {'loss': 0.0, 'learning_rate': 6.666666666666667e-07, 'epoch': 0.0} 0%| | 6/5979 [01:15<18:37:10, 11.22s/it][h264 @ 0x1e8e4c80] Missing reference picture, default is 65530 [h264 @ 0x1e7b9780] Missing reference picture, default is 65530 [h264 @ 0x8d7e80] mmco: unref short failure [h264 @ 0x8d7e80] mmco: unref short failure {'loss': 0.0, 'learning_rate': 7.777777777777779e-07, 'epoch': 0.0} 0%| | 7/5979 [01:26<18:32:16, 11.17s/it][h264 @ 0x106b5ec00] mmco: unref short failure [h264 @ 0x106b5ec00] mmco: unref short failure {'loss': 0.0, 'learning_rate': 8.88888888888889e-07, 'epoch': 0.0} {'loss': 0.0, 'learning_rate': 1.0000000000000002e-06, 'epoch': 0.0} {'loss': 0.0, 'learning_rate': 1.111111111111111e-06, 'epoch': 0.0} .. [20:06:01] /github/workspace/src/video/video_reader.cc:83: ERROR opening: /workspace/vl-data/videochatgpt_tune/v_D2JvqkKa-qM.mp4, Invalid data found when processing input Error with Error reading /workspace/vl-data/videochatgpt_tune/v_D2JvqkKa-qM.mp4... {'loss': 0.0, 'learning_rate': 1.99712517503872e-05, 'epoch': 0.05} 5%|▌ | 320/5979 [54:49<16:08:41, 10.27s/it][h264 @ 0xfa115240] mmco: unref short failure [h264 @ 0xfa115240] mmco: unref short failure {'loss': 0.0, 'learning_rate': 1.9970839794784918e-05, 'epoch': 0.05} 5%|▌ | 321/5979 [54:59<16:09:54, 10.29s/it][h264 @ 0xabcbf80] mmco: unref short failure [h264 @ 0xabcbf80] mmco: unref short failure [h264 @ 0xabcbf80] mmco: unref short failure [mov,mp4,m4a,3gp,3g2,mj2 @ 0xd33399c0] moov atom not found [20:06:22] /github/workspace/src/video/video_reader.cc:83: ERROR opening: /workspace/vl-data/videochatgpt_tune/v_Nx4rK_jvvR4.mp4, Invalid data found when processing input Error with Error reading /workspace/vl-data/videochatgpt_tune/v_Nx4rK_jvvR4.mp4... [h264 @ 0xd2a44b00] Missing reference picture, default is 65530 [h264 @ 0xaa8b7840] Missing reference picture, default is 65530 [h264 @ 0xb8fa740] mmco: unref short failure [h264 @ 0xb8fa740] mmco: unref short failure [h264 @ 0xd2a44b00] mmco: unref short failure [h264 @ 0xd2a44b00] mmco: unref short failure [h264 @ 0xd2a44b00] mmco: unref short failure {'loss': 0.0, 'learning_rate': 1.9970424912839455e-05, 'epoch': 0.05} {'loss': 0.0, 'learning_rate': 1.997000710467258e-05, 'epoch': 0.05} {'loss': 0.0, 'learning_rate': 1.9969586370406913e-05, 'epoch': 0.05} {'loss': 0.0, 'learning_rate': 1.996916271016593e-05, 'epoch': 0.05} {'loss': 0.0, 'learning_rate': 1.996873612407397e-05, 'epoch': 0.05} ... [h264 @ 0xd2a43940] mmco: unref short failure [h264 @ 0xd2a43940] mmco: unref short failure [h264 @ 0xa2373c0] Missing reference picture, default is 65530 [h264 @ 0xd2a43940] Missing reference picture, default is 65530 [h264 @ 0xb94c21c0] mmco: unref short failure [h264 @ 0xb94c21c0] mmco: unref short failure {'loss': 0.0, 'learning_rate': 7.757440216011661e-06, 'epoch': 0.58} {'loss': 0.0, 'learning_rate': 7.752161053801734e-06, 'epoch': 0.59} {'loss': 0.0, 'learning_rate': 7.746882551310377e-06, 'epoch': 0.59} [h264 @ 0x105e9fa40] mmco: unref short failure [h264 @ 0x105e9fa40] mmco: unref short failure [h264 @ 0x12a1c2c0] Missing reference picture, default is 65530 [h264 @ 0x105e9fa40] Missing reference picture, default is 65530 [h264 @ 0xca735980] mmco: unref short failure [h264 @ 0xca735980] mmco: unref short failure {'loss': 0.0, 'learning_rate': 7.741604710086778e-06, 'epoch': 0.59} {'loss': 0.0, 'learning_rate': 7.736327531679933e-06, 'epoch': 0.59}
The text was updated successfully, but these errors were encountered:
No branches or pull requests
运行脚本
export WANDB_MODE='offline'
JSON_FOLDER="train_json"
IMAGE_FOLDER="/workspace/vl-data/"
VIDEO_FOLDER="/workspace/vl-data/"
HF_DATASETS_OFFLINE=1 TRANSFORMERS_OFFLINE=1 deepspeed videollava/train/train_mem.py
--deepspeed ./scripts/zero2.json
--model_name_or_path ./vicuna-7b-v1.5
--version v1
--data_path ${JSON_FOLDER}/llava_image_tune_.json ${JSON_FOLDER}/videochatgpt_tune_.json ${JSON_FOLDER}/nlp_tune.json
--image_folder ${IMAGE_FOLDER}
--image_tower ./LanguageBind/LanguageBind_Image
--video_folder ${VIDEO_FOLDER}
--video_tower ./LanguageBind/LanguageBind_Video_merge
--mm_projector_type mlp2x_gelu
--pretrain_mm_mlp_adapter ./checkpoints/videollava-7b-pretrain/mm_projector.bin
--mm_vision_select_layer -2
--mm_use_im_start_end False
--mm_use_im_patch_token False
--image_aspect_ratio pad
--group_by_modality_length True
--bf16 True
--output_dir ./checkpoints/videollava-sb
--num_train_epochs 1
--per_device_train_batch_size 16
--per_device_eval_batch_size 4
--gradient_accumulation_steps 1
--evaluation_strategy "no"
--save_strategy "steps"
--save_steps 50000
--save_total_limit 1
--learning_rate 2e-5
--weight_decay 0.
--warmup_ratio 0.03
--lr_scheduler_type "cosine"
--logging_steps 1
--tf32 True
--model_max_length 2048 --tokenizer_model_max_length 3072
--gradient_checkpointing True
--dataloader_num_workers 4
--lazy_preprocess True
--cache_dir "./cache_dir"
运行日志
reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
warnings.warn(
[h264 @ 0xa1e6680] mmco: unref short failure
{'loss': 1.8232, 'learning_rate': 1.1111111111111112e-07, 'epoch': 0.0}
{'loss': 1.737, 'learning_rate': 2.2222222222222224e-07, 'epoch': 0.0}
0%| | 2/5979 [00:33<26:09:12, 15.75s/it][h264 @ 0xb311740] mmco: unref short failure
[h264 @ 0xb311740] mmco: unref short failure
[h264 @ 0xaaffd80] mmco: unref short failure
{'loss': 1.7796, 'learning_rate': 3.3333333333333335e-07, 'epoch': 0.0}
{'loss': 1.708, 'learning_rate': 4.444444444444445e-07, 'epoch': 0.0}
0%| | 4/5979 [00:54<20:20:27, 12.26s/it][h264 @ 0xbabe380] mmco: unref short failure
{'loss': 0.0, 'learning_rate': 5.555555555555555e-07, 'epoch': 0.0}
0%| | 5/5979 [01:05<19:08:34, 11.54s/it][h264 @ 0x25e23040] mmco: unref short failure
{'loss': 0.0, 'learning_rate': 6.666666666666667e-07, 'epoch': 0.0}
0%| | 6/5979 [01:15<18:37:10, 11.22s/it][h264 @ 0x1e8e4c80] Missing reference picture, default is 65530
[h264 @ 0x1e7b9780] Missing reference picture, default is 65530
[h264 @ 0x8d7e80] mmco: unref short failure
[h264 @ 0x8d7e80] mmco: unref short failure
{'loss': 0.0, 'learning_rate': 7.777777777777779e-07, 'epoch': 0.0}
0%| | 7/5979 [01:26<18:32:16, 11.17s/it][h264 @ 0x106b5ec00] mmco: unref short failure
[h264 @ 0x106b5ec00] mmco: unref short failure
{'loss': 0.0, 'learning_rate': 8.88888888888889e-07, 'epoch': 0.0}
{'loss': 0.0, 'learning_rate': 1.0000000000000002e-06, 'epoch': 0.0}
{'loss': 0.0, 'learning_rate': 1.111111111111111e-06, 'epoch': 0.0}
..
[20:06:01] /github/workspace/src/video/video_reader.cc:83: ERROR opening: /workspace/vl-data/videochatgpt_tune/v_D2JvqkKa-qM.mp4, Invalid data found when processing input
Error with Error reading /workspace/vl-data/videochatgpt_tune/v_D2JvqkKa-qM.mp4...
{'loss': 0.0, 'learning_rate': 1.99712517503872e-05, 'epoch': 0.05}
5%|▌ | 320/5979 [54:49<16:08:41, 10.27s/it][h264 @ 0xfa115240] mmco: unref short failure
[h264 @ 0xfa115240] mmco: unref short failure
{'loss': 0.0, 'learning_rate': 1.9970839794784918e-05, 'epoch': 0.05}
5%|▌ | 321/5979 [54:59<16:09:54, 10.29s/it][h264 @ 0xabcbf80] mmco: unref short failure
[h264 @ 0xabcbf80] mmco: unref short failure
[h264 @ 0xabcbf80] mmco: unref short failure
[mov,mp4,m4a,3gp,3g2,mj2 @ 0xd33399c0] moov atom not found
[20:06:22] /github/workspace/src/video/video_reader.cc:83: ERROR opening: /workspace/vl-data/videochatgpt_tune/v_Nx4rK_jvvR4.mp4, Invalid data found when processing input
Error with Error reading /workspace/vl-data/videochatgpt_tune/v_Nx4rK_jvvR4.mp4...
[h264 @ 0xd2a44b00] Missing reference picture, default is 65530
[h264 @ 0xaa8b7840] Missing reference picture, default is 65530
[h264 @ 0xb8fa740] mmco: unref short failure
[h264 @ 0xb8fa740] mmco: unref short failure
[h264 @ 0xd2a44b00] mmco: unref short failure
[h264 @ 0xd2a44b00] mmco: unref short failure
[h264 @ 0xd2a44b00] mmco: unref short failure
{'loss': 0.0, 'learning_rate': 1.9970424912839455e-05, 'epoch': 0.05}
{'loss': 0.0, 'learning_rate': 1.997000710467258e-05, 'epoch': 0.05}
{'loss': 0.0, 'learning_rate': 1.9969586370406913e-05, 'epoch': 0.05}
{'loss': 0.0, 'learning_rate': 1.996916271016593e-05, 'epoch': 0.05}
{'loss': 0.0, 'learning_rate': 1.996873612407397e-05, 'epoch': 0.05}
...
[h264 @ 0xd2a43940] mmco: unref short failure
[h264 @ 0xd2a43940] mmco: unref short failure
[h264 @ 0xa2373c0] Missing reference picture, default is 65530
[h264 @ 0xd2a43940] Missing reference picture, default is 65530
[h264 @ 0xb94c21c0] mmco: unref short failure
[h264 @ 0xb94c21c0] mmco: unref short failure
{'loss': 0.0, 'learning_rate': 7.757440216011661e-06, 'epoch': 0.58}
{'loss': 0.0, 'learning_rate': 7.752161053801734e-06, 'epoch': 0.59}
{'loss': 0.0, 'learning_rate': 7.746882551310377e-06, 'epoch': 0.59}
[h264 @ 0x105e9fa40] mmco: unref short failure
[h264 @ 0x105e9fa40] mmco: unref short failure
[h264 @ 0x12a1c2c0] Missing reference picture, default is 65530
[h264 @ 0x105e9fa40] Missing reference picture, default is 65530
[h264 @ 0xca735980] mmco: unref short failure
[h264 @ 0xca735980] mmco: unref short failure
{'loss': 0.0, 'learning_rate': 7.741604710086778e-06, 'epoch': 0.59}
{'loss': 0.0, 'learning_rate': 7.736327531679933e-06, 'epoch': 0.59}
The text was updated successfully, but these errors were encountered: