Skip to content
/ 2BiVQA Public

2BiVQA is a no-reference deep learning based video quality assessment metric.

License

Notifications You must be signed in to change notification settings

atelili/2BiVQA

Repository files navigation

2BiVQA

License

2BiVQA: Double Bi-LSTM based Video Quality Assessment of UGC Videos

This repository contains the code of our paper 2BiVQA: Double Bi-LSTM based Video Quality Assessment of UGC Videos. If you use any part of our code, please cite:

@article{telili20222bivqa,
  title={2BiVQA: Double Bi-LSTM based Video Quality Assessment of UGC Videos},
  author={Telili, Ahmed and Fezza, Sid Ahmed and Hamidouche, Wassim and Meftah, Hanene FZ},
  journal={arXiv preprint arXiv:2208.14774},
  year={2022}
}

Requirements

pip install -r requirements.txt

Features extraction

Please note that the meta-data should be a csv file with two columns: video name and MOS.

python3 extract_features.py [-h] [-v 'path to videos directory']
                                   [-f 'path to meta-data csv file']
                                   [-o 'overlapping between patches']
                                   [-fl 'flag: 0 for videos and 1 for images']

To extract features from images, please set flag to 1.

ResNet50 is used for features extractions.

Model Training (optional):

This step can be skipped, and directly test the model in the next section with pre-trained models.

To train your own model:

python End2End_train.py [-h] [-nf number of frames to be extracted] [-b batch_size]
                                         

To train your own spatial pooling model on other image datasets:

python spatial_train.py [-h] [-p number of patches] [-b batch_size]
                                         

Test:

To test the model:

a-On KonViD-1K:

python test_model.py --dataset konvid 
Methods SROCC PLCC KROCC RMSE
2BiVQA 0.8463 0.8404 0.6529 0.3620

b-On LIVE_VQC:

python test_model.py --dataset live  
Methods SROCC PLCC KROCC RMSE
2BiVQA 0.7614 0.8325 0.6212 9.9799

Demo:

To predict the quality of your own dataset using pre-trained model:

python demo.py  [-h] [-nf number of frames to be extracted] [-m path to pretrained model] [-f path to videos dir]

Evaluate:

To evaluate the model:

Please note that your csv file should have two columns: 'Mos' and 'Predicted'.

python evaluate.py  --mos_pred konvid.csv

Performance Benchmark:

KonViD-1K [1]:
Methods SROCC PLCC KROCC RMSE
BRISQUE 0.6567 0.6576 0.4761 0.4813
NIQE 0.5417 0.5530 0.3790 0.5336
ILNIQE 0.5264 0.5400 0.3692 0.5406
VIIDEO 0.2988 0.3002 0.2036 0.6101
GM-LOG 0.6578 0.6636 0.4770 0.4818
HIGRADE 0.7206 0.7269 0.5319 0.4391
FRIQUEE 0.7472 0.7482 0.5509 0.4252
CORNIA 0.7169 0.7135 0.5231 0.4486
HOSA 0.7654 0.7664 0.5690 0.4142
V-BLIINDS 0.7101 0.7037 0.5188 0.4595
TLVQM 0.7729 0.7688 0.5770 0.4102
ResNet-50 0.8018 0.8104 0.6100 0.3749
VGG-19 0.7741 0.7845 0.5841 0.3958
KonCept512 0.7349 0.7489 0.5425 0.4260
VIDEVAL 0.7832 0.7803 0.5845 0.4026
RAPIQUE 0.8072 0.8175 0.6189 0.3623
2BiVQA 0.8463 0.8404 0.6529 0.3620
LIVE VQC [2]:
Methods SROCC PLCC KROCC RMSE
BRISQUE 0.5925 0.6380 0.4162 13.100
NIQE 0.5957 0.6286 0.4252 13.110
ILNIQE 0.5037 0.5437 0.3555 14.148
VIIDEO 0.0332 0.0231 0.2146 16.654
GM-LOG 0.5881 0.6212 0.4180 13.223
HIGRADE 0.6103 0.6332 0.4391 13.027
FRIQUEE 0.6579 0.7000 0.4770 12.198
CORNIA 0.6719 0.7183 0.4849 11.832
HOSA 0.6873 0.7414 0.5033 11.353
V-BLIINDS 0.6939 0.7178 0.5078 11.765
TLVQM 0.7988 0.8025 0.6080 10.145
ResNet-50 0.6636 0.7205 0.4786 11.591
VGG-19 0.6568 0.7160 0.4722 11.783
KonCept512 0.6645 0.7278 0.4793 11.626
VIDEVAL 0.7522 0.7514 0.5639 11.100
RAPIQUE 0.7415 0.7659 0.5576 10.6653
2BiVQA 0.7614 0.8325 0.6212 9.9799
YouTube-UGC [3]:
Methods SROCC PLCC KROCC RMSE
BRISQUE 0.3820 0.3952 0.2635 0.5919
NIQE 0.2379 0.2776 0.1600 0.6174
ILNIQE 0.2918 0.3302 0.1980 0.6052
VIIDEO 0.0580 0.1534 0.0389 0.6359
GM-LOG 0.3678 0.3920 0.2517 0.5896
HIGRADE 0.7376 0.7216 0.5478 0.4471
FRIQUEE 0.7652 0.7571 0.5688 0.4169
CORNIA 0.5972 0.6057 0.4211 0.5136
HOSA 0.6025 0.6047 0.4257 0.5132
V-BLIINDS 0.5590 0.5551 0.3899 0.5356
TLVQM 0.6693 0.6590 0.4816 0.4849
ResNet-50 0.7183 0.7097 0.5229 0.4538
VGG-19 0.7025 0.6997 0.5091 0.4562
KonCept512 0.5872 0.5940 0.4101 0.5135
VIDEVAL 0.7787 0.7733 0.5830 0.4049
RAPIQUE 0.7610 0.7620 0.5610 0.4060
2BiVQA 0.7716 0.7904 0.5812 0.4047
All-Combined:
Methods SROCC PLCC KROCC RMSE
BRISQUE 0.5695 0.5861 0.4030 0.5617
NIQE 0.4622 0.4773 0.322 0.6112
ILNIQE 0.4592 0.4741 0.3213 0.6119
VIIDEO 0.1039 0.1621 0.0688 0.6804
GM-LOG 0.5650 0.5942 0.3995 0.5588
HIGRADE 0.7398 0.7368 0.5471 0.4674
FRIQUEE 0.7568 0.7550 0.5651 0.4549
CORNIA 0.6764 0.6974 0.4846 0.4946
HOSA 0.6957 0.7082 0.5038 0.4893
V-BLIINDS 0.6545 0.6599 0.4739 0.5200
TLVQM 0.7271 0.7342 0.5347 0.4705
ResNet-50 0.7557 0.7747 0.5613 0.4385
VGG-19 0.7321 0.7482 0.5399 0.4610
KonCept512 0.6608 0.6763 0.4759 0.5091
VIDEVAL 0.7960 0.7939 0.6032 0.4268
RAPIQUE 0.8086 0.8186 0.6148 0.4076
2BiVQA 0.8003 0.7941 0.6088 0.4218

References

[1] V. Hosu, F. Hahn, M. Jenadeleh, H. Lin, H. Men, T. Szirányi, S. Li,and D. Saupe, “The konstanz natural video database (konvid-1k),” in2017 Ninth international conference on quality of multimedia experience(QoMEX).  IEEE, 2017, pp. 1–6.

[2] Z. Sinno and A. C. Bovik, “Large-scale study of perceptual videoquality,”IEEE Transactions on Image Processing, vol. 28, no. 2, pp.612–627, 2018.

[3] Y. Wang, S. Inguva, and B. Adsumilli, “Youtube ugc dataset for videocompression research,” in2019 IEEE 21st International Workshop onMultimedia Signal Processing (MMSP).  IEEE, 2019, pp. 1–5.